P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema August 21, 2011 8:18 Printer Name: Yet to Come References Abebe, A., McKean, J. W., and Huitema, B. E. (2008). Robust Propensity Score Analysis for Causal Inference in Observational Studies. Salt Lake City, UT: American Statistical Association. Abebe, A., McKean, J. W., and Kloke, J. D. (In preparation). Iterated Reweighted Rank- Based Estimates for GEE Models. Abelson, R. P., and Tukey, J. W. (1963). Efficient utilization of non-numerical information in quantitative analysis: General theory and the case of simple order. Annals of Mathematical Statistics, 34, 1347–1369. American Psychological Association (2001). Publication Manual for the American Psychological Association (5th edn.). Washington, DC: Author. Atiqullah, M. (1964). The robustness of the covariance analysis of a one-way classification. Biometrika, 51, 365–373. Awosoga, O. A. (2009). Meta-analyses of multiple baseline time-series design intervention models for dependent and independent series. Unpublished Doctoral Dissertation. Kalamazoo: Western Michigan University. Bancroft, T. A. (1964). Analysis and inference for incompletely specified models involving the use of preliminary tests of significance. Biometrics, 20, 427–439. Barlow, D., Nock, M., and Hersen, M. (2009). Single Case Experimental Designs: Strategies for Studying Behavior for Change. Boston: Pearson Allyn and Bacon. Bathke, A., and Brunner, E. (2003). A nonparametric alternative to analysis of covariance. In M. G. Akritas and D. N. Politis (Eds.), Recent Advances and Trends in Nonparametric Statistics (pp. 109–120). Amsterdam: Elsevier. Beach, M. L., and Meier, P. (1989). Choosing covariates in the analysis of clinical trials. Controlled Clinical Trials, 10, 161S–175S. Begg, C. B. (1990). Significance of tests of covariate imbalance. Controlled Clinical Trials, 11, 223–225. Benson, K., and Hartz, A. J. (2000). A comparison of observational studies and randomized controlled trials: Special articles. New England Journal of Medicine, 342, 1878–1886. The Analysis of Covariance and Alternatives: Statistical Methods for Experiments, Quasi-Experiments, and Single-Case Studies, Second Edition. Bradley E. Huitema. © 2011 John Wiley & Sons, Inc. Published 2011 by John Wiley & Sons, Inc. 643 P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema 644 August 21, 2011 8:18 Printer Name: Yet to Come REFERENCES Berger, V. W. (2005a). Quantifying the magnitude of baseline imbalances resulting from selection bias in randomized clinical trials. Biometrical Journal, 47, 119–127. Berger, V. W. (2005b). Selection Bias and Covariate Imbalances in Randomized Clinical Trials. Chichester, West Sussex, England: Wiley. Berger, V. W., and Exner, D. V. (1999). Detecting selection bias in randomized clinical trials. Controlled Clinical Trials, 20, 319–327. Berger, V. W., and Weinstein, S. (2004). Ensuring the comparability of comparison groups: Is randomization enough? Controlled Clinical Trials, 25, 515–524. Bernard, C. (1865). An Introduction to the Study of Experimental Medicine. First English translation by Henry Copley Greene, 1927; reprinted in 1949. London: Macmillan & Co., Ltd. Borich, G. D., Godbout, R. D., and Wunderlich, K. W. (1976). The Analysis of AptitudeTreatment Interactions: Computer Programs and Applications. San Francisco: JosseyBass. Borich, G. D., and Wunderlich, K. W. (1973). A note on some statistical considerations for using Johnson-Neyman regions of significance. Annual meeting of the American Psychological Association. Montreal, Quebec. Box, G. E. P., Jenkins, G. M., and Reinsel, G. C. (2008). Time Series Analysis: Forecasting and Control (4th edn.). Hoboken, NJ: Wiley. Box, G. E. P., and Tiao, G. C. (1965). Achange in level of a nonstationary time series. Biometrika, 52, 181–192. Box, G. E. P., and Tiao, G. C. (1975). Intervention analysis with applications to economic and environmental problems. Journal of the American Statistical Association, 70, 70–79. Bradley, R. A., and Strivastava, S. S. (1979). Correlation in polynomial regression. American Statistician, 33, 11–14. Browne, R. H. (2010a). The t-test p value and its relationship to the effect size and P(X > Y). American Statistician, 64, 30–33. Browne, R. H. (2010b). Correction: The t-test p value and its relationship to the effect size and P(X > Y). American Statistician, 64, 195. Bryant, J. L., and Brunvold, N. T. (1980). Multiple comparison procedures in the ANCOVA. Journal of the American Statistical Association, 75, 874–880. Bryant, J. L., and Paulson, A. S. (1976). An extension of Tukey’s method of multiple comparisons to experimental designs with random concomitant variables. Biometrika, 63, 631–638. Budescu, D. V. (1980). A note on polynomial regression. Multivariate Behavioral Research, 15, 497–506. Burk, D. (1980). Cancer mortality linked with artificial fluoridation in Birmingham, England. Paper presented at the 4th International Symposium on the Prevention and Detection of Cancer. Wembley, UK. Cahen, L., and Linn, R. L. (1971). Regions of significant criterion differences in aptitudetreatment-interaction research. American Educational Research Journal, 8, 521–530. Campbell, D. T., and Kenny, D. A. (1999). A Primer on Regression Artifacts. New York: Guilford Press. Campbell, D. T., and Stanley, J. (1966). Experimental and Quasi-Experimental Designs for Research. Chicago: Rand McNally. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema REFERENCES August 21, 2011 8:18 Printer Name: Yet to Come 645 Carroll, R. J., Ruppert, D., Stefanski, L. A., and Crainiceanu, C. M. (2006). Measurement Error in Nonlinear Models: A Modern Perspective (2nd edn.). Boca Raton, FL: Chapman & Hall/CRC. Chen, S., and Cox, C. (1992). Use of baseline data for estimation of treatment effects in the presence of regression to the mean. Biometrics, 48, 593–598. Chen, S., Cox, C., and Cui, L. (1998). A more flexible regression-to-the-mean model with possible stratification. Biometrics, 54, 939–947. Cobb, G. W. (1998). Introduction to Design and Analysis of Experiments. New York: Springer. Cochran, W. G. (1968). Errors of measurement in statistics. Technometrics, 10, 637–666. Cohen, A. C. (1955). Restriction and selection in samples from bivarate normal distributions. Journal of the American Statistical Association, 50, 884–893. Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd edn.). Hillsdale, NJ: Lawrence Erlbaum. Cohen, P., Cohen, J., West, S., and Aiken, L. S. (2003). Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (3rd edn.). Hillsdale, NJ: Lawrence Erlbaum. Concato, J., Shah, N., and Horwitz, R. (2000). Randomized controlled trials, observational studies, and the hierarchy of research designs. New England Journal of Medicine, 342, 1887–1892. Conover, W. J., and Inman, R. L. (1982). Analysis of covariance using the rank transformation. Biometrics, 38, 715–724. Cook-Mozaffare, P., Bulusu, L., and Doll, R. (1981). Fluoridation of water supplies and cancer mortality. I. A search for an effect in the U.K. on risk of death from cancer. Journal of Epidemiology and Community Health, 35, 227–232. Cook-Mozaffare, P., and Doll, R. (1981). Fluoridation of water supplies and cancer mortality. II. Mortality trends after fluoridation. Journal of Epidemiology and Community Health, 35, 233–238. Cooper, J. O., Heron, T. E., and Heward, W. L. (2006). Applied Behavior Analysis (2nd edn.). Englewood Cliffs, NJ: Prentice Hall. Cramer, E. M., and Appelbaum, M. I. (1978). The validity of polynomial regression in the random regression model. Review of Educational Research, 48, 511–515. DeGracie, J. S., and Fuller, W. A. (1972). Estimation of the slope and analysis of covariance when the concomitant variable is measured with error. Journal of the American Statistical Association, 67, 930–937. Dehejia, R. H., and Wahba, S. (2002). Propensity score-matching methods for nonexperimental causal studies. The Review of Economics and Statistics, 841, 151–161. Dixon, S. L., and McKean, J. W. (1996). Rank based analysis of the heteroscedastic linear model. Journal of the American Statistical Association, 91, 699–712. Dmitrienko, A., Tamhane, A. C., and Bretz, F. (2010). Multiple Testing Problems in Pharmaceutical Statistics. New York: CRC Press. Drake, C. (1993). Effects of misspecification of the propensity score on estimators of treatment effect. Biometrics, 49, 1231–1336. Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Association, 56, 52–64. Durbin, J., and Watson, G. S. (1950). Testing for serial correlation in least squares regression: I. Biometrika, 37, 409–428. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema 646 August 21, 2011 8:18 Printer Name: Yet to Come REFERENCES Durbin, J., and Watson, G. S. (1951). Testing for serial correlation in least squares regression: II. Biometrika, 38, 159–178. Dyer, O. (2003). GMC reprimands doctor for research fraud. British Medical Journal, 326, 730. Dyer, K., Schwartz, I. S., and Luce, S. C. (1984). A supervision program for increasing functional activities for severely handicapped students in a residential setting. Journal of Applied Behavior Analysis, 17, 249–259. Efron, B., and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. New York: Chapman & Hall. Elashoff, J. D. (1969). Analysis of covariance: A delicate instrument. American Educational Research Journal, 6, 383–401. Enders, W. (2010). Applied Econometric Time Series (3rd edn.) Hoboken, NJ: Wiley. Erlander, S., and Gustavsson, J. (1965). Simultaneous confidence regions in normal regression analysis with an application to road accidents. Review of the International Statistical Institute, 33, 364–377. Fante, R. M., Dickinson, A. M., and Huitema, B. E. (Submitted). A comparison of three training methods on the acquisition and retention of automotive product knowledge. Farthing, M. J. G. (2004). “Publish and be damned” . . . the road to research misconduct. Journal of the Royal College of Physicians of Edinburgh, 34, 301–304. Forsythe, A. B. (1977). Post-hoc decision to use a covariate. Journal of Chronic Disease, 30, 61–64. Fredericks, H. D. (1969). A comparison of Doman-Delacato method and a behavior modification method upon the coordination of mongoloids. Unpublished Doctoral Dissertation. University of Oregon, Eugene. Fuller, W. A. (1987). Measurement Error Models. New York: Wiley. Fuller, W. A., and Hidiroglou, M. A. (1978). Regression estimation after correction for attenuation. Journal of the American Statistical Association, 73, 99–104. Gelman, A., and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. New York: Cambridge University Press. Glass, G. V., Willson, V. L., and Gottman, J. (1975). Design and Analysis of Time-Series Experiments. Boulder, CO: Colorado Associated University Press. Gong, G., and Samaniego, F. J. (1981). Pseudo maximum likelihood estimation: Theory and applications. Annals of Statistics, 9, 861–869. Gould, S. J. (1966). Allometry and size in ontogeny and physiology. Biological Reviews, 41, 587–640. Greenhouse, G. R. (2003). The growth and future of biostatistics. Statistics in Medicine, 22, 3323–3335. Grice, G. R., and Hunter, J. J. (1964). Stimulus intensity effects depend upon the type of experimental design. Psychological Review, 71, 247–256. Guyatt, G. H., Heyting, A., Jaeschke, R., Keller, J., Adachi, J. D., and Roberts, R. S. (1990). N of 1 randomized trials for investigating new drugs. Controlled Clinical Trials, 11, 88–100. Hamilton, B. L. (1976). A Monte Carlo test of the robustness of parametric and nonparametric analysis of covariance against unequal regression slopes. Journal of the American Statistical Association, 71, 864–869. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema REFERENCES August 21, 2011 8:18 Printer Name: Yet to Come 647 Hayes, A. F., and Matthes, J. (2009). Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations. Behavior Research Methods, 41, 924–936. Hayes, R. J., and Moulton, L. H. (2009). Cluster Randomized Trials. New York: Chapman & Hall. Hettmansperger, T. J., and McKean, J. W. (1998). Robust Nonparametric Statistical Methods. London: Arnold. Hill, J. (2008). Comment. Journal of the American Statistical Association, 103, 1346–1350. Hill, J., and Reiter, J. P. (2006). Interval estimation for treatment effects using propensity score matching. Statistics in Medicine, 25, 2230–2256. Hochberg, Y., and Varon-Salomon, Y. (1984). On simultaneous pairwise comparisons in analysis of covariance. Journal of the American Statistical Association, 79, 863–866. Hollingsworth, H. H. (1980). An analytic investigation of the effects of heterogeneous regression slopes in analysis of covariance. Educational and Psychological Measurement, 40, 611–618. Hosmer, D. W., and Lemeshow, S. (2000). Applied Logistic Regression (2nd edn.). New York: Wiley. Howell, D. (2010). Statistical Methods for Psychology (7th edn.). New York: Wadsworth. Huitema, B. E. (1980). The Analysis of Covariance and Alternatives. New York: Wiley. Huitema, B. E. (1985). Autocorrelation in applied behavior analysis: A myth. Behavioral Assessment, 7, 109–120. Huitema, B. E. (1986a). Autocorrelation in behavioral research: Wherefore Art Thou? In A. Poling and R. W. Fuqua (Eds.), Research Methods in Applied Behavior Analysis: Issues and Advances (pp. 187–208). New York: Plenum Press. Huitema, B. E. (1986b). Statistical analysis and single-subject designs: Some misunderstandings. In A. Poling and R. W. Fuqua (Eds.), Research Methods in Applied Behavior Analysis: Issues and Advances (pp. 209–232). New York: Plenum. Huitema, B. E. (1988). Autocorrelation: 10 years of confusion. Behavioral Assessment, 10, 253–294. Huitema, B. E. (2004). Analysis of interrupted time-series experiments using ITSE: A critique. Understanding Statistics: Statistical Issues in Psychology, Education, and the Social Sciences, 3, 27–46. Huitema, B. E. (2008). A four phase time-series intervention model with parameters measuring immediate and delayed effects. Unpublished manuscript. Kalamazoo: Western Michigan University. Huitema, B. E. (2009). Reversed ordinal logistic regression: A method for the analysis of experiments with ordered treatment levels. Unpublished manuscript. Kalamazoo: Western Michigan University. Huitema, B. E., McKean, J. W., and McKnight, S. (1994, June). New methods of intervention analysis: Simple and Complex. Presented at the meeting of the Association for Behavior Analysis. Atlanta, GA. Huitema, B. E., and McKean, J. W. (1998). Irrelevant autocorrelation in least-squares intervention models. Psychological Methods, 3, 104–116. Huitema, B. E., and McKean, J. W. (2000a). A simple and powerful test for autocorrelated errors in OLS intervention models. Psychological Reports, 87, 3–20. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema 648 August 21, 2011 8:18 Printer Name: Yet to Come REFERENCES Huitema, B. E., and McKean, J. W. (2000b). Design specification issues in time-series intervention models. Educational and Psychological Measurement, 60, 38–58. Huitema, B. E., and McKean, J. W. (2005). Propensity score methodology combined with modified ANCOVA. Paper presented at the 6th International Conference on Health Policy Research: Methodological Issues in Health Services and Outcomes Research. Boston, MA. Huitema, B. E., and McKean, J. W. (2007a). Identifying autocorrelation generated by various error processes in interrupted time-series regression designs. Educational and Psychological Measurement, 67, 447–459. Huitema, B. E., and McKean, J. W. (2007b). An improved portmanteau test for autocorrelated errors in interrupted time-series regression. Behavior Research Methods, Instruments, & Computers, 39, 343–349. Huitema, B. E., McKean, J. W., and Laraway, S. (2008). Time-series intervention analysis using ITSACORR: Fatal flaws. Journal of Modern Applied Statistical Methods, 6, 367–379. Huitema, B. E., McKean, J. W., and Laraway, S. (In press). Erratum for: Time-series intervention analysis using ITSACORR: Fatal flaws. Journal of Modern Applied Statistical Methods. Hunka, S. (1994). Using Mathematica to solve Johnson-Neyman problems. Mathematica in Education, 3, 32–36. Hunka, S. (1995). Identifying regions of significance in ANCOVA problems having nonhomogeneous regressions. British Journal of Mathematical and Statistical Psychology, 48, 161–188. Hunka, S., and Leighton, J. (1997). Defining Johnson-Neyman Regions of significance in the three-covariate ANCOVA using Mathematica. Journal of Educational and Behavioral Statistics, 22, 361–387. Imbens, G., and Lemieux, T. (2008). Regression discontinuity design: A guide to practice. Journal of Econometrics, 142, 615–635. Imbens, G., and Rubin, D. B. (In preparation.) Causal Inference in Statistics, and in the Social and Biomedical Sciences. New York: Cambridge University Press. Ioannidis, J., Haidich, A., and Pappa, M., et al. (2001). Comparison of evidence of treatment effects in randomized and nonrandomized studies. Journal of the American Medical Association, 286, 821–830. Johnson, P. O., and Neyman, J. (1936). Tests of certain linear hypotheses and their application to some educational problems. Statistical Research Memoirs, 1, 57–93. Johnston, J. M., and Pennypacker, H. S. (2008). Strategies and Tactics of Behavioral Research (3rd edn.). New York: Routledge. Jonckheere, A. R. (1954). A distribution free k-sample test against ordered alternatives. Biometrika, 41, 133–145. Jones, R. R., Vaught, R. S., and Weinrott, M. R. (1977). Time-series analysis in operant research. Journal of Applied behavior Analysis, 10, 151–167. Karpman, M. B. (1980). ANCOVA—a one covariate Johnson-Neyman algorithm. Educational and Psychological Measurement, 40, 791–793. Karpman, M. B. (1983). The Johnson-Neyman tecnmique using SPSS or BMDP. Educational and Psychological Measurement, 43, 137–147. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema REFERENCES August 21, 2011 8:18 Printer Name: Yet to Come 649 Karpman, M. B. (1986). Comparing two non-parallel regression lines with the parametric alternative to analysis of covariance using SPSS-X or SAS—The Johnson- Neyman technique. Educational and Psychological Measurement, 46, 639–644. Kazdin, A. E. (1982). Single-Case Research Designs: Methods for Clinical and Applied Settings. New York: Oxford University Press. Keppel, G., and Wickens, T. D. (2004). Design and Analysis: a Researcher’s Handbook (4th edn.). Upper Saddle River, NJ: Pearson Prentice-Hall. Kirk, R. (1995). Experimental Design (3rd edn.). Pacific Grove, CA: Brooks/Cole. Kocher, A. T. (1974). An investigation of the effects of non-homogeneous within-group regression coefficients upon the F test of analysis of covariance. Paper presented at the annual meeting of the American Educational Research Association, Chicago. Koehler, M. J., and Levin, J. R. (2000). RegRand: Statistical software for the multiple-baseline design. Behavior Research Methods Instruments, & Computers, 32, 367–371. Kosten, S. F. (2010). Robust interval estimation of a treatment effect in observational studies using propensity score matching. Unpublished Doctoral Dissertation. Kalamazoo: Western Michigan University. Kosten, S. F., McKean, J. W., and Huitema, B. E. (Submitted). Robust and nonrobust interval estimation of treatment effects in observational studies designed using propensity score matching. Manuscript submitted for publication. Kramer, C. Y. (1956). Extensions of multiple range tests to group means with unequal numbers of replications. Biometrics, 12, 307–310. Kreft, I., and de Leeuw, J. (1998). Introducing Multilevel Modeling. London: Sage. Lautenschlager, G. J. (1987). JOHN-NEY: An interactive program for computing the JohnsonNeyman confidence region for nonsignificant prediction differences. Applied Psychological Measurement, 11, 194–195. Lehmacher, W., Wassmer, G., and Reitmeir, P. (1991). Procedures for two-sample comparisons with multiple endpoints controlling the experimentwise error rate. Biometrics, 47, 511–521. Lehmann, E. L., and D’Abrera, H. J. M. (1975). Nonparametrics: Statistical Methods Based On Ranks. San Francisco: Holden-Day. Levy, K. (1980). A Monte Carlo study of analysis of covariance under violations of the assumptions of normality and equal regression slopes. Educational and Psychological Measurement, 40, 835–840. Little, R. J., Long, Q., and Lin, X. (2008). Comment. Journal of the American Statistical Association, 103, 1344–1346. Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications. Lord, F. M. (1960). Large-sample covariance analysis when the control variable is fallible. Journal of the American Statistical Association, 55, 307–321. Lundervold, D. A., and Belwood, M. F. (2000). The best kept secret in counseling: Single-case (N = 1) experimental designs. Journal of Counseling & Development, 78, 92–102. Maddox, J., Randi, J., and Stewart, W. W. (1988). “High-dilution” experiments a delusion. Nature, 334, 287–290. Madsen, L. G., and Bytzer, P. (2002). Single subject trials as a research instrument in gastrointestinal pharmacology. Alimentary Pharmacology & Therapeutics, 16, 189–196. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema 650 August 21, 2011 8:18 Printer Name: Yet to Come REFERENCES Manly, B. F. J. (1992). The Design and Analysis of Research Studies. London: Cambridge University Press. Maxwell, S. E., and Delaney, H. D. (2004). Designing Experiments and Analyzing Data: A Model Comparison Perspective (2nd edn.). Mahwah, NJ: Lawrence Erlbaum. Maxwell, S. E., Delaney, H. D., and Manheimer, J. M. (1985). ANOVA of residuals and ANCOVA: Correcting an illusion by using model comparisons and graphs. Journal of Educational Statistics, 10, 197–209. Mayo, J., White, O., and Eysenck, H. J. (1978). An empirical study of the relation between astrological factors and personality. Journal of Social Psychology, 105, 229–236. McKean, J. W. (2004). Robust analysis of linear models. Statistical Science, 19, 562–570. McKean, J. W., Naranjo, J., and Huitema, B. E. (2001). A robust method for the analysis of experiments with ordered treatment levels. Psychological Reports, 89, 267–273. McKean, J. W., Naranjo, J., and Sheather, S. J. (1999). Diagnostics for comparing robust and least squares fits. Journal of Nonparametric Statistics, 11, 161–188. McKean, J. W., and Sheather, S. J. (1991). Small sample properties of robust analyses of liner models based on R-estimates: A Survey. In W. Stahel and S. Weisberg (Eds.), Directions in Robust Statistics and Diagnostics, Part II (pp. 1–19). New York: Springer-Verlag. McKean, J. W., Sheather, S. J., and Hettmansperger, T. P. (1993). The use and interpretation of residuals based on robust estimation. Journal of the American Statistical Association, 88, 1254–1263. McKean, J. W., and Vidmar, T. J. (1994). A comparison of two rank-based methods for the analysis of linear models. American Statistician, 48, 220–229. McKnight, S., McKean, J. W., and Huitema, B. E. (2000). A double bootstrap method to analyze linear models with autoregressive error terms. Psychological Methods, 5, 87–101. Mee, R. W., and Chau, T. C. (1991). Regression toward the mean and the paired sample t test. The American Statistician, 45, 39–41. Mendro, R. L. (1975). A Monte Carlo Study of the Robustness of the Johnson-Neyman Technique. Annual meeting of the American Educational Research Association, Washington, DC. Methot, L. L. (1995). Autocorrelation in single-subject data: A meta-analytic view. Unpublished Doctoral Dissertation. Kalamazoo: Western Michigan University. Milliken, G. A., and Johnson, D. E. (2002). Analysis of Messy Data, Volume III: Analysis of Covariance. New York: Chapman & Hall/CRC. Montgomery, D. C., Jennings, C. L., and Kulahci, M. (2008). Introduction to Time Series Analysis and Forecasting. Hoboken, NJ: Wiley. Morgan, D. L., and Morgan, R. K. (2001). Single Participant Research Design: Bringing science to managed care. American Psychologist, 56, 119–127. Morgan, O. W., Griffiths, C., and Majeed, A. (2007). Interrupted time-series analysis of regulations to reduce paracetamol (acetaminophen) poisoning. Public Library of Science (PLoS) Medicine, 4, e105. Naranjo, J. D., and McKean, J. W. (2001). Adjusting for regression effect in uncontrolled studies. Biometrics, 57, 178–181. O’Brien, P. C. (1984). Procedures for comparing samples with multiple endpoints. Biometrics, 40, 1079–1087. Olejnik, S. F., and Algina, J. (1985). A review of nonparametric alternatives to analysis of covariance. Evaluation Review, 9, 51–83. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema REFERENCES August 21, 2011 8:18 Printer Name: Yet to Come 651 O’Neill, R. (2006). Patient-reported outcome instruments: Overview and comments on the FDA draft guidance. 42nd Annual meeting of the Drug Information Association, Philadelphia. Packard, G. C., and Boardman, T. J. (1988). The misuse of ratios, indices, and percentages in ecophysiological research. Physiological Zoology, 61, 1–9. Parsonson, B. S., and Baer, D. M. (1986). The graphic analysis of data. In A. Poling and R. W. Fuqua (Eds.), Research Methods in Applied Behavior Analysis: Issues and Advances (pp. 157–186). New York: Plenum. Partridge, L., and Farquhar, M. (1981). Sexual activity reduces lifespan of male fruitflies. Nature, 294, 580–581. Permutt, T. (1990). Testing for imbalance of covariates in controlled experiments. Statistics in Medicine, 9, 1455–1462. Pocock, S. J., Geller, N. L., and Tsiatis, A. (1987). The analysis of multiple endpoints in clinical trials. Biometrics, 43, 487–498. Poling, A. D., Methot, L. L., and LeSage, M. G. (1995). Fundamentals of Behavior Analytic Research. New York: Plenum. Potthoff, R. F. (1964). On the Johnson-Neyman technique and some extensions thereof. Psychometrika, 29, 241–256. Preacher, K. J., Curran, P. J., and Bauer, D. J. (2006). Computational tools for probing interactions in multiple linear regression, multilevel modeling, and latent curve analysis. Journal of Educational & Behavioral Statistics, 31, 437–448. Rantz, W. G. (2007). The effects of feedback on the accuracy of completing flight checklists. Unpublished master’s thesis. Kalamazoo: Western Michigan University. Rantz, W. (2009). Comparing the accuracy of performing digital and paper cdhecklists using a feedback package during normal workload conditions in simulated flight. Unpublished doctoral dissertation. Western Michigan University, Kalamazoo. Raudenbush, S. W., and Bryk, A. S. (2002). Hierarchical Linear Models: Applications and Data Analysis Methods. Thousand Oaks, CA: Sage. Rheinheimer, D. C., and Penfield, D. A. (2001). The effects of type I error rate and power of the ANCOVA F test and selected alternatives under nonnormality and variance heterogeneity. Journal of Experimental Education, 4, 373–391. Rogosa, D. (1977). Some results for the Johnson-Neyman technique. Unpublished Doctoral Dissertation. Stanford, California: Stanford University. Rogosa, D. (1980). Comparing nonparallel regression lines. Psychological Bulletin, 88, 307–321. Rogosa, D. (1981). On the relationship between the Johnson-Neyman region of significance and statistical tests of parallel within-group regressions. Educational and psychological Measurement, 41, 73–84. Rosenbaum, P. R. (2002). Observational Studies (2nd edn.). New York: Springer-Verlag. Rosenbaum, P. R. (2010). Design of Observational Studies. New York: Springer. Rosenbaum, P. R., and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55. Rosenbaum, P. R., and Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79, 516–524. Rothman, K. J. (1990). No adjustments are needed for multiple comparisons. Epidemiology, 1, 43–46. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema 652 August 21, 2011 8:18 Printer Name: Yet to Come REFERENCES Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701. Rubin, D. B. (1979). Using multivariate matched sampling and regression adjustment to control bias in observational studies. Journal of the American Statistical Association, 74, 318–328. Rubin, D. B. (1997). Estimating causal effects from large data sets using propensity scores. Annals of Internal Medicine, 127, 757–763. Rubin, D. B. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services & Outcomes Research Methodology, 2, 169– 188. Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100, 322–331. Rubin, D. B. (2006). Matched Sampling for Causal Effects. New York: Cambridge University Press. Rubin, D. B. (2008). Comment. Journal of the American Statistical Association, 103, 1350–1353. Samuels, M. L. (1991). Statistical reversion toward the mean: more universal than regression toward the mean. American Statistician, 45, 344–346. Scheffé, H. (1959). The Analysis of Variance. New York: Wiley. Scherrer, M. D., and Wilder, D. A. (2008). Training to increase safe tray carrying among cocktail servers. Journal of Applied Behavior Analysis, 41, 131–135. Schluchter, M. D., and Forsythe, A. B. (1985). Post-hoc selection of covariates in randomized experiments. Communication in Statistical Theory and Methods, 14, 679–699. Schmittlein, D. C. (1989). Surprising inferences from unsurprising observations: do conditional expectations really regress to the mean? American Statistician, 43, 176–183. Schultz, K. F. (1995). Subverting randomization in controlled trials. Journal of the American Medical Association, 274, 1456–1458. Senn, S. J. (1994). Testing for baseline balance in clinical trials. Statistics in Medicine, 13, 1715–1726. Senn, S. J. (1995). In defense of analysis of covariance: a reply to Chambless and Roeback [letter comment]. Statistics in Medicine, 14, 2283–2285. Senn, S. J. (1998). Applying results of randomized trials to patients. N of 1 trials are needed. British Medical Journal, 317, 537–538. Senn, S. J. (2005). Comment. Biometrical Journal, 47, 133–135. Senn, S. J., and Brown, R. A. (1985). Estimating treatment effects in clinical trials subject to regression to the mean. Biometrics, 41, 555–560. Shadish, W. R., Clark, M. H., and Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. Journal of the American Statistical Association, 103, 1334–1343. Shadish, W. R., Cook, T. D., and Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. New York: Houghton Mifflin. Shields, J. L. (1978). An empirical investigation of the effects of heteroscedasticity and heterogeneity of variance on the analysis of covariance and the Johnson-Neyman technique. Technical paper No. 292, U.S. Army Research Institute for the Behaavioral and Social Sciences, Alexandria, Virginia. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema REFERENCES August 21, 2011 8:18 Printer Name: Yet to Come 653 Shirley, E. A. C. (1981). A distribution-free method for analysis of covariance based on ranked data. Applied Statistics, 30, 158–162. Shurbutt, J., Van Houten, R., Turner, S., and Huitema, B. E. (2009). An analysis of the effects of LED rectangular rapid flash beacons (RRFB) on yielding to pedestrians using multilane crosswalks. Transportation Research Record, 2140, 85–95. Sideridis, G. D., and Greenwood, C. R. (1997). Is human behavior autocorrelated? An empirical analysis. Journal of Behavioral Education, 7, 273–293. Sidman, M. (1960). Tactics of Scientific Research. New York: Basic Books. Skinner, B. F. (1956). A case history in scientific method. American Psychologist, 11, 221–233. Slowiak, J. M., Huitema, B. E., and Dickinson, A. M. (2008). Reducing wait time in a hospital pharmacy to promote customer service. Quality Management in Health Care, 17, 112–127. Smith, R. J. (1984). Allometric scaling in comparative biology: problems of concept and method. American Journal of Physiology, 246, 152–160. Spiegelman, D. (2010). Approaches to uncertainty in exposure assessment in environmental epidemiology. Annual Review of Public Health, 31, 149–163. Steering Committee of the Physicians’ Health Study Research Group. (1988). Preliminary report: Findings from the aspirin component of the ongoing physicians’ health study. New England Journal of Medicine, 318, 262–264. Stevens, J. P. (2009). Applied Multivariate Statistics for the Social Sciences. New York: Taylor and Francis. Stoline, M. R., Huitema, B. E., and Mitchell, B. (1980). Intervention time series model with different pre- and postintervention first order autoregressive parameters. Psychological Bulletin, 88, 46–53. Stone, R. (1993). The assumptions on which causal inferences rest. Journal of the Royal Statistical Society, 55, 455–466. Stricker, G., and Trierweiler, S. J. (1995). The local clinical scientist: A bridge between science and practice. American Psychologist, 50, 995–1002. Sullivan, L. M., and D’Agostino, R. B., Sr. (2002). Robustness and power of analysis of covariance applied to data distorted from normality by floor effects: non-homogeneous regression slopes. Journal of Statistical Computation and Simulation, 72, 141–165. Tamhane, A. C., and Logan, B. R. (2004). On O’Brien’s OLS and GLS tests for multiple endpoints. In Y. Benjamini, F. Bretz, and S. Sarkar (Eds.), Recent Developments in Multiple Comparison Procedures, IMS Lecture Notes and Monograph Series, (pp. 76– 88). Bethesda, MD: Institute of Mathematical Statisties. Terpstra, T. J. (1952). The asymptotic normality and consistency of Kendall’s test against trend, when ties are present in one ranking. Indagationes Mathematicae, 14, 327–333. Terpstra, J. T., and McKean, J. W. (2004). Rank-based analysis of liner models using R. Technical Report 151. Statistical Computation Lab, Western Michigan University. Thigpen, C. C., and Paulson, A. S. (1974). A multiple range test for analysis of covariance. Biometrika, 61, 479–484. Thorndike, R. L. (1942). Regression fallacies in the matched groups experiment. Psychometrika, 7, 85–102. Trochim, W. M. K. (1984). Research Design for Program Evaluation: The RegressionDiscontinuity Approach. Newbury Park, CA: Sage. P1: TIX/OSW JWBS074-bref P2: ABC JWBS074-Huitema 654 August 21, 2011 8:18 Printer Name: Yet to Come REFERENCES Trochim, W. M. K. (1990). The regression-discontinuity design. In L. Sechrest, E. Perrin, and J. Bunker (Eds.), Research Methodology: Strengthening Causal Interpretations of Nonexperimental Data (pp. 199–140). Rockville, MD: Public Health Service, Agency for Health Care Policy and Research. Tryon, P. V., and Hettmansperger, T. P. (1973). A class of non-parametric tests for homogeneity against ordered alternatives. Annals of Statistics, 1, 1061–1070. Van Den Noortgate, W., and Onghena, P. (2003). Hierarchical linear models for the quantitative integration of effect sizes in single-case research. Behavior Research Methods, Instruments, & Computers, 35, 1–10. Wang, S., Huitema, B. E., Bruyere, R., Weintrub, K., Megregian, P., and Steinhorn, D. M. (2010). Pain perception and touch healing in healthy adults: A preliminary prospective randomized controlled study. Journal of Alternative Medicine Research, 2, 75–82. Watcharotone, K. (2010). On robustification of some procedures used in analysis of covariance. Unpublished doctoral dissertation. Kalamazoo: Western Michigan University. Watcharotone, K., McKean, J. W., and Huitema, B. E. (2010). Robust procedures for heterogeneous regression ANCOVA. Manuscript submitted for publication. Westfall, P. H., Tobias, R. D., Rom, D., Wolfinger, R. D., and Hochberg, Y. (1999). Multiple Comparisons and Multiple Tests Using the SAS System. Cary, NC: SAS Institute. Wilcox, R. R. (2005). Introduction to Robust Estimation and Hypothesis Testing. Burlington, MA: Elsevier Academic Press. Wu, M-J, Becker, B. J., and Netz, Y. (2007). Effects of physical activity on psychological change in advanced age: a multivariate meta-analysis. Journal of Modern Applied Statistical Methods, 6, 2–7. Wu, Y. B. (1984). The effects of heterogeneous regression slopes on the robustness of two test statistics in the analysis of covariance. Educational and Psychological Measurement, 44, 647–663. Wunderlich, K. W., and Borich, G. K. (1974). Curvilinear extensions to Johnson-Neyman regions of significance and some applications to educational research. Annual meeting of the American Educational Research Association. Chicago.