Understanding Approaches to Account for Clustering of Observations

Understanding Approaches to Account for Clustering of Observations in Health Services Research A. Russell Localio Division of Biostatistics Center for Clinical Epidemiology and Biostatistics University of Pennsylvania, School of Medicine Academy Health 2004 Annual Research Meeting San Diego, June 6, 2004 (corrected 06/08/2004) This project was supported in part by an Agency for Healthcare Research and Quality (AHRQ) Centers for Education and Research on Therapeutics cooperative agreement (grant # U18 HS10399) and by Agency for Healthcare Research and Quality, Grant No. R03 HS 1148101. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 1 of 80 C Abstract This outline provides an overview of many of the problems and some of the alternative solutions to account for clustering of observations within “centers” in health services research. The “center” refers to any natural or purposeful grouping of individuals. Studies are categorized across several dimensions: randomized vs observational, randomization within or across centers, fixed vs random effects, continuous or binary outcomes. Issues such as profiling of centers, confounding by center, and volume/outcome studies appear as special examples. The outline reviews and discusses the analytic alternatives and their strengths and weakness, as well as the software options. A bibliography offers references for materials covered as well as general guidance for further details. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 2 of 80 C Outline §1 Applications and methods §2 Multicenter Designs – randomization within or among centers §3 Observational Studies – Patient or center-level factors §4 Complex Designs – Longitudinal Analyses §5 Analysis options §6 Report cards and profiling §7 Confounding by Cluster §8 Volume-outcome studies – Correlations of fixed and random effect §9 Crossed vs nested effects §10 Model specification and “Interactions” §11 Comments and Conclusions References: R. Localio, Clustered Observations in HSR, 06/05/2004: Page 3 of 80 C §1 Applications and methods – Studying grouped data A. Applications to which these notes are relevant Multicenter clinical trials of the effect of drugs and therapeutics Multicenter studies on the use or misuse of drugs or on modification of behavior or exposure of clinicians and/or patients Surveys of patients Profiling of physicians, clinics, hospitals, and health plans Observational studies of associations of outcomes and exposures at the patient and/or physician and/or hospital level B. Reasons for studying grouped data: (1) Patients are naturally grouped into clusters Hospitals, clinics, physicians, neighborhoods, communities (2) Analysis of groups realizes efficiencies in design (3) Simple random sampling designs are too expensive R. Localio, Clustered Observations in HSR, 06/05/2004: Page 4 of 80 C C. Analytic methods (1) Frequentists methods Survey methods – design based Population averaged Center (cluster) specific (2) Bayesian methods (Gibbs Sampling for complex analyses) (3) Permutation-test-based methods (assumption free) D. Some notation and taxonomy Y = outcome X = covariates including factor of interest Center = Any natural grouping of individuals (a cluster) Centers indexed j=1,…,J Patients within centers indexed i=1,…,nj Treatment/exposure – Usually the factor (X) of interest E. Outcomes (Y) Focus on continuous and binary outcomes Issues for binary data usually apply for ordered categorical data and counts R. Localio, Clustered Observations in HSR, 06/05/2004: Page 5 of 80 C F. Paradigm of generalized linear model µ = α + X β -- linear combination of factors µ = h( E ( y | X )) -- link between outcome and linear combination G. Time and Center – key factors in study design and analysis Single time Multiple times Within center Parallel group RCT Longitudinal, repeated measures of patients Between Center Cluster randomization design (1) Repeated cross sectional design (2) clustered longitudinal design R. Localio, Clustered Observations in HSR, 06/05/2004: Page 6 of 80 C §2 Multicenter Studies – randomization within or among centers §2.1 Randomization within centers -Both treated and nontreated patients within each center -Advantage of accrual of large samples -Common for FDA-submissions for drug trials -Paradigm for -Studies conducted by single sponsor at one time period (multi-center RCT) -Multiple studies conducted by multiple investigators over different times (meta analysis)(Normand 1999) -Design and analysis issues (Fig 2.1) -Intercept --Variation in risk across centers among the control (standard care) patients -Slope -- Variation in the effect size across centers -Balance -- proportion assigned to treatment (and followed) across centers (can be lost from incomplete followup or missing data) -Sample size -- Number of centers --Number of patients per center --Variation in the sample sizes across centers R. Localio, Clustered Observations in HSR, 06/05/2004: Page 7 of 80 C Fig 2.1 Example: (variation in baseline risk and treatment effect across centers) p 1 0 0 1 tx -30 centers -Patients randomized within center to treatment (tx=1) or to control (tx=0) -Variation across centers in average risk among the control patients (variation in points at left) -Variation across centers in degree of improvement among treated compared to controls (slopes) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 8 of 80 C §2.2 Randomization among centers – Cluster randomization designs Characteristics: -Centers are randomized to treatments (or interventions) -Patients receive treatment by reason of being members of centers -Numbers of centers usually small -Numbers of patients within centers usually large (Green 1995; Murray 1998; Campbell 1998; Cornfield 1978; Donner 2000, 1994, 1980) Challenges -Bias --Randomization at center level does not ensure balance of patient characteristics -Variance – Naïve analytic methods overlook “design effect”. Overstate significance -Sample size estimation must consider correlation of subjects within centers -Reporting now subject to standards – CONSORT (Campbell 2004) (Donner and Klar 2000) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 9 of 80 C §3.1 Observational Studies – Interest in effect of treatment/exposure Characteristics -Responses based on complex sampling designs with stratification, clustering, unequal probabilities of selection (NHANES, NHIS, NMES and other surveys) -Effect of treatment or exposure within centers (e.g., patient age) -Effect of exposure at the center level (e.g., hospital teaching status) Challenges -Estimate the effect of exposure Controlling for confounders and Adjusting for “design effect” (ratio of variance to that under simple random sampling) -Design effects >1.0 common - clustering of respondents within sampled units - unequal sampling probabilities and thus need to adjust for “sampling weights” -Absent randomization, adjustment for confounders is usually essential to control for -Patient levels factors that are associated with outcome and exposure of interest -Center level factors -Context effects – Influence of a factor at the patient level as well as at the group level Individual income might influence individual’s access to care in the community Community income might influence access to care of anyone in community -Bias from confounding by center R. Localio, Clustered Observations in HSR, 06/05/2004: Page 10 of 80 C §3.2 Observational studies – Profiling -- Variation of outcomes across centers Characteristics -Patients differ across centers (“case mix” problem) -Centers, or “outlier” centers, are the object of comparison - “profiling” Challenges -Multiple comparisons without any prior estimates of which centers are “outliers” -Added complexity of correct and appropriate analyses -Intense demand for “some results” to identify “bad apples” and quality providers R. Localio, Clustered Observations in HSR, 06/05/2004: Page 11 of 80 C §4 Complex designs -- longitudinal designs §4.1 Repeated measures within patients over time – patients as the “center” Characteristics -Patients are measured repeatedly over time -Patients then become the “clusters” of correlated observations Challenges -Correlation structure over time becomes part of the model -Must fit and test models with alternative correlation structures -Not a focus of this talk. New texts on applied statistics are helpful (Fitzmaurice, Laird, Ware 2004; Diggle 2002, 1994) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 12 of 80 C §4.2 Repeated cross sectional studies – groups of patients as the centerr tx==0 tx==1 1 .8 .6 mean proportion .4 .2 0 0 1 0 1 Pre-Post Intervention Cluster Randomization -- Example -- 50 persons per time per site (Fig 4.2) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 13 of 80 C Characteristics (Fig 4.2) (Feldman 1984) -Same treatment/exposure assignment to all persons within a center -Patients clustered within centers, but individuals are not followed over time -Subsequent time periods involve different sets of patients in the same centers -2 sets of clusters randomized to treatment (tx=1) and control (tx=0) (simulated data) -Variation of center-specific rates at baseline (time = 0) -Variation over time among centers randomized to control but no overall improvement -Variation over time among centers randomized to treatment and overall improvement -Increased validity of estimates of treatment effect because of “control” centers -Decrease in power from clustering of patients in centers offset by increased power of repeated measures over time Challenges -Design effect from cluster randomization -Model estimates of interest are the time*treatment interaction terms R. Localio, Clustered Observations in HSR, 06/05/2004: Page 14 of 80 C §4.3 Clustered cohort designs Characteristics -Patients are clustered within centers -Treatment (exposure) applied to entire center -Individual patients followed over time and measured repeatedly -Added power of having each patient serve at his/her own control -Validity of having simultaneous control group of centers followed over time Challenges -Two levels of clustering with different degrees of correlation Within patient over time Among patients within centers R. Localio, Clustered Observations in HSR, 06/05/2004: Page 15 of 80 C §5 Analysis options – strengths and limitations §5.0 Preliminary distinctions §5.0.1 Fixed vs Random effects Define carefully what is “fixed” and what is “random” Intercept Model variation across centers by separate Fixed effect -Centers in intercept for each center µij = α1 + ... + α k + β tx sample assumed to represent only themselves random effect-- Variation across centers assumed to follow a Centers in distribution, often normal U j . Constant sample treatment effect β represent a population of centers µij = α + U j + β tx R. Localio, Clustered Observations in HSR, 06/05/2004: Page 16 of 80 C Slope Same treatment effect at each center β Variation across centers and variation of treatment effect across centers ∆ j . Assume a distribution, often normal µij = α + U j + β tx + ∆ jtx Longford (1993) §5.0.2 Population- averaged (PA) vs Center-Specific (CS) models (Liang 1993; Neuhaus 1991; Graubard 1994; Diggle 2002; Burton 1998; Hu 1998; Carlin 2001) Population-averaged data model -- center effect enters the residual variances: µij = α + β PA ∗ xi + εij , where cov(e) = σ2 e I + σ2u I There are two sources of error variance – random error across patients ( σ2 e ) and then variation across centers ( σu 2 ) µij = h[ E (Yij | xi )] , i.e., the marginal mean = average response given the covariate(s) β̂ measures of the effect of the intervention/exposure averaged over centers Center-specific data model – center effect is explicit in the model Yij = α + βCS ∗ xi + U j + εij U j represents variation of outcome across centers not accounted for by covarates U j : N (0, σu 2 ); εij : N (0, σe 2 );cov(U j , eij ) = 0 µij = h[ E (Yij | xi ,U j )] , i.e., the conditional mean is the response for a covariate pattern and center R. Localio, Clustered Observations in HSR, 06/05/2004: Page 17 of 80 C Interpretation: (Hu 1998) βˆ PA = change in E(y) for a change in x from baseline (x=0) to the comparison level (x=1) adjusted for covariates for the population of individuals at x=0 compared to the population at x=1 βˆ CS = change in E(y) for an individual if he/she were to change from x=0 to x=1 (adjusted for covariates) The same interpretation applies to center-level factors, such as hospital size or teaching status, but changes in these center-level factors are more difficult to explain with the CS model. For example, think of the interpretation of a factor x=0/1, urban/rural. How can one describe the change of a hospital from urban to rural? Estimation: PA – GEE, robust regression: (Diggle 2002, chapters 7,8) -regression uses robust variance estimates to allow for correlation of observations within center -When the interest lies not in changes over time or within center but across centers CS -- Random or mixed effects models; conditional methods -When covariates vary within center, CS methods might be more satisfactory R. Localio, Clustered Observations in HSR, 06/05/2004: Page 18 of 80 C -Linear models: PA and CS estimates should be similar -Poisson models: PA and CS intercepts will differ; β̂(ln(RR)) should be similar -Logistic models: βˆ < βˆ PA CS This result (PA estimates are attenuated relative to CS) in logistic models has two analogies: (1) Logistic regression with an omitted covariate produces estimates attenuated toward null (Gail 1984) (2) In absence of conventional confounding, unconditional (compared to conditional) estimates will be attenuated towards null = noncollapsibility of the odds ratio (Gail 1984) Model misspecification – when the data and analysis models differ If CS model is true and use PA methods Attenuated estimates of β̂ ˆ But β ˆ should be appropriate se(β) As one adds covariates, the degree of attenuation of OR should lessen If PA model is true and use CS methods (remains for further work) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 19 of 80 C Choice of PA vs CS – Still controversial with each having its adherents CS -- Lindsay & Lambert 1998; Goldstein 2002 PA -- Fitzmaurice, Laird, Ware 2004 Carlin (2001) has a good discussion of practical issues R. Localio, Clustered Observations in HSR, 06/05/2004: Page 20 of 80 C §5.0.3 Permutation-test based methods (Good 2000; Gail 1996) What are they? (1) Arrive at a test statistic based on observed data, e.g., difference in treatment means (2) Compute the test statistic for the observed data (3) Permute the observations across the treatments in all possible ways to arrive at a distribution of the test statistic under the null hypothesis – that there is no treatment difference (4) Compare the observed test to the distribution under the null and determine how far out on the tail of this distribution the observed statistic lies (5) Estimate a confidence bound for the observed test by adding (and subtracting) very small and increasing values until the upper (and lower) bounds are just statistically significant when compared to the null distribution (algorithms can do this computerintensive process efficiently) (7) Covariates are handled by defining test statistics based on residuals after adjusting for covariates (8) In theory these methods should provide statistical tests, estimates, and confidence intervals R. Localio, Clustered Observations in HSR, 06/05/2004: Page 21 of 80 C §5.0.4 Bayesian methods (Gelman 2004) As contrasted with “classical” (frequentist) approaches Starting with mixed effects models, Bayesian methods add prior distributions to the model parameters: For example, for the simple random intercept model: βˆ : N (0, 1 1 ); 2 ~ gamma( a = 0.001, b = 0.001) 0.001 σ Using intensively iterative procedures (e.g. Gibbs Sampling) characterized by: - a preliminary “burn in” sequence and - a subsequent sequence that is followed to some equilibrium values of the parameter estimates - prior distributions selected to be “flat” or “noninformative” so that data overwhelms the prior distribution (of the estimates) to arrive at a posterior - Requires use of diagnostics to determine convergence of sequences R. Localio, Clustered Observations in HSR, 06/05/2004: Page 22 of 80 C §5.0.5 Bootstrap methods (Good 2001; Carpenter 2000; Diaconis 1983; Efron 1991)) Resampling the sample repeatedly (with replacement) to ascertain robust estimates that ask: What would be the range of estimates if we had many samples from the same underlying population? These can be: Nonparametric -- Resampling residuals adjusted to have a “correct” covariance structure consistent with the study design Parametric – Resampling is based on a set of random effects U j ' s and errors eij ' s sampled from their respective distributions (assumed to be normal) But these methods, as applied to multi-center data, need more exposition R. Localio, Clustered Observations in HSR, 06/05/2004: Page 23 of 80 C §5.0.6 Randomized vs observational studies – Treatment/Exposure Exposure in an observational study can be handled as “treatment” in a randomized design, with due attention to issues: (1) Causation is more difficult to demonstrate (2) Confounding by observed and unobserved factors R. Localio, Clustered Observations in HSR, 06/05/2004: Page 24 of 80 C §5.1 Analysis options: Stratified 2 by 2 tables in presence of clustering -Goal is stratified analyses (outcome*exposure by age category) -Observations are not independent owing to clustering -Solution: Reduce the effective sample by the “design effect” Reduce the Mantel-Haenszel χ 2 statistic by the design effect Apply standard statistical tables to adjusted statistic (Rao 1992) -Simple solution using standard software and adjusted chisq statistic R. Localio, Clustered Observations in HSR, 06/05/2004: Page 25 of 80 C Exposure category 1 2 3 4 5 Center 1 ab 2 cd ef gh K In the absence of clustering, the observations a,b,c,d, …,h should be independent. But in the presence of clustering, a and b are more closely related to each other than are a and c through h. The absence of independence means that standard chisq tests are not appropriate. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 26 of 80 C §5.2 Multicenters studies with intervention/exposure varying within centers While the following comments focus on designed interventions, they also apply to observational studies in which the exposure of interest is contrasted within center §5.2.1 Pooled vs stratified methods -- with emphasis on binary outcomes Underlying issues are: (a) Does risk in baseline/reference (unexposed) persons vary across centers? (b) Does the effect of intervention vary across centers? = interaction (c ) Are the centers “fixed” or “random”? Common practice: treat all centers a single center if intervention or exposure is “balanced” within center. “Naïve pooling” But what if: -Followup is incomplete across centers? -Inferences of interest are within subgroups (interaction of intervention and gender)? -Populations differ across centers so that baseline risks differ? -Centers differ in size? -Treatment effects seem to differ across centers? (Agresti 2000; Senn 1998) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 27 of 80 C §5.2.2.1 Methods in which the center effect is estimated Fixed effects models – Indicator variable for each center µ = α1 ,..., α J (assuming no treatment by center interactions) -Can estimate of baseline risk/outcome in unexposed or control group in each center -Too few patients per center results in badly biased estimates -Low power because of large number of terms for estimation -Requires very large sample of patients per center -Adding interaction terms (centers*treatment) to allow for variation in effect across centers – adds J-1 more degrees of freedom to model and requires even larger samples -Regular regression software applicable R. Localio, Clustered Observations in HSR, 06/05/2004: Page 28 of 80 C Random intercept models -Assume that each center’s baseline value (risk) varies about average following a distribution ( µ = α + X Β + U j ;U : N (0, τ) ) -This assumption: Reduces degrees of freedom (fewer parameters to be estimated) Allows for greater than sampling variation of baseline risk across centers -Many software options: SAS (Mixed); Stata (xtreg, xtlogit …); MLWin; HLM -Issues of concern in implementation Will frequentist methods estimate the components of variance? Are Bayesian methods necessary to allow for uncertainty in estimates? Software options are BUGS 0.6, WinBUGS 1.4, or MLWin R. Localio, Clustered Observations in HSR, 06/05/2004: Page 29 of 80 C Random slope models: When the treatment/exposure effects vary across centers (as contrasted with varying baseline risks) Fewer degrees of freedom than comparable fixed effects model Commonly seen in meta analyses (but issues apply to all studies) Studies from different protocols and across widely different population can expect different effects of treatment or exposure Sometimes seen in multi-center studies and observational studies Reason for variation in effect across centers needs explanation R. Localio, Clustered Observations in HSR, 06/05/2004: Page 30 of 80 C Random slope models(cont)) Methods: (a) DerSimonian & Laird - simple, readily available (Stata “metan”; Rev Man) Used primarily in meta analyses, but it can apply to multicenter RCTs and to observational studies Can produce biased estimates (b) Mixed effects models for continuous outcomes Generalized linear mixed effects models for binary, Poisson, ordered categorical, outcomes Treatment is a “fixed effect” Center is a random effect (random intercept representing baseline variation) Treatment at center level has random slope R. Localio, Clustered Observations in HSR, 06/05/2004: Page 31 of 80 C µij = α + β1 ∗ txij + U j + ∆ j tx , where j indexes the center and I indexes the patient within center, α represents a “fixed intercept” the average baseline across center, and β1 represents the “fixed slope”, the average treatment effect across centers U j : N (0, τ1 ) is a random intercept representing variation of the baseline effect across centers. Assumption saves degrees of freedom ∆ j : N (0, τ 2 ) is a random “slope” representing variation of the treatment effect across center. Assumption saves degrees of freedom Variances of confidence intervals of β1 are generally wider than for same estimate from a fixed effects analysis R. Localio, Clustered Observations in HSR, 06/05/2004: Page 32 of 80 C Software options -Continuous outcomes – SAS Mixed (popular and well documented - MLWin; HLM - Splus and R (lme) (Venables 2002; Everitt 2001) -Binary outcomes -Quadrature is recommended SAS NLMIXED Splus 6.2 (correlated data library) -Approximations (recommended only for many large centers) SAS glimmix R (glmmPQL) MLWin (PQL) HLM (Works well with many centers; awaiting simulations for smaller numbers of smaller centers) Additional concerns: Attention to reporting and explaining differences -- just as with any instance of “effect modification” across strata of a covariate R. Localio, Clustered Observations in HSR, 06/05/2004: Page 33 of 80 C §5.2.2.2 Marginal methods in which the baseline center effect is not estimated: Here the analysis model is: µij = α + XB + εij Variances are adjusted to account for the inflated error from having multiple centers Assumptions: Effect of intervention does not vary across centers Random intercept (U j ) is a nuisance parameter not in need of estimation Must adjust confidence intervals to allow for excess variance Common methods: -generalized estimating equations (GEE) SAS Genmod Stata xtgee Splus 6.2 (correlated data library – download) -survey methods (the equivalent of GEE with independence corr) SUDAAN Stata xtgee -resampling methods – if done at the center level Stata “bs” “bootstrap” functions Splus 6.2 –resample library R. Localio, Clustered Observations in HSR, 06/05/2004: Page 34 of 80 C §5.2.2.3 Conditional methods in which the baseline center effect is not estimated: Conditional regression -- For estimating within-center effects Variance – typically less of a problem for RCTs but all estimates assume “fixed effects”, i.e, that these centers are fixed and not a sample from a larger population of centers Bias – the principal concern in analysis Binary outcomes -Central issue involves noncollapsibility of the odds ratio -Must stratify an analysis even in the absence of imbalance treatment across centers -The pooled OR will differ from the stratified OR and be attenuated towards 1.0 -Mantel-Haenszel for binary outcomes stratified by center Estimates for odds ratios (OR), relative risk (RR) and risk difference (RD) -Conditional logistic regression to control for patient-level covariates -Easy to compute using many standard software packages (SAS, Stata, StatXact) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 35 of 80 C Ordered or continuous outcomes -Stratified linear rank tests – Permutation tests and their special cases Wilcoxon Rank Sum; Normal Scores (Van de Waerden) (StatXact; SAS) -Permutation tests for any outcome Stata “permute” function with stratification -Fixed effects regression – Within-center effects of treatment (Stata “xtreg, fe”) Survival data -Stratified logrank test (StatXact, Stata) -Stratified Cox regression (Stata, Splus, R) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 36 of 80 C §5.3 Cluster randomized designs (Atienza 2002; Donner 2000; Murray 1998, 2001) §5.3.1 Single-time study (Donner; 2000; Green 1995; 1997) Simple methods: For both continuous and binary outcomes -t-tests (2-sample) – use mean of each center as an observation The method is simple but can be conservative (p-values too large) Can adjust for covariates (see permutation-test-based methods) -Adjusted chisq test (Donald 1987; Donner & Klar 1994) For the association of Y and X across centers Generalized linear models to adjust for covariates at the patient level -Generalized estimation equations (GEE) (Bellamy 2000) These will work well when the number of centers is large, but are anticonservative (p-values too small) with few centers Population averaged models seem more appropriate because contrast is between rather than within centers R. Localio, Clustered Observations in HSR, 06/05/2004: Page 37 of 80 C -Center-specific methods – penalized quasi likelihood (PQL – SAS Glimmix; MLWin) Works better than GEE with few centers but still problematic with few centers because of bias in estimating intraclass correlation coefficient (ICC) (Bellamy 2000) -Quadrature methods – (SAS NLMIXED, Stata gllamm; Splus 6.2 (correlated data library) – Performance remains for further analysis -Other approximations (HLM Laplace) – Performance remains for further analysis -Bayesians methods (BUGS 0.6, WinBUGS etc) – Performance remains for further analysis -Permutation tests -These can be configured to allow for covariates by permuting the residuals (Braun 2001; Gail 1996) -Confidence intervals – obtained by “inverting the test” (find the bounds of the value of the effect size [e.g. risk difference or difference in means] such that the results of the permutation test are exactly p=0.05) (Good 2000; §3.2) -Strong theoretical justification -- strength -Absence of standard software -- weakness R. Localio, Clustered Observations in HSR, 06/05/2004: Page 38 of 80 C §5.3.2 Repeated cross sectional designs – Analysis options (Bellamy 2000) -Patients are not followed as individuals, but centers are followed over time -Single level of clustering -Interest lies in the time*treatment interaction (does treatment group of centers improve more than the control group)? -Assume (1) a random intercept (variation across centers) (2) a random slope (variation in the treatment effect over time) -Continuous outcomes – REML -- Widely available in SAS (mixed), Splus (lme), R (lme) RIGLS – Used in MLWin HLM (Widely used but requires multistep data setup) Quadrature – Stata (gllamm); SAS (nlmixed) These are far slower than REML algorithms and do not offer improved performance in results R. Localio, Clustered Observations in HSR, 06/05/2004: Page 39 of 80 C -Binary outcomes – -PQL(Penalized quasi likelihood) – MLWin, SAS (glimmix), R (glmmPQL), Splus(correlatedData) PQL has been criticized for most (McCulloch 2002) or all (Neuhaus 2001) applications, but performs adequately with large centers and modest random effects. Performance becomes unsatisfactory (coverage <0.9 for 95% confidence intervals) with 10 or fewer centers and even modest std deviation of random effects -Laplace approximation (HLM) – Performance needs further evaluation -Adaptive quadrature – SAS NLMIXED; Stata gllamm; Splus 6.2 (correlatedData) Offer improved performance at the expense of much slower execution times Also result in undercoverage (not as severe) with smaller numbers of centers and increased variability of baseline risks (random intercept) and treatment effects (random slope) In some applications, results can be sensitive to number of quadrature points Do not rely on default quadrature points. Try 8, 12, 16. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 40 of 80 C -Bayesian methods BUGS 0.6 WinBUGS 1.4 – Very poor data entry R(bugs.R) (Gelman 2004) -- A front end for WinBUGS with vastly improved input and output MLWin 2.0 These offer best performance with acceptable execution times (based on work in progress) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 41 of 80 C -Permutation-based methods (Rosenbaum 2002) -These methods make no assumptions about parametric distributions; do not have to ˆ assume that β µ : N (0,1) . se(βˆ ) -Covariates? – Fit a linear model of the outcome (Y) as a function of (X), except for treatment, and apply a permutation-test based method on the residuals. Covariates might be (a) patient level factors that differ across center, (b) baseline risk at each center -Confidence intervals can be obtained by “inverting the test”, i.e., searching for values (θL , θU ) of the test statistic that exactly coincide with the rejection region of the null distribution. This search can be very inefficient (requiring 1000s of permutations if not done efficiently). One option is reported by Garthwaite (1996) – “Robbins-Monro search” -For centers of unequal size, weighting methods are possible (Braun 1999) -Limitations – Unbalanced data (unequal numbers of centers in intervention/control groups -Software – No easy-to-use software. Stata (permute) might offer a solution R. Localio, Clustered Observations in HSR, 06/05/2004: Page 42 of 80 C Overall comments on repeated cross-sectional designs -Many analyses are tractable using standard software -Major challenges – Too few centers for analysis -Much additional work needed on performance of alternative methods R. Localio, Clustered Observations in HSR, 06/05/2004: Page 43 of 80 C §5.3.3 Clustered cohort design – Analysis options Model the multiple layers of clustering – time within patient, patient within center Nested modeling cannot ignore the multiple layers (Ten Have 1999) What is the more appropriate perspective: population averaged or center-specific? -The questions is: What is the improvement to an individual within a center over time (subject to treatment), and how does this improvement compare to what the same individual would experience in another center (not subject to treatment)? -CS models are likely more appropriate in this context Analysis paradigm: Multilevel modeling – nested random effects (Sullivan 1999) -Continuous outcomes REML (MLWin, SAS proc mixed, Splus (lme) R (lme) -Fast and efficient algorithms -Syntax for SAS is especially flexible and documentation is ample R. Localio, Clustered Observations in HSR, 06/05/2004: Page 44 of 80 C -Binary outcomes -PQL (Penalized quasi-likelihood)– (SAS glimmix, MLWin, R – glmmPQL These methods fail when (as is usual) the number of repeated measures within patients over time is small. Not recommended -Laplace approximation (HLM) – Performance uncertain with this application -Adaptive quadrature – Stata (gllamm); Splus 6.2 (Correlated Data Library) (SAS nlmixed not an option – handles only single level of random effects) Far better than PQL in terms of bias and coverage -Non-linear mixed effects models (Splus, R “nlme”) (Venables 2002) (Performance in these applications needs assessment) -Alternating logistic regression (Carey 1993) SAS GENMOD (Performance in these applications needs assessment) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 45 of 80 C -Any outcome -Permutation-based methods (Gail 1996) Works well with additive models – For differences (between treatment group) of differences (within patient over time). Covariates not a problem Performance for logit models with few repeated measures within patient uncertain. Donner & Klar (2000) dismiss this option. -Bayesian methods: BUGS (Spiegelhalter 1997 ); WinBUGS; Bugs.R; MLWin; (S-plus 6.2 (Bayes) – problematic as of this writing) Perhaps the easiest and most promising methods (Work in progress) -Bootstrap resampling Remains for further investigation as to implementation and performance R. Localio, Clustered Observations in HSR, 06/05/2004: Page 46 of 80 C §5.4 Surveys – complex survey – stratified and clustered Characteristics of survey data: -Stratification – strata of different sizes with variable numbers of clusters -Clustering --“primary sampling units” within strata (of different sizes) -Unequal sampling fractions of PSUs and of individuals within PSUsà weighted data - National surveys come with survey designs and weights in the datasets Large stratum Multiple PSUs per stratum Small stratum R. Localio, Clustered Observations in HSR, 06/05/2004: Page 47 of 80 C Software options for analysis of survey data: -Sudaan: means, ratios, totals, regression, logistic regression, log linear models, survival -Stata (svy): means, ratios, totals, regression, logistic regression, multinomial logit, ordered logit, Poisson, negative binomial, -SAS (some limited options) Interpretation is population averaged (Hansen 1953; Kish 1965; Korn & Graubard 1999) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 48 of 80 C §5.4.1 Survey methods for non-survey applications (LaVange 2001) Application of survey methods to -Multicenter clinical trials -Repeated measures within patients over time -Multiple outcomes within patients -Nested clusters (repeated measures within patients, who are clustered within centers) Potential for arriving at “population averaged” estimate accounting for multilevel data Remains for extensive study via simulations of performance with varying: -Number and size of strata -Number and size of centers -Number of subjects—overall and per center -Balance across treatments and exposures -Prevalences of factors and covariates R. Localio, Clustered Observations in HSR, 06/05/2004: Page 49 of 80 C §5.5 Observational “hierarchical” models with center- and patient- level factors of interest “Hierarchical models” – term refers to range of analyses including mixed effects models. This section focuses on observational studies. A large literature on “hierarchical models” from social and health sciences, mostly on continuous outcomes (Goldstein 1995; Leyland 2001; Raudenbush 2002) Analysis options are similar to those previously outlined with attention to whether factors are within or across centers -Population averaged methods (GEE) – Especially where interest lies with center level factors -Center-specific methods – Especially where interest lies in patient-level factor (conditional on being in a particular center) Hierarchies can be more than two levels: -Patients clustered within physicians, and physicians within health plans R. Localio, Clustered Observations in HSR, 06/05/2004: Page 50 of 80 C Population averaged methods might apply -GEE -Survey software also a potential analysis method (SUDAAN, Stata) Mixed effects methods available in software packages -SAS MIXED (ideal for continuous outcomes) -SAS glimmix (PQL) – if centers are large -SAS nlmixed (only for 2 levels of data or single level of clustering) -Stata gllamm (adaptive and non adaptive quadrature) -MLWin (has Bayesian analysis options) -HLM (v5) (Uses Laplace approximation) - R (lme and nlme) -S-plus 6.2 (Correlated Data library) Performance depends, again, on: -Outcome – continuous, binary, ordered, counts -Number of centers -Number of observations within center -Dispersion of outcomes across centers R. Localio, Clustered Observations in HSR, 06/05/2004: Page 51 of 80 C Bayesian hierarchical models -BUGS v 0.6 – Data input is simple, examples are abound (Spiegelhalter 1997) -WinBUGS v1.4 – Data input is difficult, unless dataset is small (Spiegelhalter 2003). Many examples (Congdon 2001; 2003) -bugs.R (Gelman 2004 – Appendix C) – Resolves the data input problems in WinBUGS. Requires writing program in BUGS. Good documentation -Splus 6.2 – S+Bayes module provide hierarchical mixed effects models for Few examples, crashes unexpectedly R. Localio, Clustered Observations in HSR, 06/05/2004: Page 52 of 80 C §6 Report cards, league tables, and profiling (Marshall 1998) -Misguided methods used by Pennsylvania Healthcare Cost Containment Council (and others) (a) Estimate expected risk (mortality) by a logistic regression for each patient (b) Sum expected risks across hospitals (surgeons) (c) For each center j estimate (O j − E j ) / var -Does not control at all for Type I error (multiple comparisons) -High risk of finding outliers when cause is random variation -More soundly based method – Mixed effects model µij = α + X ij Β + U j , where U j : N (0, τ) represent the j centers and X Β represents a matrix of fixed covariates (patient + center level) Then the goal is to look for “outlier” centers: Γ j = Uj se(U j ) > 1.96 These Γ j represent how far the jth center departs from the overall average. These estimators are denominated “best linear unbiased predictors” (BLUPs) or sometimes called “empirical Bayes” or “skrunken” estimates R. Localio, Clustered Observations in HSR, 06/05/2004: Page 53 of 80 C Report card methods for estimating U j ' s and Γ j ’s. (a) Continuous measures – REML-based methods likely will perform adequately provided large number of large centers -SAS mixed; S-plus – lme; R-lme (b) Binary outcomes – Perhaps more common (surgical deaths) -PQL-based methods suffer from bias in estimating σu 2 (Evans 2001) SAS glimmix; R – glmmPQL; MLWin -Quadrature-based methods likely perform better SAS nlmixed; Stata – gllamm Splus 6.2 (correlated data library) (c) Bayesian models for ordering centers based on outcomes -Mixed-model-based estimators have standard errors that can be too small because variance components from mixed model are estimated rather than known. But extent remains to be determined in different settings. -Bayesian methods attempt to account for additional uncertainty (Goldstein 1996) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 54 of 80 C (d) Use of more complex designs (repeated cross sectional models) -The same mixed effects models could easily be extended to estimate “outliers” and yet avoid Type I and Type II error. -Follow centers over time and model -This method remains for further development and testing. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 55 of 80 C §7 Confounding by Cluster – Issues of variance estimation for clustered data are well-known. Problems of bias from confounding are less appreciated (Localio 2002; Berlin 1999; Neuhaus 1998; Ten Have 1996) Ingredients for confounding by center: (1) Focus on factors at patient level (race, gender, age) rather than center level (hospital size) (2) Association between: -Center and outcome – e.g., outcome varies across centers (3) Variation in prevalence of patient-level factors across centers -Therefore, if patient-level factor is balanced (identically distributed) within center confounding will not occur (except if special case using OR as the outcome) (4) If the odds ratio (OR) is the outcome: Variation in the odds of outcome in the reference patient group across centers (noncollapsibility) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 56 of 80 C Example: Reperfusion therapy among Medicare Beneficiaries (Canto NEJM 2000) Question: Do minority patients have lower access to reperfusion therapy for acute myocardial infarction Key Facts Known: N= 26,575 patients Outcome =57% patients received reperfusion therapy Prevalence of risk factor (race) = 6% Key factors Not Known: Number of hospitals – perhaps 2000 or more Variation in rate of reperfusion across hospitals Variation of risk factor across hospitals Findings: Black women (RR=0.9) and black men (RR=0.85) significantly less likely to receive perfusion therapy than white men, adjusting for patient and hospitals (cath lab, urban, 3 sizes, 4 regions) Proposed explanation: Physicians’ clinical ambiguity, lack of training, insufficient knowledge Alternative explanation: Caucasian Medicare patients receive care at hospitals in which all patients are more likely to receive reperfusion therapy. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 57 of 80 C Table 7.1. Interquartile Ranges of Center-Specific Baseline Outcome (overall risk =0.2) Std dev Random Effect 25th% ile 75th% ile Range 0.0 0.5 1.0 0.15 0.24 0.09 0.11 0.31 0.20 0.10 0.34 0.24 Some variation across centers is consistent with random variation (sd=0.0) assuming that observations from each center come from the same true underlying risk and only sampling variation applies R. Localio, Clustered Observations in HSR, 06/05/2004: Page 58 of 80 C Table 7.2. Bias under typical methods for analysis of clustered data. Mean odds ratios from 500 simulations. True RR→ 1.0 1.5 2.0 True OR→ 1.0 1.71 2.67 Cluster 1.09 1.84 2.84 specific (CS) Population 1.07 1.59 2.21* averaged (GEE) Survey 1.57 2.32 3.24 (sandwich) Dispersion of exposure and outcome: 2.0;Correlation of dispersion = 0.4 Baseline risk=0.2; Exposure prevalence=0.2 (*Note, in this example the PA (GEE) estimate is lower than 2.67 because there are two effects at work. This estimate uses an exchangeable correlation structure and there is ample attenuation of the results towards the null. This attenuation offsets the bias of confounding by center. See Localio 2002 for details). Standard methods confound two different attributes of the exposure (1) The among center effect – differences across hospitals according to race of patients treated (2) The strictly within-center (hospital) effect – differences in treatment according to race of patients given that patients of multiple races are treated at the same hospital Both PA (GEE) and CS (MLWin, NLMIXED, gllamm) methods will result in bias R. Localio, Clustered Observations in HSR, 06/05/2004: Page 59 of 80 C Solutions? (1) Condition on center to find the within-center effect -eliminates the random center effect by conditioning -random effects become nuisance parameters not estimated -eliminates all center-level factors -uses only centers with discordant outcomes (must have events and nonevents within each center -fixed effects regression (e.g., Stata “xtreg, fe”) -conditional regression (conditional logistic or Poisson) (2) Decompose the within- and among-center effects Add mean rate of exposure for each center as a regression covariate. Alternative parameterization, mean exposure is subtracted from the individual level (binary exposure) to yield the form: logit{E[Yij | Γi , X ij ]} = α + U j + β A X j + βW ( X ij − X j ) Subscripts A and W represent among and within components of exposure U ~ N (0, τ2 ) represents a random intercept for the J centers R. Localio, Clustered Observations in HSR, 06/05/2004: Page 60 of 80 C Random intercept is not explicitly estimated in population-averaged models (GEE or survey methods). The within component – Does medical care differ within an institution because of the patient’s characteristic? The among component – Does medical care differ across institutions based on the institutional average ( X j ) of the patient’s characteristic? Each component measures a different effect R. Localio, Clustered Observations in HSR, 06/05/2004: Page 61 of 80 C Table 7.3. Conditional analyses. Mean odds ratios from 500 simulations. True RR→ 1.0 1.5 2.0 True 1.0 1.71 2.67 OR→ # Centers Size 30 (20-50) 1.02 1.72 2.66 60 (20-50) 1.01 1.71 2.69 10 (80-150) 1.00 1.71 2.66 Dispersion of exposure and outcome: 2.0; Correlation of dispersion = 0.4 Baseline risk=0.2; Exposure prevalence=0.2 The conditional analysis (in this case conditional logistic regression) gives unbiased estimates regardless of institutional (center) size and the number of centers R. Localio, Clustered Observations in HSR, 06/05/2004: Page 62 of 80 C Table 7.4. Decomposition of Cluster Specific and Population Averaged Models. Mean odds ratios True RR→ 1.0 1.5 2.0 True OR→ 1.0 1.71 2.67 Method of Analysis Center Specific 1.02 1.72 2.67 Population 1.01 1.53 2.13 Averaged (GEE, exch) Population 1.01 1.53 2.15 averaged (Survey, sandwich) Dispersion of exposure and outcome: 2.0;Correlation of dispersion = 0.4 Baseline risk=0.2; Exposure prevalence=0.2 Decomposition of the center specific (CS) estimate gives an unbiased estimate of true within-center effect Decomposition of GEE and survey estimates (population averaged (PA) estimates) are attenuated towards the null, as expected. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 63 of 80 C Table 7.5. Fixed Effects Analyses – Relation to Number of Centers # Centers 30 60 10 RR→ OR→ Size (20-50) (20-50) (80-150) 1.0 1.0 1.5 1.71 2.0 2.67 1.02 1.01 1.00 1.75 1.74 1.71 2.74 2.77 2.69 Dispersion of exposure and outcome: 2.0; Correlation of dispersion = 0.4 Baseline risk=0.2; Exposure prevalence=0.2 A fixed effects analysis (using an indicator variable for each center) will lead to some bias if the number of centers is large. The fixed effects method shows least bias with few centers relative to the number of patients. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 64 of 80 C Example: Reperfusion and Race – Confounding by hospital? (Canto’s example) Assumptions of a simulation: -2000 hospitals with 6 to 24 patients per hospital (n=30,000) -Dispersion random effects outcome and exposure = 0.5 -Interquartile range hospital-level prevalence race = 0% - 12% -Interquartile range of rate of reperfusion = 34% to 65% -Correlation of random effects = 0.5 -Spearman correlation reperfusion rate and race = 0.077 -True association race and reperfusion: RR=OR =1.0 Results: Population averaged (sandwich) method: OR = 0.92 (Canto found RR=0.90 and 0.85) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 65 of 80 C §8 Volume-outcome studies – Correlations of fixed and random effect A common application of clustered data – do outcomes improve with larger centers? A simple model:  pˆ  ln  ij  = α + X Β + δ ∗ vol j + U j , where X Β represents a linear combination of patient 1 − pˆ ij    level factors and their coefficients, vol j represents a fixed effect for the volume in center j, and U j represents a random effect for the center, i.e., the variation of each center about the mean p̂ . Issue #1: Should the analysis use a PA or CS model? (Panageas 2003) PA model makes more sense because Volume does not change within center (Graubard 1994) There is (usually) no effort to demonstrate effect over time. CS and PA methods can give qualitatively different estimates (Panageas 2003) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 66 of 80 C Issue #2:Whether random effects, Γ j , are independent of fixed effects vol j For example, will high volume centers, after controlling for patient characteristics and for volume, tend to have smaller positive departures from average risk than small volume centers? -If yes, then simulations show that the negative association of volume and outcome (larger centers have lower risks) will be biased toward the null, with both GEE and center-specific models. -A positive association will bias the estimate of association, both PA and CS, away from the null -Both methods assume independence of random and fixed effects -There is no good way to address the problem, expect perhaps by sensitivity analyses R. Localio, Clustered Observations in HSR, 06/05/2004: Page 67 of 80 C §9 Crossed (non-nested) vs nested effects Examples: Patients are seen by more than one physician Physicians practice in more than one hospital “Volume” of surgeon who practices at more than one hospital and each hospital has its own “volume” Solution 1: Treat multiple occurrences of repeated measures as coming from an independent observation. If Dr. Smith works at two hospitals, treat Dr. Smith as being two different physicians – her practice might differ across hospitals. (Clayton 1999; Rasbash 2001; Goldstein 2002) Solution 2: Assign weights to data so that a physician’s time is allocated to Hospital A random effect and Hospital B random effect (Rasbash 2001) Software options MLWin Splus R Bayesian methods (Note: Stata’s “gllamm” might not be an option for non-nested models) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 68 of 80 C §10 Model specification and “interaction” Desire to express effect of intervention in terms of: “risk differences” (for a main effect), or “differences of risk differences” (for an interaction of time and treatment/exposure) “Interaction” is scale dependent Example: t=time, tx=treatment. Effect of t and tx on outcome (risk) No interaction for a multiplicative effect (risk doubles regardless of tx) Interaction on additive scale: 5% points (tx=0) vs 10 % points (tx=1); a difference of risk differences of 5 % points. t=1 t=0 tx=1 0.20 0.10 tx=0 0.10 0.05 R. Localio, Clustered Observations in HSR, 06/05/2004: Page 69 of 80 C Should the investigator use a linear hierarchical model (as contrasted with a logistic hierarchical model) simply to achieve a risk differences (or difference in risk differences? “The linear approach was adopted because there was more interest in public health effects (that is, percent increase in the outcome per unit change in a predictor) than in epidemiologic association” (Unnamed author of a manuscript reviewed by ARL) Solution = Fit a statistical model appropriate to the data and then express the results in a manner appropriate to the audience. (Lindsay & Jones 1998) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 70 of 80 C §11 Comments and Conclusions -Clustering within centers is common statistical issue in health services research. It results in challenges to adjust for bias and to estimate variance. -Choice of software for analysis should be governed by the scientific question, rather than by availability -Many statistical problems remain unsolved (volume-outcome analysis, for example) -Solutions for some statistical problems remain controversial (Bayesian methods or permutation-test-based methods) -Specialized software presents challenges (some does not work) and expense (in some instances) and false assurances (when not used properly) -Performance of statistical software might be good in some applications and poor in others -Statistical expertise is essential for complex analyses (but statisticians are too often absent in health services research studies based on multi-center data) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 71 of 80 C References: Agresti A, Hartzel J. Strategies for comparing treatments on a binary response with multi-center data. Statist Med. 2000;19:1115-39. Albert PS. Longitudinal data analysis (repeated measures) in clinical trials. Statist. Med. 1999;18:1707-32. Andersen PK, Klein JP, Zhang MJ. Testing for centre effects in multi-centre survival studies: a Monte Carlo comparison of fixed and random effects tests. Statist Med. 1999;18:1489-1500. Ashby M, Neuhaus JM, Hauck WW, Bacchetti P, Heilbron DC, Jewell NP, et al. An annotated bibliography of methods for analyzing correlated categorical data. Statist Med. 1992;11:67-99. Atienza AA, King AC. Community-based health intervention trials: An overview of methodological issues. Epidemiologic Reviews. 2002;24:72-79. Bellamy SL, Gibberd R, Hancock L, Howley P, Kennedy B, Klar N, Lipsitz S, Ryan L. Analysis of dichotomous outcome data for community intervention studies. Statistical Method in Medical Research. 2000; 9:135-159. Berlin JA, Kimmel SE, Ten Have TR, Sammel MD. An empirical comparison of several clustered data approaches under confounding due to cluster effects in the analysis of complications of coronary angiography. Biometrics. 1999;55:470-6. Braun TM, Feng Z. Optimal permutation tests for the analysis of group randomized trials. J Am Statist Assn. 2001;96:1424-32. Burton P, Gurrin L, Sly P. Extending the simple linear regression model to account for correlated responses: an introduction to generalized estimating equations and multi-level mixed modeling. Statist Med. 1998; 17:1261-91 Campbell MK, Elbourne DR, Altman DG. CONSORT statement: extension to cluster randomized trials. BMJ. 2004;702-8. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 72 of 80 C Campbell MK, Grimshaw JM. Cluster randomization trials: time for improvement. BMJ. 1998;317:171-2. Canto, J.G., Allison, J.J., Kiefe, C.I., Fincher, C., Farmer, R., Sekar, P., Sharina, P., and Weissman, N.W. Relation of race and sex to the use of reperfusion therapy in Medicare beneficiaries with acute myocardial infarction. N Engl J Med. 2000; 342: 1094-1100. Carlin JB, Wolfe R, Brown CH, Gelman A. A case study on the choice, interpretation and checking of multilevel models for longitudinal binary outcomes. Biostatistics. 2001. 2:397-416. Carpenter J, Bithell J. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Statist Med. 2000;19:1141-64 Carlin BP, Louis TA. Bayes and Empirical Bayes Methods for Data Analysis. Second Edition. Boca Raton: Chapman & Hall/CRC; 2000:35. Carey V, Zeger SL. Modelling multivariate binary data with alternating logistic regression. Biometrika. 1993; 80:517-26. Clayton D, Rasbash J. Estimation in large crossed random effects models by data augmentation. J R Statist Soc A. 1999;162:425-36. Cnaan A, Laird. NM, Slasor P. Using the general linear mixed model to analyze unbalanced repeated measures and longitudinal data. Statist Med. 1997;16:2349-80. Congdon P. Bayesian Statistical Modeling. New York: John Wiley & Sons; 2001. Congdon P. Applied Bayesian Modeling. New York: John Wiley & Sons; 2003. Cornfield J. Randomization by group: a formal analysis. Am J Epidemiol. 1978;108:100-2. Diaconis P, Efron B. Computer intensive methods in statistics. Scientific American. 1983 May;248(5):116-130. Diez-Roux AV. Multi-level analysis in public health research. Annu Rev Pub Health. 2000;21:17192. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 73 of 80 C Diggle P, Liang KY, Zeger SL. The Analysis of Longitudinal Data. New York: Oxford University Press; 1994. Diggle PJ, Heagerty P, Liang K-Y, Zeger SL. Analysis of Longitudinal Data. Second Edition. New York: Oxford; 2002. Donald A, Donner A. Adjustments to the Mantel-Haenszel chi-square statistic and odds ratio variance estimator when the data are clustered. Statist Med. 1987;6:491-9. Donner A, Klar N. Design and Analysis of Cluster Randomization Trials in Health Research. London: Arnold;2000. Donner A, Klar N. Methods for comparing event rates in intervention studies when the unit of allocation is the cluster. Am J Epidemiol. 1994; 140:279-89. Donner A, Brown KS, Brasher P. A methodological review of non-therapeutic intervention trials employing cluster randomization. 1979-1989. International Journal of Epidemiology. 1990;19:795800. Efron B, Tibshirani R. Statistical data analysis in the computer age. Science. 1991;253:390-5. Evans BA, Feng Z, Peterson AV. A comparison of generalized linear mixed model procedures with estimating equations for variance and covariance parameter estimation in longitudinal studies and group randomized trials. Statist Med. 2001; 20:3353-73. Everitt B, Rabe-Hesketh S. Analyzing Medical Data Using S-plus. New York: Springer; 2001. Feldman HA, McKinlay SM. Cohort versus cross-sectional design in large field trials: precision, sample size, and a unifying model. Statist. Med. 1994;13:61-78. Feng Z, Diehr P, Peterson A, McLerran D. Selected statistical issues in group randomization trials. Annu Rev Public Health. 2001;22:167-87. Fitzmaurice G, Laird N, Ware J. Applied Longitudinal Analysis. New York: Wiley; 2004 (June) R. Localio, Clustered Observations in HSR, 06/05/2004: Page 74 of 80 C Gail MH, Wieand S, Piantadosi S. Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika. 1984;71:3:431-44. Gail MH, Mark SD, Carroll RJ, Green SB. On design considerations and randomization-based inference for community intervention trials. Statist Med. 1996;15:1069-92. Garthwaite PH. Confidence intervals from randomization tests. Biometrics. 1996; 52:1387-93. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. Second Edition. Boca Raton, FL: Chapman & Hall/CRC:2004. Goldstein H. Multilevel Statistical Models. London: Edward Arnold; 1995:1-13. Goldstein H, Spiegelhalter DJ. League tables and their limitations: statistical issues in comparisons of institutional performance. JR Statist Soc. 1996;159:385-409 Goldstein H, Browne W, Tasbash J. Tutorial in biostatistics. Multilevel modeling of medical data. Statist. Med. 2002;21:3291-3315 Good P. Permutation tests. A Practical Guide to Resampling Methods for Testing Hypotheses. 2nd Ed. New York: Springer-Verlag;2000; Good P. Resampling methods: A Practical Guide to Data Analysis. 2nd Edition. Boston: Birkhäuser; 2001. Gould AL. Multicenter trial analysis revisited. Statist Med. 1998;17:1779-97. Graubard BI, Korn EL. Regression analysis with clustered data. Statist Med. 1994;13:509-22. Green SB. The advantages of community-randomized trials for evaluating lifestyle modification. Controlled Clinical Trials. 1997;18:506-13. Green SB, Corle DK, Gail MH, Mark SD, Pee D, Freedman LS, Graubard BI, Lynn WR. Interplay between design and analysis for behavioral intervention trials with community as the unit of randomization. Am J Epidemiol. 1995;142:587-93. Guo G, Zhao H. Multilevel modeling for binary data. Annu Rev Sociol. 2000; 26:441-62 R. Localio, Clustered Observations in HSR, 06/05/2004: Page 75 of 80 C Hansen MH, Hurwitz WN, Madow WG. Sample Survey Methods and Theory, Vol 1: Methods and Applications. New York: John Wiley & Sons; 1953. Hedeker D, Siddiqui O, Hu FB. Random-effects regression analysis of correlated grouped-time survival data. Statistical Methods in Medical Research. 2000;9:161-79. Horton NJ, Lipsitz SR. Review of software to fit generalized estimating equation regression models. American Statistician. 1999; 53:160-9. International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use. ICH Harmonized Tripartite Guideline. Statistical Principles for Clinical Trials. Statist. Med. 1999;18:1905-42. http://www.fda.gov/cder/guidance/index.html. Kerry SM, Bland JM. The intracluster correlation coefficient in cluster randomization. BMJ. 1998;316:1455-60. Kerry SM, Bland JM. Statistical notes: sample size in cluster randomization. BMJ. 1998;316:549. Kish L. Survey Sampling. New York: John Wiley & Sons;1965:88. Koepsell TD, Martin DC, Diehr PH, Psaty BM, Wagner EF, Perrin EB, Cheadle A. Data analysis and sample size issues in evaluations of community-based health promotion and disease prevention programs: a mixed model analysis of variance approach. J Clin Epidemiol. 1991;44:701-13. Korn EL, Graubard BI. Analysis of Health Surveys. New York: John Wiley & Sons; 1999. LaVange LM, Koch GG, Schwartz TA. Applying sample survey methods to clinical trials data. Statist Med. 2001;20:2609-23. Leyland AH, Goldstein H. Eds. Multilevel Modeling of Health Statistics. Chichester: John Wiley & Sons; 2001. Liang K-Y, and Zeger S L. Regression analysis for correlated data. Annu Rev Pub Health. 1993;14:43-68. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 76 of 80 C Lindsay JK, Jones B. Choosing among generalized linear models applied to medical data. Statist Med. 1998;17:59-68. Lindsay JK, Lambert P. On the appropriateness of marginal models for repeated measurements in clinical trials. Statist Med. 1998;17:447-69. Localio AR, Berlin JA, Ten Have TR. Confounding due to cluster in multicenter studies – causes and cures. Health Services & Outcomes Research Methodology. 2002;3:1-16. Longford NT. Random coefficient models. Oxford: Clarendon Press; 1993. Marshall EC, Spiegelhalter DJ. Reliability of league tables of in vitro fertilization clinics: retrospective analysis of live birth rates. BMJ. 1998;316:1701-5. McCulloch CE, Searle SR. Generalized, Linear, and Mixed Models. New York: John Wiley & Sons; 2001:232-4. Murray DM. Design and Analysis of Group Randomization Trials. New York: Oxford University Press; 1998. Murray DM. Statistical models appropriate for designs often used in group-randomization trials. Statist. Med. 2001;20:1373-85. Murray DM, Hannan PJ, Wolfinger RD, Baker WL, Dwyer JH. Analysis of data from grouprandomized trials with repeat observations on the same groups. Statist Med. 1998;17:1581-1600. Hannan PJ, Murray DM. Gauss or Bernouilli? A Monte Carlo comparison of the performance of the linear mixed-model analysis of simulated community trials with a dichotomous outcome variable at the individual level. Evaluation Review. 1996;20:338-52. Neuhaus JM. Assessing change with longitudinal and clustered binary data. Annu Rev Public Health. 2001;22:115-28. Neuhaus JM. Statistical methods for longitudinal and clustered designs with binary responses. Statistical Methods in Medical Research. 1992;1:249-73. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 77 of 80 C Neuhaus JM, Segal MR. Design effects for binary regression models fitted to dependent data. Statist Med. 1993; 12:1259-68. Neuhaus JM, Kalbfleisch JD, Hauck WW. A comparison of cluster-specific and populationaveraged approaches for analyzing correlated binary data. International Statistical Review. 1991;59:25-35. Neuhaus, J. and Kalbfleisch JD. Between- and within-cluster covariate effects in the analysis of clustered data. Biometrics. 1998; 54:638-64. Nixon RM, Thompson SG. Baseline adjustments for binary data in repeated cross-sectional cluster randomized trials. Statist. Med 2003; 22:2673-92. Normand ST. Tutorial in biostatistics Meta-analysis: formulating, evaluating, combining, and reporting. Statist Med. 1999;18:321-359. Okuomunne OC. Thompson SG. Analysis of cluster randomization trials with repeated crosssectional binary measurements. Statist Med. 2001; 20:417-33. Okoumunne OC, Guilliford MC, Chinn S, Sterne JAC, Burney PGJ, Donner A. Evaluation of health interventions at area and organization level. BMJ. 1999;319:376-9. Omar RZ, Thompson SG. Analysis of a cluster randomized trial with binary outcome data using a multi-level model. Statist Med. 2000;19:2675-88. Panageas KS, Schrag D, Riedel E, Bach PB, Begg CB. The Effect of Clustering of Outcomes on the Association of Procedure Volume and Surgical Outcomes. Ann Intern Med. 2003;139: 658 - 665. Rabe-Hesketh S, Pickles A, Skrondal A. Reliable estimation of generalized linear mixed models using adaptive quadrature. The Stata Journal. 2002;2:1-21. Rabe-Hesketh S, Pickley A, Skrondal A. GLLAMM Manual. London: Kings College; 2001. http://www.gllamm.org/ Rao JNK, Scott AJ. A simple method for the analysis of clustered binary data. Biometrics. 1992;48:577-85. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 78 of 80 C Rao JNK, Scott AJ. A simple method for analyzing overdisperion in clustered poisson data. Statist. Med. 1999;18:1373-85 Rasbash J, Browne W. Modelling non-hierarchical structures. In Layland AH, Goldstein H, eds. Multilevel Modelling of Health Statistics. New York: John Wiley & Sons; 2001: 93-105 Raudenbush SW, Bryk AS. Hierarchical Linear Models: Applications and Data Analysis Methods. Newbury Park, CA: Sage;2002. Raudenbush SW, Yang ML, Matheos Y. Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation. J Computational and Graphical Statistics. 2000;9:141-57 Rosenbaum PR. Covariance adjustment in randomized experiments and observational studies. Statistical Science. 2002;17:286-327. Senn S. Some controversies in planning and analyzing multi-centre trials. Statist. Med. 1998;17:1753-65. Simpson JM, Klar N, Donner A. Accounting for cluster randomization: a review of primary prevention trials. 1990-1993. Am J Pub Hlth. 1995;85:1378-83. Spiegelhalter D, Thomas A, Best N, Gilks W. BUGS 0.6. Bayesian Inference Using Gibbs Sampling. Cambridge: MRC Biostatistics Unit, Institute of Public Health; 1997. Spiegelhalter D, Thomas A, Best N, Lunn D. WinBUGS User Manual. Version 1.4. Cambridge: MRC Biostatistics Unit, Institute of Public Health; 2003. Sullivan LM, Dukes KA, Losina E. Tutorial in biostatistics. An introduction to hierarchical linear modeling. Statist Med. 1999;18:855-88. Ten Have TR, Kunselman AR, Tran L. A comparison of mixed effects logistic regression models for binary response data with two nested levels of clustering. Statist. Med. 1999;18:947-60. Ten Have TR, Landis JR, Weaver S. Association models for periodontal disease progression: a comparison of methods for clustered binary data. (Letter). Statist Med. 1996;15:1227-9. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 79 of 80 C Ten Have TR, Landis JR, Weaver SL. Association models for periodontal disease progression: a comparison of methods for clustered binary data. Statist Med. 1995;14:413-29. Ten Have TR, Kunselman A, Zharichenko E. Accommodating negative intracluster correlation with a mixed effects logistic model for bivariate binary data. J Biopharmaceutical Statistics. 1998;8:131-49. Thompson SG, Warn DE, Turner RM. Bayesian methods for analysis of binary outcome data in cluster randomized trials on the absolute risk scale. Statist Med. 2004; 23:389-410. Venables WN, Ripley BD. Modern Applied Statistics with S. Fourth Edition. New York: Springer; 2002:297-8. Wei LJ, Glidden DV. An overview of statistical methods for multiple failure time data in clinical trials. Statist Med. 1997;16:833-39. Williams RL. A note on robust variance estimation for cluster-correlated data. Biometrics. 2000;56:645-6. Zeger SL, Liang KY, Albert PS. Models for longitudinal data: a generalized estimating equation approach. Biometrics. 1988:44:1049-60. This project was supported in part by an Agency for Healthcare Research and Quality (AHRQ) Centers for Education and Research on Therapeutics cooperative agreement (grant # U18 HS10399) and by Agency for Healthcare Research and Quality, Grant No. R03 HS 1148101. R. Localio, Clustered Observations in HSR, 06/05/2004: Page 80 of 80 C

Understanding Approaches to Account for Clustering of Observations

Related documents

Products

Support

Understanding Approaches to Account for Clustering of Observations

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib