Sample Size, Power and Effect Size

Power Calculations for DNA Applications Sample Size, Power and Effect Size Below we provide estimates of power and effect size for various genetic tests and subgroups within the Framingham Heart Study. Estimates are provided for the following situations: 1) linkage analyses with the 330 pedigrees included in the genome scan with microsatellite markers, conducted by the Mammalian Genotyping Service, 2) association analyses for unrelated individuals with varying sample sizes 3) association analyses for family-based association tests Linkage Analyses in 330 Pedigrees All power calculations for linkage were performed using SOLAR v1.6.6 (Almasy and Blangero, 1998). We evaluated the power of a likelihood-ratio test for linkage for a range of QTL heritabilities (5% to 40%) and various sample sizes. The different sample sizes will reflect the variation in the availability of data for different phenotypes. Three sample sizes, ranging from the largest possible size (e.g. for a trait measured in all Framingham participants such as blood pressure) to smaller sizes (e.g. traits measured only in the Offspring Cohort or a subset of Offspring) were considered within the existing 330 pedigrees. For these samples we computed power for three critical values (LOD scores 1.0, 2.0 and 3.0). These critical values encompass LOD scores that are suggestive of linkage (LOD  1.0 or 1.5) or establish significant linkage (LOD  3). Given a selected critical value, an approximate value of the power of the likelihoodratio test is computed using the non-central 2 distribution whose non-centrality parameter is twice the difference between the expected log-likelihoods of the null and alternative hypotheses of linkage (Sham et al., 2000). In this analysis the non-centrality parameters were estimated by means of simulation using our pedigrees. For each QTL heritability, 100 data sets were simulated using the 330 pedigree structure and a genetic model containing a marker with 10 alleles of equal frequency (0.90 heterozygosity) that was linked to a diallelic QTL with 10% allelic frequency. We evaluated other allele frequencies, but found that all were essentially equivalent since the primary determinant of power here is the proportion of variance that the QTL explains. Simulation was performed using the “simqtl” command in SOLAR. First, the marker and the phenotypic data (under the assumed QTL variance) were generated and the IBD probabilities were computed for all observations in the pedigrees. For a given sample the non-centrality parameter of the 2 distribution was estimated from a twopoint linkage analysis of that sample. Tables 1 presents the expected lod score (ELOD, the average twopoint LOD score from 100 simulated data sets) and power computed for the various samples and levels of QTL heritability in the 330 families. In these tables the three columns for power represent different critical values. For example, the estimated power for sample size N=2885 in 330 pedigrees and 20% QTL heritability was 99%, 92% and 76% if linkage is deemed significant for LOD  1.0, LOD  2.0 and LOD  3.0, respectively. Table 1: Power for Linkage Analyses in 330 Pedigrees Original Cohort and Offspring Offspring Offspring Subset [N=2885, Ped=330] [N=1672, Ped=330] [N=1228, Ped=326] QTL Heritabilit ELOD LOD1 LOD2 LOD3 ELOD LOD1 LOD2 LOD3 ELOD LOD1 LOD2 LOD3 y (%) 5 0.39 21 4 1 0.30 16 3 1 0.21 12 2 0 10 1.20 58 25 9 0.83 43 14 4 0.51 27 7 1 15 2.47 89 63 36 1.68 74 40 17 0.98 49 18 6 20 4.22 99 92 76 2.84 93 72 46 1.63 72 38 16 25 6.50 100 99 96 4.36 99 93 78 2.47 89 63 37 30 9.35 100 100 100 6.26 100 99 95 3.53 97 84 62 35 12.87 100 100 100 8.60 100 100 99 4.84 99 95 84 *For 40% QTL heritability, power greater than 95% Association Studies with Biologically Unrelated Subjects Table 2 displays detectable effect sizes (ΔR2 or percent variation explained by a QTL SNP) for a quantitative trait measured on unrelated subjects for power of 80% or 90% and alpha levels of 0.01 0.001 and 0.0001. These calculations are performed for varying samples sizes, from 1800 to 800. The numbers for 1800 correspond to genotyping the unrelated plate set and all subjects having the trait. Thus, for 1% significance level, if background (baseline) covariates explain 30% of the variation in the trait (R2 base), then a sample of size 1800 will have 80% power to detect a QTL variance of 0.0054 and 90% power to detect a QTL variance of 0.0068. Table 2: QTL variance sufficient for power 0.80 (0.90) in association analyses of unrelated subjects Analysis with General Linear Model having unrestricted genotype means [u(AA), u(Aa), u(aa)] Entries are values of QTL variance for stated power and significance level given fixed sample size, and non-QTL baseline R-squared, with total variance=1.0 Sample Size Power R2 base 1800 1800 1800 1800 1800 1800 1700 1700 1700 1700 1700 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 Significance Level 0.01 0.001 0.0001 0.0039 0.0054 0.0069 0.0048 0.0068 0.0087 0.0041 0.0057 0.0074 0.0051 0.0072 0.0055 0.0076 0.0098 0.0066 0.0092 0.0119 0.0058 0.0081 0.0104 0.0070 0.0098 0.0070 0.0098 0.0126 0.0083 0.0116 0.0149 0.0074 0.0104 0.0133 0.0087 0.0122 1700 1600 1600 1600 1600 1600 1600 1500 1500 1500 1500 1500 1500 1400 1400 1400 1400 1400 1400 1300 1300 1300 1300 1300 1300 1200 1200 1200 1200 1200 1200 1100 1100 1100 1100 1100 1100 1000 1000 1000 1000 1000 1000 900 900 900 900 900 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.8 0.8 0.8 0.9 0.9 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.5 0.3 0.0092 0.0043 0.0061 0.0078 0.0054 0.0076 0.0098 0.0046 0.0065 0.0083 0.0058 0.0081 0.0104 0.0050 0.0069 0.0089 0.0062 0.0087 0.0112 0.0053 0.0075 0.0096 0.0067 0.0094 0.0120 0.0058 0.0081 0.0104 0.0072 0.0101 0.0130 0.0063 0.0088 0.0114 0.0079 0.0111 0.0142 0.0069 0.0097 0.0125 0.0087 0.0122 0.0156 0.0077 0.0108 0.0139 0.0097 0.0135 0.0126 0.0061 0.0086 0.0110 0.0074 0.0104 0.0133 0.0065 0.0092 0.0118 0.0079 0.0111 0.0142 0.0070 0.0098 0.0126 0.0085 0.0119 0.0152 0.0075 0.0106 0.0136 0.0091 0.0128 0.0164 0.0082 0.0114 0.0147 0.0099 0.0138 0.0178 0.0089 0.0125 0.0160 0.0108 0.0151 0.0194 0.0098 0.0137 0.0177 0.0118 0.0166 0.0213 0.0109 0.0152 0.0196 0.0131 0.0184 0.0157 0.0079 0.0110 0.0142 0.0093 0.0130 0.0167 0.0084 0.0117 0.0151 0.0099 0.0139 0.0178 0.0090 0.0126 0.0162 0.0106 0.0148 0.0191 0.0097 0.0135 0.0174 0.0114 0.0160 0.0205 0.0105 0.0147 0.0188 0.0124 0.0173 0.0222 0.0114 0.0160 0.0206 0.0135 0.0189 0.0242 0.0126 0.0176 0.0226 0.0148 0.0207 0.0266 0.0139 0.0195 0.0251 0.0164 0.0230 900 800 800 800 800 800 800 0.9 0.8 0.8 0.8 0.9 0.9 0.9 0.1 0.5 0.3 0.1 0.5 0.3 0.1 0.0174 0.0087 0.0122 0.0156 0.0109 0.0152 0.0195 0.0236 0.0122 0.0171 0.0220 0.0148 0.0207 0.0266 0.0296 0.0157 0.0219 0.0282 0.0185 0.0258 0.0332 Differences between SNP genotypes relate to ΔR2 through allele frequencies (let f denote minor allele frequency) ─ which determine expected genotype frequencies (g1, g2 and g3) ─ and through covariate-adjusted genotype-specific means (μ1, μ2 and μ3). Without loss of generality, we assume trait values are standardized to variance 1. Then ΔR2 = Σ gi(μi – μ)2 where μ= Σ giμi is the overall mean. With an additive genetic model, μ2-μ1= θ and μ3-μ1= 2θ. Figure 1 displays the relation between the increment ΔR2, the minor allele frequency, f, and the additive difference between means, θ. For example, suppose we determine that we have power 0.80 to detect a SNP that contributes 2% of the variance (Δ=0.02). If the minor allele frequency is f=0.10 then θ0.33; note that θ is in standard deviation units. If the minor allele frequency is f=0.30, then θ0.22. In conjunction with the power table, this Figure helps us to determine which combinations of allele frequencies and genotype mean differences will be compatible with the value of ΔR2 from power calculations for specified power and significance level. Figure 1. Delta R-square related to Allele Frequency Theta = difference between genotype means 0.5 Theta Delta R-sq 0.005 0.01 0.015 0.02 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 Minor allele frequency 0.4 0.5 Association Studies using Family-Based Association Test Table 3 displays power calculated using PBAT for the family-based association test with the family plate set. Two sample sizes were evaluated: n=1398 from 400 nuclear families using essentially all subjects on the plate set and n=1061 from 291 nuclear families, a subset of the total. Power was computed for a range of QTL and marker minor allele frequencies. The QTL and the marker are assumed to be in linkage disequilibrium (D’~1). Thus, when the minor allele for the QTL is equal to that of the marker, it is assumed that the marker and the QTL are the same and power is at its maximum. Although the D’ remains constant, as the differences between the minor alleles for the QTL and the marker increase, the LD correlation (R2) and the study power decrease. For example, if a study used 1400 subjects on the family plate set for a trait with allele frequency for the QTL of 5% and allele frequency of the nearby marker of 10%, the power to detect a QTL variance (heritability) of 10% would be 52% for an alpha of 0.001 and 93% for an alpha of 0.05. When the allele frequencies of the marker and the underlying functional variant of the QTL are the same, power in these examples increase to a maximum of 98% and 100%. Sample Size Table 3: Power for FBAT analyses Allele Allele Freq for Freq for Functional Typed Variant Marker QTL H2 Alpha Power 1400 5 5 5 5 10 10 0.001 0.05 0.001 0.05 66 98 98 100 1400 5 10 5 5 10 10 0.001 0.05 0.001 0.05 21 72 52 93 1400 5 20 5 5 10 10 0.001 0.05 0.001 0.05 6 41 16 65 1400 5 30 5 5 10 10 0.001 0.05 0.001 0.05 2 26 7 44 1400 10 5 5 5 0.001 0.05 24 77 10 10 0.001 0.05 65 97 1400 10 10 5 5 10 10 0.001 0.05 0.001 0.05 71 98 99 100 1400 10 20 5 5 10 10 0.001 0.05 0.001 0.05 22 72 57 94 1400 10 30 5 5 10 10 0.001 0.05 0.001 0.05 9 50 27 77 1400 20 5 5 5 10 10 0.001 0.05 0.001 0.05 6 43 21 73 1400 20 10 5 5 10 10 0.001 0.05 0.001 0.05 24 75 66 97 1400 20 20 5 5 10 10 0.001 0.05 0.001 0.05 74 98 99 100 1400 20 30 5 5 10 10 0.001 0.05 0.001 0.05 36 85 80 99 1400 30 5 5 5 10 10 0.001 0.05 0.001 0.05 3 28 9 51 1400 30 10 5 5 10 10 0.001 0.05 0.001 0.05 10 53 33 84 1400 30 20 5 5 10 10 0.001 0.05 0.001 0.05 39 87 86 99 1400 30 30 5 5 10 10 0.001 0.05 0.001 0.05 75 98 99 100 1061 5 5 5 5 10 10 0.001 0.05 0.001 0.05 29 85 72 99 1061 5 10 5 5 10 10 0.001 0.05 0.001 0.05 8 50 22 76 1061 5 20 5 5 10 10 0.001 0.05 0.001 0.05 2 27 7 45 1061 5 30 5 5 10 10 0.001 0.05 0.001 0.05 1 18 3 30 1061 10 5 5 5 10 10 0.001 0.05 0.001 0.05 9 54 28 84 1061 10 10 5 5 10 10 0.001 0.05 0.001 0.05 36 87 82 100 1061 10 20 5 5 10 10 0.001 0.05 0.001 0.05 9 51 27 78 1061 10 30 5 5 10 0.001 0.05 0.001 4 33 11 10 0.05 55 1061 20 5 5 5 10 10 0.001 0.05 0.001 0.05 2 28 7 50 1061 20 10 5 5 10 10 0.001 0.05 0.001 0.05 9 54 31 84 1061 20 20 5 5 10 10 0.001 0.05 0.001 0.05 40 88 86 100 1061 20 30 5 5 10 10 0.001 0.05 0.001 0.05 15 64 46 91 1061 30 5 5 5 10 10 0.001 0.05 0.001 0.05 1 19 3 33 1061 30 10 5 5 10 10 0.001 0.05 0.001 0.05 4 35 13 62 1061 30 20 5 5 10 10 0.001 0.05 0.001 0.05 17 66 53 93 1061 30 30 5 5 10 10 0.001 0.05 0.001 0.05 41 88 88 100

Sample Size, Power and Effect Size

Related documents

Products

Support

Sample Size, Power and Effect Size

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib