Comparing 2 Population Means

Comparison of 2 Population Means • Goal: To compare 2 populations/treatments wrt a numeric outcome • Sampling Design: Independent Samples (Parallel Groups) vs Paired Samples (Crossover Design) • Data Structure: Normal vs Non-normal • Sample Sizes: Large (n1,n2>20) vs Small Independent Samples • Units in the two samples are different • Sample sizes may or may not be equal • Large-sample inference based on Normal Distribution (Central Limit Theorem) • Small-sample inference depends on distribution of individual outcomes (Normal vs non-Normal) Parameters/Estimates (Independent Samples) • • • • Parameter:      Estimator: Y 1  Y 2 S12 S 22  Estimated standard error: n1 n2 Shape of sampling distribution: – Normal if data are normal – Approximately normal if n1,n2>20 – Non-normal otherwise (typically) Large-Sample Test of  • Null hypothesis: The population means differ by D0 (which is typically 0): H 0 : 1  2  D 0 • Alternative Hypotheses: – 1-Sided: H A : 1   2  D 0 – 2-Sided: H A : 1  2  D0 • Test Statistic: ( y1  y 2 )  D 0 zobs  S12 S 22  n1 n2 Large-Sample Test of  • Decision Rule: – 1-sided alternative H A : 1   2  D 0 • If zobs  za ==> Conclude   D0 • If zobs < za ==> Do not reject   D0 – 2-sided alternative H A : 1   2  D 0 • If zobs  za/ ==> Conclude   D0 • If zobs  -za/ ==> Conclude  < D0 • If -za/ < zobs < za/ ==> Do not reject   D0 Large-Sample Test of  • Observed Significance Level (P-Value) – 1-sided alternative H A : 1   2  D 0 • P=P(z  zobs) (From the std. Normal distribution) – 2-sided alternative H A : 1  2  D0 • P=2P( z |zobs| ) (From the std. Normal distribution) • If P-Value  a, then reject the null hypothesis Large-Sample (1-a)100% Confidence Interval for  • Confidence Coefficient (1-a) refers to the proportion of times this rule would provide an interval that contains the true parameter value  if it were applied over all possible samples • Rule: y 1 )  y 2  za / 2 S12 S 22  n1 n2 Large-Sample (1-a)100% Confidence Interval for  • For 95% Confidence Intervals, z.025=1.96 • Confidence Intervals and 2-sided tests give identical conclusions at same a-level: – If entire interval is above D0, conclude   D0 – If entire interval is below D0, conclude  < D0 – If interval contains D0, do not reject  ≠ D0 Example: Vitamin C for Common Cold • Outcome: Number of Colds During Study Period for Each Student • Group 1: Given Placebo y1  2.2 s1  0.12 n1  155 • Group 2: Given Ascorbic Acid (Vitamin C) y 2  1.9 s2  0.10 n2  208 Source: Pauling (1971) 2-Sided Test to Compare Groups • H0: 12 0 No difference in trt effects) • HA: 12≠ 0 Difference in trt effects) • Test Statistic: zobs  (2.2  1.9)  0 (0.12) 2 (0.10) 2  155 208  0.3  25.3 0.0119 • Decision Rule (a=0.05) – Conclude  > 0 since zobs = 25.3 > z.025 = 1.96 95% Confidence Interval for  • Point Estimate: y1  y 2  2.2 1.9  0.3 • Estimated Std. Error: (0.12) 2 (0.10) 2   0.0119 155 208 • Critical Value: z.025 = 1.96 • 95% CI: 0.30 ± 1.96(0.0119)  0.30 ± 0.023  (0.277 , 0.323) Entire interval > 0 Small-Sample Test for  Normal Populations • Case 1: Common Variances (s12 = s22 = s2) • Null Hypothesis: H 0 : 1   2  D 0 • Alternative Hypotheses: – 1-Sided: H A : 1   2  D 0 – 2-Sided: H A : 1   2  D 0 • Test Statistic:(where Sp2 is a “pooled” estimate of s2) tobs  ( y1  y 2 )  D 0  1 1   S p2    n1 n2  2 2 ( n  1 ) S  ( n  1 ) S 1 2 2 S p2  1 n1  n2  2 Small-Sample Test for  Normal Populations • Decision Rule: (Based on t-distribution with n=n1+n2-2 df) – 1-sided alternative • If tobs  ta,n ==> Conclude   D0 • If tobs < ta,n ==> Do not reject   D0 – 2-sided alternative • If tobs  ta/ ,n ==> Conclude   D0 • If tobs  -ta/,n ==> Conclude  < D0 • If -ta/,n < tobs < ta/,n ==> Do not reject   D0 Small-Sample Test for  Normal Populations • Observed Significance Level (P-Value) • Special Tables Needed, Printed by Statistical Software Packages – 1-sided alternative • P=P(t  tobs) (From the tn distribution) – 2-sided alternative • P=2P( t  |tobs| ) (From the tn distribution) • If P-Value  a, then reject the null hypothesis Small-Sample (1-a)100% Confidence Interval for   Normal Populations • Confidence Coefficient (1-a) refers to the proportion of times this rule would provide an interval that contains the true parameter value  if it were applied over all possible samples • Rule: y  y )  t 1 2 a / 2, 1 1 S     n1 n2  2 p • Interpretations same as for large-sample CI’s Small-Sample Inference for  Normal Populations • Case 2: s12  s22 • Don’t pool variances: S12 S 22  n1 n2 Sy y  1 2 • Use “adjusted” degrees of freedom (Satterthwaites’ Approximation) : S S  2 1 n*  2 2 2   n  n   2   1 2 2 S2   S 22   1  n1  n2       n 1 n2  1 1           Example - Scalp Wound Closure • Groups: Stapling (n1=15) / Suturing (n2=16) • Outcome: Physician Reported VAS Score at 1-Year Mean Std Dev Sample Size Stapling (i=1) 96.92 7.51 15 Suturing (i=2) 96.31 8.06 16 • Conduct a 2-sided test of whether mean scores differ • Construct a 95% Confidence Interval for true difference Source: Khan, et al (2002) Example - Scalp Wound Closure H0:   0 HA:   0 (a = 0.05) (15  1)( 7.51) 2  (16  1)(8.06) 2 S   60.83 15  16  2 96.92  96.31 0.61 TS : tobs    0.22 2.80 1   1 60.83    15 16  RR : | tobs |  t.025, 29  2.045 2 p 95%CI : 0.61  2.045( 2.80)  0.61  5.73  ( 5.12,6.34) No significant difference between 2 methods Small Sample Test to Compare Two Medians - Nonnormal Populations • Two Independent Samples (Parallel Groups) • Procedure (Wilcoxon Rank-Sum Test): – Rank measurements across samples from smallest (1) to largest (n1+n2). Ties take average ranks. – Obtain the rank sum for each group (T1 , T2 ) – 1-sided tests:Conclude HA: M1 > M2 if T2  T0 – 2-sided tests:Conclude HA: M1  M2 if min(T1, T2)  T0 – Values of T0 are given in many texts for various sample sizes and significance levels. P-values printed by statistical software packages. Example - Levocabostine in Renal Patients • 2 Groups: Non-Dialysis/Hemodialysis (n1 = n2 = 6) • Outcome: Levocabastine AUC (1 Outlier/Group) Non-Dialysis 857 (12) 567 (9) 626 (10) 532 (8) 444 (5) 357 (1) T1 = 45 Hemodialysis 527 (7) 740 (11) 392 (2.5) 514 (6) 433 (4) 392 (2.5) T2 = 33 2-sided Test: Conclude Medians differ if min(T1,T2)  26 Source: Zagornik, et al (1993) Computer Output - SPSS n N f G A N 0 H 0 T b a U M W Z A a E S a N b G Inference Based on Paired Samples (Crossover Designs) • Setting: Each treatment is applied to each subject or pair (preferably in random order) • Data: di is the difference in scores (Trt1-Trt2) for subject (pair) i • Parameter: D - Population mean difference • Sample Statistics:  d n d i 1 i n  d d)   2 n s 2 d i 1 i n 1 sd  sd2 Test Concerning D • Null Hypothesis: H0:D=D0 (almost always 0) • Alternative Hypotheses: – 1-Sided: HA: D > D0 – 2-Sided: HA: D  D0 • Test Statistic: tobs  d sd n Test Concerning D Decision Rule: (Based on t-distribution with n=n-1 df) 1-sided alternative If tobs  ta,n ==> Conclude D  D0 If tobs < ta,n ==> Do not reject D  D0 2-sided alternative If tobs  ta/ ,n ==> Conclude D  D0 If tobs  -ta/,n ==> Conclude D < D0 If -ta/,n < tobs < ta/,n ==> Do not reject D  D0 Confidence Interval for D  sd  d  ta / 2,n    n Example - Evaluation of Transdermal Contraceptive Patch In Adolescents • Subjects: Adolescent Females on O.C. who then received Ortho Evra Patch • Response: 5-point scores on ease of use for each type of contraception (1=Strongly Agree) • Data: di = difference (O.C.-EVRA) for subject i • Summary Statistics: d  1.77 sd  1.48 n  13 Source: Rubinstein, et al (2004) Example - Evaluation of Transdermal Contraceptive Patch In Adolescents • 2-sided test for differences in ease of use (a=0.05) • H0:D = 0 HA:D  0 1.77 1.77   4.31 1.48 0.41 13 RR :| tobs | t.025,12  2.179 TS : tobs  95%CI : 1.77  2.179(0.41)  1.77  0.89  (0.88,2.66) Conclude Mean Scores are higher for O.C., girls find the Patch easier to use (low scores are better) Small-Sample Test For Nonnormal Data • Paired Samples (Crossover Design) • Procedure (Wilcoxon Signed-Rank Test) – Compute Differences di (as in the paired t-test) and obtain their absolute values (ignoring 0s) – Rank the observations by |di| (smallest=1), averaging ranks for ties – Compute T+ and T-, the rank sums for the positive and negative differences, respectively – 1-sided tests:Conclude HA: M1 > M2 if T-  T0 – 2-sided tests:Conclude HA: M1  M2 if min(T+, T- )  T0 – Values of T0 are given in many texts for various sample sizes and significance levels. P-values printed by statistical software packages. Example - New MRI for 3D Coronary Angiography • Previous vs new Magnetization Prep Schemes (n=7) • Response: Blood/Myocardium Contrast-Noise-Ratio Subject A B C D E F G Previous 20 31 20 19 40 28 10 New 36 37 27 32 48 40 25 Diff=Pre-New -16 -6 -7 -13 -8 -12 -15 |Diff| 16 6 7 13 8 12 15 Rank(|Diff|) 7 1 2 5 3 4 6 • All Differences are negative, T- = 1+2+…+7 = 28, T+ = 0 • From tables for 2-sided tests, n=7, a=0.05, T0=2 • Since min(0,28)  2, Conclude the scheme means differ Source: Nguyen, et al (2004) Computer Output - SPSS n n N f a N N 0 0 0 b P 7 0 0 c T 0 T 7 a N b N c N t b a W V a Z 6 A 8 a B b W Note that SPSS is taking NEW-PREVIOUS in top table

Comparing 2 Population Means

Related documents

Products

Support

Comparing 2 Population Means

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib