SDC 3 Appendix: Background information and formulas for the p

advertisement
SDC 3 Appendix: Background information and formulas for the p-values and
confidence interval for the problem of comparing two means.
Brief summary: The purpose of this technical appendix is to present the detailed formulas
used to compute the p-values and the confidence interval for the problem of comparing two
means. The derivation of the formulas relating pb and ph to p-values and the derivation of
the confidence interval are also presented..
The model we assume is that Y11 …Y1n1 are independent and normally distributed with mean
1 and variance  12 and, independently, Y21 , Y2n are independent and normally distributed
2
2
with mean  2 and variance  2 . The problem is to make inferences about the difference in
means 2  1 .
The sample mean of sample i  1 2 is Y i  ni1  ji1 Yij (presented in L62 and L93), and the
n
sample variance of sample i  1 2 is si2  (ni  1) 1  ji1 (Yij  Y i )2 , (the sample standard
n
deviation si is presented in L63 and L94). The standard error is
SE Y 2  Y 1  s12  n1  s22  n2 (presented in L64 and L95) and the approximate degrees of


freedom is   SE Y 2  Y 1   s12  n1    n1  1   s22  n2    n2  1 (presented in L65 and
4
2
2
L96). Under the assumed model, the sampling distribution of
Y
2
 Y 1   2  1   SE Y 2  Y 1 is approximated by the Student t distribution with 
degrees of freedom. In the equal variance case (  12   22 ), the sampling distribution of
Y
2
 Y 1   2  1   SE Y 2  Y 1 where
SE Y 2  Y 1 
 n 1 s   n
2
1
1
2
 1 s22    n1  n2  2  1  n1  1  n2 , is exactly the Student t
distribution with   n1  n2  2 degrees of freedom.
Let G be the distribution function of the Student t distribution with  degrees of freedom
and let T denote a random variable with this distribution. The Student t distribution is
symmetric so T and T have the same distribution. The one-sided p-value for testing the
null hypothesis that 2  1   against the alternative that 2  1   is

Pr T   Y2  Y1    / SE  Y2  Y1  2  1  





 Pr T      Y2  Y1  / SE  Y2  Y1  2  1   


 Pr   T     Y2  Y1  / SE  Y2  Y1  2  1   


 G     Y2  Y1  / SE  Y2  Y1  


 ph


given by [2]. Thus ph can be interpreted as the one-sided p-value for testing the null
hypothesis that 2  1   against the alternative that 2  1   (i.e. the probability
under the null hypothesis 2  1   that we observe a t-ratio greater than or equal to the
observed value of the t-ratio Y 2  Y 1    / SE Y 2  Y 1 ). Similarly, pb can be interpreted as
the one-sided p-value for testing the null hypothesis that 2  1   against the alternative
that 2  1   . The derivation is

Pr Tv   Y2  Y1    / SE  Y2  Y1  2  1  





 Pr Tv      Y2  Y1  / SE  Y2  Y1  2  1   


 Pr   Tv     Y2  Y1  / SE  Y2  Y1  2  1   


 1  Gv     Y2  Y1  / SE  Y2  Y1  


 pb .


The confidence interval [1] for 2  1 is obtained as
1    Pr  t  Y 2  Y 1   2  1   SE Y 2  Y 1   t 
 Pr Y 2  Y 1  t SE Y 2  Y 1  2  1  Y 2  Y 1  t SE Y 2  Y 1  .
The critical value t  G 1 1   / 2 , where G 1 is the inverse of the distribution function
of the Student t distribution.
Download