Lecture 4

Sampling from a MVN Distribution BMTRY 726 1/17/2014 Sample Mean Vector • We can estimate a sample mean for X1, X2, …, Xn X 1 n  X1  X 2  ...  X n     n X1 j   j 1  n   X  2 j   1n  j 1     n   X pj   j 1   1  n X1 j   n j 1   X1 1 n X   X  2   n j 1 2 j           1n  n X pj   X p j 1         1 n  n j 1 Xj Sample Mean Vector • Now we can estimate the mean of our sample • But what about the properties of X ? – It is an unbiased estimate of the mean – It is a sufficient statistic – Also, the sampling distribution is: X ~ N p   , 1n   Sample Covariance • And the sample covariance for X1, X2, …, Xn  s11 s ' j 21 S  n11  j 1  x j  x  x j  x       s p1 • Sample variance sii  s  2 i 1 n 1  x n j 1 ij   xi 2 • Sample Covariance sik  1 n 1  x n j 1 ij  xi  x kj  xk  s12 s22 sp2 s1 p  s2 p    s pp  Sample Mean Vector • So we can also estimate the variance of our sample • And like X, S also has some nice properties – It is an unbiased estimate of the variance – It is also a sufficient statistic – It is also independent of X • But what about the sampling distribution of S? Wishart Distribution • Given Z1 , Z 2 ,..., Z n ~ NID p  0, Σ  , the distribution of  j1 Z j Z'j is called a Wishart distribution with n degrees of freedom. n • A   n  1 S   j 1  x j  x  x j  x  has a Wishart distribution ' n with n -1 degrees of freedom • The density function is Wn 1  A Σ   A 2 p n1 2  p p 1 4 n p 2 2 Σ e n1 2  tr  AΣ1   2 n i   i1 2  where A and  are positive definite p Wishart cont’d • The Wishart distribution is the multivariate analog of the central chi-squared distribution. – If A1 ~ Wq  A1 Σ  and A 2 ~ Wq  A 2 Σ  are independent then A1  A 2 ~ Wq  r  A1  A 2 Σ    – If A ~ Wn A Σ then CAC’ is distributed Wn  CAC ' CΣC '  – The distribution of the (i, i) element of A is aii   j 1  xij  xi n  2 ~  ii 2n1 Large Sample Behavior • Let X1, X2, …, Xn be a random sample from a population with mean and variance (not necessarily normally distributed)  1    μ   p    and  11  Σ  p1  1 p     pp  Then X and S are consistent estimators for  and . This means P X μ P SΣ as n   as n   Large Sample Behavior • If we have a random sample X1, X2, …, Xn a population with mean and variance, we can apply the multivariate central limit theorem as well • The multivariate CLT says n  X  μ   N p  0, Σ  and X  μ   X  μ ~  ' 1 n 1 2 p Checking Normality Assumptions • Check univariate normality for each component of X – Normal probability plots (i.e. Q-Q plots) – Tests: • Shapiro-Wilk • Correlation • EDF • Check bivariate (and higher) – Bivariate scatter plots – Chi-square probability plots Univariate Methods • If X1, X2,…, Xn are a random sample from a p-dimensional normal population, then the data for the ith trait are a random sample from a univariate normal distribution (from result 4.2) • -Q-Q plot (1) Order the data xi1  xi2  ...  xin (2) Compute the quantiles q1  q2  ...  qn according to qj j  12 1  12 z 2  e dz  n 2 leads to (3) Plot the pairs of observations  j  12  qj    j  1, 2,..., n   n  1  x   q  ,  x   q  ,...,  x   q  i1, 1 i 2, 2 i n, n Correlation Tests • Shapiro-Wilk test • Alternative is a modified version of Shapiro-Wilk test • Uses correlation coefficient from the Q-Q plot rQ   x    x q  q    x    x   q  q   n j 1 i i j 2 n j 1 i j i j n j 1 2 j • Reject normality if rQ is too small (values in Table 4.2) Empirical Distribution Tests • Anderson-Darling and Kolmogrov-Smirnov statistics measure how much the empirical distribution function (EDF) Fn  xi    number observations less than or equal to xi  n differs from the hypothesized distribution F  x, θ  using θˆ to estimate θ • For a univariate normal distribution   ˆ  x   xx  θ   2  , θ   2  , and F x, θˆ =   s  s         • Large values for either statistic indicate observed data were not sampled from the hypothesized distribution Multivariate Methods • You can generate bivariate plots of all pairs of traits and look for unusual observations • A chi-square plot checks for normality in p > 2 dimensions (1) For each observation compute d 2j   x j  x  S 1  x j  x  , j  1, 2,..., n ' (2) Order these values from smallest to largest d21  d22  ...  d2n (3) Calculate quantiles for the chi-squared distribution with p d.f. q1  q2  ...  qn j  12  P   p2  q j  n Multivariate Methods (1) Plot the pairs  d  , q  ,  d  , q  ,...,  d  , q  2 1 1 2 2 2 n 2 n d 2j  qj Do the points deviate too much from a straight line? Things to Do with non-MVN Data • Apply normal based procedures anyway – Hope for the best…. – Resampling procedures • Try to identify an more appropriate multivariate distribution • Nonparametric methods • Transformations • Check for outliers Transformations • The idea of transformations is to re-express the data to make it more normal looking • Choosing a suitable transformation can be guided by – Theoretical considerations • Count data can often be made to look more normal by using a square root transformation – The data themselves • If the choice is not particularly clear consider power transformations Power Transformations • Commonly use but note, defined only for positive variables • Defined by a parameter l as follows: y j  x lj if y j  ln  x j  l0 if l 0 • So what do we use? – Right skewed data consider l < 1 (fractions, 0, negative numbers…) – Left skewed data consider l > 1 Power Transformations • Box-Cox are a popular modification of power transformations where yj  x lj  1 l y j  ln  x j  if if l0 l 0 • Box-Cox transformations determine the best l by maximizing: 2 n n   1 l  l    ln n  j 1  y j  y j    l  1  j 1 ln x j   l x 1 n y j  1n  j 1 j n 2 l Transformations • Note, in the multivariate setting, this would be considered for every trait • However… normality of each individual trait does not guarantee joint normality • We could iteratively try to search for the best transformations for joint and marginal normality – May not really improve our results substantially – And often univariate transformations are good enough in practice • Be very cautious about rejecting normality Next Time • Examples of normality checks in SAS and R • Begin our discussion of statistical inference for MV vectors

Lecture 4

Related documents

Products

Support

Lecture 4

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib