population denote

※ Discrimination and classification 1. To describe the differential features of objects from several known collections. 2. To assign new objects into two or more labeled by using a rule. Example: We want to judge two species of chickweed by Sepal(萼片) and petal(花瓣) length, petal cleft(半裂的) depth, bract(苞片) length, scarious(乾膜質的) tip length, pollen(花粉) diameter(直徑). Let f1 ( x) and f 2 ( x) be density functions associates with the p 1 vector random variable X for the populations 1 and  2 . Let  be the sample space of all possible observations x . Let R1 be the set of x values for which we classify objects as 1 and R2 =   R1 be the remaining x values for which we classify objects as  2 . The conditional probability of classifying an object as  2 when it is from 1 is p(2 | 1)  p( X  R2 |  1 )   f1 ( x) dx , similarly, the conditional probability of R2 classifying an object as 1 when it is really from  2 is p(1 | 2)  p ( X  R1 |  2 )   f 2 ( x) dx . R1 Let p1 be the prior probability of 1 and p2 be the prior probability of  2 , where p1  p2  1. Then p(obervation is misclassif ied as 1 )  p( X  R1 |  2 ) p( 2 )  p(1 | 2) p2 , p(obervation is misclassif ied as  2 )  p( X  R2 | 1 ) p(1 )  p(2 | 1) p1 . Let c(1 | 2) is cost when an observation from  2 is incorrectly classified as 1 , and c(2 | 1) is the cost when 1 observation is incorrectly classified as  2 . 1. For two populations (minimize ECM)： Allocate x to  1 if f1 ( x) C (1 2) P2  , in particular two populations are normal distribution then f 2 ( x) C (2 1) P1 a. If 1   2   C (1 2) P2 1 ) Allocate x to  1 if ( 1   2 ) '  1 x  ( 1   2 ) '  1 ( 1   2 )  ln( 2 C (2 1) P1 PS: Above inequality is implemented by substituting the sample quantities x1 , x2 , Spooled for 1 , 2 ,  respectively. b. If 1   2 Allocate x to  1 if  C (1 2) P2 1 x(11   21 ) x  ( 1' 11   2'  21 ) x  K  ln( ) 2 C (2 1) P1 1  1 Where K  ln( 1 )  ( 1' 11 1   2 ) 2 2 2 PS: Above inequality is implemented by substituting the sample quantities x1 , x2 , S1 , S2 for 1 , 2 , 1 ,  2 respectively. 2. For two populations using Fisher’s discrimination (maximum separation)： It does not assume that the populations are normal, but it need the equal population covariance matrices. | y  y2 | Choose a linear transformation a' , y  a ' x , such that the separation= 1 is maximum, where Sy S y2  n1 n2 j 1 j 1  ( y1 j  y1 ) 2   ( y2 j  y2 ) 2 n1  n2  2 is the pooled estimate of variance. Then the linear transformation 1 yˆ  aˆ ' x  ( x1  x2 )' S pooled x maximizes the ratio of the separation So, allocate x to  1 if ( 1   2 ) '  1 x  C (1 2) P2 1 ( 1   2 ) '  1 ( 1   2 )  ln( ) 2 C (2 1) P1 PS: Above inequality is implemented by substituting the sample quantities x1 , x2 , Spooled for 1 , 2 ,  respectively. Remark: Fisher’s linear discrimination rule is equivalent to the minimum ECM rule with equal prior probabilities and equal costs of misclassification. 3. For several populations (minimum TPM)： Allocate x to  k if ln Pk f k ( x)  ln Pi f i ( x) for all i  k , in particular all populations are normal distribution then a. unequal  i 1 1 Allocate x to  k if d kQ ( x)  max {d iQ ( x)} , where d iQ ( x)   ln  i  ( x   i ) '  i1 ( x   i )  ln Pi 1i  g 2 2 PS: Above inequality is implemented by substituting the sample quantities xi , Si for i ,  i respectively. b. equal  i   , i  1,2,  g 1 Allocate x to  k if di ( x)  max{di ( x)} , where d i ( x)   i'  1 x   i'  1  i  ln Pi 1i  g 2 PS: Above inequality is implemented by substituting the sample quantities xi , S pooled for i ,  respectively. 4. For several populations using Fisher’s discrimination (maximum separation)： Now we introduce this method. It does not assume that the populations are normal, but it need the equal population covariance matrices. We consider the linear combination Y  a' X  Sum of squared distances from     population s to overall mean of Y     Variance of Y     g  (iY  Y ) 2 i 1  2 Y g  g  (a'  i  a'  ) 2  i 1 a' a a' ( (  i   )(  i   )')a i 1 a ' a  a ' B a a ' a g Where B   (  i   )(  i   )' i 1 Ordinarily,  and  i are unavailable. Suppose a random sample of size ni from population  i , i  1,2, g . Denote the ni  p data set, from population  i , by X i and its j th row by xij' . g 1 We define xi  ni ni  xij and x  j 1 ni  x ij i 1 i 1 g n i 1 g the sample between groups matrix B   ni ( xi  x )( xi  x )' i 1 i g The sample covariance of population  i is S i and S pooled  ni  ( x i 1 j 1 ij g  (n i 1 Consequently we want to chose an â maximizing maximizing  xi )( xij  xi )' i   1) W g  (n i 1 i  1) aˆ ' Baˆ , then it is equivalent that chose an â aˆ ' S pooled aˆ aˆ ' Baˆ . aˆ 'Waˆ Then aˆ1  eˆ1 , aˆ 2  eˆ2 , , aˆ s  eˆs , where eˆ1 , eˆ2 , , eˆs are the eigenvectors of W 1 B and scaled so that eˆ' S pooleseˆ  1 , where s  min{( g  1), p} . The linear combination aˆ1' x is called the sample first discriminant, and aˆ k' x is called the sample k th discriminant. Remark：  1 1 2  1 2 Let e1 , e2 , , es be the eigenvectors of  B then e1 , e2 , , es are also the eigenvectors of  B  . Similarly eˆ1 , eˆ2 , , eˆs are the eigenvectors of W 1 B then eˆ1 , eˆ2 , , eˆs are also the eigenvectors of 1 S pooled B.  a1'  i  Y1   a1' x   '  Y   '  a2 x  a 2   Moreover Y  has mean vector  iY   2 i  under population  i and covariance          '     '  Ys   a s x   as  i  matrix I . Then the appropriate measure of squared distance form Y  y to  iY is s ( y  iY )' ( y  iY )   ( y j  iYj ) 2 j 1 Allocate x to  k if s s s j 1 j 1 j 1  ( y j  kYj ) 2   (a 'j ( x  k )) 2   (a 'j ( x  i )) 2 for all i  k . PS: Above inequality is implemented by substituting the sample quantities xi â j for  i a j , where â j is defined as above. In factor, Fisher’s discrimination among several population is a special case in “normal theory” discriminant score d i (x) , i.e. s s j 1 j 1  ( y j  kYj ) 2   (a 'j ( x  k )) 2  ( x  i )'  1 ( x  i )  2di ( x)  x'  1 x  2 ln Pi , where 1  2  1 2 y j  a x , a j   e j and e j is an eigenvector of  B  ' j  1 2

population denote

Related documents

Products

Support

population denote

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib