PAM method is used to find genes in a set of DNA chips which best classifies samples and to validate a set of genes to classify samples. The goal is to find the smallest subset of genes allowing the best classification. PAM can be divided in 2 main parts: - The first step is to define, by calculating, a parameter characterizing each gene. This will allows to generate various subsets of genes tested in a second step. Starting with all genes, subsets of genes are produced by selecting genes whose parameter reduced by is greater or equal to zero. is a value starting at zero for the first group (all genes), and increasing to a value equal to the highest parameter value (Nearest Shrunken Centroids method). - The second step is a K-fold validation. Samples are divided into K equal parts. Based on the K-1 parts, the selected genes are used to predict the classes of Kth part, an error rate is computed. This prediction is repeated K fold, each part is predicted based on the gene expression of the K-1 parts. An average error rate is calculated. The evolution of average error prediction is used to find the limiting value of producing the smallest group of genes with the weakest average error rate. This group of genes allow prediction of the class of each sample with the weakest probability of mistake. To validate a set of genes, the average error rate must be at zero for =0 and increase as . There are n samples in K classes, for each gene of these samples, xij is the expression for the gene i in the sample j and. xik is the average expression of the gene i in the class k, this is the class centroid. xi is the overall centroid for the gene i. xij xik = ΣjЄCk n k n xi = Σj=1 xij n The parameter characterizing each gene (dik ) is calculated from dik = (xik – xi) / Si xik and xi (Si is the pooled within-class standard deviation.) The subsets of genes are produced by calculating d’ik d’ik = sign(dik)(|dik|-Δ)+ If |dik|-Δ < 0 then d’ik = 0 , the gene is eliminated References: - The Stanford website: http://www-stat.stanford.edu/%7Etibs/PAM/ - "Diagnosis of multiple cancer types by shrunken centroids of gene expression" PNAS 2002 99:6567-6572 (May 14).