Name___________________________ Microarray Unit Questionnaire 1. A “dye swap” design in microarray analysis … a) … is designed as a biological replication, in which each sample is labeled alternately either with red or green dye and hybridized against the array. b) … allows the researcher to assess variation in the print quality between one array and the next. c) … statistically counts as one experiment, not two. d) … is a method, with which the researcher can test the quality of the cDNA synthesis reaction. e) … allows for comparison of the rate of incorporation of the green fluorophore versus the red fluorophore into each cDNA copy. 2. Microarray data are often transformed into logarithmic values … a) … because logarithmic transformation is a mathematical way to avoid biologically impossible “negative gene expression” values. These values can then be flagged as technical artifacts. b) … because logarithmic values of log2 transforms fractions smaller than 1 into negative numbers of the same absolute value for genes with the same magnitude (up or down) “fold-change”. c) … because tests to assess the statistical difference between treatment and control samples will not provide a p-value, unless the data are displayed as logarithms. d) … because the ratio between two dyes (red/green or green/red) mathematically cannot be displayed in non-logarithmic values. e) … because it is the mean (as opposed to the median) that is used in the statistical analysis. 3. The “null hypothesis” in microarray data … a) … states that a gene has the same expression in the treatment as in the control group. b) … states that two genes have different expression levels in the treatment and the control group. c) … assumes that green and red dye in a dye swap analysis incorporate into the same cDNAs of the probe at different rates. d) … states that the M value for the same gene in treatment versus control groups does not equal 0. e) … states that the mean of all repeated measurements of the same gene as it appears on an M/A plot lies either above or below the 0 line. 4. Assume that the statistical analysis of several repeated data points for the same gene from several arrays results in a p-value of >0.05. This means that… a) … the alternative hypothesis can be rejected, which means that the control and treatment group are indeed different in their gene expression. b) … according to the data, either the treatment or the control group does not contain the gene in question in its DNA. c) … the null hypothesis can be rejected, which means that the control and treatment group are indeed different in their gene expression. d) … the null hypothesis cannot be rejected, which means the control and treatment group are not different from each other in their gene expression. e) … statistically, the question whether or not the gene from the control and treatment group are different in their expression cannot be answered. 5. If microarrays are to be evaluated statistically it is important to have at least one replicate, … a) … because from one single slide analysis no biological data could be obtained. b) … but a dye swap design would not provide replicated data. c) … which, furthermore, has to include a technical replicate, not jus a biological replicate, if the data are to have any statistical value. d) … because if you have only one value for each gene you cannot perform any statistics on its value. e) … because each replicate gives a value for either the control or the treatment group (red or green). 6. In microarray analysis thousands of data points are analyzed resulting in thousands of p-values after statistical analysis has been performed. This results in “multiple testing” problems because… a) … due to the large number of genes analyzed, the p-value for each gene is less likely to reflect true statistical (in-)significance for the gene that it is assigned to. b) … due to the large number of genes analyzed, the likelihood that among the p-value assigned several are false positives is close to 100%. c) … due to the large number of genes analyzed, the likelihood that among those pvalue are false positives is equal to or less than 5% (0.05). d) …only one p-value is calculated to reflect the significance for all represented genes on the array. e) … each p-value has to be re-calculated as many times as there are genes on the array. 7. The Bonferroni correction method… a) … calculates p-values not by the use of t-statistics but by assigning ranks to each gene, and dividing the rank by the total sample size. b) … is basically a method that due to the constraints of multiple testing issues of thousands of genes considers p-values as significant even when they are greater than 0.05. c) … is basically a method that due to the constraints of multiple testing issues of thousands of genes considers p-values as significant only when they are much smaller than 0.05. d) … does not rely on a p-value but compares each gene to each other on the array in pairwise comparisons to eliminate the problems of multiple testing. e) … calculates a more stringent p-value because it can apply a greater degree of freedom due to the thousands of replicated data points on each array.