„Calibration of microarray hybridisation readout for genome-scale profiling of gene expression with improved specificity“ Andreas Sommer FH-HauptbetreuerIn: Dr. Anton Beyer 1 Introduction The quality of microarray measurements is affected by a multitude of parameters covering the entire process from details of protocols in the wet lab to subsequent image analysis steps. Recent developments by members of the Chair of Bioinformatics, University of Natural Resources and Applied Life Sciences, Vienna, have yielded a novel quantitative approach for the comparative assessment of the specific information content of multisample hybridisation signals, allowing an informed adaptation of process parameters and the comparison of available technologies. In this thesis, the method was used to evaluate the performance of four commercial microarray scanner models: – Axon GenePix 4000A (Molecular Devices Corporation, Sunnyvale,CA, USA) – Agilent G2565BA (Agilent Technologies, Santa Clara, CA, USA) – Axon GenePix 4000B (Molecular Devices Corporation, Sunnyvale,CA, USA) – Tecan LS Reloaded (Tecan Group Ltd., Grödig, Austria) Additionally, we assayed the impact of dynamic range extension through multiple scans. The first and lower scan was performed at the specific PMT(Photomultiplier)-gain at which the first spots started to show saturation. The second scan was performed at highest possible PMT settings. Additionally, each slide was scanned in two orientations. First, using the standard alignment and second, after rotating the slide by 180°. The second scan mitigates systematic spatial effects introduced by the scanner and enhance the statistical analysis by increasing the number of analysed scans to 8 measurements per slide. After scanning, the data of the low and the high PMT scan were combined by extrapolation of the saturated spots after calculating a linear regression using the scan at the lower PMT-gain, thus creating a new combined data set.. Four commonly used normalisation methods were applied in order to asses the influence of normalisation on scanner performance: – Location Normalisation: Adjustment of the mean intensity value for a series of experiments. – Location and Scale Normalisation: Adjustment of mean values and scaling of standard deviation over the whole intensity range. – Quantile Normalisation: Forces microarray data of several experiments into the same distribution including adjustment to a general baseline. – VSN: Variance Stabilizing Normalisation : Data transformation using a parametrised asinh function which renders variance approximately independent of the mean intensity. 2 Results The Agilent scanner was found to perform best when comparing results for vsn and quantile normalisation. Agilent's quantile normalised low PMT image yielded maximal sample separability information. The GenePix 4000B resulted second when using the mentioned normalisation methods and performed better than the Agilent scanner when applying location and location/scale normalisations. Independent of the normalisation method used, the Tecan scanner performed always worst. Quantile normalisation produced best results for all scanners followed by vsn normalisation, while localisation normalisation, being the most conservative transformation, showed the smallest improvements. Quantile and vsn normalisation also reduced significantly the median absolute deviation, indicating a potent calibration over all measurements. While the combination of low and high PMT results would not yield any improvement when applied to untransformed data, better results were obtained when normalised data were analysed. Respecting the two less conservative normalisations, quantile and vsn, the combined results showed in most cases (7 out of 8) an improvement over the high PMT data and often (5 out of 8) an improvement over low and high data. 3 Discussion In contrast to most published calibration studies we chose an approach that would quantify the information content of microarray images. The applied method proved exceptionally useful in detecting performance differences of the different scanner models. The obtained results reveal strong performance differences between the four scanner models, favouring the Agilent G2565BA scanner. Observed aberrations were discussed with scientists in charge of each scanner, leading to improved scanning settings. The performance improvements yielded by four popular normalisation methods highlighted the importance of normalisation in microarray analysis. Quantile normalisation had the strongest positive impact on the data, followed by variance stabilizing normalisation. As quantile normalisation is thought to be an aggressive transformation, eventually distorting data structure, we recommend its use only in combination with careful data exploration, whereas vsn normalisation could be more suitable for routine analysis. Extending dynamic range by repeated scanning and combination of the data, resulted in many cases in performance improvement, this being especially true for the GenePix 4000B and the Tecan LS reloaded. However, the fact that best results were not always achieved by data combination as well as the observation that low PMT scans would yield more information than high PMT scans raised serious questions about scanner characteristics at high PMT settings and will be adressed by future analysis. In summary, this thesis presented an exemplary application of a calibration method based on sample separability information. As platform establishment goes on at the host group, several parameters including hybridisation temperature, buffer compositions and process durations will be calibrated using the data analysis pipeline designed in the course of this thesis. 4 Literaturverzeichnis / Literature Dudley A.M., Aach J., Steffen M.A. and Church G.M., Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range. Proc Natl Acad Sci U S A. 2002 May 28;99(11):7554-9 Jaluria P., Konstantopoulos K., Betenbaugh M. and Shiloach J., A perspective on microarrays: current applications, pitfalls, and potential uses. Microb cell Fact. 2007 Jan 25;6:4. Lyng H., Badiee A., Svandsrud D.H., Hovig E., Myklebost O. and Stokke T., Profound nfluence of microarray scanner charactestics on gene expression ratios: analysis and procedure for correction. BMC Genomics. 2004 Feb 3;5(1):10. Quackenbush J., Computational approaches to analysis of DNA data. Methods Inf Med. 103. 2006; 45 Suppl 1:91- Yuen T., Wurmbach E., Pfeffer R.L., Ebersole B.J. and Sealfon S.C., Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays. Nucleic Acids Res. 2002 May 15;30(10):e48. 2