SIGMA: A Platform to Visualize and Analyze DNA Copy Number Microarray Data Raj Chari, PhD Student BC Cancer Research Centre Department of Cancer Genetics and Developmental Biology APIII Conference, August 17th, 2006 Overview DNA microarrays and array comparative genomic hybridization (array CGH) Architecture of SIGMA Examples Current/Future directions Studying DNA changes Methods to study DNA aberrations are getting better => movement to array-based Different from expression microarrays Measure genomic content vs. RNA transcript levels Dynamic range of values are much smaller Discrete vs. continuous data (segmentation algorithms) Array CGH Technology Chari et al, Cancer Informatics, 2006, 2, 48-58 Rationale for SIGMA Many different platforms for array CGH Software developed tends to be platform-specific Inefficient data processing pipeline Need to encapsulate data processing and support different types of data => System for Integrative Genomic Microarray Analysis (SIGMA) Architecture of SIGMA LOCAL MySQL Database JDBC SERVER MySQL Database JDBC Java Application JGR R: Analysis Main interface Functionalities of SIGMA Importing data from multiple array CGH platforms Built-in segmentation algorithms DNACopy Edge detection based Segmentation (Poster #105) Integration with other types of DNA microarray-based assays Chromosome Immunoprecipitation on microarray chips (ChIP on chip) (Poster #116) => Histone acetylation Methylation Dependent Immunoprecipitation array CGH (MeDIP array CGH) (Poster #120) => DNA methylation Gene expression => RNA levels Example: cancer cell line database “stripped” down version of SIGMA database of pre-processed data Poster #104 Case #1: Examining a single sample for copy number aberrations Case #2: Identifying recurrent alterations in lung adenocarcinoma H2087 Lung cancer cell line A. Whole genome karyogram B. Chromosome 8 C. Region on arm 8q D. Highlight and find genes Segment & Curate changes 100% 50% 50% 100% -1 +1 +1 +1 -1 +1 Individual Profile Detection of Alterations Frequency of alterations (aligning many profiles) Summary of 24 Lung Adenocarcinomas Current / Future Directions Database of cancer cell lines will soon be publicly available Full application to be completed by October Integration with proteomics DNA-RNA-Protein Multi-dimensional views of the cell will enhance understanding of pathogenesis => “Systems” approach Acknowledgements Wan Lam lab Calum MacAulay Funding organizations: