MICROARRAY SUB\SIONS TO EXPERIBASE by Aidan Rawle Downes Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degrees of Master of Engineering in Electrical Engineering and Computer Science MSCHVFMW OF TECHN*OLGY at the Massachusetts Institute of Technology May 19, 2005 20 JULe 3 28 Copyright 2005 Aidan R. Downes. All rights reserve . LIBRARIES L RR The author hereby grants to M.I.T. permission to reproduce and distribute publicly paper and electronic copies of this thesis and to grant others the right to do so. Author Department of Electrical Engineering and Computer Science May 19, 2005 Certified by- ______ C ertif ed b yF or b e s D e e y , J r Thesis 4ervisor Accepted by Ac b.. AkLahdr C. Smith Chairman, Department Committee on Graduate Theses BARKER MASSACHUSSETTS INSTITUTE OF TECNOLOGY ABSTRACT MICROARRAY SUBMISSIONS TO EXPERIBASE By Aidan Rawle Downes Thesis Supervisor: Professor C. Forbes Dewey, Jr. Professor of Mechanical Engineering / Bioengineering Experibase is an experimental database that supports the storage of data from leading biological experiment techniques. Experibase ontology was extended to include a robust representation of microarray data, a leading experimental technique. The microarray submission system takes advantage of Experibase's new microarray storage capabilities by allowing biologist to submit microarray data to Experibase using an application that they are already familiar with. The transformation of data from the submitted format to a format suitable for Experibase takes place without the submitter's knowledge, reducing the need for an Experibase specific submission application. TABLE OF CONTENTS Introduction ......................................................................................................................... 1 T hesis O utline ............................................................................................................... 2 O verview of R elevant Technologies ................................................................................ 3 D N A M icroarrays ........................................................................................................ 3 E xperibase ..................................................................................................................... 4 The M IA ME Standard ................................................................................................ 6 A.rrayExpress and M IAM Express ............................................................................. 8 D esign .................................................................................................................................. 10 Project M otivation ..................................................................................................... 10 Project R equirem ents ................................................................................................ 10 Project D esign ............................................................................................................. 11 MLkM Express Custom izations ...................................................................................... 14 A utom atic U ser Login and/or R egistration .......................................................... 14 D ata Synchronization ................................................................................................ 15 E xperibase M icroarray Schem a ...................................................................................... 20 Introduction ................................................................................................................ 20 E xperibase Com m on Package ................................................................................. 21 Study Plan .................................................................................................................... 22 A dministration Package ............................................................................................ 24 Sam ple .......................................................................................................................... 25 E xperim ent Package .................................................................................................. 28 Results A nd Conclusions ................................................................................................. 31 LIST OF FIGURES Number Page 4 Figure 1: Image scan of a hybridized microarray ............................................................................ 6 Figure 2: The five packages of Experibase and their relations ...................................................... Figure 3: Overview of the components of the system and the relationships 12 b etw een them ............................................................................................................................... Figure 4: Miamexpress Submission data tables and all its related tables.....................................18 Figure 5: Flow of data as submitted through MIAMExpress.......................................................19 Figure 6: The identifiable interface, which is realized by all Experibase Classes......................21 23 Figure 7: A UML Class diagram of the Study Plan Package ............................................................. 24 Figure 8: The Administration package .............................................................................................. Figure 9: Sample.PhysicalSample class Diagram............................................................................27 Figure 10: Sample.MeasuredSample sub-package..........................................................................28 Figure 11: The experiment package................................................................................................. 29 Figure 12:Experiment.Protocol sub-package diagram .................................................................. 30 ii ACKNOWLEDGMENTS The author wishes to acknowledge his Supervisor, C. Forbes Dewey, for his guidance and help through out the year. I would also like to thank my colleagues, Howard Chou and Shiva Ayyadurai, and Ronald Taylor and Abigail Corrigan at PNNL. I would like to thank my family for being there for me, especially my mother, father, and sister. They are most important people in the world, and I know they think the same of me. iii GLOSSARY Bioinformatics. The use of ideas, algorithms and techniques from applied mathematics, informatics, statistics, and computer science to solve or aid in the solving of biological problems. Microarray. An array of DNA sequences, grouped into spots (also know as probes) attached to solid surface like glass, or silicon. Used in experiments to measure the expression level of genes attached to the array. MIAME. Minimum Information about Microarray Experiments. A standard that defines the minimum information that must be reported about microarray experiments, in order to ensure the interpretability of the experimental results generated as well as their potential independent verification. MAGE. Microarray Gene Expression. A group that aims to provide a standard for the representation of microarray expression data that would facilitate the exchange of microarray information between different data systems. Responsible for MAGE-OM, a microarray object model, and MAGE-ML, a microarray data exchange format Ontology. a concise and unambiguous description of principle relevant entities with their potential, valid relations to each other. iv Chapter 1 INTRODUCTION DNA microarrays provide a simple and natural vehicle for exploring the genome in a way that is both systematic and comprehensive [1]. Using the concept of complimentary base parings in DNA and RNA components, DNA microarrays allow biologists to study various biological phenomena such as gene expression. Biologists can determine what genes are being expressed in a given sample and how actively that gene is being expressed. Other advantages of microarrays are that they are relatively cheap to produce, can be produced quickly and they are easy to control [1]. It is not a surprise that experimental labs that use DNA microarrays produce a lot of data. Data includes image data from scanners, results from software analysis packages, and data about the experiment itself, including protocols used, samples used, and the experiment design. In order to take full advantage of experiment results, the data produced should be stored in a medium where they are easily accessible, like an experimental database. This thesis addresses the need for storage mediums for microarray data, the data input, and its exportation. This thesis consists of the design and implementation of a system for collecting output data from Microarray experiments, processing that data and storing the resulting information in an experimental database. This system consists of three distinct components: a web application for entering microarray experiment data; another web application for system and user administration; and an experimental database, Experibase [2]. The data entry web application is a modified version of the MLAMExpress web application (a data submission tool developed and supported by the European Bioinformatics Institute (EBI) [3]). The target user base of the system will be experimental of these biologists. Many biologists are familiar with the MIAMExpress user interface and work flow. Using an augmented version of MIAMExpress reduces the time that these biologists have to devote to learning a new application user interface. The second component of the system is the user administration web application. This application controls access to the submission tool using individual and group security. This component also provides some access to the experiment al database. Users of this application can export data from the experimental database in XML format to other databases. The third component of the system is the experimental database. An experimental database is a database whose schema (ontology) is designed for the storing of experimental data. The experimental database of choice is Experibase. Experibase accommodates the storing and retrieval from leading biological experiment techniques in a single database. The Experibase schema was updated to make it MIAME (Minimum Information about Microarray Experiments) compliant. MIAME is a proposed standard that describes the information that should be stored for microarray experiments. Thesis Outline Chapter 2 presents an overview of relevant technologies used throughout this thesis document. Chapter 3 discusses the system design. Chapter 4 discusses the customizations made to MIAMExpress. Chapter 5 discusses the Experibase schema. Chapter 6 reports conclusion. 2 Chapter 2 OVERVIEW OF RELEVANT TECHNOLOGIES DNA Microarrays A DNA microarray consists of an array of DNA sequences, grouped into spots (also known as probes) and attached to a solid surface like glass, or silicon. The most common use of microarrays is to measure mRNAs transcribed by different genes found on the microarray. RNA is extracted from sample cells and then converted to cDNA or cRNA. The resulting cDNA or cRNA is then tagged with a fluorescent compound. Because of complimentary DNA base pairings, a cDNA or cRNA strand will hybridize with the probe that contains a DNA sequence complimentary to the or cDNA cRNA own sequeunce. Spots where cDNA or CRNa sequences have hybridized can be detected by visually by the fluorescent glow emitted by the hybridized sequences. The fluorescence intensity of each probe is analyzed by software packages. The level of intensity of a probe indicates whether the cells in the sample have recently transcribed, or ceased transcription, of a gene that contains the probed sequence. The intensity of the fluorescence is proportional to the number of copies of a particular mRNA that were present and is used to quantify the expression level of the gene. Microarrays are usually used to determine what genes in a cell become active or deactivates when the experiment conditions have changed. Figure 1 shows an image scan of microarray. Two different samples were used. One sample was 3 labeled with a red fluorescent compound and the other with a green fluorescent compound. The yellow spots indicate genes that were expressed in both samples. Figure 1: Image scan of a hybridized microarray Experibase Experibase is an experimental database designed by Professor C. Forbes Dewey group at M.I.T. Its data model was designed by forming a composite of the data storage needs of several leading experimental techniques [2]. As result Experibase 4 can store data from leading experimental techniques. Currently the data model supports data from the following experimental techniques: " * " Gel Electrophoresis Flow Cytometry Mass Spectrometry The schema can also be expanded to add data from even more experimental techniques. One of the major ideas behind Experibase is that most biological experimental data models can be partitioned into five distinct packages in such away that data entities within a package share similarities in function, data stored, and relationships with other data entities. This similarity is exploited in the Experibase data model by the creation of data entities within a package that contains data found in all experimental techniques. Data entities that are specific to a particular experimental technique inherit from these common data entities. The five packages of Experibase are: study plan, sample, experiment, high level analysis, and administration package [2]. Figure 2 shows the five packages and the relationships between the packages. The individual package descriptions are as follows: * Study Plan Package: Data classes in this package describe information about experimental projects. This includes hypotheses, references, and project reports. " Sample Package: Data classes in this package describe information about experimental samples. This includes ideal biological samples, and derived samples 5 Experiment Package: Data classes in this package describe information * about experiments. This experiment includes design, experiment protocols, raw data, and preprocessed data High Level Analysis Package: This package contains classes that represent * advanced analysis of the experiment results. This can include data output from analytic and statistics software applications. * Administration Package: This package contains classes that represent contact, audit and security information about an experiment. This includes the experimenter, laboratory, institution and permissions. StudyPlan Package Sample Package I .. Administration Package HightevelAnalysis Package Experiment Package A *B Dependency. The changes of A can cause changes in B. Reference Figure 2: The five packages of Experibase and their relations The MIAME Standard The Minimum Information about Microarray Experiments (MIAME) standard describes the information that microarray data sources should contain. 6 The standard was created with the belief that is necessary to define the minimum information that must be reported, in order to ensure the interpretability of the experimental results generated using microarrays as well as their potential independent verification [5]. MIAME was created and is maintained by the Microarray Gene Expression Database group (MGED), a widely supported group that creates microarray data standards. The MIAME standard consists of six parts: experimental design, array design, samples, hybridizations, and normalization [5]. Each part describes the data that should be represented in any MIAME compliant database. Experibase was extended to store microarray data. The MIAME standard was used a guide line in the designing of the microarray experiment data model. The following guidelines were followed in the extension of Experibase: " Information stored for Experiment Design: o The goal or name of the experiment o A brief description of the experiment o Keywords, for example, time course, cell type comparison, array CGH. o Experimental factors - the parameters or conditions tested, such as time, dose, or genetic variation o Experimental design - relationships between samples, treatments, extracts, labeling, and arrays o Quality control steps taken o Links to the publication, any supplemental websites or database accession numbers. * Information stored about the samples used, extract preparation and labeling: o The origin of each biological sample and its characteristics 7 o Manipulation of biological samples and protocols used o Experimental factor value for each experimental factor, for each sample o Technical protocols for preparing the hybridization extract and labeling. " Information stored about Hybridization procedures and parameters: o The protocol and conditions used for hybridization, blocking and washing. " Measurement data stored: o The raw data; the feature extraction output from the array scanner. This includes the intensity for each probe on the array. o The normalized and summarized data; this can include the averaged normalized log ratios of the intensities. o Image scanning hardware and software, and processing procedures and parameters. " Array Design: o Array platform, surface and coating specifications. o Spotting protocols and product information for commercial array designs. o Array spot and reporter information. This includes the location of each spot. ArrayExpress and MIAMExpress ArrayExpress is a public database of microarray gene expression data at the EBI. It is a generic gene expression database designed to hold data from all microarray platforms. The ArrayExpress object model is based on MAGE-OM (Microarray Gene Expresssion Object Model), an object model for microarray experiment 8 developed by the MGED [4]. Using MAGE-OM as the object model ensures that ArrayExpress is MIAME compliant as MAGE-OM is MIAME compliant. ArrayExpress accepts data in MAGE-ML (Microarray Gene Expresssion Markup Lanuage). MAGE-ML is an xml schema based on MAGE-OM MIAMExpess is a data submission web application developed by the EBI. Data entered into MIAMExpress can be exported in a XML format, suitable for acceptance by ArrayExpress. It is a well designed application with a simple and very intuitive user interface. Like its name suggests, MIAMExpress is MIAME compliant. MIAMExpress is a Perl CGI application, which stores most of its data in a MYSQL database. Raw and preprocessed experiment data files are not parsed, but instead these files are stored on the server's file system. The file locations are stored in the database instead. 9 Chapter 3 DESIGN Project Motivation The project was commissioned by Pacific Northwest National Labs (PNNL). PNNL were one of the initial adopters of Experibase. Experibase was used internally to store data from experiments occurring at the lab. The initial version of Experibase microarray data storage capabilities was immature. Specifically, there was no user interface for entering microarray data, no export capabilities, and the schema did not support the storage of data from experiment files generated from microarray scanners. PNNL were committed to using Experibase because of its single database for many experiments property. Therefore PNNL commissioned Professor C.F. Dewey's group at M.I.T. to add the needed features to Experibase. Project Requirements The project had the following requirements: 1. Build a web application where users can enter experimental data from microarray experiments. 2. Web application should be easy to use and grasp intuitively. 3. Experimental data should be stored in an Experibase instance. 4. Web application should interoperate with an existing web application for Experibase administration. 10 5. Data stored in Experibase should be exportable to ArrayExpress for journal publication purposes. Project Design Figure 3 displays a schematic illustrating the network layout of the system. The system is a distributed application consisting of three components that can be deployed to different servers. MLAMExpress is chosen as the data submission tool because of its maturity. The MLAMvExpress development project has been active for at least three year with several stable versions available for download. Also, because the software is developed and maintained MIAMExpress adheres to the latest microarray standards. 11 by the EBI, Enter data Client Logm and request NMiiroarray Experanent User presented with experiment submNsion form Subrnisann CreAtes new submission or opens existing submision Experibe minstratini web appliation Check user credentah and returns existam submisions Stores experient data in Experibase Instalation s Experibase Instance Figure 3: Overview of the components of the system and the relationships between them The web applications communicate with the database using standard database connections. The web applications communicate with each other over HTTP. Any state necessary for the completion of an experiment submission is passed to the MIAMExpress web application over http The database provides storage services to both web applications. The Administration web-application depends on the database for storage and access of user account information, project information and user groups. The 12 MIAMExpress submission does not need to depend on the database for storage (it has its own internal database). However, since the project requires that all data are stored in Experibase, the MIAMExpress web application is rewritten to store information in Experibase. The database schema is an important part of the design as it determines what information can and cannot be exported to external databases. The designed data schema is MIAME compliant, and is heavily influenced by the MAGE object module. The schema was designed to take full advantage of Experibase's common components whenever possible, to facilitate better interoperability with data models for other experimental techniques. 13 Chapter 4 MIAMEXPRESS CUSTOMIZATIONS Automatic User Login and/or Registration Both the Experibase administration web application and MIAMExpress have they own user authentication system. Consequently a potential user must register with both applications to use the system, which is undesirable. The solution was to automatically register the user with the MIAMExpress web application. The main entrance of the system is through the Experibase Administration web application. New system users register for a user account which will gain them access to that application. After the Administration application has successfully authenticated a user, the user can submit microarray database to the Experibase, using MIAMExpress as the user interface. The first step is to create a new microarray experiment in the administration web application. When a request is made to create a new microarray experiment in the administration application, the user is forwarded to MIAMExpress. Http query parameters and values are added to the forward address so that MIAMExpress can automatically login the user. If the user has not been registered with MIAMExpress, MLAMExpress will query Experibase for the user's credentials and automatically register the user to the application. The request parameters passed to MiAMExpress are given in Table 1: Table 1: Query parameters for automatic user login and/or registration Query Parameter Description 14 ACTION Parameter used with conjunction by MIAMExpress. the Used, in SELECTSUBMISSION parameter, to direct the user to any page in the MIAMExpress application. SELECTSUBMISSION Parameter used with conjunction by MLAMExpress. the SELECT_ Used, in ACTION parameter, to direct the user to any page in the MIAMExpress application. MXPGVARjloginname The loginname of the user in MIAMExpress which is also the login name of user in Experibase. ExperiB3_studyplan The Experibase study plan. Needed to update the Experibase experimental database record. ExperiB3_groupno The Experibase group number. Also needed to update the Experibase experiment dataset record. ExperiB3_startDate The Experibase start date. Also need to update the Experibase experiment record. Data Synchronization The data synchronization application is responsible for synchronizing data stored in MIAMExpress with data stored in Experibase. Experiment submissions in MIAMExpress have a one to one correspondence with microarray experiments stored in Experibase. Whenever a MIAMExpress submission is created or 15 updated the data synchronization application is invoked. For new experiments, the synchronization creates a new microarray experiment in Experibase, and the copies the data to over to Experibase For experiment update, the application figures out what data has changed, and updates Experibase with the necessary changes. The application is written in Java. Perl wrappers that invoke the application were also written. These wrappers give the MIAMExpress Perl CGI the ability to invoke the application. The application consist of four modules; the Experiment File Parser, the MIAMExpress data object, the Experibase data object, and the Importer module. The Experiment File Parser module consists of java packages and classes responsible for parsing the experiment data files. Currently this module is capable of parsing the following data files: o Affymetrix o CHP files: contains probe set analysis results generated from Affymetrix software. o CEL files: stores the results of the intensity calculations on the pixel values of the array image. o EXP files: contains information entered in the Experiment window of Affymextrix MAS 5 software. o Nimblegen o Raw data files: stores the results of the intensity calculations on the pixel values of the array image. 16 o Design files: stores the array design o Pair files: holds the gene expression data The MIAMExpress data object module provides object relational mapping from the MIAMExpress data tables to Java objects. The mappings were created using Oracle's Business Components for Java (BC4J). The data object module also represents all the relationships between the data table that is not inherent in the MILAMExpress Database but is very visible in the MIAMExpress application code. Figure 4 shows the data table that stores the MIAMExpress submission records, and all the tables that are related to it. All of these relationships are represented in the data object module through simple get and set methods. The BC4J application development environment essentially takes java code and creates the appropriate SQL calls to the database. The Experibase data object module is similar to the MLAMExpress data object module in function. It provides an object relational mapping from Experibase data tables to java objects. This module provides other modules with the ability to query Experibase data tables, update Experibase data tables, and add new data records. This module was also developed with Oracle's Business Components for Java (BC4J). 17 rD Tardesin Teprmnt I I - (bX Tsubmt 1 Submissi nSubmiter Submissior ArrayDesign 00 . T.bmis 1* Tsubmii CA. 0~ Tsubmis Submissioi Experiment ;ubmis SubmissonLabeledHybrid Tpooled SubmissionPooled Tsubmis rD CA SubmissionPublication Tsubm Tsubmis Tpublic Submission 0..1 IxperimentFnl Taxprfnl The Importer module acts as the controller for data synchronizing application. This module makes use of the other modules in its goal of synchronizing Experibase with the data stored in MIAMExpress. It contains logic that maps the data stored in MLAMExpress to data tables in Experibase. It also contains logic that maps the data experiment files to data tables in Experibase. Figure 5 displays a schematic which illustrated the flow of data when a user makes a submission to the MIAMExpress application. Local Fie SveteM 4 1 User Data Files IAExpresss a Entered Data onier Internal Database 6riData File Importer Experihase Figure 5: Flow of data as submitted through MIAMExpress 19 Chapter 5 EXPERIBASE MICROARRAY SCHEMA Introduction An ontology is a concise and unambiguous description of principle relevant entities with their potential, valid relations to each other [6]. A database schema is an example of an ontology. Each entity is well defined by a data table and the relationship between entities is well defined by foreign key constraints. Ontologies are becoming increasingly important to the field of bioinformatics. This importance can be attributed to the realization that making a comparison between different experiments is only feasible if consistent terminology is used in describing experimental data. The existence of a standard ontology to describe experimental data will greatly aid in the sharing and comparing of experimental results. It will also help biologist take full advantage of the extensive amount data produced by biology laboratories. It is of no surprise to learn that ontologies are now being developed for many different areas in experimental biology. MAGEOM is an ontology that applies to the area of DNA microarrays, and Gene Ontology (GO) applies to the area of gene products and their behavior in a cellular context. In theory, Experibase, the experimental database, can be viewed as an ontology for all of experimental biology. In practice, the Experibase incomplete. But as more experimental ontology is techniques are incorporated in to Experibase, it will approach its ideal state. As mentioned earlier, Experibase currently contains data representations for Gel Electrophoresis, Flow Cytometry, and Mass Spectrometry experiments. Experibase was extended in this thesis to include data representations for Microarray Experiments. 20 Experibase Common Package The Experibase Common package does not contain any concrete data representations but instead contains only one interface that all classes in Experibase implement (realize in UML terms). The identifiable interface ensures that all records in the database are identifiable for some common naming scheme. Figure 6 displays the Experibase Common package as a UML Class diagram. -:Idenftfia~f> id JLSID Identiiger Date(eated DateModified Figure 6: The identifiable interface, which is realized by all Experibase Classes The Id column is a database generated identifier. All instances of classes in the same inheritance hierarchy have an Id that is unique among the other instances in the same inheritance hierarchy. The LSID (Life Science Identifier) column holds the object instance LSID. LSID's are globally unique identifiers across all LSID aware data sources. They can be used quickly retrieve information about the object through the use of an LSID resolver. The name column stores a potentially human recognizable and possibly ambiguous identifier for an object instance. The identifier column stores a machine friendly identifier that ties an object to its source. More than one 21 instance of a class may share an identifier value. However, objects that share the same identifier should be regarded as equivalent objects. The DateCreated and DateModified fields are booking keeping fields that store the date the record was added and the last time the record was modified. Study Plan The Study Plan packages consist of classes that provide metadata about an experiment. The Submission class (also known as project or study plan class) stores information related to new experiment added to the database. The metadata stored about an experiment in the Study Plan includes: The experiment's submitter; stored as a relationship between the ' Submission class and the Person class. * The experiment's hypothesis, stored as a relationship between the Submission class and the Hypothesis class. * For experiments stored in public databases, the relationship between the Submission class and the DatabaseEntry class records that information. * Experiments whose findings are published in a scientific journal have their publication information stored as relationship between the Submission class and Publication Class. * Information that the biologist associate most with an experiment is stored in the Experiment class. There is a relationship between the Submission class and the Experiment class that ties the experiment's details to the rest of the experiment's metadata 22 The MicroarraySubmission class extends the Submission class to add information specific to microarray experiments. The MicroarraySubmission has a relationship to the array designs used in the microarray experiment. Figure 7 displays the Study Plan package. &1ndX~jhz Deiern.tion DatabaseEntry DatabaseNanw DatabtiasURI Submissionn IsComplete Estemalld Ekporiment (from Adminstration) ""Submitter Micoara Person (from Administration) SssIon Publication PubkacatomNaina Title Yeaf ArrayDesign (from RAperiment) u RI FirstPage Status Figure 7: A UML Class diagram of the Study Plan Package 23 Administration Package The administration package contains data objects that represent information about people, and organizations related to the experiment. The list of people possibly related to an experiment includes the experiment submitter, and the publication authors. Possible organizations related to the experiment include laboratories, universities, research companies, software vendors and hardware manufacturers. The administration package also contains security information about experiment. Role based security is used to determine the capabilities of a person registered to the system. Individuals can also possess user accounts for access to the database through a database client application. Figure 8 displays the classes of the administration package. Admin xatmn Contact Addiess Afflllnlinx Fax Un Phone Person Organization FustNanu LUstNaire Midbutals Rot! Contact Person Figure 8: The Administration package 24 Sample The sample package contains classes that represent biological samples (Physical samples), measured samples, and derived samples [2]. It also includes classes that represent treatments to the samples for the purpose of the experiment. The package is divided into three sub-packages; PhysicalSample, MeasuredSample, and DerivedSample. The PhysicalSample sub-package represents ideal biological samples, and their treatment. Ideal biological samples are basically parts of an organism such as cell or tissue. Figure 9 displays the class diagram for the PhysicalSample sub-package. It includes seven classes that are specific to microarray experiments. These classes are: " Extract: Represent the extraction of RNA or a similar product from sample cells. See chapter 2 for information about RNA extraction. " ExtractProtocol: Stores a description of the protocol used to extract RNA from the sample cell. Information stored includes the type of the nucleic acid or molecules extracted, and the amplification agent, if amplification was used. * LabeledExtract: Represents an extract solution tagged with fluorescent label. " LabelingProtocol: Stores a description of the protocol used to label the extract with the fluorescent compound. Includes the amount of the label used, and the name of the fluorescent compound. 25 * Array: Represents a physical microarray as described in Chapter 2. Every array complies with an array design that dictates the position of each probe set and the DNA sequences at each probe set. * PhysicalBioAssay: represents the product of hybridization between an array and a labeled extract solution. " HybridizationProtocol: Stores a description about the creation of a PhysicalBioAssay. Information of interest stored includes the amount of the labeled extract solution use, the duration of the hybridization, the temperature during hybridization, and the total volume of the hybridization. The remaining classes in the sub-package are common to most biological experiments. The BioSample class describes a sample from an organism. The BioSequence class represents biological sequences like DNA. Figure 9 displays the class diagram. The MeasuredSample sub-package represents information that has been extracted from physical samples and that can be used again as input to the experiment. It is diagrammed in Figure 10. The MeasuredBioAssay class represents the product of feature extraction performed on a PhyscicalBioAssay instance. The scanning protocol relationship between Protocol and MeasuredBioAssay associates to MeasuredBioAssay a description of the protocol used in feature extraction including the hardware and software used. MeasuredBioAssayData represents the information obtained from feature extraction. 26 Iamp.hy icaL kitPr ArrayDevign (from Eperiment) Toye yp Pcir mxType BioSample sequence Is~jruli IsAppwoiwmf eLmngth Cl~eie SanpkTypr Nvr Stagy AgSta tu: Ag Ranige Aeag Array BatdhNo ax PioducetionDate Tini.jt TirPoit Sex Databse F ity' (frota Stu Geiciaiiatin TaiumniType Iaiiidual dy~la) IndividnaKien Direa seStatr TargtCeIITy Phy Ie*1BioAassay peg CellLine eparatioiTech ract LabeiedExtract Protocol (from Experiment) Extra Protocnl FAractedplodu Ct Anplificationlb jaid SampleGrowthProtoco LAbe.ingPootocol TiiwMin TitwMwx Aswumnt Ainiunit LaboII.)ad Tmrknit Tenpi ktuloMin umt AiiilificationTypo ii Thtttotatturl Medjlx~IkwNfA HybridzationProtocol abeA2draetUed abetractUnit Durntion DurationUnit VolumeUmit Teniperature TenperaturelUnit + Figure 9: Sample.PhysicalSample class Diagram 27 er iatn'it~ai ha1)1"tuA typo ScaiingPr oIo r~ (frommE piment) O(vrn Erwime"10 Figure 10: Sample.MeasuredSample sub-package The DerivedSample sub-package contains information about the transformation of MeaseuredBioAssay and/or PhysicalBioAssay instances. The transformation could be the normalization of a MeasuredBioAssay instance or some other manipulation of available data The DerivedBioAssay represents the product of transformation performed on a PhyscicalBioAssay and/or MeasuredBioAssay instances. Experiment Package The experiment package contains details about an experiment including the protocols used, the experiment description, type, and experimental factors. Figure 11 displays the experiment package. None of classes are specific to microarrays 28 but the all used in the submission of a microarray experiment. The Experiment class represents a microarray experiment. It has an ExperimentDesign instance associated with it. An experiment design describes the experiment, records the experiment's type, and the experimental factors like sample age. E per mnt ~ve £xpedmat R1Ipabmroiattt IDes ValneValue siiiion Figure 11: The experiment package The experiment package has several sub-packages. One of these packages is the Experiment.Protocol package. This package contains the definition of the Protocol class and its relationships with other entities in the sub-package. Figure 12 displays a diagram containing the package components. 29 Experiment Protocol Pr*toeol Dws cripftn Type ,Hx *we~ MAeTC Ycar NIO&I Venaiolk Mfmfacturer fEturer Figure 12:Experiment.Protocol sub-package diagram The RawData sub-package contains the class MeasuredBioAssayData and its subclasses. Subclasses include classes for storing data from Affymetrix commercial arrays (CEL file) and Nimblgen commercial arrays (RawData file). The ProprocessedData sub-package contains the class DerivedBioAssayData and its subclasses. Subclasses include classes for storing data from Affymetix CHP files and Nimblegen Pair files. 30 Chapter 6 RESULTS AND CONCLUSIONS The system described in the previous chapters is currently being deployed at Pacific Northwest National Labs. Great interest has been shown in the application. After the deployment and testing phase of the application is over, it quite possible that the application will be deployed to other experimental laboratories. The project has met all it requirements, with the exception of data exportation. This requirement is partially functional due to time constraints. The system is deployed on two servers. MLAMExpress is deployed on a Linux server while the administration application is hosted on a Windows 2003 server running Apache Tomcat. Experibase is hosted in an Oracle instance on the Windows 2003 server. The project requirements created a system design challenge by requiring that two different, existing and unknown systems be able to interoperate with each other. A database design challenge created by the need for an extension of Experibase. Both challenges were solved using knowledge from computer science and compute system and design fields. The system design challenge was solved by modifying the applications that they communicated with other, making one application the controller, the other worker. The database design challeneged was solved using UML modeling to represent the data produced by microarray experiments. The end result is a system that makes biologist life easier while promoting a new all inclusive approach to data storage that is attractive because of its compactness. 31 BIBLIOGRAPHY [1] Patrick 0. Brown; David Botstein. Exploring the new world ofthe genome with DNA microarrays.Nature Genetics Supplement. Volume 21. Pg 33 - 37. January 1999. [2] C.F. Dewey Jr; Aidan Downes; Howard Chou; Shixin Zhang. A Unique Opportunity in BiologicalInformation Standards.W3C Workshop on Semantic Web for Life Sciences. Cambridge, MA. October 2004. [3] Alvis Brazma et al. ArrqyExpress-apublic repositoryfor microarraygene expression data at the EBL Nucleic Acids Research. Volume 31, No. 1. Pg 68-17. 2003 [4] Ugis Sarkans et al. The ArrayExpressgene expression database:a software engineering and implementation perspective. Bioinformatics. Vol 21, no. 8. Pg. 14951501. 2003. [5] Alvis Brazma et al. Minimum information about a microarryexperiment (MIAME)-towardstandardsformicroarraydata. Nature Genetics .Volume 29. Pg 365- 371. December 2001. [6] Steffen Schulze-Kremer. Ontologiesfor molecular biology and bioinformatics.In Silico Biology. Volume 2. Pg 0017 , 2002. 32 APPENDEX 1 MIAMExpress Mappings (HTML form fields to Experibase Database Location) 1.1 ExperimentDesign Form Field Experiment name Experiment design type Experimental factors Experiment description Public Release Date 1.2 ControlVcb.Value[type="ExperimentType'" ControlVcb.Value[type="ExperiemntalFactor'" ExperimentDesign.Description ExperimentDesign.HoldDate Publication Form Field Publication Status Journal Title Year Volume First Page Last Page 1.3 Experibase Location ExperimentTable.Name Experibase Location Publication.Status Publication.Journal Publication.Title Publication.Year Publication.Volume Publication.FirstPage Publication.LastPage Author Form Field First name Initial Last name Experibase Location Person.Name Person.Middlelnitial Person.LastName 1.4 Sample Form Field Sample name Organism Gender Provider Experibase Location Sample.Name Sample.Taxonomy Sample.Sex Sample.CellProvider Sample Type Sample.SampleType Development stage Age Sample.DevStage Sample.AgeStatus Age Mmn Sample.AgeRangeMin Age Max InitialTimePoint Unit Organism part Gene modification Sample.AgeRangeMax Sample.TimePoint Sample.TimeUnit Sample.OrganismPart Sample.GeneticVariation Individual Identifier Sample.Individual Individual Genetic trait or genotype Disease State Cell type or Target Cell Type Sample.IndividualGen Additional Sample.Additional Clinical Information Separation Technique 1.5 Extract 1.6 Labeled Extract Form Field Extract Name Protocol Pooling Protocol Sample.DiseaseState Sample.TargetCellType Sample.SeperationTech Experibase Location Extract.Name Protocol via Extract.ProtocolId Protocol via Extract.PoolProtocolld Form Field Label Extract Name Protocol 1.7 Experibase Location LabeledExtract.Name Protocol LabeledExtract.Protocolld via Labeled Hybrid Form Field Experibase Location Hybrid NA Array Design Name ArrayBatch Serial No LabelExtractName ArrayDesign.Name Array.batch Array.serialno via LabeledExtract.Name PhysicalBioAssay.LabelExtractld 1.8 Hybrid Form Field Experibase Location Hybridization Name NA Raw Data file BioAssayData (locations depend on file type) BionAssayData(locations depend on file type) Normalized data file BinaryFilelO.java Tue May 03 21:31:24 2005 1 buffer.get(bytes); Created on Mar 8, 2005 return new String(bytes); * TODO To change the template for this generated file go to Window - Preferences - Java - Code Style - Code Templates */ package edu.mit.parsers; public static String readFixedString(ByteBuffer buffer, int len) byte] bytes = new byte[len]; buffer.get(bytes); int i = 0; import java.nio.ByteBuffer; * @author Aidan Downes for (i * TODO To change the template for this generated type comment go to * Window - Preferences - Java - Code Style - Code Templates public static int readInt(ByteBuffer buffer) public static long readLong(ByteBuffer buffer) return buffer.getLong(); public static short readShort(ByteBuffer buffer) return buffer.getShort); public static char readChar(ByteBuffer buffer) return (char)buffer.get(); public static float readFloat(ByteBuffer buffer) return buffer.getFloat); public static short readUnsignedChar (ByteBuffer bb) return ((short) (bb.get() & Oxff)); public static int readUnsignedShort (ByteBuffer bb) return (bb.getShort() & Oxffff); public static long readUnsignedInt (ByteBuffer bb) return ((long)bb.getInt() & OxffffffffL); public static String readString(ByteBuffer buffer) int len = readInt(buffer); byte[] bytes = new byteflen]; 0; i < bytes.length; i++) if(bytes[il ==0) break; return new String(bytes, 0, i); public class BinaryFileIO return buffer.getInt); = } Thu May 12 13:36:46 2005 AffymetrixImporter.java 2 else dbaImpl. setProbeArrayType (header.getChipType )); ViewObject allSubmissions = eModule.findViewObject("SubmissionView"); logger.info("Found Submission View"); if (dataExists (header.getAlgorithmNameo()) subImpi = (SubmissionViewRowImpl)allSubmissions.createRow); subImpl.setExperibaseSubId(new Number(experibaseId)); allSubmissions.insertRow(subImpl); logger.info("Created new row in submission table"); eModule.getTransaction ().conmit ); logger.info("Cosmit initial submission"); dbaImpl . setAlgorithmName (header.getAlgorithmName )); if (dataExists(header.getAlgorithmVersion) )) dbaImpl . setAlgorithmVersion (header.getAlgorithmVersion )); } I } for(Iterator iter2 = header.getAlgorithmParameters().iterator(); iter2.hasNext(); public void importData(CHPData data) ) importCHPHeader (data. getHeader)); importExpressionProbeSetResults (data.getExpressionProbeSetResults)); NameValuePair nvp = (NameValuePair)iter2.next); logger.info(nvp.getName() +": "+nvp.getValue)); //TO DO Add iterDBA.insertRow(dbaImpl); private void importExpressionProbeSetResults(java.util.List probeSets) } RowIterator iter = dbaImpl.getCHPAnalysisView); logger.info("reading probe sets with size " + probeSets.size)); for(Iterator probeSetIter = probeSets.iterator); probeSetIter.hasNext();) public void importData(ExperimentData data) ExpressionProbeSetResults probeSet = (ExpressionProbeSetResults)probeSetIter.nex to if(expImpl.getName() == null) expImpl.setName(data.getName (); ; CHPAnalysisViewRowImpl chpImpl = (CHPAnalysisViewRowImpl)iter.createRow); chpImpl. setDetection (new Integer(probeSet.getDetection ))); chpImpl.setDetectionPValue(new Float (probeSet.getDetectionPValue))); chpImpl . setNoOf Pairs (new Integer(probeSet.getNoOf Pairs())); chpImpl . setNoOfUsedPairs (new Integer(probeSet.getNoOfUsedPairs ))); chpImpl.setSignal(new Float(probeSet.getSignal())); chpImpl.setSubId(new Number(subImpl.getId() .getValue()); iter.insertRow(chpImpl); logger.info("finished reading "+ probeSets.size() + " probe sets"); private void importCHPHeader(CHPHeaderData header) RowIterator iterDBA = subImpl.getAffyChpDataView); dbaImpl = (AffyChpDataViewRowImpl)iterDBA.createRow); dbaImpl .setCols (new Number(header.getCols))); dbaImpl .setRows (new Number(header.getRows))); dbaImpl . setNumberOf ProbeSets (new Number(header.getNoOf ProbeSets())); for(Iterator iter = header.getSunmaryParameters) .iterator(); iter.hasNext);) //PhysicalArrayDesign if (dataExists (data.getChipType ()) logger. info("Entering array design information"); RowIterator iter = subImpl . getArrayDesigns ); ArrayDesignTableViewRowImpl row = (ArrayDesignTableViewRowImpl)iter. createRow(); row.setName(data.getChipType)); iter.insertRow(row); logger.info("inserted array design info to cache"); //ArrayManufacture if(dataExists(data.getChipLot()) //Person Person person = null; if(dataExists(data.getOperator())) / /Biosource if(dataExists(data.getSanpleType()) dataExists(data.getComments())1 dataExists(data.getDescription()) dataExists (data. getProject () NameValuePair nvp = (NameValuePair)iter.next ); logger.info(nvp.getName) +": "+nvp.getValue)); //TOD ADD //Protocol if(dataExists(header.getChipType())) if(dataExists(data.getProtocol())) ---- -- -. 000110 N N 0 NOR Thu May 12 13:36:46 2005 AffymetrixImporter.java 3 public void importData(CELData data importCELHeader(data.getHeaderData()); importCELEntries(data.getEntries()); if (dataExists(data.getFilter())j dataExists(data.getStation()) dataExists(data.getPixelSize() )I dataExists(data.getScannerId)) 11 dataExists(data.getNumberOfScans())I dataExists(data.getScannerType() private void importCELHeader(CELHeaderData data) RowIterator iter = subImpl.getAffyCelDataView); mbaImpl = (AffyCelDataViewRowImpl)iter.createRow(); iter.insertRow(mbaImpl); //BioAssay private void importCELEntries(List entries) //compound RowIterator iter = mbaImpl.getCELAnalysisView); logger.info("Adding entry information for "+entries.size()+" entries"); if(dataExists(data.getSolutionType())) int count = 0; for(Iterator entriesIter = entries.iterator(); //Channel if(dataExists(data.getFilter()) CELFileEntryType entry = entriesIter.hasNext();) (CELFileEntryType)entriesIter.next(); CELAnalysisViewRowImpl row = (CELAnalysisViewRowImpl)iter.createRow(); row.setSubId(subImpl.getId().getSequenceNumber()); row.setIntensity(new Float(entry.getIntensity())); row.setMask(new Boolean(entry.getMask))); row.setOutlier(new Boolean(entry.getOutlier()); row.setPixels(new Integer(entry.getPixels())); row.setStdDev(new Float(entry.getStdv))); if(dataExists(data.getStation()) row.setX(new Integer(entry.getX))); row.setY(new Integer(entry.getY()); iter.insertRow(row); if(dataExists(data.getPixelSize() )II dataExists(data.getScannerId)) I1 dataExists(data.getNumberOfScans()) I if(++count % 1000 == 0) dataExists(data.getScannerType()) persist(); logger.info(count + " entries have been added to the database"); } //Create Parameters logger.info("finshed add entries to database"); if(dataExists(data.getPixelSize())) } boolean dataExists(String str) if(dataExists(data.getScannerId())) return str != null && !str.trim).equals("1); public void persist) if(dataExists(data.getNumberOfScans()) logger.info("Committing all changes made to the db"); try subImpl.setCompleted(new Boolean(true)); eModule.getTransaction().commit(); logger.info("All changes commited to db"); if(dataExists(data.getScannerType()) catch (Exception e) logger.log(Level.WARNING, "Exception thrown in committing to database", e); } } CHPFileParser.java Tue Apr 19 02:12:54 2005 // rereading magic number int magicNo = BinaryFileIO.readInt(buffer); CHPFILEMAGICNUMBER) { if (magicNo throw new IOException("Incorrect Magic number"); Created on Mar 8, 2005 * TODO To change the template for this generated file go to Window - Preferences - Java - Code Style - Code Templates */ package edumit parsers affymetrix; import import import import import import 1 header.setMagicNumber (magicNo); // read version int version = BinaryFileIO.readInt(buffer); if (version > CHPFILEVERSIONNUMBER) { throw new IOException("Incompatible version"); edu.mit.data.affymetrix.CHPData; java.io.IOException; java.nio.ByteBuffer; java.nio.ByteOrder; java.util.ArrayList; java.util.List; header.setVersion(version); edu.mit.data.affymetrix.CHPHeaderData; edu.mit.data.affymetrix.DataConstants; import edu.mit.data.affymetrix.ExpressionProbeSetResults; import edu.mit.data.affymetrix.GenotypeProbeSetResults; // read cols and rows dimension header.setCols(BinaryFileIO.readUnsignedShort(buffer)); header.setRows(BinaryFileIO.readUnsignedShort(buffer)); import edu.mit.data.affymetrix.NameValuePair; import import edu.mit.parsers.BinaryFileIO; edu.mit.parsers.FileParser; // read no of probe sets header.setNoOfProbeSets(BinaryFileIO.readInt(buffer)); import import // skip qc data BinaryFileIO.readInt(buffer); * @author Aidan Downes TODO To change the template for this generated type comment go to Window Preferences - Java - Code Style - Code Templates */ public class CHPFileParser extends FileParser // read type header.setGeneChipAssayType(BinaryFileIO.readlnt(buffer)); // read progID header.setProgID(BinaryFileIO.readString(buffer)); private CHPHeaderData header; // read parentCellFile header.setParentCellFile(BinaryFileIO.readString(buffer)); private List genotypeResults; // header chipType header.setChipType(BinaryFileIO.readString(buffer)); private List expressionResults; public static final char DELIMCHAR = Ox14; // read algorithm header.setAlgorithmName(BinaryFileIO.readString(buffer)); public static final int MINCELLSTR = 4; // read algorithm parameters int noOfParams = BinaryFileIO.readInt(buffer); public static final int CHPFILEMAGIC_NUMBER = 65; public static final int CHPFILEVERSIONNUMBER = 2; public static final int EXPRESSIONABSOLUTE_STAT_ANALYSIS = 2; public static final int EXPRESSION_COMPARISONSTATANALYSIS = 3; for (int i = 0; i < noOfParams; i++) f NameValuePair param = new NameValuePair(); param.setName(BinaryFileIo.readString(buffer)); param.setValue(BinaryFileIO.readString(buffer)); header.getAlgorithmParameters().add(param); public static final String APPNAME = "GeneChip Sequence File"; private void reset() header = new CHPHeaderDatao; header.setAlgorithmParameters(new ArrayList)); header.setSunmaryParameters (new ArrayList()); genotypeResults = new ArrayList(); expressionResults = new ArrayList(; public void read(String filePath) throws IOException ByteBuffer buffer = getBuffer(filePath); buffer.position(0); buffer.order(ByteOrder.nativeOrder()); reset(); // check if file is compatible if (isXDAComptableFile(buffer)) // set position to start buffer.position(0); // read summary paramters noOfParams = BinaryFileIO.readInt(buffer); for (int i = 0; i < noOfParams; i++) ( NameValuePair param = new NameValuePair(); param.setName(BinaryFileIO.readString(buffer)); param.setValue(BinaryFileIO.readString(buffer)); header.getSummaryParameters() .add(param); // skip noOfParams = BinaryFileIO.readInt(buffer); float f = BinaryFileIO.readFloat(buffer); for (int i = 0; i < noOfParams; i++) ( BinaryFileIO.readFloat(buffer); BinaryFileIO.readFloat(buffer); BinaryFileIO.readFloat(buffer); 4 Tue Apr 19 02:12:54 2005 CHPFileParser. java 2 .readUnsignedShort(buf fer)); // finished header if (header.getGeneChipAssayType() != DataConstants.GENE_CHIP_.A expressionResults.add(results); SSAY_TYPEEXPRESSION && header.getGeneChipAssayType() != DataConsta } nts.GENE_CHIPASSAYTYPEGENOTYPING) ( throw new IOException("Supports only Expression or Gen otypes"); else int ival = BinaryFileIO.readInt(buffer); for (int i = 0; i < header.getNoOfProbeSets(; i++) GenotypeProbeSetResults results = new Genotype if (header.getGeneChipAssayType() == DataConstants.GENECHIP_A ProbeSetResults(); SSAYTYPEEXPRESSION) results int analysisType = BinaryFileIO.readUnsignedChar(buffe .setAlleleCall(BinaryFileIO .readUnsignedC r); int ival = BinaryFileIO.readInt(buffer); har(buffer)); if (analysisType != EXPRESSIONABSOLUTESTATANALYSIS && analysisType != EXPRESSIONCOMPARIS results.setConfidence(BinaryFileIO.readFloat(b uffer)); results.setRasl(BinaryFileIO.readFloat(buffer) ONSTATANALYSIS) throw new IOException( results.setPvalueAA(results.getRasl()); "outdataed expression CHP file s, should be MAS 5 or higher"); results. setRas2 (BinaryFileIO. readFloat (buffer) results.setPvalueAB(results.getRas2()); for (int i = 0; i < header.getNoOfProbeSets(; i++) ExpressionProbeSetResults results = new Expres results.setPvalueBB(BinaryFileIO.readFloat(bu sionProbeSetResults(; ffer)); results.setPvalueNoCall(BinaryFileIO.readFloa results.setDetection(BinaryFileIO.readUnsigned t(buffer)); Char(buffer)); genotypeResults.add(results); results.setDetectionPValue(BinaryFileIo.readFl oat(buffer)); else results.setSignal(BinaryFileIO.readFloat(buffe buffer.position(0); String verString = BinaryFileIO.readFixedString(buffer, APPNA r)); results ME .setNoOfPairs(BinaryFileIO .readUnsignedS .length)); if (!verString.equals(APP_NAME)) throw new IOException("incompatible file format"); hort(buffer)); results.setNoOfUsedPairs(BinaryFileIO .readUnsignedShort(buffer)); int version = BinaryFileIO.readInt(buffer); if (version < 12) ( throw new IOException( "This chip file is not supported by th results.setHasCompResults(false); if (analysisType == EXPRESSIONCOMPARISONSTAT e parser"); -ANALYSIS) results.setHasCompResults(true); results .setChange(BinaryFileI 0 .readU header.setVersion(version); // read algorithm header.setAlgorithmName(BinaryFileIO.readtring(buffer)); nsignedChar(buffer)); results.setChangePValue(BinaryFileIO.r eadFloat(buffer)); results.setSignalLogRatio(BinaryFileIO .readFloat(buffer)); results.setSignalLogRatioLow(BinaryFil eIO .readFloat(buffer)); results.setSignalLogRatioHigh(BinaryFi // read version header.setAlgorithmVersion(BinaryFileIO.readString(buffer)); // read parameters ?? BinaryFileIO.readString(buffer); BinaryFileIO.readString(buffer); // read cols and rows dimension leIO .readFloat(buffer)); results.setNoOfCommonPairs(BinaryFileI 0 header.setRows(BinaryFileIo.readInt(buffer)); header.setCols(BinaryFileIO.readInt(buffer)); Tue Apr 19 02:12:54 2005 CHPFileParser.java 3 BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); // read no of probe sets header.setNoOfProbeSets(BinaryFileIO.readInt(buffer)); // unused data int maxvalue = BinaryFileIO.readInt(buffer); results.setDetectionPValue(BinaryFileIO.readFl oat(buffer)); BinaryFileIO.readInt(buffer); if (header.getVersion() == 12) BinaryFileIO.readFloat(buffer); for (int i = 0; i < header.getNoOfProbeSets); i++) { BinaryFileIO.readInt(buffer); results.setSignal(BinaryFileIO.readFloat(buffe r) ); for (int i = 0; i < maxvalue; i++) BinaryFileIO.readInt(buffer); results. setDetection (BinaryFileIO. readInt (buff er)); for (int i = 0; i < maxvalue; i++) { int type = BinaryFileIO.readInt(buffer); if (i == 0) { if for (int j = 0; j < results.getNoOfPairs(); j+ +) { BinaryFileIO.readFloat(buffer); BinaryFileIO.readInt(buffer); (type == 3) header . setGeneChipAssayType( if (header.getVersion) == 12) BinaryFileIO. readInt (buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readFloat (buffer) DataConstants.GENECHIPASSAYTYPE_EXPRESSION); else if (type == 2) header .setCeneChipAssayType( DataConstants.GENECHIPASSAYTYPEGENOTYPING); BinaryFileIO.readFloat(buffer) else header BinaryFileIO. readInt (buffer); .setGeneChipAssayType( BinaryFileiO.readChar(buffer); DataConstants.GENECHIPASSAYTYPEJUNKNOWN); BinaryFileIO.readChar(buffer); } else for (int i = 0; i < header.getNoOfProbeSetso; i++) BinaryFileIO.readInt(buffer); BinaryFileIO.readUnsignedShort (buffer); BinaryFileIO.readUnsignedShort (buffer); header.setChipType(BinaryFileIO.readFixedString(buffer, 256)); header.setParentCellFile(BinaryFileIO.readFixedString(buffer, if (header.getVersion() == 12) BinaryFileIO.readInt (buffer); BinaryFileIO. readInt (buffer); BinaryFileIO. readFloat (buffer) 256)); header.setProgID(BinaryFileIO.readString(buffer)); if (header.getGeneChipAssayType() !=DataConstants.GENECHIP_A BinaryFileIO. readFloat (buffer) SSAYTYPEEXPRESSION && header.getGeneChipAssayType() != DataConsta nts.GENECHIPASSAYTYPEGENOTYPING) BinaryFileIO.readInt(buffer); BinaryFileIO.readChar(buffer); BinaryFileIO.readChar(buffer); { throw new IOException("Supports only Expression or Gen otypes"); } else BinaryFileIO.readUnsignedShort if == DataConstants.GENECHIP-A (buffer); for (int i = 0; i < header.getNoOfProbeSets(); i++) { ExpressionProbeSetResults results = new Expres (buffer); (header.getGeneChipAssayType) BinaryFileIO.readunsignedShort SSAYTYPEEXPRESSION) ( sionProbeSetResults(); results.setNoOfPairs(BinaryFileIO.readInt(buff results er))); results.setNoOfUsedPairs(BinaryFileIO.readInt( setHasCompResults (BinaryFileI O.readInt(buffer) == 1 ? true buffer)); : false); if (header.getVersion() <= 12) if (results.isHasCompResults() results .setNoOfCommonPairs(Bi BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); naryFileIO if (header.getVersion() == 12) ( BinaryFileIO.readInt(buffer); .readI nt(buffer)); CHPFileParser. java 4 Tue Apr 19 02:12:54 2005 BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); if (header.getVersion() == 12) { BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); int cval if (buffer)); BinaryFileIO.readChar(buffer); (header.getVersion() == 12) { BinaryFileIO.readChar(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); 1) results.setAlleleCall(BinaryFileIO .readUnsignedChar(buff if (header.getVersion() == 12) { results.setConfidence((float) results.setSignalLogRatioHigh((float) BinaryFileIO .readInt(buffer) / 100 BinaryFileIO.readChar(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readString(buffer); BinaryFileIO.readstring(buffer); BinaryFileIO.readString(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); results.setChange(BinaryFileIO.readInt if = (cval == BinaryFileIO .readInt(buffe 0); r) / 1000); } else { BinaryFileIO.readInt(buffer); results.setConfidence(BinaryFi if (header.getVersion() == 12) BinaryFileIO.readInt(buffer); leIO .readFloat(buf fer)); results.setSignalLogRatio((float) Bina ryFileIO BinaryFileIO.readFloat(buffer); BinaryFileIO.readFloat(buffer); BinaryFileIo.readFloat(buffer); .readInt(buffer) / 100 0); if (header.getVersion() == 12) BinaryFileIO.readInt(buffer); results.setRasl(BinaryFileIO.readFloat (buffer)); results.setRas2 (BinaryFilelO.readFloat results.setSignalLogRatioLow((float) B inaryFileIO (buffer)); .readInt(buffer) / 100 } else results.setConfidence(Of); results.setRasl(Of); results.setRas2(Of); results.setAlleleCall(DataConstants.AL 0); if (header.getVersion() == 12) { results.setChangePValue((float ) BinaryFileIO LELENOCALL); .readInt(buffe r) / 1000); else { results.setChangePValue(Binary FileIO .readFloat(buf results.setPvalue-AA(0.Of); results.setPvalueAB(Of); results.setPvalueBB(Of); results.setPvalueNoCall(Of); fer)); BinaryFileIO.readString(buffer); BinaryFileIO.readString(buffer); int np = BinaryFileIO.readInt(buffer); expressionResults.add(results); for (int j = 0; j < np; i++) { BinaryFileIO.readInt(buffer); else for (int i = 0; i < header.getNoQfProbeSets); i++) GenotypeProbeSetResults results = new Genotype ProbeSetResults(); int ngroups = BinaryFileIO.readInt(buffer); for (int j 0; j < ngroups; j++) { BinaryFileIO.readInt(buffer); BinaryFileIO.readString(buffer); BinaryFileIO.readChar(buffer); if (header.getVersion() == 12) { BinaryFileIO.readInt (buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO. readChar (buffer); BinaryFileIO. readChar (buffer); else ( BinaryFileIO.readUnsignedChar( Tue Apr 19 02:12:54 2005 CHPFileParser. java 5 return genotypeResults; buffer); BinaryFileIO.readUnsignedChar( ) buffer); if * * */ (header.getVersion() == 12) BinaryFileIO.readInt(buffer); @param genotypeResults The genotypeResults to set. public void setGenotypeResults(List genotypeResults) this.genotypeResults = genotypeResults; BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readInt(buffer); BinaryFileIO.readChar(buffer); BinaryFileIO.readChar(buffer); @return Returns the header. */ public CHPHeaderData getHeader) return header; * } else { BinaryFileIO.readUnsignedChar( buffer); BinaryFileIO.readUnsignedChar( buffer); @param header The header to set. */ public void setHeader(CHPHeaderData header) this.header = header; * * this.genotypeResults.add(results); } } } public CHPData getCHPData() CHPData data = new CHPDatao; private boolean isXDAComptableFile(ByteBuffer buffer) data.setExpressionProbeSetResults(this.expressionResults); int magic = BinaryFileIO.readInt(buffer); return (magic == CHPFILEMAGIC_NUMBER); data.setGenotypeProbeSetResults(this.genotypeResults); data.setHeader(this.header); return data; public static void main(String[] args) throws IOException String file = args[O]; CHPFileParser parser = new CHPFileParser); parser.read(file); CHPHeaderData header = parser.getHeadero; System.out.println(header.getAlgorithmName)); System.out.println(header.getAlgorithmVersion)); System.out.println(header.getMagicNumber)); System.out.println(header.getVersion(); System.out.println(header.getAlgorithmParameters()); System.out.println(header.getParentCellFile)); System.out.println(header.getChipType()); System.out.println(header.getProgID)); * @return Returns the expressionResults. */ public List getExpressionResults) return expressionResults; * @param expressionResults The expressionResults to set. * */ public void setExpressionResults(List expressionResults) this.expressionResults = expressionResults; a @return Returns the genotypeResults. public List getGenotypeResults() } CHPHeaderData. java Wed Mar 09 01:41:46 2005 1 /* * * @param algorithmVersion The algorithmVersion to set. Created on Mar 8, 2005 * TODO To change the template for this generated file go to * Window - Preferences - Java - Code Style - Code Templates */ package edu.mit.data.affymetrix; public void setAlgorithmVersion(String algorithmVersion) this.algorithmVersion = algorithmVersion; * @return Returns the summaryParameters. */ public List getSummaryParameters() return summaryParameters; import java.io.Serializable; import java-util.ArrayList; import java.util.List; The sumnaryParameters to set. */ public void setSunmmaryParameters (List aummaryParameters) this.suumaryParameters = aunmaryParameters; * @param sunmaryParameters * @author Aidan Downes TODO To change the template for this generated type comment go to Style - Code Templates */ public class CHPHeaderData implements Serializable * Window - Preferences - Java - Code @return Returns the chipType. */ public String getChipType() return chipType; * * Comment for <code>serialVersionUID</code> */ private static final long serialVersionUID = private int magicNumber; private int version; private int cols; private int rows; private int noOfProbeSets; private int geneChipAssayType; private String chipType; private String algorithmName; private String algorithmVersion; private String parentCellFile; private String progID; private List algorithmParameters; private List summaryParameters; 1L; @param chipType The chipType to set. */ public void setChipType(String chipType) this.chipType = chipType; * @return Returns the cols. */ public int getCols() return cols; * @param cols The cols to set. */ public void setCols(int cols) this.cols = cols; * @return Returns the algorithmName. */ public String getAlgorithmName() return algorithmName; * * @return Returns the geneChipAssayType. * @param algorithmName The algorithmName to set. */ public void setAlgorithmName(String algorithmName) this.algorithmName = algorithmName; public int getGeneChipAssayType() return geneChipAssayType; @param geneChipAssayType The geneChipAssayType to set. */ public void setGeneChipAssayType(int geneChipAssayType) this.geneChipAssayType = geneChipAssayType; * @return Returns the algorithmParameters. */ public List getAlgorithmParameters() return algorithmParameters; * @return Returns the magicNumber. */ public int getMagicNumber() return magicNumber; * @param algorithmParameters The algorithmParameters to set. */ public void setAlgorithmParameters (List algorithmParameters) this.algorithmParameters = algorithmParameters; * * @param magicNumber The magicNumber to set. * @return Returns the algorithmVersion. */ public String getAlgorithmVersion) return algorithmVersion; */ public void setMagicNumber(int magicNumber) this.magicNumber = magicNumber; * ---------- - @return Returns the noOfProbeSets. CHPHeaderData.java Wed Mar 09 01:41:46 2005 */ public int getNoOfProbeSets() return noOfProbeSets; @param noOfProbeSets The noOfProbeSets to set. public void setNoOfProbeSets(int noOfProbeSets) this.noOfProbeSets = noOfProbeSets; /** * @return Returns the parentCellFile. public String getParentCellFile() return parentCellFile; @param parentCellFile The parentCellFile to set. */ public void setParentCellFile(String parentCellFile) this.parentCellFile = parentCellFile; * * @return Returns the progID. */ public String getProgID() return progID; *parem progID The progID to set. public void setProgID(String progID) this.progID = progID; * @return Returns the rows. */ public int getRows) return rows; @param rows The rows to set. */ public void setRows(int rows) this.rows = rows; * * @return Returns the version. */ public int getVersion() return version; * @param version The version to set. */ public void setVersion(int version) this.version = version; 2 Thu Apr 28 23:27:08 2005 DataConstants.java 1 /* case ALLELENOCALL: return "No Call"; * Created on Mar 8, 2005 * TODO To change the template for this generated file go to Window - Preferences - Java - Code Style - Code Templates */ package edu.mit.data.affymetrix; * @author return public static String getDetectionString(int detection) switch (detection) { Aidan Downes case ABSPRESENTCALL: * Not elegant but it works return "P"; */ public class DataConstants // Expression, Genotyping, Resequencing, Universal, Unknown public static final int GENECHIPASSAYTYPE_EXPRESSION = 0; case ABSMARGINALCALL: return "M"; public static final int public static final int GENECHIPASSAYTYPERESEQUENCING = 2; public static final int GENECHIPASSAYTYPE_UNIVERSAL = 3; case ABSNOCALL: return "No Call"; public static final int GENECHIPASSAYTYPE_UNKNOWN default: public static final int ALLELE_A_CALL = 6; public static final int GENECHIPASSAYTYPEGENOTYPING = 1; case ABSABSENTCALL: return "A"; = 4; break; return public static final ALLELEBCALL = 7; int ALLELE_AB_CALL = 8; public static final int ALLELENOCALL = 11; public static final int public static final int ABSMARGINALCALL = 1; public static final int public static final int ABSNOCALL = 3; public static final int COMPINCREASECALL = 1; public static final int COMPDECREASECALL = 2; public static final int COMP-MODINCREASECALL = 3; public static String getChangeString(int change) switch (change) { case COMPINCREASECALL: return "I"; ABSPRESENTCALL = 0; case COMPDECREASECALL: return "D"; ABSABSENTCALL = 2; case COMPMOD_INCREASECALL: return "MI"; case COMPMODDECREASECALL: return "MD"; case COMP NOCHANGECALL: public static final int COMPMODDECREASECALL = 4; public static final int COMPNOCHANGECALL = 5; return "NC"; case COMP_NOCALL: return "No Call"; default: public static final int COMPNO-CALL = 6; public static final char CELLDELIMCHAR = public static final int MINCELLSTR = 4; public static final int public static final int 0x14; CELLFILEMAGICNUMBER = 64; CELLFILEVERSIONNUMBER = 4; public static String getAlleleCallString(int alleleCall) switch (alleleCall) case ALLELEACALL: return "A"; case ALLELEB-CALL: return "B"; case ALLELEABCALL: return "AB"; break; return ""; } } ExperibaseImporter.java Thu May 12 13:32:56 2005 1 package org.experibase.importer; import java.io.File; import edu.mit.parsers.affymetrix.*; import java.io.IOException; catch (IOException e) logger.log(Level.WARNING, "problems parsing exp file", e); import java.util.ArrayList; import import import import java.util.Iterator; oracle.jbo.ViewObject; org.experibase.importer.utils.GetOpt; java.util.logging.*; else if(ext.equalsIgnoreCase(".CEL")) try CELFileParser celParser = new CELFileParsero; celParser.read(file.getAbsolutePath)); affyImporter.importData(celParser.getData(o); Adds data from raw files to experibase experiment catch (IOException e) */ logger.log(Level.WARNING, "problems parsing cel file", e); public class ExperibaseImporter 1; private int experibaseId = -. private static Logger logger = Logger-getLogger("org.experibase"); affyImporter.persist (; public ExperibaseImporter (int experibaseId) logger.info("finished import files"); this.experibaseId = experibaseId; Imports a file into experibase, file types supported include: * * EXP files CHP files @param fileName * Returns the extension of a file including the leading * @return The file extension or the empty string if filName is null * or the file has no extension * 9param. fileName */ private static String getFileExtension(String fileName) public void importFiles(ArrayList fileNames) if(fileName == null) return ""; int index = fileName.lastIndexOf('.'); if (index < 0) return ""; else return fileName.substring(index); AffymetrixImporter affyImporter = new AffymetrixImporter(experibaseId); for(Iterator iter = fileNames.iteratoro; iter.hasNext);) File file = new File(iter.next().toString()); if(file.exists() && !file.isDirectoryo) * Adds data from files to experiment String ext = getFileExtension(file.getName(); if(ext.equalsIgnoreCase(".CHP")) @param args Args to program */ public static void main(String[] args) * try CHPFileParser chpParser = new CHPFileParser); chpParser.read(file.getAbsolutePath)); try I FileHandler fh = new FileHandler("experibasefiles.txt", 5242880, affyImporter. importData (chpParser.getCHPData (); logger. addHandler (fh); catch (IOException e) logger.log(Level.WARNING, "problems parsing chp file", e); catch (IOException e) //ignore logger.setLevel(Level.ALL); else if (ext.equalsIgnoreCase(".EXP")) try ExperimentFileParser expParser = new ExperimentFileParser); expParser.read(file.getAbsolutePath)); affyImporter. importData(expParser.getData()); GetOpt go = new GetOpt(args, "he:"); go.optErr = true; int ch = -1; // process options in command line arguments boolean usagePrint = false; 1, true); ExperibaseImporter.java Thu May 12 13:32:56 2005 int eId = -1; while ((ch = go.getopt)) !=go.optEOF) ((char)ch == 'h') usagePrint = true; if else if ((char)ch == 'e') eId = go.processArg(go.optArgGet), eId); logger.info("ExperibaseId is " + eId); else doHelp(l); // undefined option if (usagePrint) doHelp(O); ArrayList files = new ArrayList); // process non-option command line arguments for (int k = go.optIndexGeto; k < args.length; k++) logger.info("Procesing "+ args[k]); files.add(args[k]); ExperibaseImporter importer = new ExperibaseImporter(eId); importer.importFiles(files); /* Stub for providing help on usage You can write a longer help than this, certainly. */ static void doHelp(int returnValue) System.err.println("Usage: Importer -e experibaseId file ... System.err.println("\te\tExperibaseId"); System.err.println("\th\tPrints this menu"); logger.info("help shown with return value " + returnValue); System.exit(returnValue); } "); 2 ExperimentData. java Tue Apr 19 02:16:12 2005 /* Created on Mar 7, 2005 * TODO To change the template for this generated file go to Window - Preferences - Java - Code Style - Code Templates */ package edu.mit . data. affymetrix; 1 private private private private private private String pixelSize; String filter; scanTemperature; String String scanDate; String scannerId; String numberOfScans; * greturn Returns the numberOfScans. */ public String getNumberOfScans() return numberOfScans; import java.io.Serializable; * @author Aidan Downes * TODO To change the template for this generated type comment go to * Window - Preferences - Java - Code Style - Code Templates */ public class ExperimentData implements Serializable { @param numberOfScans The numberOfScans to set. */ public void setNumberOfScans (String numberOfScans) this.numberOfScans = numberOfScans; * private String scannerType; Comment for <code>serialVersionUID</code> */ private static final long serialVersionUID = 3257288036928009526L; * * @return Returns the absolutePath. private String name; private String directoryPath; private String absolutePath; 1/ // 'I,, II 'I // 1/ // // I, [Sample Info] Chip Type Chip Lot Operator Sample Type Description Project Comments Solution Type Solution Lot private String private String private String private String private String private String private String private String private String Ecoli_ASv2 */ public String getAbsolutePath() return absolutePath; @param absolutePath The absolutePath to set. */ public void setAbsolutePath(String absolutePath) this.absolutePath = absolutePath; * @return Returns the chipLot. */ public String getChipLot() return chipLot; * chipType; chipLot; operator; sampleType; description; project; comments; solutionType; solutionLot; @param chipLot The chipLot to set. */ public void setChipLot(String chipLot) { this.chipLot = chipLot; * * @return Returns the chipType. II 'I III, */ public String getChipType() return chipType; [Fluidics) Protocol Station Module Hybridize Date private private private private String String String String * @param chipType The chipType to set. protocol; station; module; */ public void setChipType(String chipType) this.chipType = chipType; hybridizeDate; @return Returns the comments. */ public String getComments() return comments; * // // ///'/ // // //// [Scanner] Pixel Size Filter Scan Temperature Scan Date Scanner ID Number of Scans Scanner Type /* * @param comments The comments to set. */ public void setComments(String comments) this.comments = comments; * ExperimentData.java Tue Apr 19 02:16:12 2005 2 * @param name The name to set. public void setName(String name) this.name = name; @return Returns the description. */ public String getDescription) return description; @return Returns the operator. */ public String getOperator) return operator; * @param description The description to set. */ public void setDescription(String description) { this.description = description; @param operator The operator to set. */ public void setOperator(String operator) this.operator = operator; * @return Returns the directoryPath. */ public String getDirectoryPath) return directoryPath; * * @return Returns the pixelSize. */ @param directoryPath The directoryPath to set. public String getPixelSize() return pixelSize; */ public void setDirectoryPath(String directoryPath) this.directoryPath = directoryPath; * @param pixelSize The pixelSize to set. * @return Returns the filter. */ public void setPixelSize(String pixelSize) this.pixelSize = pixelSize; public String getFilter){ return filter; @return Returns the project. */ public String getProject() return project; @param filter The filter to set. */ public void setFilter(String filter) this.filter = filter; * @param project The project to set. */ public void setProject(String project) this.project = project; *return Returns the hybridizeDate. */ public String getHybridizeDate) return hybridizeDate; * @param hybridizeDate The hybridizeDate to set. */ public void setHybridizeDate(String hybridizeDate) this.hybridizeDate = hybridizeDate; * @return Returns the protocol. public String getProtocol() return protocol; * @param protocol The protocol to set. @return Returns the module. */ public String getModule) return module; */ public void setProtocol(String protocol) this.protocol = protocol; f * @return Returns the sampleType. */ * @param module The module to set. */ public void setModule(String module) this-module = module; public String getsampleType){ return sampleType; * @return Returns the name. */ * @param sampleType The sampleType to set. public void setSampleType(String sampleType) this-sampleType = sampleType; public String getName() return name; * @return */ Returns the scanDate. ExperimentData.java Tue Apr 19 02:16:12 2005 3 public String getScanDate() return scanDate; * @return Returns the station. * @param scanDate The scanDate to set. */ */ public String getStation() return station; public void setScanDate(String scanDate) this.scanDate = scanDate; @param station The station to set. */ public void setStation(String station) this.station = station; * * @return Returns the scannerId. public String getScannerId() return scannerId; /** * @param scannerId The scannerId to set. */ public void setScannerId(String scannerId) this.scannerId = scannerId; * @return Returns the scannerType. */ public String getScannerType() return scannerType; * @param scannerType The scannerType to set. */ public void setScannerType(String scannerType) this.scannerType = scannerType; * @return Returns the scanTemperature. */ public String getScanTemperature() return scanTemperature; @param scanTemperature The scanTemperature to set. */ public void setScanTemperature(String scanTemperature) this.scanTemperature = scanTemperature; * @return Returns the solutionLot. */ * public String getSolutionLot() return solutionLot; * @param solutionLot The solutionLot to set. */ public void setSolutionLot(String solutionLot) this.solutionLot = solutionLot; @return Returns the solutionType. */ public String getSolutionType() return solutionType; @param solutionType The solutionType to set. */ public void setSolutionType(String solutionType) this.solutionType = solutionType; * ExperimentFileParser. java Tue Apr 19 05:06:56 2005 * 1 String value = ""; /* Created on Mar 7, 2005 logger.info("reading file contents"); while(lineMatcher.find()) * TODO To change the template for this generated file go to * Window - Preferences - Java - Code Style - Code Templates String line = lineMatcher.group(); if(line.startsWith(CHIPTYPE)) */ package edu.mit.parsers.affymetrix; import import import import import value = line.substring(CHIPTYPE.length()).trim); if(!value.equals("")) data.setChipType(value); java.io.*; java.nio.CharBuffer; java.util.logging.Logger; java.util.regex.Matcher; java.util.regex.Pattern; else if(line.startsWith(CHIPLOT)) { import edu.mit.data.affymetrix.ExperimentData; import edu.mit.parsers.FileParser; } else if(line.startsWith(SAMPLETYPE)) * Pauthor Aidan Downes * TODO To change the template for this generated type comment go to * Window - Preferences - Java - Code Style - Code Templates */ public class value = line.substring(CHIPLOT.length()).trim)); if(!value.equals("")) data.setChipLot(value); value = line.substring(SAMPLETYPE.length()).trim); if(!value.equals("")) data.setSampleType(value); ExperimentFileParser extends FileParser { else if(line.startsWith(DESCRIPTION)) private private private private private private private private private private private private private private private private private private private private private final String CHIPTYPE = "Chip Type"; final String CHIPLOT "Chip Lot"; final String SAMPLETYPE = "Sample Type"; final String DESCRIPTION = "Description"; final String PROJECT = "Project"; final String COMMENTS = "Comments"; final String SOLUTION_TYPE = "Solution Type"; SOLUTIONLOT = "Solution Lot"; final String final String PROTOCOL = "Protocol"; final String STATION = "Station"; MODULE = "Module"; final String final String HYBRIDIZEDATE = "Hybridize Date"; final String PIXELSIZE = "Pixel Size"; final String FILTER = "Filter"; static final String SCANTEMPERATURE = "Scan Temperature"; SCANDATE = "Scan Date"; static final String static final String SCANNERID = "Scanner ID"; NUMBEROFSCANS = "Number of Scans"; static final String static final String SCANNERTYPE = "Scanner Type"; static final String OPERATOR = "Operator"; ExperimentData data; static static static static static static static static static static static static static static value = line.substring(DESCRIPTION.length()).trim)); if(!value.equals("")) data.setDescription(value); else if(line.startsWith(PROJECT)) value = line.suhstring(PROJECT.length()).trim(); if(!value.equals("")) data.setProject(value); else if(line.startsWith(COMMENTS)) value = line.substring(COMMENTS.length()).trim)); if(!value.equals("")) data.setConunents(value); else if(line.startsWith(SOLUTIONTYPE)) value = line.substring(SOLUTIONTYPE. length)). trim)); if(!value.equals("")) data.setSolutionType(value); else if(line.startsWith(SOLUTIONLOT)) public void read(String filePath) throws IOException value = line.substring(SOLUTIONLOT.length()).trim); if(!value.equals("")) File file = new File(filePath); String name = file.getName(); int index = name.lastIndexOf("."); name = name.substring(0,index); data.setSolutionLot(value); else if(line.startsWith(PROTOCOL)) CharBuffer buffer = getCharBuffer(filePath); value = line.substring(PROTOCOL.lengtho)).trim)); if(!value.equals("")) //line pattern Pattern linePattern = Pattern.compile(".*$", data.setProtocol(value); Pattern.MULTILINE); else if(line.startsWith(STATION)) //Match line Matcher lineMatcher = linePattern.matcher(buffer); data = new ExperimentData); data.setName(name); value = line.substring(STATION.length()).trim); if(!value.equals("")) data.setStation(value); else if(line.startsWith(MODULE)) ExperimentFileParser.java Tue Apr 19 05:06:56 2005 value = line.substring(MODULE.length()).trim); if(!value.equals("'")) data.setModule(value); 2 */ public ExperimentData getData() return data; else if(line.startsWith(HYBRIDIZEDATE)) public static void main(String[] args) value = line.substring(HYBRIDIZE_DATE.length().trim() if(!value.equals("')) data.setHybridizeDate(value); else if(line.startsWith(PIXELSIZE)) value = line.substring(PIXELSIZE.length()).trim); if(!value.equals("")) data.setPixelSize(value); else if(line.startsWith(FILTER)) value = line.substring(FILTER.length()).trim(; if(!value.equals ("")) data.setFilter(value); else if(line.startsWith(SCANTEMPERATURE)) value = line.substring(SCANTEMPERATURE.length()).trim if(!value.equals("")) data.setScanTemperature(value); else if(line.startsWith(SCANDATE)) value = line.substring(SCANDATE.length().trim(); if(!value.equals ( ')) data.setScanDate(value); else if(line.startsWith(SCANNER_ID)) value = line.substring(SCANNERID.length)).trim(); if(!value.equals("")) data.setScannerId(value); else if(line.startsWith(NUMBEROF_SCANS)) value = line.substring(NUMBEROFSCANS.length()).trim( if(!value.equals ("")) data.setNumnberOfScans(value); else if(line.startsWith(SCANNERTYPE)) value = line.substring(SCANNER_TYPE.length)).trimo; if(!value.equals('"")) data.setScannerType(value); else if(line.startsWith(OPERATOR)) value = line.substring(OPERATOR.length()).trim; if(!value.equals (')) data.setOperator(value); logger.info("finished reading file contents"); * @return Returns the data. ExpressionProbeSetResults.java 1 Wed Mar 09 04:19:54 2005 public void setDetection(int detection) this.detection = detection; /* Created on Mar 8, 2005 * TODO To change the template for this generated file go to * Window - Preferences - Java - Code Style - Code Templates package edu.mit.data.affymetrix; * @return Returns the detectionPValue. */ public float getDetectionPValue() return detectionPValue; import java.io.Serializable; * * @author Aidan Downes * TODO To change the template for this generated type comment go to * Window - Preferences - Java - Code Style - Code Templates * @return Returns the hasCompResults. public boolean isHasCompResults) return hasCompResults; 1L; } * @param hasCompResults The hasCompResults to set. float signal; int noOfPairs; */ public void setHasCompResults(boolean hasCompResults) this.hasCompResults = hasCompResults; int noOfUsedPairs; int detection; boolean hasCompResults; float changePValue; @return Returns the noOfCommonPairs. */ public int getNoOfCommonPairs() return noOfCommonPairs; * float signalLogRatio; float signalLogRatioLow; float signalLogRatioHigh; int noOfCommonPairs; int change; * * @return detectionPValue The detectionPValue to set. public void setDetectionPValue(float detectionPValue) this.detectionPValue = detectionPValue; */ public class ExpressionProbeSetResults implements Serializable { Comment for <code>serialVersionUID</code> */ private static final long serialVersionUID = float detectionPValue; @param Returns the change. */ @param noOfCommonPairs The noOfCommonPairs to set. public void setNoOfCommonPairs(int noOfCommonPairs) this.noOfCommonPairs = noOfCommonPairs; public int getChange() return change; * @return Returns the noofPairs. * @param change The change to set. */ public void setChange(int change) this.change = change; * @return Returns the changePValue. */ public float getChangePValue() return changePValue; */ public int getNoOfPairs() return noOfPairs; * @param noOfPairs The noOfPairs to set. */ public void setNoOfPairs(int noOfPairs) this.noOfPairs = noOfPairs; * @return Returns the noOfUsedPairs. * @param changePValue The changePValue to set. */ public int getNoOfUsedPairs() return noOfUsedPairs; public void setChangePValue(float changePValue) this.changePValue = changePValue; @param noOfUsedPairs The noOfUsedPairs to set. */ public void setNoOfUsedPairs(int noOfUsedPairs) this.noOfUsedPairs = noOfUsedPairs; * * @return Returns the detection. */ public int getDetection() return detection; * @return Returns the signal. * @param detection The detection to set. */ */ public float getSignal() { return signal; ExpressionProbeSetResults.java * @param signal The signal Wed Mar 09 04:19:54 2005 to set. */ public void setSignal(float signal) this.signal = signal; @return Returns the signalLogRatio. */ public float getSignalLogRatio() return signalLogRatio; * @param signalLogRatio The signalLogRatio to set. */ public void setSignalLogRatio(float signalLogRatio) this.signalLogRatio = signalLogRatio; * @return Returns the signalLogRatioHigh. */ * public float getSignalLogRatioHigh() return signalLogRatioHigh; @param signalLogRatioHigh The signalLogRatioHigh to set. */ public void setSignalLogRatioHigh(float signalLogRatioHigh) this.signalLogRatioHigh = signalLogRatioHigh; * @return Returns the signalLogRatioLow. public float getSignalLogRatioLow() return signalLogRatioLow; @param signalLogRatioLow The signalLogRatioLow to set. */ public void setSignalLogRatioLow(float signalLogRatioLow) this.signalLogRatioLow = signalLogRatioLow; * } 2 Tue Mar 08 21:55:58 2005 FileParser.java 1 /* protec ted ByteBuffer getBuffer(String filePath) throws FileNotFoundException, Created on Mar 7, 2005 IOException ( //Map file to file byte buffer FileInputStream input = new FileInputStream(filePath); FileChannel channel = input.getChannel(); long fileLength = channel.size(; MappedByteBuffer buffer = channel .map(FileChannel.MapMode.READONLY, 0 * TODO To change the template for this generated file go to * Window - Preferences - Java - Code Style - Code Templates package edu.mit.parsers; import import import import import import import import import import import * java.io.FileInputStream; java.io.FileNotFoundException; java.io.IOException; java.nio.ByteBuffer; java.nio.CharBuffer; java.nioMappedByteBuffer; java.nio.channels.FileChannel; java.nio.charset.Charset; java.nio.charset.CharsetDecoder; java.util.logging.Level; java.util.logging.Logger; @author fileLength); return buffer; Aidan Downes * TODO To change the template for this generated type comment go to Window - Preferences - Java - Code Style - Code Templates */ public class FileParser protected Logger logger; public FileParser) logger = Logger.getLogger("edu.mit.parsers"); * Gets a character buffer containing the contents of the file at <code>filePa th</code> * * @param filePath @return @throws IOException */ public CharBuffer getCharBuffer(String filePath) throws IOException * try { ByteBuffer buffer = getBuffer(filePath); //Converter to char buffer Charset charset = Charset.forName("ISO-8859-1"); CharsetDecoder decoder = charset.newDecoder); return decoder.decode(buffer); } catch (FileNotFoundException e) { logger.log(Level.WARNING, e.getLocalizedMessage), e); throw e; } catch (IOException e) logger.log(Level.WARNING, e.getLocalizedMessage), throw e; /**~ * @param * @return @throws * * filePath FileNotFoundException @throws IOException e); GenotypeProbeSetResults.java 1 Wed Mar 09 00:06:00 2005 @return Returns the pvalueAB. */ public float getPvalueAB() return pvalueAB; /* * * Created on Mar 8, 2005 * TODO To change the template for this generated file go to Window - Preferences - Java - Code Style - Code Templates */ package edu.mit.data.affymetrix; * import java.io.Serializable; * @author @param pvalue.AB The pvalueAB to set. */ public void setPvalueAB(float pvalueAB) this.pvalueAB = pvalueAB; Aidan Downes * * TODO To change the template for this generated type comment go to * Window - Preferences - Java - Code Style - Code Templates @return Returns the pvalueBB. public float getPvalueBB() return pvalueBB; */ public class GenotypeProbeSetResults implements Serializable @param pvalueBB The pvalueBB to set. */ public void setPvalueBB(float pvalue-BB) this.pvalue_BB = pvalueBB; * * Comment for <code>serialVersionUID</code> */ private static final long serialVersionUID = int alleleCall; float confidence; float rasl; float ras2; float pvalue_-AA; float pvalueAB; float pvalueBB; float pvalueNoCall; 1L; * @return Returns the pvalueNoCall. */ public float getPvalueNoCall() { return pvalueNoCall; @param pvalue.NoCall The pvalue_NoCall to set. */ public void setPvalue_NoCall(float pvalueNoCall) this.pvalueNoCall = pvalue_NoCall; @return Returns the alleleCall. */ public int getAlleleCall() return alleleCall; * @return Returns the rasl. */ public float getRasl() return rasl; * @param alleleCall The alleleCall to set. */ public void setAlleleCall(int alleleCall) this.alleleCall = alleleCall; @param rasl The rasl to set. */ public void setRasl(float rasl) this.rasl = rasl; * * @return Returns the confidence. */ public float getConfidence() return confidence; @return Returns the ras2. */ public float getRas2() return ras2; * @param confidence The confidence to set. */ public void setConfidence(float confidence) this.confidence = confidence; * * @param ras2 The ras2 to set. * @return */ public float getPvalueAA() return pvalueAA; { @param pvalueAA The pvalue__AA to set. */ public void setPvalueAA(float pvalueAA) { this.pvalueAA = pvalueAA; * */ public void setRas2(float ras2) this.ras2 = ras2; Returns the pvalue_AA. } GetOpt.java Thu Apr 21 02:15:46 2005 1 // -// // -// // // -// package org.experibase.importer.utils; //NOTES // Original Author not known // OVERVIEW: // // // // // GetOpt provides a general means for a Java program to parse command line arguments in accordance with the standard Unix conventions; it is analogous to, and based on, getopt(3) for C programs. (The following documentation is based on the man page for getopt(3).) // DESCRIPTION: // // // // // // // // // // // // GetOpt interprets command arguments in accordance with the standard Unix conventions: option arguments of a command are introduced by followed by a key character, and a non-option argument terminates the processing of options. GetOpt's option interpretation is controlled by its parameter optString, which specifies what characters designate legal options and which of them require associated values. // // in the command line arguments that matches a letter in optString. optString must contain the option letters the command using getopt // // For example, getopt("ab") specifies that the command will recognize. line should contain no options, only "-a", only "-b", or both "-a" and (The command line can also contain non-option // "-b" in either order. Multiple options per argument // arguments after any option arguments.) // are allowed, e.g., "-ab" for the last case above. // // // // // // // If a letter in optString is followed by a colon, the option is expected to have an argument. The argument may or may not be separated by whitespace from the option letter. For example, getopt("w:") allows either "-w 80" or "-w80". The variable optArg is set to the option argument, e.g., "80" in either of the previous examples. Conversion functions such as Integer.parseInt(), etc., can then be applied to optArg. getopt places in the variable optIndex the index of the next command line argument to be processed; optIndex is automatically initialized to 1 before the first call to getopt. When all options have been processed (that is, up to the first // non-option argument), getopt returns optEOF (-1). getopt recognizes the // command line argument "--" (i.e., two dashes) to delimit the end of Subsequent, // the options; getopt returns optEOF and skips "--". // non-option arguments can be retrieved using the String array passed to // main(), beginning with argument number optIndex. // // // // //I // DIAGNOSTICS: getopt prints an error message on System.stderr and returns a question mark ('?') when it encounters an option letter in a command line argument that is not included in optString. Setting the variable optErr to false disables this error message. // NOTES: // // // // // // // // is handled. Sun and DEC getopt(3)'s differ w.r.t. how "---" (or anything starting with "--") the same as Sun treats "---" as two separate "-" options DEC treats "---" // // // -// (so "-" should appear in option string). Java GetOpt follows the DEC convention. An option 'letter' can be a letter, number, or most special character. Like getopt(3), GetOpt disallows a colon as an option letter. public class GetOpt { GetOpt is a Java class that provides one method, getopt, and some variables that control behavior of or return additional information from getopt. // The getopt method returns the next, moving left to right, option letter // // // // // // Duplicate command line options are allowed; it is up to user to deal with them as appropriate. A command line option like "-b-" is considered as the two options "b" and "-" (so "-" should appear in option string); this differs from "-b --'. The following notes describe GetOpt's behavior in a few interesting or special cases; these behaviors are consistent with getopt(3)'s behaviors. by itself is treated as a non-option argument. - A '-' - If optString is "a:" and the command line arguments are "-a -x", then "-x" is treated as the argument associated with the "-a". private String[] theArgs = null; private int argCount = 0; private String optString = null; public GetOpt(String[] args, String opts) theArgs = args; argCount = theArgs.length; optString = opts; // user can toggle this to control printing of error messages public boolean optErr = false; public int processArg(String arg, int n) int value; try { value = Integer.parseInt(arg); } catch (NumberFormatException e) if (optErr) System.err.println("processArg cannot process " + arg + " as an integer"); return n; return value; public int tryArgiint k, int n) int value; try { value = processArg(theArgs(k], n); } catch (ArrayIndexOutOfBoundsException e) if (optErr) System.err.println("tryArg: no theArgs[" + k + "]); return n; } return value; } public long processArg(String arg, long n) long value; try ( value = Long.parseLong(arg); } catch (NumberFormatException e) if (optErr) System.err.println("processArg cannot process " + arg + " as a long"); return n; return value; public long tryArg(int k, long n) long value; -- GetOpt.java try Thu Apr 21 02:15:46 2005 ( value = processArg(theArgs[k], n); } catch (ArrayIndexOutOfBoundsException e) if (optErr) System.err.println("tryArg: no theArgs[" return n; + k + "]"); 2 value = processArg(theArgsk], b); } catch (ArrayIndexOutOfBoundsException e) if (optErr) System.err.println("tryArg: no theArgs[" + k + "I"); return b; return value; return value; public String tryArg(int k, String s) String value; public double processArg(String arg, double d) double value; try ( value = Double.valueOf(arg).doubleValue); } catch (NumberFormatException e) if try { value = theArgs[k]; } catch (ArrayIndexOutOfBoundsException e) if (optErr) (optErr) System.err.println("processArg cannot process " + arg System.err.println("tryArg: no theArgs[" return s; + k + "]"); + " as a double"); return value; return d; return value; private static void writeError(String msg, char ch) System.err.println("GetOpt: " + msg + -- + ch); public double tryArg(int k, double d) double value; try ( value = processArg(theArgs[k], public static final int optEOF = -1; d); private int optIndex = 0; public int optIndexGet) {return optIndex;) catch (ArrayIndexOutOfBoundsException e) if (optErr) System.err.println("tryArg: no theArgs[" + k + "]"); return d; private String optArg = null; public String optArgGet() (return optArg;} return value; private int optPosition = 1; public float processArg(String arg, float f) float value; try { value = Float.valueOf(arg).floatValue); catch (NumberFormatException e) if (optErr) System.err.println("processArg cannot process " + arg + " as a float"); return f; return value; public int getopt() optArg = null; if (theArgs == null 11 optString == null) return optEOF; if (optIndex < 0 || optIndex >= argCount) return optEOF; String thisArg = theArgs[optIndex]; int argLength = thisArg.length(; // handle special cases if (argLength <= 1 11 thisArg.charAt(0) != '-') { // e.g., "", "a", "abc", or just return optEOF; } else if (thisArg.equals("--")) { // end of non-option args optIndex++; return optEOF; public float tryArg(int k, float f) float value; try { value = processArg(theArgs(k], f); } catch (ArrayIndexOutOfBoundsException e) if (optErr) System.err.println("tryArg: no theArgs[" + k + "I"); return f; // get next "letter" from option argument char ch = thisArg.charAt(optPosition); // find this option in optString int pos = optString.indexof(ch); if (pos ==-1 ch == ':') { if (optErr) writeError("illegal option", ch); ch = ?'; else { // return value; public boolean processArg(String arg, boolean b) // 'true' in any case mixture is true; anything else is false return Boolean.valueOf(arg).booleanValueo; handle colon, if present if (pos < optString.length()-l && optString.charAt(pos+l) == if (optPosition != argLength-1) ( // take rest of current arg as optArg optArg = thisArg.substring(optPosition+l); optPosition = argLength-1; // force advance to next arg below } else ( // take next arg as optArg optIndex++; if (optIndex < argCount public boolean tryArg(int k, boolean b) boolean value; try { && (theArgs[optIndex].charAt(0) 1= '-' theArgs[optIndex].length() >= 2 && (optString.indexOf(theArgs[optIndex.charAt(1)) == -1 3 Thu Apr 21 02:15:46 2005 GetOpt.java theArgs~optIndex].charAt(l) == optArg = theArgsfoptIndex]; ':'))) { } else { if (optErr) writeError("option requires an argument", ch); optArg = null; not ? // Linux man page for getopt(3) says ch = ':'; // advance to next option argument, // which might be in thisArg or next arg optPosition++; if (optPosition >= argLength) { optIndex++; optPosition = 1; return ch; public static void main(String[] args) ( // test the class GetOpt go = new GetOpt(args, "Uab:f:h:w:"); go.optErr = true; int ch = -1; // process options in command line arguments boolean usagePrint = false; // int aflg = 0; // default // values boolean bflg = false; String filename = "out"; set // of int width = 80; // options double height = 1; // here while ((ch = go.getopt() != go.optEOF) if ((char)ch == 'U') usagePrint else if ((char)ch == 'a') aflg++; else if ((char)ch == 'b') true; = bflg = go.processArg(go.optArgGet), bflg); else if ((char)ch == 'f') filename = go.optArgGet(; else if ((char)ch == 'h') height = go.processArg(go.optArgGet), else if ((char)ch == 'w') width = go.processArg(go.optArgGeto, else System.exit(l); height); width); // undefine d option // getopt) returns '?' if (usagePrint) { System.out.println("Usage: -a -b bool -f file -h height -w width"); System.exit(0); System.out.println("These are all the command line arguments " + "before processing with GetOpt:"); for (int i=0; i<args.length; i++) System.out.print(" " + args[il); System.out.println) ; System.out.println("-U " + usagePrint); System.out.println("-a " + aflg); System.out.println("-b " + bflg); System.out.println("-f ' + filename); System.out.println("-h " + height); System.out.println("-w " + width); // process non-option command line arguments for (int k = go.optIndexGet(; k < args.length; k++) System.out.println("normal argument " + k + " is " + args[k]); } MM MiamexpressImporter.java Thu May 19 22:50:34 2005 package org.experibase.importer; import java.io.IOException; import java.util.ArrayList; import java.util.HashMap; import java.util.Map; import java.util.logging.FileHandler; import java-util.logging.Level; import java.util.logging.Logger; import oracle.jbo.ApplicationModule; import oracle.jbo.Key; import oracle.jbo.Row; import oracle.jbo.RowIterator; import oracle.jbo.ViewObject; import oracle.jbo.client.Configuration; import oracle.jbo.domain.Number; import org.experibase.importer.utils.GetOpt; import org.experibase.miamexpress.ExperimentFactorViewRowImpl; import org.experibase.miamexpress.views.TardesinViewRowImpl; import org.experibase.miamexpress.views.TarrayViewRowImpl; import org.experibase.miamexpress.views.TauthorViewRowImpl; import org.experibase.miamexpress.views.TctlvcbviewRowImpl; import org.experibase.miamexpress.views.TexpfctrViewRowImpl; import org.experibase.miamexpress.views.TexprmntViewRowImpl; import org.experibase.miamexpress.views.TexprtypViewRowImpl; import org.experibase.miamexpress.views.TextractViewRowImpl; import org.experibase.miamexpress.views.ThybridViewRowImpl; import org.experibase.miamexpress.views.TlabelViewRowImpl; import org.experibase.miamexpress.views.TntxsynViewRowImpl; import org.experibase.miamexpress.views.TothersViewRowImpl; import org.experibase.miamexpress.views.TprotclsViewRowImpl; import org.experibase.miamexpress.views.TpublicViewRowImpl; import org.experibase.miamexpress.views.TsubmisViewRowImpl; import org.experibase.miamexpress.views.TlabhybViewRowlmpl; import org.experibase.miamexpress.views.TsampleViewRowImpl; import org.experibase.microarrays.SubmissionviewRowImpl; import oracle.jbo.domain.*; import org.experibase.microarrays.array.ArrayTableViewRowImpl; import org.experibase.microarrays.arraydesign.ArrayDesignTableViewRowlmpl; import org.experibase.microarrays.bioassay.PhysicalBioAssayViewRowImpl; import org.experibase.microarrays.biomaterial.BioSampleViewRowImpl; import org.experibase.microarrays.biomaterial.LabeledExtractViewRowImpl; import org.experibase.microarrays.bqs.PersonViewRowImpl; import org.experibase.microarrays.bqs.PublicationAuthorViewRowImpl; import org.experibase.microarrays.bqs.PublicationViewRowImpl; import org.experibase.microarrays.description.DescriptionTableViewRowImpl; import org.experibase.microarrays.experiment.ExperimentDesignViewRowImpl; import org.experibase.microarrays.experiment.ExperimentTableViewRowImpl; import org.experibase.microarrays.experiment.ExperimentTypeAssocViewRowImpl; import org.experibase.microarrays.experiment.ExperimentTypeViewRowImpl; import org.experibase.microarrays.experiment.ExperimentalFactorAssocViewRowImpl; import org.experibase.microarrays.experiment.ExperimentalFactorViewRowImpl; 1 private private private private private private private int experibaseId; int miamexpressId; ApplicationModule eModule; ApplicationModule mxModule; TsubmisViewRowImpl submisRow; SubmissionViewRowImpl subImpl; boolean createMode = true; public MiamexpressImporter(int experibaseId, int miamexpressId) this.experibaseId = experibaseId; this.miamexpressId = miamexpressId; eModule = Configuration.createRootApplicationModule("org.experibase.microarrays.Ex peribaseModule", "ExperibaseModuleLocal"); mxModule = Configuration.createRootApplicationModule("org.experibase.miamexpress.v iews.MiamexpressModule", "MiamexpressModuleLocal"); public void importData() findMXSubmissiono; findExperibaseSubmission(); importExperimentData(); private void persist() logger.info("Committing all changes made to the db'); try eModule.getTransaction().commit(); logger.info("All changes commited to db"); catch (Exception e) logger.log(Level.WARNING, "Exception thrown in committing to database", e); private void findMXSubmission() "+ miamexpressId); logger.info("finding submission with id ViewObject submis = mxModule.findViewObject("TsubmisViewl"); if(submis == null) logger.log(Level.WARNING, "Could not find miamexpress submission"); throw new RuntimeException("Cant find submision"); } Key key = new Key(new Objectt]{new Number(miamexpressId)}); Row[] rows = submis.findByKey(key, 1); * Imports data in MIAMExpress into Experibase if(rows.length == 0) * In order to import data from MIAMExpress, the MIAMExpress submision Id is needed to find the data to import. In order to maintain the tie to the MIAMExpress submission with an Experibase project, pass the experibase project Id a s logger-log(Level.WARNING, "Could not find miamexpress submission"); System.err.println("Error finding miamexpress submission"); throw new RuntimeException("Can't find miamexpress submission"); * well. Data mapping was gained from reverse engineering MIAMExpress mage-ml converter tool provided with miamexpress submisRow = (TsubmisViewRowImpl)rows[0]; logger.info("Found miamexpress submission"); */ public class MiamexpressImporter private void findExperibaseSubmission() private static Logger logger = Logger.getLogger("org.experibase"); MiamexpressImporter.java Thu May 19 22:50:34 2005 2 ViewObject submis = eModule. findViewObject ( "ExperimentSubmissionView"); submis. setWhereClauseParam( O, new Number(experibaseId)); submis . executeQuery (); / /LABELEDHYBRIDS Rowlterator labelHybridIterator = submisRow.getTlabhybView(); // if(submis.hasNext() logger.info("Found existing row"); subImpl = (SubmissionViewRowImpl)submis .next ); if(subImpl.getExperibaseSubId().intValue() != experibaseId) logger.warning("inconsistent state, miamexpress submission added previously wi th a different submision id. Exiting"); System.exit(0); / //{ // while(labelHybridlterator.hasNext() TlabhybViewRowImpl labeledHybrid = (TlabhybViewRowImpl)labelHybridIterator.nex to; // // if(create) createLabeledHybrid(labeledHybrid); // } persist(); } createMode = false; logger.info("In update mode"); private void createPublication(TpublicViewRowImpl tpub) else ViewObject allSubmissions = eModule.findViewObject("SubmissionView"); logger.info("Found Submission View"); createMode = true; subImpl = (SubmissionViewRowImpl)allSubmissions.createRow); subImpl.setExperibaseSubId(new Number(experibaseId)); allSubmissions.insertRow(subImpl); logger.info("Created new row in submission table"); eModule. getTransaction) . commit); logger.info("Commit initial submission"); logger.info("in create mode"); RowIterator pubs = subImpl.getPublicationView(); PublicationViewRowImpl pub = (PublicationViewRowImpl)pubs.createRow); String identifier = "MIAMEXPRESS:PUBLICATION:"+tpub.getTpublicSysuido; pub.setIdentifier(identifier); pub.setFirstPage(tpub.getTpublicFirstPage() .toString (); pub.setLastPage(tpub.getTpublicLastPage() .toString)); pub.setTitle(tpub.getTpublicTitleo()); pub.setYear(tpub.getTpublicYear().toString)); pub.setVolume (tpub.getTpublicVolume )); pub.setStatus(getCV(new Number(tpub.getTpublicStatus () . intValue)))); pub.setJournal(getCV(new Number(tpub.getTpublicPublication() .intValue()))); pubs.insertRow(pub); persist); } * Imports experiment data into Experibase * private void importExperimentData ) //adding authors RowIterator tauthors = tpub.getTauthorViewl(); RowIterator assocs = pub.getPublicationAuthorViewo; while (tauthors .hasNext () TauthorViewRowImpl tauthor = (TauthorViewRowImpl)tauthors . next); ViewObject persons = eModule.findViewObject ("PersonViewl"); Row person = persons.createRow(); person.setAttribute("FirstName", tauthor.getTauthorFname()); person.setAttribute("LastName", tauthor.getTauthorLname()); person.setAttribute("MidInitials", tauthor.getTauthorInitial()); persons.insertRow(person); //EXPERIMENTS RowIterator texperimentIterator = if(!texperimentIterator.hasNext() submisRow.getTexprmntView(); logger.log(Level.WARNING, "Error in MIAMExpress"); return; TexprmntViewRowImpl experiment = (TexprmntViewRowImpl)texperimentIterator.next (); logger .info ("experiment id: "+experiment.getTexprmntExprid)); if(createMode) createExperiment(experiment); else PublicationAuthorViewRowImpl assoc = (PublicationAuthorViewRowImpl) assocs .createRowo; assoc.setAuthorId( (Number)person.getAttribute("Id")); assoc.setPublicationId(pub.getId().getSequenceNumber)); assocs.insertRow(assoc); updateExperiment(experiment); //PUBLICATION RowIterator tpubs = submisRow.getTpublicView(); while (tpubs .hasNext () private void updatePublication (TpublicViewRowImpl tpub) PublicationViewRowImpl pub = getPublicationForTpub(tpub); TpublicViewRowImpl tpub = (TpublicViewRowImpl)tpubs . next); if(createMode) createPublication(tpub); if (pub == null) { createPublication(tpub); return; } String identifier = else "MIAMEXPRESS:PUBLICATION:"+tpub.getTpublicSysuid(; updatePublication(tpub); pub.setIdentifier(identifier); MiamexpressImporter. java Thu May 19 22:50:34 2005 3 String value = getCV(new Number(texpfct.getTexpfctrId().intValue())); ExperimentalFactorViewRowImpl factor = getExperimentFactorForValue(value); assoc.setExperimentalFactorId(factor.getId().getSequenceNumber)); expFactorsAssocIter.insertRow(assoc); pub.setFirstPage(tpub.getTpublicFirstPage().toString)); pub.setLastPage(tpub.getTpublicLastPage() .toString()); pub.setTitle(tpub.getTpublicTitle()); pub.setYear(tpub.getTpublicYear().toString()); pub.setVolume(tpub.getTpublicVolume()); pub.setStatus(getCV(new Number(tpub.getTpublicStatus).intValue()))); pub.setJournal(getCV(new Number(tpub.getTpublicPublication().intValue()))); //Experiment Types RowIterator texpTypesIter = row.getTexprtypView(); RowIterator experimentTypeAssocIter = experimentDesign.getExperimentTypeAssocView( while(texpTypesIter.hasNext() //adding authors RowIterator tauthors = tpub.getTauthorViewl(); RowIterator assocs = pub.getPublicationAuthorView); while(tauthors.hasNexto) TexprtypViewRowImpl texpType = (TexprtypViewRowImpl)texpTypesIter.next(); ExperimentTypeAssocViewRowImpl assoc = (ExperimentTypeAssocViewRowImpl)experim entTypeAssocIter.createRow(); TauthorViewRowImpl tauthor = (TauthorViewRowImpl)tauthors.next)); ViewObject persons = eModule.findViewObject("PersonViewl"); Row person = persons-createRowo; String value = getCV(new Number(texpType.getTexprtypId().intValue())); ExperimentTypeViewRowImpl expType = getExperimentTypeForValue(value); assoc.setExperimentTypeId(expType.getId().getSequenceNumber(); experimentTypeAssocIter.insertRow(assoc); person.setAttribute("FirstName", tauthor.getTauthorFname)); person.setAttribute( "LastName", tauthor.getTauthorLname()); person.setAttribute("MidInitials", tauthor.getTauthorInitial()); persons.insertRow(person); RowIterator tothersIter = row.getTothersView); PublicationAuthorViewRowImpl assoc = (PublicationAuthorViewRowImpl) assocs .createRowo; while(tothersIter.hasNext() assoc.setAuthorId((Number)person.getAttribute("Id")); assoc.setPublicationId(pub.getId().getSequenceNumber)); TothersViewRowImpl tothers = (TothersViewRowImpl)tothersIter.next(); if(tothers.getTothersId().equals("EXPERIMENTTYPE")) assocs.insertRow(assoc); I private void createExperiment(TexprmntViewRowImpl row) //Get experiment table in experibase RowIterator experimentIter = subImpl.getExperiments); ExperimentTypeAssocViewRowImpl assoc = (ExperimentTypeAssocViewRowImpl)exp erimentTypeAssocIter.createRow(); ExperimentTypeViewRowImpl expType = getExperimentTypeForValue(tothers.getT othersValue()); assoc.setExperimentTypeId(expType.getId().getSequenceNumber()); experimentTypeAssocIter.insertRow(assoc); else if(tothers.getTothersId().equals("EXPERIMENTALFACTOR")) //identifier I String identifier = "MIAMEXPRESS:EXPERIMENT"+row.getTexprmntExprid(; //add experiment ExperimentTableViewRowImpl newExperiment = (ExperimentTableViewRowImpl)experimentI ter.createRow(); newExperiment.setldentfier(identifier); newExperiment.setName(submisRow.getTsubmisSubDescr()); experimentIter.insertRow(newExperiment); ExperimentalFactorAssocViewRowImpl assoc = (ExperimentalFactorAssocViewRow Impl)expFactorsAssocIter.createRow(); ExperimentalFactorViewRowImpl factor = getExperimentFactorForValue(tothers .getTothersValue()); assoc.setExperimentalFactorId(factor.getId).getSequenceNumber)); expFactorsAssocIter.insertRow(assoc); //addExperiment design ViewObject experimentDesigns = eModule.findViewObject("ExperimentDesignViewl"); ExperimentDesignViewRowImpl experimentDesign = (ExperimentDesignViewRowImpl)experi mentDesigns.createRow(); experimentDesign.setExperimentTableId(newExperiment.getId()); experimentDesign.setDescription(row.getTexprmntDescr()); experimentDesign.setHoldDate(submisRow.getTsubmisHoldDate().toStringo); //create Samples RowIterator tsamplesIter = row.getTsampleView(); while(tsamplesIter.hasNext() I TsampleViewRowImpl tsample = (TsampleViewRowImpl)tsamplesIter.next(); logger-info("Viewing sample id:"+tsample.getTsampleSysuid)); createSample(newExperiment, tsample); experimentDesigns.insertRow(experimentDesign); //Experimental Factors RowIterator texpFactorsIter = persist(); row.getTexpfctrViewo; RowIterator expFactorsAssocIter = experimentDesign.getExperimentalFactorAssocView( } I; while (texpFactorslter.hasNext()) TexpfctrViewRowImpl texpfct = (TexpfctrViewRowImpl)texpFactorsIter.next(); ExperimentalFactorAssocViewRowImpl assoc = (ExperimentalFactorAssocViewRowImpl )expFactorsAssocIter.createRow(); * Updates experibase experiment * @param texperiment */ private void updateExperiment (TexprrmtViewRowlmpl texperiment) MiamexpressiMporter.java Thu May 19 22:50:34 2005 4 //create Samples RowIterator tsamplesIter = texperiment.getTsampleView(); while(tsamplesIter.hasNext() RowIterator experimentIter = subImpl.getExperiments(); String identifier = "MIAMEXPRESS:EXPERIMENT"+texperiment.getTexprmntExprid(); if(!experimentIter.hasNext() TsampleViewRowImpl tsample = (TsampleViewRowImpl)tsamplesIter.next(); logger.info("Viewing sample id:"+tsample.getTsampleSysuid)); BioSampleViewRowImpl sample = getSampleForTsample(tsample); if(sample != null) logger.log(Level.WARNING, "ExperibaseExperiment not found, creating experi ment"); createExperiment(texperiment); return; updateSample(sample, tsample); if(!sample.getExperimentId().equals(experiment.getId()) ExperimentTableViewRowImpl experiment = (ExperimentTableViewRowImpl)experiment sample.setExperimentId(experiment.getId()); Iter.next)); experiment.setIdentfier(identifier); experiment.setName(submisRow.getTsubmisSubDescr)); else ExperimentDesignViewRowImpl experimentDesign = (ExperimentDesignViewRowImpl)experiment.getExperimentD logger.log(Level.WARNING, "Could not find sample, so creating new one"); createSample(experiment, tsample); esignView)); experimentDesign.setDescription(texperiment.getTexprmntDescr)); experimentDesign.setHoldDate(submisRow.getTsubmisHoldDate).toString()); //ExperimentFactors private ExperimentTypeViewRowImpl getExperimentTypeForValue (String value) RowIterator experimentalFactorAssocIter = experimentDesign.getExperimentalFac ViewObject experimentTypes = eModule.findViewObject("ExperimentTypeQueryViewl" torAssocView)); while(experimentalFactorAssoclter.hasNext() experimentalFactorAssocIter.next).remove)); experimentTypes.setWhereClauseParam(O, value); experimentTypes.executeQuery(); texperiment.getTexpfctrView(); RowIterator texpFactorsIter = while(texpFactorsIter.hasNext)) if(experimentTypes.hasNext() I return (ExperimentTypeViewRowImpl)experimentTypes.next); TexpfctrViewRowImpl texpfct = (TexpfctrViewRowImpl)texpFactorsIter.next); ExperimentalFactorAssocViewRowImpl assoc = (ExperimentalFactorAssocViewRow Impl)experimentalFactorAssocIter.createRow(; String value = getCV(new Number(texpfct.getTexpfctrId).intValue())); ExperimentalFactorViewRowImpl factor = getExperimentFactorForValue(value); assoc.setExperimentalFactorId(factor.getId).getSequenceNumber()); experimentalFactorAssocIter.insertRow(assoc); //Experiment Types RowIterator experimentTypeAssocIter = experimentDesign.getExperimentTypeAssocV else experimentTypes = eModule. findViewObject ("ExperimentTypeViewl"); ExperimentTypeViewRowImpl experimentType = (ExperimentTypeViewRowImpl)expe rimentTypes.createRow(); experimentType.setValue(value); experimentType.setSource("MIAMExpress"); experimentType.setDescription("Experiment Type"); experimentTypes.insertRow(experimentType); persist); return experimentType; iew); while(experimentTypeAssocIter.hasNext()) experimentTypeAssocIter.next().remove)); private ExperimentalFactorViewRowImpl getExperimentFactorForValue(String value) ViewObject experimentFactors = eModule.findViewObject("ExperimentFactorQueryVi ewl"); RowIterator texpTypesIter = texperiment.getTexprtypView(); while(texpTypesIter.hasNext()) TexprtypViewRowImpl texpType = (TexprtypViewRowImpl)texpTypesIter.next(); experimentFactors.setWhereClauseParam(O, value); experimentFactors.executeQuery(); if(experimentFactors.hasNext() ExperimentTypeAssocViewRowImpl assoc = (ExperimentTypeAssocViewRowImpl)exp erimentTypeAssocIter.createRow(); String value = getCV(new Number(texpType.getTexprtypId().intValue())); return (ExperimentalFactorViewRowImpl)experimentFactors.next)); else ExperimentTypeViewRowImpl expType = getExperimentTypeForValue(value); assoc.setExperimentTypeId(expType.getId().getSequenceNumber()); experimentTypeAssocIter.insertRow(assoc); experimentFactors = eModule.findViewObject("ExperimentalFactorViewl"); ExperimentalFactorViewRowImpl experimentFactor = (ExperimentalFactorViewRo wImpl)experimentFactors.createRow); experimentFactor.setValue(value); MiamexpressImporter. java Thu May 19 22:50:34 2005 5 sample setTimeUnit(getCV(new Number(tsample.getTsampleTimeUnit().intValue( experimentFactor.setSource("MIAMExpress"); experimentFactor.setDescription('Experiment Type"); experimentFactors.insertRow(experimentFactor); persist)); samples.insertRow(sample); persist)); return experimentFactor; } private void createSample(ExperimentTableViewRowImpl private BioSampleViewRowImpl getSampleForTsample (TsampleViewRowImpl tsample) exp, TsampleViewRowImpl tsamp le) String identifier = "MIAMEXPRESS:BIOSAMPLE:"+tsample.getTsampleSysuid); ViewObject query = eModule.findViewObject("BioSampleQueryViewl"); query.setWhereClauseParam(O, identifier); query.executeQuery(); if(query.hasNext() RowIterator samples = exp.getBioSampleView(); BioSampleViewRowImpl sample = (BioSampleViewRowImpl) samples.createRow); sample-setIdentifier()"MIAMEXPRESS:BIOSAMPLE:"+tsample.getTsampleSysuid)); if(tsample.getTsampleAdditional() != null) sample.setAdditional(tsample.getTsampleAdditional()); if(tsample.getTsampleAgerangeMax() != null) BioSampleViewRowImpl sample = (BioSampleViewRowImpl)query.next)); logger.info("found biosample id: "+sample.getId().getSequenceNumber()); return sample; sample.setAgeRangeMax(new Number(tsample.getTsampleAgerangeMax).floatValu e())); != null) if(tsample.getTsampleAgerangeMin) sample.setAgeRangeMin(new Number(tsample.getTsampleAgerangeMin).floatValu else return null; e()); private PublicationViewRowImpl getPublicationForTpub(TpublicViewRowImpl tpub) != null) if(tsample.getTsampleAgeStatus) sample.setAgeStatus(getCV(new Number(tsample.getTsampleAgeStatus).intValu String identifier = "MIAMEXPRESS:PUBLICATION:"+tpub.getTpublicSysuid(; ViewObject query = eModule.findviewObject("PublicationQueryViewl"); query.setWhereClauseParam(O, identifier); query.executeQuery(); if(query.hasNext()) e( )))); if(tsample.getTsampleCellLine) != null) sample.setCellLine(tsample.getTsampleCellLine)); if(tsample.getTsampleCellProvider() != null) sample.setCellProvider(tsample.getTsampleCellProvider)); != null) if(tsample.getTsampleDevStage) sample.setDevStage(getCV(new Number(tsample.getTsampleDevStage).intValue( PublicationViewRowImpl pub = return pub; if(tsample.getTsampleDiseaseState() != null) sample.setDiseaseState(tsample.getTsampleDiseaseState)); != null) if(tsample.getTsampleGeneticVariation) sample.setGeneticVariation(getCV(new Number(tsample.getTsampleGeneticVaria else return null; private void updateSample(BioSampleViewRowImpl sample, TsampleViewRowImpl tsample) tion().intValue()))); if(tsample.getTsampleIndividual() != null) sample.setIndividual(tsample.getTsampleIndividual()); if(tsample.getTsampleIndividualGen) != null) sample.setIndividualGen(tsample.getTsampleIndividualGen)); != null) if(tsample.getTsampleOrganismPart) sample.setOrganismPart(tsample.getTsampleOrganismPart)); if(tsample.getTsampleSampleType() != null) sample.setSampleType(getCV(new Number(tsample.getTsampleSampleType() .intVa lue()) (PublicationViewRowImpl)query.next)); { if(tsample.getTsampleAdditional() != null) sample.setAdditional(tsample.getTsampleAdditional()); if(tsample.getTsampleAgerangeMax) != null) sample.setAgeRangeMax(new Number(tsample.getTsampleAgerangeMax).floatValu e())); if(tsample.getTsampleAgerangeMin() != null) sample.setAgeRangeMin(new Number(tsample.getTsampleAgerangeMin() . floatValu ); if(tsample.getTsampleSeparationTech) != null) sample.setSeparationTech(getCV(new Number(tsample.getTsampleSeparationTech ).intValue()) e()) e() ); )) if(tsample.getTsampleAgeStatus ()!= null) sample.setAgeStatus(getCV(new Number(tsample.getTsampleAgeStatus ().intValu ); if(tsample.getTsampleSex) != null) sample.setSex(getCV(new Number(tsample.getTsampleSex).intValue)))); if(tsample.getTsampleTargetCellType) != null) sample.setTargetCellType(tsample.getTsampleTargetCellType)); if(tsample.getTntxsynView) = null) if(tsample.getTsampleCellLine() != null) sample.setCellLine(tsample.getTsampleCellLine)); if(tsample.getTsampleCellProvider() !=null) sample.setCellProvider(tsample.getTsampleCellProvider)); if(tsample.getTsampleDevStage) !=null) sample.setDevStage(getCV(new Number(tsample.getTsampleDevStage).intValue( TntxsynViewRowImpl taxonomy = (TntxsynViewRowImpl)tsample. getTntxsynView) sample.setTaxonomy(taxonomy.getTntxsynNameTxt)); if(tsample.getTsampleTimePoint() = null) if(tsample.getTsampleDiseaseState) != null) sample.setDiseaseState(tsample.getTsampleDiseaseState)); if(tsample.getTsampleGeneticVariation() != null) sample.setGeneticVariation(getCV(new Number(tsample.getTsampleGeneticVaria tion).intValue)))); sample.setTimePoint(getCV(new Number(tsample.getTsampleTimePoint).intValu if(tsample.getTsampleIndividual() != null) sample.setIndividual(tsample.getTsampleIndividual()); if(tsample.getTsampleIndividualGen) != null) e())));u if(tsample.getTsampleTimeUnit() != null) sample.setIndividualGen(tsample.getTsampleIndividualGen)); if(tsample.getTsampleOrganismPart() != null) Thu May 19 22:50:34 2005 MiamexpressImporter.java sample.setOrganismPart (tsample.getTsampleOrganismPart )); if(tsample.getTsampleSampleType() !=null) sample.setSampleType(getCV(new Number(tsample.getTsampleSampleType() .intVa lue() )); if(tsample.getTsampleSeparationTech() null) sample.setSeparationTech(getCV(new Number(tsample.getTsampleSeparationTech ().intValue())); null) if(tsample.getTsampleSex() sample.setSex(getCV(new Number(tsample.getTsampleSex() . intValue ())); if (tsample.getTsampleTargetCellType() != null) sample.setTargetCellType(tsample.getTsampleTargetCellType) ); if (tsample.getTntxsynView) != null) TntxsynViewRowImpl taxonomy = (TntxsynViewRowImpl)tsample.getTntxsynView() sample.setTaxonomy(taxonomy.getTntxsynNameTxt)); if (tsample.getTsampleTimePoint) 6 experibasePba.setLabeledExtractId(labeledExtract.getId) .getSequenceNumber // ()); persist ); // // /'/ // // // // //// // // // // // // //import protocols while(protocols.hasNext() TprotclsViewRowImpl protocol = (TprotclsViewRowImpl)protocols.next ); //import scan protocol; if(scanProtocol != null) I = null) sample. setTimePoint (getCV(new Number(tsample.getTsampleTimePoint () . intValu e))))); if (tsample.getTsampleTimeUnit() = null) sample.setTimeUnit(getCV(new Number(tsample.getTsampleTimeUnit () .intValue( private void createLabeledHybrid(TlabhybViewRowImpl row) // // //- ThybridViewRowImpl thybrid = (ThybridViewRowImpl )row.getThybridView(); TlabelViewRowImpl tlabel = (TlabelViewRowImpl)row.getTlabelView(); //I //I TarrayViewRowImpl tarray = // //I // //create pba RowIterator pbas = subImpl.getPhysicalBioAssays); PhysicalBioAssayViewRowImpl pba = (PhysicalBioAssayViewRowImpl)pbas.createRo TextractViewRowImpl textract = (TextractViewRowImpl)tlabel.getTextractView() (TarrayViewRowImpl) thybrid.getTarrayView); wO); /I/ pba. setIdentifier( "MIAMEXPRESS: "+row.getTlabhybSysuid()); //pbas.insertRow(pba); /I/ // //import array //I ViewObject experibaseArrays = eModule.findViewObject("ArrayView"); if(arrays.hasNexto) /////I/I/ // /1/ // //import label //ViewObject labeledExtracts = eModule.findViewObject ("LabeledExtract"); // while(labels.hasNext()) ////TlabelViewRowImpl label = (TlabelViewRowImpl)labels.next(); // /'LabeledExtractViewRowImpl labeledExtract = (LabeledExtractViewRowImpl)labe lad]Extracts.createRow(); //labeledExtract.setName(label.getTlabelId()); // labeledExtracts.insertRow(labeledExtract); ////I persist(); private void importExperimentDesign) // //create record in database // ViewObject experimentDesigns = eModule. findViewObject ("ExperimentDesignViewl"); // ExperimentDesignViewRowImpl row = (ExperimentDesignViewRowImpl)experimentDesigns .createRow ); row.setExperimentTableId(expImpl.getId()); // // // //add types RowIterator types = row.getExperimentTypeView); // RowIterator ctlvcbs = mx-exp.getExperimentTypeView(); // // while(ctlvcbs.hasNext() // // ( TctlvcbViewRowImpl ctlvcb = (TctlvcbViewRowImpl)ctlvcbs.next ); // ExperimentTypeViewRowImpl type = (ExperimentTypeViewRowImpl)types. createRow); I/ type.setSource("MIAMEXPRESS"); // type.setDescription(ctlvcb.getTctlvcbDescr()); // type.setValue(ctlvcb.getTctlvcbValue)); // types.insertRow(type); // // // } // /I/ xperimentDesigns .insertRow (row); private String getCV(Number number) ViewObject tctlvcbView = mxModule. findViewObject ("TctlvcbViewl"); logger.info("Finding ctrlvcb with key " + number); Key key = new Key(new Object[]{number}); Row[] rows = tctlvcbView.findByKey(key, 1); logger.info("found " + rows.length); if(rows.length == 1) { TctlvcbViewRowImpl row = (TctlvcbViewRowImpl)rows[0]; return row.getTctlvcbValue); else return null; MiamexpressInporter.java Thu May 19 22:50:34 2005 7 */ static void doHelp(int returnValue) System.err.println("Usage: MiameExpressImporter -m miamexpressId -e experibase Id"); Transfers data from Miamexpress to Experibase. First argument is miamexpress * submisison Id, second argument is experibase submission Id; * @param args First argument is miamexpress submisison Id, second argument is exper ibase submission Id; */ public static void main(String[] args) * try { FileHandler fh = new FileHandler("miamexpesslog.txt", 5242880, 1, false); logger. addHandler (fh); catch (IOException e) { //ignore logger. setLevel (Level.WARNING); GetOpt go = new GetOpt(args, "vhm:e:"); go.optErr = true; int ch = -1; // process options in command line arguments boolean usagePrint = false; int miamexpressId = -1; int experibaseld = -1; while ((ch = go.getopt()) !=go.optEOF) ((char)ch == 'h') usagePrint = true; if else if ((char)ch == 'im') miamexpressId = go.processArg(go.optArgGet(), miamexpressId); logger.info("MiamexpressId is " + miamexpressId); else if ((char)ch == 'e') experibaseId = go.processArg(go.optArgGet(), experibaseId); logger.info("ExperibaeId is " + experibaseId); } else if ((char)ch == 'v') logger.setLevel(Level.ALL); else doHelp(l); // undefined option if (usagePrint) doHelp(0); if(miamexpressld < 0 11 experibaseId < 0) doHelp(l); else MiamexpressImporter importer = new MiamexpressImporter(experibaseId, miamexpre ssId) importer. importData ); importer. persist(); Stub for providing help on usage /** * You can write a longer help than this, certainly. System.err.println("\tp\tExperibaseId"); System.err.println("\th\tPrints this menu"); logger.info("help shown with return value " + returnValue); System.exit(returnValue); NameValuePair. java Wed Mar 09 01:53:30 2005 1 /* * * @param value The value to set. */ public void setValue(String value) this.value = value; Created on Mar 8, 2005 * TODO To change the template for this generated file go to * Window - Preferences - Java - Code Style - Code Templates */ package edu.mit.data.affymetrix; } import java.io.Serializable; * @author Aidan Downes * TODO To change the template for this generated type comment go to * Window - Preferences - Java - Code Style - Code Templates */ public class NameValuePair implements /* * Serializable{ (non-Javadoc) @see java.lang.Object#toString) public String toString() return new StringBuffer() .append(name) .append("=>") .append(value).toSt ring() ; for <code>serialVersionUID</code> */ private static final long serialVersionUID = private String name; private String value; * Comment 1L; */ public NameValuePair() super (; * @param name @param value */ public NameValuePair(String name, String value) * super(); this.name = name; this.value = value; * @return Returns the name. */ public String getName() return name; @param name The name to set. */ public void setName(String name) this.name = name; * @return Returns the value. */ public String getValue) return value; AffymetrixImporter.java Thu May 12 13:36:46 2005 package org.experibase.importer; import edu.mit.data.affymetrix.CELData; import edu.mit.data.affymetrix.CELFileEntryType; import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import import edu.mit.data.affymetrix.CELHeaderData; edu.mit.data.affymetrix.CHPData; edu.mit.data.affymetrix.CHPHeaderData; edu.mit.data.affymetrix.ExperimentData; edu.mit.data.affymetrix.ExpressionProbeSetResults; edu.mit.data.affymetrix.NameValuePair; java.io.IOException; java.io.StringWriter; java.util.Iterator; java.util.List; java.util.logging.Level; java.util.logging.Logger; oracle.jbo.domain.Number; org.biomage.Array.ArrayManufacture; org.biomage.Array.Array-package; org.biomage.ArrayDesign.ArrayDesign-package; org.biomage.ArrayDesign.PhysicalArrayDesign; org.biomage.AuditAndSecurity.AuditAndSecurity-package; org.biomage.AuditAndSecurity.Person; org.biomage.BioAssay.BioAssaypackage; org.biomage.BioAssay.Channel; org.biomage.BioAssay.Hybridization; org.biomage.BioAssay.ImageAcquisition; org.biomage.BioAssay.PhysicalBioAssay; org.biomage.BioAssayData.BioAssayData-package; org.biomage.BioAssayData.DerivedBioAssayData; org.biomage.BioAssayData.Transformation; org.biomage.BioEvent.BioEvent-package; org.biomage.BioMaterial.BioMaterialpackage; org.biomage.BioMaterial.BioSource; org.biomage.BioMaterial.Compound; org.biomage.Common.MAGEJava; org.biomage.Common.NameValueType; org.biomage.Description.Description; org.biomage.Description.OntologyEntry; org.biomage.Experiment.Experiment; org.biomage.Experiment.Experimentpackage; org.biomage.Protocol.Hardware; org.biomage.Protocol.HardwareApplication; org.biomage.Protocol.Parameter; org.biomage.Protocol.ParameterValue; org.biomage.Protocol.Protocol; org.biomage.Protocol.ProtocolApplication; org.biomage.Protocol.Protocol-package; oracle.jbo.*; oracle.jbo.domain.*; oracle.jbo.client.Configuration; org.experibase.microarrays.SubmissionViewRowImpl; org.experibase.microarrays.arraydesign.ArrayDesignTableViewRowImpl; org.experibase.microarrays.bioassaydata.AffyCelDataViewRowImpl; org.experibase.microarrays.bioassaydata.AffyChpDataViewRowImpl; org.experibase.microarrays.bioassaydata.CELAnalysisViewRowImpl; org.experibase.microarrays.bioassaydata.CHPAnalysisViewRowImpl; org.experibase.microarrays.bioassaydata.DerivedBioAssayDataViewRowImpl; org.experibase.microarrays.bioassaydata.MeasuredBioAssayDataViewImpl; org.experibase.microarrays.bioassaydata.MeasuredBioAssayDataViewRowImpl; org.experibase.microarrays.common.*; import org.experibase.microarrays.experiment.ExperimentTableViewRowImpl; BioEventpackage bep; private Protocolpackage pp; private BioMaterial-package bmp; AuditAndSecurity-package aasp; private private Array-package ap; private ArrayDesign-package adp; private Experiment-package ep; private Experiment exp; private BioAssayData-package badp; private Protocol protocol; private static Logger logger = Logger.getLogger("org.experibase"); private ApplicationModule eModule; private ExperimentTableViewRowImpl expImpl; private SubmissionViewRowImpl subImpl; private AffyChpDataViewRowImpl dbaImpl; private AffyCelDataViewRowImpl mbaImpl; private * Creates importer for particular affymetrix submission * @param gId The Experibase GroupId * @param pId The Experibase ProjectId public AffymetrixImporter(int eId) this.experibaseId = eId; init(); private void init() logger.entering("AffymetrixImporter", "init"); eModule = Configuration.createRootApplicationModule("org.experibase.microarrays.Ex peribaseModule", "ExperibaseModuleLocal"); logger.info("Found application module"); findSubmission(); RowIterator iter = subImpl.getExperiments); if(iter.hasNext() expImpl = (ExperimentTableViewRowImpl)iter.next)); logger.info("Found existing experiment"); else expImpl = (ExperimentTableViewRowImpl) iter.createRow(); iter.insertRow(expImpl); logger.info("Created new row in experiment table"); logger.exiting("AffymetrixImporter", "init"); private void findSubmission() ViewObject submis = eModule.findViewObject("ExperimentSubmissionView"); submis.setWhereClauseParam(O, new Number(experibaseId)); submis.executeQuery(); public class AffymetrixImporter private int experibaseId; private MAGEJava mage; private BioAssay-package bap; if(submis.hasNext() logger.info("Found existing row"); subImpl = (SubmissionViewRowImpl)submis.next();