MICROARRAY SUB\SIONS TO
EXPERIBASE
by
Aidan Rawle Downes
Submitted to the Department of Electrical Engineering and Computer Science
in Partial Fulfillment of the Requirements for the Degrees of
Master of Engineering in Electrical Engineering and Computer Science
MSCHVFMW
OF TECHN*OLGY
at the Massachusetts Institute of Technology
May 19, 2005
20
JULe
3
28
Copyright 2005 Aidan R. Downes. All rights reserve
.
LIBRARIES
L RR
The author hereby grants to M.I.T. permission to reproduce and
distribute publicly paper and electronic copies of this thesis
and to grant others the right to do so.
Author
Department of Electrical Engineering and Computer Science
May 19, 2005
Certified by-
______
C ertif ed b yF
or b e s D e e y , J r
Thesis 4ervisor
Accepted by
Ac b..
AkLahdr C. Smith
Chairman, Department Committee on Graduate Theses
BARKER
MASSACHUSSETTS INSTITUTE
OF TECNOLOGY
ABSTRACT
MICROARRAY SUBMISSIONS TO
EXPERIBASE
By Aidan Rawle Downes
Thesis Supervisor:
Professor C. Forbes Dewey, Jr.
Professor of Mechanical Engineering / Bioengineering
Experibase is an experimental database that supports the storage of data from
leading biological experiment techniques. Experibase ontology was extended to
include a robust representation of microarray data, a leading experimental
technique. The microarray submission system takes advantage of Experibase's
new microarray storage capabilities by allowing biologist to submit microarray
data to Experibase using an application that they are already familiar with. The
transformation of data from the submitted format to a format suitable for
Experibase takes place without the submitter's knowledge, reducing the need for
an Experibase specific submission application.
TABLE OF CONTENTS
Introduction ......................................................................................................................... 1
T hesis O utline ............................................................................................................... 2
O verview of R elevant Technologies ................................................................................ 3
D N A M icroarrays ........................................................................................................ 3
E xperibase ..................................................................................................................... 4
The M IA ME Standard ................................................................................................ 6
A.rrayExpress and M IAM Express ............................................................................. 8
D esign .................................................................................................................................. 10
Project M otivation ..................................................................................................... 10
Project R equirem ents ................................................................................................ 10
Project D esign ............................................................................................................. 11
MLkM Express Custom izations ...................................................................................... 14
A utom atic U ser Login and/or R egistration .......................................................... 14
D ata Synchronization ................................................................................................ 15
E xperibase M icroarray Schem a ...................................................................................... 20
Introduction ................................................................................................................ 20
E xperibase Com m on Package ................................................................................. 21
Study Plan .................................................................................................................... 22
A dministration Package ............................................................................................ 24
Sam ple .......................................................................................................................... 25
E xperim ent Package .................................................................................................. 28
Results A nd Conclusions ................................................................................................. 31
LIST OF FIGURES
Number
Page
4
Figure 1: Image scan of a hybridized microarray ............................................................................
6
Figure 2: The five packages of Experibase and their relations ......................................................
Figure 3: Overview of the components of the system and the relationships
12
b etw een them ...............................................................................................................................
Figure 4: Miamexpress Submission data tables and all its related tables.....................................18
Figure 5: Flow of data as submitted through MIAMExpress.......................................................19
Figure 6: The identifiable interface, which is realized by all Experibase Classes......................21
23
Figure 7: A UML Class diagram of the Study Plan Package .............................................................
24
Figure 8: The Administration package ..............................................................................................
Figure 9: Sample.PhysicalSample class Diagram............................................................................27
Figure 10: Sample.MeasuredSample sub-package..........................................................................28
Figure 11: The experiment package.................................................................................................
29
Figure 12:Experiment.Protocol sub-package diagram ..................................................................
30
ii
ACKNOWLEDGMENTS
The author wishes to acknowledge his Supervisor, C. Forbes Dewey, for his
guidance and help through out the year. I would also like to thank my colleagues,
Howard Chou and Shiva Ayyadurai, and Ronald Taylor and Abigail Corrigan at
PNNL.
I would like to thank my family for being there for me, especially my mother,
father, and sister. They are most important people in the world, and I know they
think the same of me.
iii
GLOSSARY
Bioinformatics. The use of ideas, algorithms and techniques from applied
mathematics, informatics, statistics, and computer science to solve or aid in the
solving of biological problems.
Microarray. An array of DNA sequences, grouped into spots (also know as
probes) attached to solid surface like glass, or silicon. Used in experiments to
measure the expression level of genes attached to the array.
MIAME. Minimum Information about Microarray Experiments. A standard that
defines the minimum information that must be reported about microarray
experiments, in order to ensure the interpretability of the experimental results
generated as well as their potential independent verification.
MAGE. Microarray Gene Expression. A group that aims to provide a standard
for the representation of microarray expression data that would facilitate the
exchange of microarray information between different data systems. Responsible
for MAGE-OM, a microarray object model, and MAGE-ML, a microarray data
exchange format
Ontology. a concise and unambiguous description of principle relevant entities
with their potential, valid relations to each other.
iv
Chapter
1
INTRODUCTION
DNA microarrays provide a simple and natural vehicle for exploring the genome
in a way that is both systematic and comprehensive [1]. Using the concept of
complimentary base parings in DNA and RNA components, DNA microarrays
allow biologists to study various biological phenomena such as gene expression.
Biologists can determine what genes are being expressed in a given sample and
how actively that gene is being expressed. Other advantages of microarrays are
that they are relatively cheap to produce, can be produced quickly and they are
easy to control [1]. It is not a surprise that experimental labs that use DNA
microarrays produce a lot of data. Data includes image data from scanners,
results from software analysis packages, and data about the experiment itself,
including protocols used, samples used, and the experiment design. In order to
take full advantage of experiment results, the data produced should be stored in a
medium where they are easily accessible, like an experimental database. This
thesis addresses the need for storage mediums for microarray data, the data input,
and its exportation.
This thesis consists of the design and implementation of a system for collecting
output data from Microarray experiments, processing that data and storing the
resulting information in an experimental database. This system consists of three
distinct components: a web application for entering microarray experiment data;
another web application for system and user administration; and an experimental
database, Experibase [2].
The data entry web application is a modified version of the MLAMExpress web
application (a data submission tool developed and supported by the European
Bioinformatics Institute (EBI) [3]). The target user base of the system will be
experimental
of these
biologists. Many
biologists
are
familiar
with
the
MIAMExpress user interface and work flow. Using an augmented version of
MIAMExpress reduces the time that these biologists have to devote to learning a
new application user interface.
The second component of the system is the user administration web application.
This application controls access to the submission tool using individual and
group security. This component also provides some access to the experiment al
database. Users of this application can export data from the experimental
database in XML format to other databases.
The third component
of the
system is
the experimental
database.
An
experimental database is a database whose schema (ontology) is designed for the
storing of experimental data. The experimental database of choice is Experibase.
Experibase accommodates the storing and retrieval from leading biological
experiment techniques in a single database. The Experibase schema was updated
to make it MIAME (Minimum Information about Microarray Experiments)
compliant. MIAME is a proposed standard that describes the information that
should be stored for microarray experiments.
Thesis Outline
Chapter 2 presents an overview of relevant technologies used throughout this
thesis document. Chapter 3 discusses the system design. Chapter 4 discusses the
customizations made to MIAMExpress. Chapter 5 discusses the Experibase
schema. Chapter 6 reports conclusion.
2
Chapter 2
OVERVIEW OF RELEVANT TECHNOLOGIES
DNA Microarrays
A DNA microarray consists of an array of DNA sequences, grouped into spots
(also known as probes) and attached to a solid surface like glass, or silicon. The
most common use of microarrays is to measure mRNAs transcribed by different
genes found on the microarray. RNA is extracted from sample cells and then
converted to cDNA or cRNA. The resulting cDNA or cRNA is then tagged with
a fluorescent compound. Because of complimentary DNA base pairings, a cDNA
or cRNA strand will hybridize with the probe that contains a DNA sequence
complimentary
to
the
or
cDNA
cRNA
own
sequeunce.
Spots where cDNA or CRNa sequences have hybridized can be detected by
visually by the fluorescent glow emitted by the hybridized sequences.
The fluorescence intensity of each probe is analyzed by software packages. The
level of intensity of a probe indicates whether the cells in the sample have
recently transcribed, or ceased transcription, of a gene that contains the probed
sequence. The intensity of the fluorescence is proportional to the number of
copies of a particular mRNA that were present and is used to quantify the
expression level of the gene.
Microarrays are usually used to determine what genes in a cell become active or
deactivates when the experiment conditions have changed. Figure 1 shows an
image scan of microarray. Two different samples were used. One sample was
3
labeled with a red fluorescent compound and the other with a green fluorescent
compound. The yellow spots indicate genes that were expressed in both samples.
Figure 1: Image scan of a hybridized
microarray
Experibase
Experibase is an experimental database designed by Professor C. Forbes Dewey
group at M.I.T. Its data model was designed by forming a composite of the data
storage needs of several leading experimental techniques [2]. As result Experibase
4
can store data from leading experimental techniques. Currently the data model
supports data from the following experimental techniques:
"
*
"
Gel Electrophoresis
Flow Cytometry
Mass Spectrometry
The schema can also be expanded to add data from even more experimental
techniques.
One of the major ideas behind Experibase is that most biological experimental
data models can be partitioned into five distinct packages in such away that data
entities within a package share similarities in function, data stored, and
relationships with other data entities. This similarity is exploited in the Experibase
data model by the creation of data entities within a package that contains data
found in all experimental techniques. Data entities that are specific to a particular
experimental technique inherit from these common data entities.
The five packages of Experibase are: study plan, sample, experiment, high level
analysis, and administration package [2]. Figure 2 shows the five packages and the
relationships between the packages. The individual package descriptions are as
follows:
*
Study Plan Package: Data classes in this package describe information
about experimental projects. This includes hypotheses, references, and
project reports.
"
Sample Package: Data classes in this package describe information about
experimental samples. This includes ideal biological samples, and derived
samples
5
Experiment Package: Data classes in this package describe information
*
about
experiments.
This
experiment
includes
design,
experiment
protocols, raw data, and preprocessed data
High Level Analysis Package: This package contains classes that represent
*
advanced analysis of the experiment results. This can include data output
from analytic and statistics software applications.
*
Administration Package: This package contains classes that represent
contact, audit and security information about an experiment. This
includes the experimenter, laboratory, institution and permissions.
StudyPlan
Package
Sample
Package
I
..
Administration
Package
HightevelAnalysis
Package
Experiment
Package
A *B Dependency. The changes of A can cause changes in B.
Reference
Figure 2: The five packages of Experibase and their relations
The MIAME Standard
The Minimum Information about Microarray Experiments (MIAME) standard
describes the information that microarray data sources should contain.
6
The
standard was created with the belief that is necessary to define the minimum
information that must be reported, in order to ensure the interpretability of the
experimental results generated using microarrays as well as their potential
independent verification [5]. MIAME was created and is maintained by the
Microarray Gene Expression Database group (MGED), a widely supported
group that creates microarray data standards. The MIAME standard consists of
six parts: experimental design, array design, samples, hybridizations, and
normalization [5]. Each part describes the data that should be represented in any
MIAME compliant database.
Experibase was extended to store microarray data. The MIAME standard was
used a guide line in the designing of the microarray experiment data model. The
following guidelines were followed in the extension of Experibase:
"
Information stored for Experiment Design:
o
The goal or name of the experiment
o
A brief description of the experiment
o
Keywords, for example, time course, cell type comparison, array
CGH.
o
Experimental factors - the parameters or conditions tested, such
as time, dose, or genetic variation
o
Experimental design - relationships between samples, treatments,
extracts, labeling, and arrays
o
Quality control steps taken
o
Links to the publication, any supplemental websites or database
accession numbers.
*
Information stored about the samples used, extract preparation and
labeling:
o
The origin of each biological sample and its characteristics
7
o
Manipulation of biological samples and protocols used
o
Experimental factor value for each experimental factor, for each
sample
o
Technical protocols for preparing the hybridization extract and
labeling.
"
Information stored about Hybridization procedures and parameters:
o
The protocol and conditions used for hybridization, blocking and
washing.
"
Measurement data stored:
o
The raw data; the feature extraction output from the array
scanner. This includes the intensity for each probe on the array.
o
The normalized and summarized data; this can include the
averaged normalized log ratios of the intensities.
o
Image
scanning
hardware
and
software,
and processing
procedures and parameters.
"
Array Design:
o
Array platform, surface and coating specifications.
o
Spotting protocols and product information for commercial array
designs.
o
Array spot and reporter information. This includes the location of
each spot.
ArrayExpress and MIAMExpress
ArrayExpress is a public database of microarray gene expression data at the EBI.
It is a generic gene expression database designed to hold data from all microarray
platforms. The ArrayExpress object model is based on MAGE-OM (Microarray
Gene Expresssion Object Model), an object model for microarray experiment
8
developed by the MGED [4]. Using MAGE-OM as the object model ensures
that ArrayExpress is MIAME compliant as MAGE-OM is MIAME compliant.
ArrayExpress accepts data in MAGE-ML (Microarray Gene Expresssion Markup
Lanuage). MAGE-ML is an xml schema based on MAGE-OM
MIAMExpess is a data submission web application developed by the EBI. Data
entered into MIAMExpress can be exported in a XML format, suitable for
acceptance by ArrayExpress. It is a well designed application with a simple and
very intuitive user interface. Like its name suggests, MIAMExpress is MIAME
compliant.
MIAMExpress is a Perl CGI application, which stores most of its data in a
MYSQL database. Raw and preprocessed experiment data files are not parsed,
but instead these files are stored on the server's file system. The file locations are
stored in the database instead.
9
Chapter 3
DESIGN
Project Motivation
The project was commissioned by Pacific Northwest National Labs (PNNL).
PNNL were one of the initial adopters of Experibase. Experibase was used
internally to store data from experiments occurring at the lab. The initial version
of Experibase microarray data storage capabilities was immature. Specifically,
there was no user interface for entering microarray data, no export capabilities,
and the schema did not support the storage of data from experiment files
generated from microarray scanners. PNNL were committed to using Experibase
because of its single database for many experiments property. Therefore PNNL
commissioned Professor C.F. Dewey's group at M.I.T. to add the needed features
to Experibase.
Project Requirements
The project had the following requirements:
1.
Build a web application where users can enter experimental data from
microarray experiments.
2. Web application should be easy to use and grasp intuitively.
3.
Experimental data should be stored in an Experibase instance.
4. Web application should interoperate with an existing web application for
Experibase administration.
10
5.
Data stored in Experibase should be exportable to ArrayExpress for
journal publication purposes.
Project Design
Figure 3 displays a schematic illustrating the network layout of the system. The
system is a distributed application consisting of three components that can be
deployed to different servers. MLAMExpress is chosen as the data submission
tool because of its maturity. The MLAMvExpress development project has been
active for at least three year with several stable versions available for download.
Also,
because
the
software
is
developed
and maintained
MIAMExpress adheres to the latest microarray standards.
11
by the
EBI,
Enter
data
Client
Logm and request
NMiiroarray Experanent
User presented with experiment
submNsion form
Subrnisann
CreAtes new submission or
opens existing submision
Experibe minstratini
web appliation
Check user credentah
and returns existam submisions
Stores experient data
in Experibase
Instalation
s
Experibase Instance
Figure 3: Overview of the components of the
system and the relationships between them
The web applications communicate with the database using standard database
connections. The web applications communicate with each other over HTTP.
Any state necessary for the completion of an experiment submission is passed to
the MIAMExpress web application over http
The database provides storage services to both web applications.
The
Administration web-application depends on the database for storage and access
of user account information, project information and user groups. The
12
MIAMExpress submission does not need to depend on the database for storage
(it has its own internal database). However, since the project requires that all data
are stored in Experibase, the MIAMExpress web application is rewritten to store
information in Experibase.
The database schema is an important part of the design as it determines what
information can and cannot be exported to external databases. The designed data
schema is MIAME compliant, and is heavily influenced by the MAGE object
module. The schema was designed to take full advantage of Experibase's
common components whenever possible, to facilitate better interoperability with
data models for other experimental techniques.
13
Chapter 4
MIAMEXPRESS CUSTOMIZATIONS
Automatic User Login and/or Registration
Both the Experibase administration web application and MIAMExpress have
they own user authentication system. Consequently a potential user must register
with both applications to use the system, which is undesirable. The solution was
to automatically register the user with the MIAMExpress web application.
The main entrance of the system is through the Experibase Administration web
application. New system users register for a user account which will gain them
access to that application. After the Administration application has successfully
authenticated a user, the user can submit microarray database to the Experibase,
using MIAMExpress as the user interface. The first step is to create a new
microarray experiment in the administration web application. When a request is
made to create a new microarray experiment in the administration application, the
user is forwarded to MIAMExpress. Http query parameters and values are added
to the forward address so that MIAMExpress can automatically login the user. If
the user has not been registered with MIAMExpress, MLAMExpress will query
Experibase for the user's credentials and automatically register the user to the
application. The request parameters passed to MiAMExpress are given in Table
1:
Table 1: Query parameters for automatic user login
and/or registration
Query Parameter
Description
14
ACTION
Parameter
used
with
conjunction
by
MIAMExpress.
the
Used,
in
SELECTSUBMISSION
parameter, to direct the user to any page in the
MIAMExpress application.
SELECTSUBMISSION
Parameter
used
with
conjunction
by
MLAMExpress.
the
SELECT_
Used,
in
ACTION
parameter, to direct the user to any page in the
MIAMExpress application.
MXPGVARjloginname
The loginname of the user in MIAMExpress which is
also the login name of user in Experibase.
ExperiB3_studyplan
The Experibase study plan. Needed to update the
Experibase experimental database record.
ExperiB3_groupno
The Experibase group number. Also needed to
update the Experibase experiment dataset record.
ExperiB3_startDate
The Experibase start date. Also need to update the
Experibase experiment record.
Data Synchronization
The data synchronization application is responsible for synchronizing data stored
in MIAMExpress with data stored in Experibase. Experiment submissions in
MIAMExpress have a one to one correspondence with microarray experiments
stored in Experibase. Whenever a MIAMExpress submission is created or
15
updated the data synchronization application is invoked. For new experiments,
the synchronization creates a new microarray experiment in Experibase, and the
copies the data to over to Experibase For experiment update, the application
figures out what data has changed, and updates Experibase with the necessary
changes.
The application is written in Java. Perl wrappers that invoke the application were
also written. These wrappers give the MIAMExpress Perl CGI the ability to
invoke the application. The application consist of four modules; the Experiment
File Parser, the MIAMExpress data object, the Experibase data object, and the
Importer module.
The Experiment File Parser module consists of java packages and classes
responsible for parsing the experiment data files. Currently this module is capable
of parsing the following data files:
o
Affymetrix
o
CHP files: contains probe set analysis results generated from
Affymetrix software.
o
CEL files: stores the results of the intensity calculations on the
pixel values of the array image.
o
EXP files: contains information entered in the Experiment
window of Affymextrix MAS 5 software.
o
Nimblegen
o
Raw data files: stores the results of the intensity calculations on
the pixel values of the array image.
16
o
Design files: stores the array design
o
Pair files: holds the gene expression data
The MIAMExpress data object module provides object relational mapping from
the MIAMExpress data tables to Java objects. The mappings were created using
Oracle's Business Components for Java (BC4J). The data object module also
represents all the relationships between the data table that is not inherent in the
MILAMExpress Database but is very visible in the MIAMExpress application
code. Figure 4 shows the data table that stores the MIAMExpress submission
records, and all the tables that are related to it. All of these relationships are
represented in the data object module through simple get and set methods. The
BC4J application development environment essentially takes java code and
creates the appropriate SQL calls to the database.
The Experibase data object module is similar to the MLAMExpress data object
module in function. It provides an object relational mapping from Experibase
data tables to java objects. This module provides other modules with the ability to
query Experibase data tables, update Experibase data tables, and add new data
records. This module was also developed with Oracle's Business Components for
Java (BC4J).
17
rD
Tardesin
Teprmnt I
I -
(bX
Tsubmt
1
Submissi nSubmiter
Submissior ArrayDesign
00
. T.bmis
1*
Tsubmii
CA.
0~
Tsubmis
Submissioi Experiment
;ubmis
SubmissonLabeledHybrid
Tpooled
SubmissionPooled
Tsubmis
rD
CA
SubmissionPublication
Tsubm
Tsubmis
Tpublic
Submission
0..1
IxperimentFnl
Taxprfnl
The Importer module acts as the controller for data synchronizing application.
This module makes use of the other modules in its goal of synchronizing
Experibase with the data stored in MIAMExpress. It contains logic that maps the
data stored in MLAMExpress to data tables in Experibase. It also contains logic
that maps the data experiment files to data tables in Experibase.
Figure 5 displays a schematic which illustrated the flow of data when a user
makes a submission to the MIAMExpress application.
Local
Fie SveteM
4
1
User
Data Files
IAExpresss
a
Entered Data
onier
Internal
Database
6riData File
Importer
Experihase
Figure 5: Flow of data as submitted through
MIAMExpress
19
Chapter 5
EXPERIBASE MICROARRAY SCHEMA
Introduction
An ontology is a concise and unambiguous description of principle relevant
entities with their potential, valid relations to each other [6]. A database schema is
an example of an ontology. Each entity is well defined by a data table and the
relationship
between
entities is well defined by foreign
key constraints.
Ontologies are becoming increasingly important to the field of bioinformatics.
This importance can be attributed to the realization that making a comparison
between different experiments is only feasible if consistent terminology is used in
describing experimental data. The existence of a standard ontology to describe
experimental data will greatly aid in the sharing and comparing of experimental
results. It will also help biologist take full advantage of the extensive amount data
produced by biology laboratories. It is of no surprise to learn that ontologies are
now being developed for many different areas in experimental biology. MAGEOM is an ontology that applies to the area of DNA microarrays, and Gene
Ontology (GO) applies to the area of gene products and their behavior in a
cellular context.
In theory, Experibase, the experimental database, can be viewed as an ontology
for all of experimental
biology. In practice, the Experibase
incomplete. But as more experimental
ontology is
techniques are incorporated in to
Experibase, it will approach its ideal state. As mentioned earlier, Experibase
currently contains data representations for Gel Electrophoresis, Flow Cytometry,
and Mass Spectrometry experiments. Experibase was extended in this thesis to
include data representations for Microarray Experiments.
20
Experibase Common Package
The Experibase Common package does not contain any concrete data
representations but instead contains only one interface that all classes in
Experibase implement (realize in UML terms). The identifiable interface ensures
that all records in the database are identifiable for some common naming scheme.
Figure 6 displays the Experibase Common package as a UML Class diagram.
-:Idenftfia~f>
id
JLSID
Identiiger
Date(eated
DateModified
Figure 6: The identifiable interface, which is
realized by all Experibase Classes
The Id column is a database generated identifier. All instances of classes in the
same inheritance hierarchy have an Id that is unique among the other instances in
the same inheritance hierarchy.
The LSID (Life Science Identifier) column holds the object instance LSID.
LSID's are globally unique identifiers across all LSID aware data sources. They
can be used quickly retrieve information about the object through the use of an
LSID resolver.
The name column
stores a potentially human recognizable and possibly
ambiguous identifier for an object instance. The identifier column stores a
machine friendly identifier that ties an object to its source. More than one
21
instance of a class may share an identifier value. However, objects that share the
same identifier should be regarded as equivalent objects.
The DateCreated and DateModified fields are booking keeping fields that store
the date the record was added and the last time the record was modified.
Study Plan
The Study Plan packages consist of classes that provide metadata about an
experiment. The Submission class (also known as project or study plan class)
stores information related to new experiment added to the database. The
metadata stored about an experiment in the Study Plan includes:
The
experiment's submitter; stored as a relationship between the
'
Submission class and the Person class.
*
The experiment's hypothesis, stored as a relationship between the
Submission class and the Hypothesis class.
*
For experiments stored in public databases, the relationship between the
Submission class and the DatabaseEntry class records that information.
*
Experiments whose findings are published in a scientific journal have
their publication information
stored as relationship
between the
Submission class and Publication Class.
*
Information that the biologist associate most with an experiment is stored
in the Experiment class. There is a relationship between the Submission
class and the Experiment class that ties the experiment's details to the rest
of the experiment's metadata
22
The MicroarraySubmission
class extends the Submission class to add
information specific to microarray experiments. The MicroarraySubmission
has a relationship to the array designs used in the microarray experiment.
Figure 7 displays the Study Plan package.
&1ndX~jhz
Deiern.tion
DatabaseEntry
DatabaseNanw
DatabtiasURI
Submissionn
IsComplete
Estemalld
Ekporiment
(from Adminstration)
""Submitter
Micoara
Person
(from Administration)
SssIon
Publication
PubkacatomNaina
Title
Yeaf
ArrayDesign
(from RAperiment)
u RI
FirstPage
Status
Figure 7: A UML Class diagram of the Study Plan
Package
23
Administration Package
The administration package contains data objects that represent information
about people, and organizations related to the experiment. The list of people
possibly related to an experiment includes the experiment submitter, and the
publication authors. Possible organizations related to the experiment include
laboratories, universities, research companies, software vendors and hardware
manufacturers.
The administration package also contains security information about experiment.
Role based security is used to determine the capabilities of a person registered to
the system. Individuals can also possess user accounts for access to the database
through a database client application.
Figure 8 displays the classes of the
administration package.
Admin xatmn
Contact
Addiess
Afflllnlinx
Fax
Un
Phone
Person
Organization
FustNanu
LUstNaire
Midbutals
Rot!
Contact Person
Figure 8: The Administration package
24
Sample
The sample package contains classes that represent biological samples (Physical
samples), measured samples, and derived samples [2]. It also includes classes that
represent treatments to the samples for the purpose of the experiment. The
package is divided into three sub-packages; PhysicalSample, MeasuredSample,
and DerivedSample.
The PhysicalSample sub-package represents ideal biological samples, and their
treatment. Ideal biological samples are basically parts of an organism such as cell
or tissue. Figure 9 displays the class diagram for the PhysicalSample sub-package.
It includes seven classes that are specific to microarray experiments. These classes
are:
"
Extract: Represent the extraction of RNA or a similar product from
sample cells. See chapter 2 for information about RNA extraction.
"
ExtractProtocol: Stores a description of the protocol used to extract
RNA from the sample cell. Information stored includes the type of the
nucleic acid or molecules extracted, and the amplification agent, if
amplification was used.
*
LabeledExtract: Represents an extract solution tagged with fluorescent
label.
"
LabelingProtocol: Stores a description of the protocol used to label the
extract with the fluorescent compound. Includes the amount of the label
used, and the name of the fluorescent compound.
25
*
Array: Represents a physical microarray as described in Chapter 2. Every
array complies with an array design that dictates the position of each
probe set and the DNA sequences at each probe set.
*
PhysicalBioAssay: represents the product of hybridization between an
array and a labeled extract solution.
"
HybridizationProtocol: Stores a description about the creation of a
PhysicalBioAssay. Information of interest stored includes the amount of
the labeled extract solution use, the duration of the hybridization, the
temperature
during hybridization,
and the
total volume
of the
hybridization.
The remaining classes in the sub-package are common to most biological
experiments. The BioSample class describes a sample from an organism. The
BioSequence class represents biological sequences like DNA. Figure 9 displays
the class diagram.
The MeasuredSample sub-package represents information that has been extracted
from physical samples and that can be used again as input to the experiment. It is
diagrammed in Figure 10. The MeasuredBioAssay class represents the product of
feature extraction performed on a PhyscicalBioAssay instance. The scanning
protocol relationship between Protocol and MeasuredBioAssay associates to
MeasuredBioAssay a description of the protocol used in feature extraction
including the hardware and software used. MeasuredBioAssayData represents the
information obtained from feature extraction.
26
Iamp.hy icaL
kitPr
ArrayDevign
(from Eperiment)
Toye
yp
Pcir mxType
BioSample
sequence
Is~jruli
IsAppwoiwmf
eLmngth
Cl~eie
SanpkTypr
Nvr Stagy
AgSta tu:
Ag Ranige
Aeag
Array
BatdhNo
ax
PioducetionDate
Tini.jt
TirPoit
Sex
Databse F ity'
(frota Stu
Geiciaiiatin
TaiumniType
Iaiiidual
dy~la)
IndividnaKien
Direa seStatr
TargtCeIITy
Phy Ie*1BioAassay
peg
CellLine
eparatioiTech
ract
LabeiedExtract
Protocol
(from Experiment)
Extra Protocnl
FAractedplodu Ct
Anplificationlb jaid
SampleGrowthProtoco
LAbe.ingPootocol
TiiwMin
TitwMwx
Aswumnt
Ainiunit
LaboII.)ad
Tmrknit
Tenpi ktuloMin
umt
AiiilificationTypo
ii
Thtttotatturl
Medjlx~IkwNfA
HybridzationProtocol
abeA2draetUed
abetractUnit
Durntion
DurationUnit
VolumeUmit
Teniperature
TenperaturelUnit
+
Figure 9: Sample.PhysicalSample class Diagram
27
er iatn'it~ai ha1)1"tuA typo
ScaiingPr oIo
r~
(frommE piment)
O(vrn
Erwime"10
Figure 10: Sample.MeasuredSample sub-package
The DerivedSample sub-package contains information about the transformation
of MeaseuredBioAssay and/or PhysicalBioAssay instances. The transformation
could be the normalization of a MeasuredBioAssay instance or some other
manipulation of available data The DerivedBioAssay represents the product of
transformation performed on a PhyscicalBioAssay and/or MeasuredBioAssay
instances.
Experiment Package
The experiment package contains details about an experiment including the
protocols used, the experiment description, type, and experimental factors. Figure
11 displays the experiment package. None of classes are specific to microarrays
28
but the all used in the submission of a microarray experiment. The Experiment
class represents a microarray experiment. It has an ExperimentDesign instance
associated with it. An experiment design describes the experiment, records the
experiment's type, and the experimental factors like sample age.
E
per mnt
~ve
£xpedmat
R1Ipabmroiattt
IDes
ValneValue
siiiion
Figure 11: The experiment package
The experiment package has several sub-packages. One of these packages is the
Experiment.Protocol package. This package contains the definition of the
Protocol class and its relationships with other entities in the sub-package. Figure
12 displays a diagram containing the package components.
29
Experiment
Protocol
Pr*toeol
Dws cripftn
Type
,Hx *we~
MAeTC
Ycar
NIO&I
Venaiolk
Mfmfacturer
fEturer
Figure 12:Experiment.Protocol sub-package diagram
The RawData sub-package contains the class MeasuredBioAssayData and its
subclasses.
Subclasses include
classes for storing data from Affymetrix
commercial arrays (CEL file) and Nimblgen commercial arrays (RawData file).
The ProprocessedData sub-package contains the class DerivedBioAssayData and
its subclasses. Subclasses include classes for storing data from Affymetix CHP
files and Nimblegen Pair files.
30
Chapter 6
RESULTS AND CONCLUSIONS
The system described in the previous chapters is currently being deployed at
Pacific Northwest National Labs. Great interest has been shown in the
application. After the deployment and testing phase of the application is over, it
quite possible that the application will be deployed to other experimental
laboratories. The project has met all it requirements, with the exception of data
exportation. This requirement is partially functional due to time constraints.
The system is deployed on two servers. MLAMExpress is deployed on a Linux
server while the administration application is hosted on a Windows 2003 server
running Apache Tomcat. Experibase is hosted in an Oracle instance on the
Windows 2003 server.
The project requirements created a system design challenge by requiring that two
different, existing and unknown systems be able to interoperate with each other.
A database design challenge created by the need for an extension of Experibase.
Both challenges were solved using knowledge from computer science and
compute system and design fields. The system design challenge was solved by
modifying the applications that they communicated with other, making one
application the controller, the other worker. The database design challeneged was
solved using UML modeling to represent the data produced by microarray
experiments. The end result is a system that makes biologist life easier while
promoting a new all inclusive approach to data storage that is attractive because
of its compactness.
31
BIBLIOGRAPHY
[1] Patrick 0. Brown; David Botstein. Exploring the new world ofthe genome
with DNA microarrays.Nature Genetics Supplement. Volume 21. Pg 33 - 37.
January 1999.
[2] C.F. Dewey Jr; Aidan Downes; Howard Chou; Shixin Zhang. A Unique
Opportunity in BiologicalInformation Standards.W3C Workshop on Semantic Web for
Life Sciences. Cambridge, MA. October 2004.
[3] Alvis Brazma et al. ArrqyExpress-apublic repositoryfor microarraygene expression
data at the EBL Nucleic Acids Research. Volume 31, No. 1. Pg 68-17. 2003
[4] Ugis Sarkans et al. The ArrayExpressgene expression database:a software
engineering and implementation perspective. Bioinformatics. Vol 21, no. 8. Pg. 14951501. 2003.
[5] Alvis Brazma et al. Minimum information about a microarryexperiment
(MIAME)-towardstandardsformicroarraydata. Nature Genetics .Volume 29. Pg
365- 371. December 2001.
[6] Steffen Schulze-Kremer. Ontologiesfor molecular biology and bioinformatics.In Silico
Biology. Volume 2. Pg 0017 , 2002.
32
APPENDEX
1
MIAMExpress Mappings (HTML form fields to Experibase Database
Location)
1.1
ExperimentDesign
Form Field
Experiment
name
Experiment
design type
Experimental
factors
Experiment
description
Public
Release Date
1.2
ControlVcb.Value[type="ExperimentType'"
ControlVcb.Value[type="ExperiemntalFactor'"
ExperimentDesign.Description
ExperimentDesign.HoldDate
Publication
Form Field
Publication Status
Journal
Title
Year
Volume
First Page
Last Page
1.3
Experibase Location
ExperimentTable.Name
Experibase Location
Publication.Status
Publication.Journal
Publication.Title
Publication.Year
Publication.Volume
Publication.FirstPage
Publication.LastPage
Author
Form Field
First name
Initial
Last name
Experibase Location
Person.Name
Person.Middlelnitial
Person.LastName
1.4
Sample
Form Field
Sample name
Organism
Gender
Provider
Experibase Location
Sample.Name
Sample.Taxonomy
Sample.Sex
Sample.CellProvider
Sample Type
Sample.SampleType
Development stage
Age
Sample.DevStage
Sample.AgeStatus
Age Mmn
Sample.AgeRangeMin
Age Max
InitialTimePoint
Unit
Organism part
Gene modification
Sample.AgeRangeMax
Sample.TimePoint
Sample.TimeUnit
Sample.OrganismPart
Sample.GeneticVariation
Individual Identifier
Sample.Individual
Individual Genetic trait or
genotype
Disease State
Cell type or Target Cell Type
Sample.IndividualGen
Additional
Sample.Additional
Clinical
Information
Separation Technique
1.5
Extract
1.6
Labeled Extract
Form Field
Extract Name
Protocol
Pooling Protocol
Sample.DiseaseState
Sample.TargetCellType
Sample.SeperationTech
Experibase Location
Extract.Name
Protocol via Extract.ProtocolId
Protocol via Extract.PoolProtocolld
Form Field
Label Extract Name
Protocol
1.7
Experibase Location
LabeledExtract.Name
Protocol
LabeledExtract.Protocolld
via
Labeled Hybrid
Form Field
Experibase Location
Hybrid
NA
Array Design Name
ArrayBatch
Serial No
LabelExtractName
ArrayDesign.Name
Array.batch
Array.serialno
via
LabeledExtract.Name
PhysicalBioAssay.LabelExtractld
1.8
Hybrid
Form Field
Experibase Location
Hybridization Name
NA
Raw Data file
BioAssayData (locations depend
on file type)
BionAssayData(locations depend
on file type)
Normalized data file
BinaryFilelO.java
Tue
May 03 21:31:24 2005
1
buffer.get(bytes);
Created on Mar 8, 2005
return new String(bytes);
* TODO To change the template for this generated file go to
Window - Preferences - Java - Code Style - Code Templates
*/
package edu.mit.parsers;
public static String readFixedString(ByteBuffer buffer, int len)
byte] bytes = new byte[len];
buffer.get(bytes);
int i = 0;
import java.nio.ByteBuffer;
*
@author Aidan Downes
for (i
* TODO To change the template for this generated type comment go to
* Window - Preferences - Java - Code Style - Code Templates
public static int readInt(ByteBuffer buffer)
public static long readLong(ByteBuffer buffer)
return buffer.getLong();
public static short readShort(ByteBuffer buffer)
return buffer.getShort);
public static char readChar(ByteBuffer buffer)
return (char)buffer.get();
public static float readFloat(ByteBuffer buffer)
return buffer.getFloat);
public static short readUnsignedChar (ByteBuffer bb)
return ((short) (bb.get() & Oxff));
public static int readUnsignedShort (ByteBuffer bb)
return (bb.getShort() & Oxffff);
public static long readUnsignedInt (ByteBuffer bb)
return ((long)bb.getInt() & OxffffffffL);
public static String readString(ByteBuffer buffer)
int len = readInt(buffer);
byte[] bytes = new byteflen];
0; i < bytes.length; i++)
if(bytes[il ==0)
break;
return new String(bytes, 0, i);
public class BinaryFileIO
return buffer.getInt);
=
}
Thu May 12 13:36:46 2005
AffymetrixImporter.java
2
else
dbaImpl. setProbeArrayType (header.getChipType ));
ViewObject allSubmissions = eModule.findViewObject("SubmissionView");
logger.info("Found Submission View");
if (dataExists (header.getAlgorithmNameo())
subImpi = (SubmissionViewRowImpl)allSubmissions.createRow);
subImpl.setExperibaseSubId(new Number(experibaseId));
allSubmissions.insertRow(subImpl);
logger.info("Created new row in submission table");
eModule.getTransaction ().conmit );
logger.info("Cosmit initial submission");
dbaImpl . setAlgorithmName (header.getAlgorithmName ));
if (dataExists(header.getAlgorithmVersion) ))
dbaImpl . setAlgorithmVersion (header.getAlgorithmVersion ));
}
I
}
for(Iterator iter2 = header.getAlgorithmParameters().iterator(); iter2.hasNext();
public void importData(CHPData data)
)
importCHPHeader (data. getHeader));
importExpressionProbeSetResults (data.getExpressionProbeSetResults));
NameValuePair nvp =
(NameValuePair)iter2.next);
logger.info(nvp.getName()
+": "+nvp.getValue));
//TO DO Add
iterDBA.insertRow(dbaImpl);
private void importExpressionProbeSetResults(java.util.List probeSets)
}
RowIterator iter = dbaImpl.getCHPAnalysisView);
logger.info("reading probe sets with size
" + probeSets.size));
for(Iterator probeSetIter = probeSets.iterator); probeSetIter.hasNext();)
public void importData(ExperimentData data)
ExpressionProbeSetResults probeSet = (ExpressionProbeSetResults)probeSetIter.nex
to
if(expImpl.getName() == null)
expImpl.setName(data.getName ();
;
CHPAnalysisViewRowImpl chpImpl =
(CHPAnalysisViewRowImpl)iter.createRow);
chpImpl. setDetection (new Integer(probeSet.getDetection )));
chpImpl.setDetectionPValue(new Float (probeSet.getDetectionPValue)));
chpImpl . setNoOf Pairs (new Integer(probeSet.getNoOf Pairs()));
chpImpl . setNoOfUsedPairs (new Integer(probeSet.getNoOfUsedPairs
)));
chpImpl.setSignal(new Float(probeSet.getSignal()));
chpImpl.setSubId(new Number(subImpl.getId() .getValue());
iter.insertRow(chpImpl);
logger.info("finished reading "+ probeSets.size() + " probe sets");
private void importCHPHeader(CHPHeaderData header)
RowIterator iterDBA = subImpl.getAffyChpDataView);
dbaImpl = (AffyChpDataViewRowImpl)iterDBA.createRow);
dbaImpl .setCols (new Number(header.getCols)));
dbaImpl .setRows (new Number(header.getRows)));
dbaImpl . setNumberOf ProbeSets (new Number(header.getNoOf ProbeSets()));
for(Iterator iter = header.getSunmaryParameters) .iterator(); iter.hasNext);)
//PhysicalArrayDesign
if (dataExists (data.getChipType ())
logger. info("Entering array design information");
RowIterator iter = subImpl . getArrayDesigns );
ArrayDesignTableViewRowImpl row = (ArrayDesignTableViewRowImpl)iter. createRow();
row.setName(data.getChipType));
iter.insertRow(row);
logger.info("inserted array design info to cache");
//ArrayManufacture
if(dataExists(data.getChipLot())
//Person
Person person = null;
if(dataExists(data.getOperator()))
/ /Biosource
if(dataExists(data.getSanpleType())
dataExists(data.getComments())1
dataExists(data.getDescription())
dataExists (data. getProject ()
NameValuePair nvp = (NameValuePair)iter.next );
logger.info(nvp.getName) +": "+nvp.getValue));
//TOD ADD
//Protocol
if(dataExists(header.getChipType()))
if(dataExists(data.getProtocol()))
---- --
-. 000110 N N
0 NOR
Thu May 12 13:36:46 2005
AffymetrixImporter.java
3
public void importData(CELData data
importCELHeader(data.getHeaderData());
importCELEntries(data.getEntries());
if
(dataExists(data.getFilter())j
dataExists(data.getStation())
dataExists(data.getPixelSize() )I
dataExists(data.getScannerId)) 11
dataExists(data.getNumberOfScans())I
dataExists(data.getScannerType()
private void importCELHeader(CELHeaderData data)
RowIterator iter = subImpl.getAffyCelDataView);
mbaImpl = (AffyCelDataViewRowImpl)iter.createRow();
iter.insertRow(mbaImpl);
//BioAssay
private void importCELEntries(List entries)
//compound
RowIterator iter = mbaImpl.getCELAnalysisView);
logger.info("Adding entry information for "+entries.size()+" entries");
if(dataExists(data.getSolutionType()))
int count = 0;
for(Iterator entriesIter = entries.iterator();
//Channel
if(dataExists(data.getFilter())
CELFileEntryType entry =
entriesIter.hasNext();)
(CELFileEntryType)entriesIter.next();
CELAnalysisViewRowImpl row = (CELAnalysisViewRowImpl)iter.createRow();
row.setSubId(subImpl.getId().getSequenceNumber());
row.setIntensity(new Float(entry.getIntensity()));
row.setMask(new Boolean(entry.getMask)));
row.setOutlier(new Boolean(entry.getOutlier());
row.setPixels(new Integer(entry.getPixels()));
row.setStdDev(new Float(entry.getStdv)));
if(dataExists(data.getStation())
row.setX(new Integer(entry.getX)));
row.setY(new Integer(entry.getY());
iter.insertRow(row);
if(dataExists(data.getPixelSize() )II
dataExists(data.getScannerId))
I1
dataExists(data.getNumberOfScans())
I
if(++count % 1000
==
0)
dataExists(data.getScannerType())
persist();
logger.info(count + " entries have been added to the database");
}
//Create Parameters
logger.info("finshed add entries to database");
if(dataExists(data.getPixelSize()))
}
boolean dataExists(String str)
if(dataExists(data.getScannerId()))
return str != null && !str.trim).equals("1);
public void persist)
if(dataExists(data.getNumberOfScans())
logger.info("Committing all changes made to the db");
try
subImpl.setCompleted(new Boolean(true));
eModule.getTransaction().commit();
logger.info("All changes commited to db");
if(dataExists(data.getScannerType())
catch (Exception e)
logger.log(Level.WARNING, "Exception thrown in committing to database", e);
}
}
CHPFileParser.java
Tue Apr 19 02:12:54 2005
// rereading magic number
int magicNo = BinaryFileIO.readInt(buffer);
CHPFILEMAGICNUMBER) {
if (magicNo
throw new IOException("Incorrect Magic number");
Created on Mar 8, 2005
* TODO To change the template for this generated file go to
Window - Preferences - Java - Code Style - Code Templates
*/
package edumit parsers affymetrix;
import
import
import
import
import
import
1
header.setMagicNumber (magicNo);
// read version
int version = BinaryFileIO.readInt(buffer);
if (version > CHPFILEVERSIONNUMBER) {
throw new IOException("Incompatible version");
edu.mit.data.affymetrix.CHPData;
java.io.IOException;
java.nio.ByteBuffer;
java.nio.ByteOrder;
java.util.ArrayList;
java.util.List;
header.setVersion(version);
edu.mit.data.affymetrix.CHPHeaderData;
edu.mit.data.affymetrix.DataConstants;
import edu.mit.data.affymetrix.ExpressionProbeSetResults;
import edu.mit.data.affymetrix.GenotypeProbeSetResults;
// read cols and rows dimension
header.setCols(BinaryFileIO.readUnsignedShort(buffer));
header.setRows(BinaryFileIO.readUnsignedShort(buffer));
import
edu.mit.data.affymetrix.NameValuePair;
import
import
edu.mit.parsers.BinaryFileIO;
edu.mit.parsers.FileParser;
// read no of probe sets
header.setNoOfProbeSets(BinaryFileIO.readInt(buffer));
import
import
// skip qc data
BinaryFileIO.readInt(buffer);
*
@author Aidan Downes
TODO To change the template for this generated type comment go to Window Preferences - Java - Code Style - Code Templates
*/
public class CHPFileParser extends FileParser
// read type
header.setGeneChipAssayType(BinaryFileIO.readlnt(buffer));
// read progID
header.setProgID(BinaryFileIO.readString(buffer));
private CHPHeaderData header;
// read parentCellFile
header.setParentCellFile(BinaryFileIO.readString(buffer));
private List genotypeResults;
// header chipType
header.setChipType(BinaryFileIO.readString(buffer));
private List expressionResults;
public static final char DELIMCHAR =
Ox14;
// read algorithm
header.setAlgorithmName(BinaryFileIO.readString(buffer));
public static final int MINCELLSTR = 4;
// read algorithm parameters
int noOfParams = BinaryFileIO.readInt(buffer);
public static final int CHPFILEMAGIC_NUMBER = 65;
public static final int CHPFILEVERSIONNUMBER =
2;
public static final int EXPRESSIONABSOLUTE_STAT_ANALYSIS = 2;
public static final int EXPRESSION_COMPARISONSTATANALYSIS =
3;
for (int i = 0; i < noOfParams; i++) f
NameValuePair param = new NameValuePair();
param.setName(BinaryFileIo.readString(buffer));
param.setValue(BinaryFileIO.readString(buffer));
header.getAlgorithmParameters().add(param);
public static final String APPNAME = "GeneChip Sequence File";
private void reset()
header = new CHPHeaderDatao;
header.setAlgorithmParameters(new ArrayList));
header.setSunmaryParameters (new ArrayList());
genotypeResults = new ArrayList();
expressionResults = new ArrayList(;
public void read(String filePath) throws IOException
ByteBuffer buffer = getBuffer(filePath);
buffer.position(0);
buffer.order(ByteOrder.nativeOrder());
reset();
// check if file is compatible
if (isXDAComptableFile(buffer))
// set position to start
buffer.position(0);
// read summary paramters
noOfParams = BinaryFileIO.readInt(buffer);
for (int i = 0; i < noOfParams; i++) (
NameValuePair param = new NameValuePair();
param.setName(BinaryFileIO.readString(buffer));
param.setValue(BinaryFileIO.readString(buffer));
header.getSummaryParameters() .add(param);
// skip
noOfParams = BinaryFileIO.readInt(buffer);
float f = BinaryFileIO.readFloat(buffer);
for (int i = 0; i < noOfParams; i++) (
BinaryFileIO.readFloat(buffer);
BinaryFileIO.readFloat(buffer);
BinaryFileIO.readFloat(buffer);
4
Tue Apr 19 02:12:54 2005
CHPFileParser. java
2
.readUnsignedShort(buf
fer));
// finished header
if (header.getGeneChipAssayType() != DataConstants.GENE_CHIP_.A
expressionResults.add(results);
SSAY_TYPEEXPRESSION
&& header.getGeneChipAssayType()
!= DataConsta
}
nts.GENE_CHIPASSAYTYPEGENOTYPING) (
throw new IOException("Supports only Expression or Gen
otypes");
else
int ival = BinaryFileIO.readInt(buffer);
for (int i = 0; i < header.getNoOfProbeSets(; i++)
GenotypeProbeSetResults results = new Genotype
if (header.getGeneChipAssayType() == DataConstants.GENECHIP_A
ProbeSetResults();
SSAYTYPEEXPRESSION)
results
int analysisType = BinaryFileIO.readUnsignedChar(buffe
.setAlleleCall(BinaryFileIO
.readUnsignedC
r);
int ival = BinaryFileIO.readInt(buffer);
har(buffer));
if
(analysisType != EXPRESSIONABSOLUTESTATANALYSIS
&& analysisType != EXPRESSIONCOMPARIS
results.setConfidence(BinaryFileIO.readFloat(b
uffer));
results.setRasl(BinaryFileIO.readFloat(buffer)
ONSTATANALYSIS)
throw new IOException(
results.setPvalueAA(results.getRasl());
"outdataed expression CHP file
s, should be
MAS
5 or higher");
results. setRas2 (BinaryFileIO. readFloat (buffer)
results.setPvalueAB(results.getRas2());
for (int i = 0; i < header.getNoOfProbeSets(; i++)
ExpressionProbeSetResults results = new Expres
results.setPvalueBB(BinaryFileIO.readFloat(bu
sionProbeSetResults(;
ffer));
results.setPvalueNoCall(BinaryFileIO.readFloa
results.setDetection(BinaryFileIO.readUnsigned
t(buffer));
Char(buffer));
genotypeResults.add(results);
results.setDetectionPValue(BinaryFileIo.readFl
oat(buffer));
else
results.setSignal(BinaryFileIO.readFloat(buffe
buffer.position(0);
String verString = BinaryFileIO.readFixedString(buffer, APPNA
r));
results
ME
.setNoOfPairs(BinaryFileIO
.readUnsignedS
.length));
if (!verString.equals(APP_NAME))
throw new IOException("incompatible file format");
hort(buffer));
results.setNoOfUsedPairs(BinaryFileIO
.readUnsignedShort(buffer));
int version = BinaryFileIO.readInt(buffer);
if (version < 12) (
throw new IOException(
"This chip file is not supported by th
results.setHasCompResults(false);
if
(analysisType == EXPRESSIONCOMPARISONSTAT
e parser");
-ANALYSIS)
results.setHasCompResults(true);
results
.setChange(BinaryFileI
0
.readU
header.setVersion(version);
// read algorithm
header.setAlgorithmName(BinaryFileIO.readtring(buffer));
nsignedChar(buffer));
results.setChangePValue(BinaryFileIO.r
eadFloat(buffer));
results.setSignalLogRatio(BinaryFileIO
.readFloat(buffer));
results.setSignalLogRatioLow(BinaryFil
eIO
.readFloat(buffer));
results.setSignalLogRatioHigh(BinaryFi
// read version
header.setAlgorithmVersion(BinaryFileIO.readString(buffer));
// read parameters ??
BinaryFileIO.readString(buffer);
BinaryFileIO.readString(buffer);
//
read cols and rows dimension
leIO
.readFloat(buffer));
results.setNoOfCommonPairs(BinaryFileI
0
header.setRows(BinaryFileIo.readInt(buffer));
header.setCols(BinaryFileIO.readInt(buffer));
Tue Apr 19 02:12:54 2005
CHPFileParser.java
3
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
// read no of probe sets
header.setNoOfProbeSets(BinaryFileIO.readInt(buffer));
// unused data
int maxvalue = BinaryFileIO.readInt(buffer);
results.setDetectionPValue(BinaryFileIO.readFl
oat(buffer));
BinaryFileIO.readInt(buffer);
if (header.getVersion() == 12)
BinaryFileIO.readFloat(buffer);
for (int i = 0; i < header.getNoOfProbeSets); i++) {
BinaryFileIO.readInt(buffer);
results.setSignal(BinaryFileIO.readFloat(buffe
r) );
for (int i = 0; i < maxvalue; i++)
BinaryFileIO.readInt(buffer);
results. setDetection (BinaryFileIO. readInt (buff
er));
for (int i = 0; i < maxvalue; i++) {
int type = BinaryFileIO.readInt(buffer);
if (i == 0) {
if
for (int j = 0; j < results.getNoOfPairs();
j+
+) {
BinaryFileIO.readFloat(buffer);
BinaryFileIO.readInt(buffer);
(type == 3)
header
. setGeneChipAssayType(
if (header.getVersion) == 12)
BinaryFileIO. readInt (buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readFloat (buffer)
DataConstants.GENECHIPASSAYTYPE_EXPRESSION);
else if (type == 2)
header
.setCeneChipAssayType(
DataConstants.GENECHIPASSAYTYPEGENOTYPING);
BinaryFileIO.readFloat(buffer)
else
header
BinaryFileIO. readInt (buffer);
.setGeneChipAssayType(
BinaryFileiO.readChar(buffer);
DataConstants.GENECHIPASSAYTYPEJUNKNOWN);
BinaryFileIO.readChar(buffer);
} else
for (int i = 0; i < header.getNoOfProbeSetso; i++)
BinaryFileIO.readInt(buffer);
BinaryFileIO.readUnsignedShort
(buffer);
BinaryFileIO.readUnsignedShort
(buffer);
header.setChipType(BinaryFileIO.readFixedString(buffer, 256));
header.setParentCellFile(BinaryFileIO.readFixedString(buffer,
if (header.getVersion() == 12)
BinaryFileIO.readInt (buffer);
BinaryFileIO. readInt (buffer);
BinaryFileIO. readFloat (buffer)
256));
header.setProgID(BinaryFileIO.readString(buffer));
if
(header.getGeneChipAssayType()
!=DataConstants.GENECHIP_A
BinaryFileIO. readFloat (buffer)
SSAYTYPEEXPRESSION
&& header.getGeneChipAssayType() != DataConsta
nts.GENECHIPASSAYTYPEGENOTYPING)
BinaryFileIO.readInt(buffer);
BinaryFileIO.readChar(buffer);
BinaryFileIO.readChar(buffer);
{
throw new IOException("Supports only Expression or Gen
otypes");
} else
BinaryFileIO.readUnsignedShort
if
== DataConstants.GENECHIP-A
(buffer);
for (int i = 0; i < header.getNoOfProbeSets(); i++) {
ExpressionProbeSetResults results = new Expres
(buffer);
(header.getGeneChipAssayType)
BinaryFileIO.readunsignedShort
SSAYTYPEEXPRESSION) (
sionProbeSetResults();
results.setNoOfPairs(BinaryFileIO.readInt(buff
results
er)));
results.setNoOfUsedPairs(BinaryFileIO.readInt(
setHasCompResults (BinaryFileI
O.readInt(buffer) == 1 ? true
buffer));
:
false);
if (header.getVersion() <= 12)
if (results.isHasCompResults()
results
.setNoOfCommonPairs(Bi
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
naryFileIO
if
(header.getVersion() == 12) (
BinaryFileIO.readInt(buffer);
.readI
nt(buffer));
CHPFileParser. java
4
Tue Apr 19 02:12:54 2005
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
if (header.getVersion() == 12) {
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
int cval
if
(buffer));
BinaryFileIO.readChar(buffer);
(header.getVersion() == 12) {
BinaryFileIO.readChar(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
1)
results.setAlleleCall(BinaryFileIO
.readUnsignedChar(buff
if (header.getVersion() == 12) {
results.setConfidence((float)
results.setSignalLogRatioHigh((float)
BinaryFileIO
.readInt(buffer) / 100
BinaryFileIO.readChar(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readString(buffer);
BinaryFileIO.readstring(buffer);
BinaryFileIO.readString(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
results.setChange(BinaryFileIO.readInt
if
=
(cval ==
BinaryFileIO
.readInt(buffe
0);
r) / 1000);
} else {
BinaryFileIO.readInt(buffer);
results.setConfidence(BinaryFi
if
(header.getVersion() == 12)
BinaryFileIO.readInt(buffer);
leIO
.readFloat(buf
fer));
results.setSignalLogRatio((float) Bina
ryFileIO
BinaryFileIO.readFloat(buffer);
BinaryFileIO.readFloat(buffer);
BinaryFileIo.readFloat(buffer);
.readInt(buffer) / 100
0);
if (header.getVersion() == 12)
BinaryFileIO.readInt(buffer);
results.setRasl(BinaryFileIO.readFloat
(buffer));
results.setRas2 (BinaryFilelO.readFloat
results.setSignalLogRatioLow((float) B
inaryFileIO
(buffer));
.readInt(buffer) / 100
} else
results.setConfidence(Of);
results.setRasl(Of);
results.setRas2(Of);
results.setAlleleCall(DataConstants.AL
0);
if
(header.getVersion() == 12) {
results.setChangePValue((float
) BinaryFileIO
LELENOCALL);
.readInt(buffe
r) / 1000);
else {
results.setChangePValue(Binary
FileIO
.readFloat(buf
results.setPvalue-AA(0.Of);
results.setPvalueAB(Of);
results.setPvalueBB(Of);
results.setPvalueNoCall(Of);
fer));
BinaryFileIO.readString(buffer);
BinaryFileIO.readString(buffer);
int np = BinaryFileIO.readInt(buffer);
expressionResults.add(results);
for (int j = 0; j < np;
i++) {
BinaryFileIO.readInt(buffer);
else
for (int i = 0; i < header.getNoQfProbeSets); i++)
GenotypeProbeSetResults results = new Genotype
ProbeSetResults();
int ngroups = BinaryFileIO.readInt(buffer);
for (int j
0; j < ngroups; j++) {
BinaryFileIO.readInt(buffer);
BinaryFileIO.readString(buffer);
BinaryFileIO.readChar(buffer);
if (header.getVersion() == 12) {
BinaryFileIO.readInt (buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO. readChar (buffer);
BinaryFileIO. readChar (buffer);
else (
BinaryFileIO.readUnsignedChar(
Tue Apr 19 02:12:54 2005
CHPFileParser. java
5
return genotypeResults;
buffer);
BinaryFileIO.readUnsignedChar(
)
buffer);
if
*
*
*/
(header.getVersion() == 12)
BinaryFileIO.readInt(buffer);
@param genotypeResults
The genotypeResults to set.
public void setGenotypeResults(List genotypeResults)
this.genotypeResults = genotypeResults;
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readInt(buffer);
BinaryFileIO.readChar(buffer);
BinaryFileIO.readChar(buffer);
@return Returns the header.
*/
public CHPHeaderData getHeader)
return header;
*
} else {
BinaryFileIO.readUnsignedChar(
buffer);
BinaryFileIO.readUnsignedChar(
buffer);
@param header
The header to set.
*/
public void setHeader(CHPHeaderData header)
this.header = header;
*
*
this.genotypeResults.add(results);
}
}
}
public CHPData getCHPData()
CHPData data = new CHPDatao;
private boolean isXDAComptableFile(ByteBuffer buffer)
data.setExpressionProbeSetResults(this.expressionResults);
int magic = BinaryFileIO.readInt(buffer);
return (magic == CHPFILEMAGIC_NUMBER);
data.setGenotypeProbeSetResults(this.genotypeResults);
data.setHeader(this.header);
return data;
public static void main(String[] args) throws IOException
String file = args[O];
CHPFileParser parser = new CHPFileParser);
parser.read(file);
CHPHeaderData header = parser.getHeadero;
System.out.println(header.getAlgorithmName));
System.out.println(header.getAlgorithmVersion));
System.out.println(header.getMagicNumber));
System.out.println(header.getVersion();
System.out.println(header.getAlgorithmParameters());
System.out.println(header.getParentCellFile));
System.out.println(header.getChipType());
System.out.println(header.getProgID));
* @return Returns the expressionResults.
*/
public List getExpressionResults)
return expressionResults;
* @param expressionResults
The expressionResults to set.
*
*/
public void setExpressionResults(List expressionResults)
this.expressionResults = expressionResults;
a
@return
Returns the genotypeResults.
public List getGenotypeResults()
}
CHPHeaderData. java
Wed Mar 09 01:41:46 2005
1
/*
*
* @param algorithmVersion The algorithmVersion to set.
Created on Mar 8, 2005
* TODO To change the template for this generated file go to
* Window - Preferences - Java - Code Style - Code Templates
*/
package edu.mit.data.affymetrix;
public void setAlgorithmVersion(String algorithmVersion)
this.algorithmVersion = algorithmVersion;
* @return Returns the summaryParameters.
*/
public List getSummaryParameters()
return summaryParameters;
import java.io.Serializable;
import java-util.ArrayList;
import java.util.List;
The sumnaryParameters to set.
*/
public void setSunmmaryParameters (List aummaryParameters)
this.suumaryParameters = aunmaryParameters;
* @param sunmaryParameters
* @author Aidan Downes
TODO To change the template for this generated type comment go to
Style - Code Templates
*/
public class CHPHeaderData implements Serializable
* Window - Preferences - Java - Code
@return Returns the chipType.
*/
public String getChipType()
return chipType;
*
* Comment for <code>serialVersionUID</code>
*/
private static final long serialVersionUID =
private int magicNumber;
private int version;
private int cols;
private int rows;
private int noOfProbeSets;
private int geneChipAssayType;
private String chipType;
private String algorithmName;
private String algorithmVersion;
private String parentCellFile;
private String progID;
private List algorithmParameters;
private List summaryParameters;
1L;
@param chipType The chipType to set.
*/
public void setChipType(String chipType)
this.chipType = chipType;
*
@return Returns the cols.
*/
public int getCols()
return cols;
*
@param cols The cols to set.
*/
public void setCols(int cols)
this.cols = cols;
*
@return Returns the algorithmName.
*/
public String getAlgorithmName()
return algorithmName;
*
* @return Returns the geneChipAssayType.
* @param algorithmName The algorithmName to set.
*/
public void setAlgorithmName(String algorithmName)
this.algorithmName = algorithmName;
public int getGeneChipAssayType()
return geneChipAssayType;
@param geneChipAssayType The geneChipAssayType to set.
*/
public void setGeneChipAssayType(int geneChipAssayType)
this.geneChipAssayType = geneChipAssayType;
*
@return Returns the algorithmParameters.
*/
public List getAlgorithmParameters()
return algorithmParameters;
*
@return Returns the magicNumber.
*/
public int getMagicNumber()
return magicNumber;
*
@param algorithmParameters The algorithmParameters to set.
*/
public void setAlgorithmParameters (List algorithmParameters)
this.algorithmParameters = algorithmParameters;
*
* @param magicNumber The magicNumber to set.
*
@return
Returns the algorithmVersion.
*/
public String getAlgorithmVersion)
return algorithmVersion;
*/
public void setMagicNumber(int magicNumber)
this.magicNumber = magicNumber;
*
---------- -
@return Returns the noOfProbeSets.
CHPHeaderData.java
Wed Mar 09 01:41:46 2005
*/
public int getNoOfProbeSets()
return noOfProbeSets;
@param noOfProbeSets The noOfProbeSets to set.
public void setNoOfProbeSets(int noOfProbeSets)
this.noOfProbeSets = noOfProbeSets;
/**
* @return Returns the parentCellFile.
public String getParentCellFile()
return parentCellFile;
@param parentCellFile The parentCellFile to set.
*/
public void setParentCellFile(String parentCellFile)
this.parentCellFile = parentCellFile;
*
*
@return Returns the progID.
*/
public String getProgID()
return progID;
*parem progID The progID to set.
public void setProgID(String progID)
this.progID = progID;
* @return Returns the rows.
*/
public int getRows)
return rows;
@param rows The rows to set.
*/
public void setRows(int rows)
this.rows = rows;
*
* @return Returns the version.
*/
public int getVersion()
return version;
* @param version The version
to set.
*/
public void setVersion(int version)
this.version = version;
2
Thu Apr 28 23:27:08 2005
DataConstants.java
1
/*
case ALLELENOCALL:
return "No Call";
* Created on Mar 8, 2005
* TODO To change the template for this generated file go to
Window - Preferences - Java - Code Style - Code Templates
*/
package edu.mit.data.affymetrix;
*
@author
return
public static String getDetectionString(int detection)
switch (detection) {
Aidan Downes
case ABSPRESENTCALL:
* Not elegant but it works
return "P";
*/
public class DataConstants
// Expression, Genotyping, Resequencing, Universal, Unknown
public static final int GENECHIPASSAYTYPE_EXPRESSION = 0;
case ABSMARGINALCALL:
return "M";
public static final
int
public static final
int GENECHIPASSAYTYPERESEQUENCING = 2;
public static final
int GENECHIPASSAYTYPE_UNIVERSAL = 3;
case ABSNOCALL:
return "No Call";
public static final
int GENECHIPASSAYTYPE_UNKNOWN
default:
public static final
int ALLELE_A_CALL = 6;
public static final
int
GENECHIPASSAYTYPEGENOTYPING = 1;
case ABSABSENTCALL:
return "A";
=
4;
break;
return
public
static final
ALLELEBCALL = 7;
int ALLELE_AB_CALL = 8;
public static final
int ALLELENOCALL = 11;
public static final
int
public static final
int ABSMARGINALCALL = 1;
public static final
int
public static final
int ABSNOCALL = 3;
public static final
int COMPINCREASECALL = 1;
public static final
int COMPDECREASECALL = 2;
public static final
int COMP-MODINCREASECALL = 3;
public static String getChangeString(int change)
switch (change) {
case COMPINCREASECALL:
return "I";
ABSPRESENTCALL = 0;
case COMPDECREASECALL:
return "D";
ABSABSENTCALL = 2;
case COMPMOD_INCREASECALL:
return "MI";
case COMPMODDECREASECALL:
return "MD";
case COMP NOCHANGECALL:
public
static final
int COMPMODDECREASECALL = 4;
public
static final
int COMPNOCHANGECALL = 5;
return "NC";
case COMP_NOCALL:
return "No Call";
default:
public static final
int COMPNO-CALL = 6;
public static final char CELLDELIMCHAR =
public static final int MINCELLSTR = 4;
public static final int
public static final int
0x14;
CELLFILEMAGICNUMBER = 64;
CELLFILEVERSIONNUMBER = 4;
public static String getAlleleCallString(int alleleCall)
switch (alleleCall)
case ALLELEACALL:
return "A";
case ALLELEB-CALL:
return "B";
case ALLELEABCALL:
return "AB";
break;
return "";
}
}
ExperibaseImporter.java
Thu May 12 13:32:56 2005
1
package org.experibase.importer;
import java.io.File;
import edu.mit.parsers.affymetrix.*;
import java.io.IOException;
catch (IOException e)
logger.log(Level.WARNING, "problems parsing exp file", e);
import java.util.ArrayList;
import
import
import
import
java.util.Iterator;
oracle.jbo.ViewObject;
org.experibase.importer.utils.GetOpt;
java.util.logging.*;
else if(ext.equalsIgnoreCase(".CEL"))
try
CELFileParser celParser = new CELFileParsero;
celParser.read(file.getAbsolutePath));
affyImporter.importData(celParser.getData(o);
Adds data from raw files to experibase experiment
catch (IOException e)
*/
logger.log(Level.WARNING, "problems parsing cel file", e);
public class ExperibaseImporter
1;
private int experibaseId = -.
private static Logger logger = Logger-getLogger("org.experibase");
affyImporter.persist (;
public ExperibaseImporter (int experibaseId)
logger.info("finished import files");
this.experibaseId = experibaseId;
Imports a file into experibase, file types supported include:
*
*
EXP files
CHP files
@param fileName
* Returns the extension of a file including the leading
* @return The file extension or the empty string if filName is null
* or the file has no extension
* 9param. fileName
*/
private static String getFileExtension(String fileName)
public void importFiles(ArrayList fileNames)
if(fileName == null)
return "";
int index = fileName.lastIndexOf('.');
if (index < 0)
return "";
else return fileName.substring(index);
AffymetrixImporter affyImporter = new AffymetrixImporter(experibaseId);
for(Iterator iter = fileNames.iteratoro; iter.hasNext);)
File file = new File(iter.next().toString());
if(file.exists() && !file.isDirectoryo)
* Adds data from files to experiment
String ext = getFileExtension(file.getName();
if(ext.equalsIgnoreCase(".CHP"))
@param args Args to program
*/
public static void main(String[] args)
*
try
CHPFileParser chpParser = new CHPFileParser);
chpParser.read(file.getAbsolutePath));
try
I
FileHandler fh = new FileHandler("experibasefiles.txt", 5242880,
affyImporter. importData (chpParser.getCHPData ();
logger. addHandler (fh);
catch (IOException e)
logger.log(Level.WARNING, "problems parsing chp file", e);
catch (IOException
e)
//ignore
logger.setLevel(Level.ALL);
else if
(ext.equalsIgnoreCase(".EXP"))
try
ExperimentFileParser expParser = new ExperimentFileParser);
expParser.read(file.getAbsolutePath));
affyImporter. importData(expParser.getData());
GetOpt go = new GetOpt(args, "he:");
go.optErr = true;
int ch = -1;
// process options in command line arguments
boolean usagePrint = false;
1, true);
ExperibaseImporter.java
Thu May 12 13:32:56 2005
int eId = -1;
while ((ch = go.getopt)) !=go.optEOF)
((char)ch == 'h') usagePrint = true;
if
else if
((char)ch ==
'e')
eId
= go.processArg(go.optArgGet), eId);
logger.info("ExperibaseId is " + eId);
else doHelp(l);
// undefined option
if (usagePrint)
doHelp(O);
ArrayList files = new ArrayList);
// process non-option command line arguments
for (int k = go.optIndexGeto; k < args.length; k++)
logger.info("Procesing "+ args[k]);
files.add(args[k]);
ExperibaseImporter importer = new ExperibaseImporter(eId);
importer.importFiles(files);
/*
Stub for providing help on usage
You can write a longer help than this, certainly.
*/
static void doHelp(int returnValue)
System.err.println("Usage: Importer -e experibaseId file ...
System.err.println("\te\tExperibaseId");
System.err.println("\th\tPrints this menu");
logger.info("help shown with return value " + returnValue);
System.exit(returnValue);
}
");
2
ExperimentData. java
Tue Apr 19 02:16:12 2005
/*
Created on Mar 7, 2005
* TODO To change the template for this generated file go to
Window - Preferences - Java - Code Style - Code Templates
*/
package edu.mit . data. affymetrix;
1
private
private
private
private
private
private
String pixelSize;
String filter;
scanTemperature;
String
String scanDate;
String scannerId;
String numberOfScans;
* greturn Returns the numberOfScans.
*/
public String getNumberOfScans()
return numberOfScans;
import java.io.Serializable;
* @author Aidan Downes
* TODO To change the template for this generated type comment go to
* Window - Preferences - Java - Code Style - Code Templates
*/
public class ExperimentData implements
Serializable
{
@param numberOfScans The numberOfScans to set.
*/
public void setNumberOfScans (String numberOfScans)
this.numberOfScans = numberOfScans;
*
private String scannerType;
Comment for <code>serialVersionUID</code>
*/
private static final long serialVersionUID = 3257288036928009526L;
*
* @return Returns the absolutePath.
private String name;
private String directoryPath;
private String absolutePath;
1/
//
'I,,
II
'I
//
1/
//
//
I,
[Sample Info]
Chip Type
Chip Lot
Operator
Sample Type
Description
Project
Comments
Solution Type
Solution Lot
private String
private String
private String
private String
private String
private String
private String
private String
private String
Ecoli_ASv2
*/
public String getAbsolutePath()
return absolutePath;
@param absolutePath The absolutePath to set.
*/
public void setAbsolutePath(String absolutePath)
this.absolutePath = absolutePath;
*
@return Returns the chipLot.
*/
public String getChipLot()
return chipLot;
*
chipType;
chipLot;
operator;
sampleType;
description;
project;
comments;
solutionType;
solutionLot;
@param chipLot The chipLot to set.
*/
public void setChipLot(String chipLot) {
this.chipLot = chipLot;
*
* @return Returns the chipType.
II
'I
III,
*/
public String getChipType()
return chipType;
[Fluidics)
Protocol
Station
Module
Hybridize Date
private
private
private
private
String
String
String
String
* @param chipType The chipType to set.
protocol;
station;
module;
*/
public void setChipType(String chipType)
this.chipType = chipType;
hybridizeDate;
@return Returns the comments.
*/
public String getComments()
return comments;
*
//
//
///'/
//
//
////
[Scanner]
Pixel Size
Filter
Scan Temperature
Scan Date
Scanner ID
Number of Scans
Scanner Type
/* *
@param comments The comments to set.
*/
public void setComments(String comments)
this.comments = comments;
*
ExperimentData.java
Tue Apr 19 02:16:12 2005
2
* @param name The name to set.
public void setName(String name)
this.name = name;
@return Returns the description.
*/
public String getDescription)
return description;
@return
Returns the operator.
*/
public String getOperator)
return operator;
* @param description The description to set.
*/
public void setDescription(String description) {
this.description = description;
@param operator The operator to set.
*/
public void setOperator(String operator)
this.operator = operator;
*
@return Returns the directoryPath.
*/
public String getDirectoryPath)
return directoryPath;
*
* @return Returns the pixelSize.
*/
@param
directoryPath The directoryPath to set.
public String getPixelSize()
return pixelSize;
*/
public void setDirectoryPath(String directoryPath)
this.directoryPath = directoryPath;
* @param pixelSize The pixelSize to set.
*
@return
Returns the filter.
*/
public void setPixelSize(String pixelSize)
this.pixelSize = pixelSize;
public String getFilter){
return filter;
@return Returns the project.
*/
public String getProject()
return project;
@param filter The filter to set.
*/
public void setFilter(String filter)
this.filter = filter;
* @param project The project to set.
*/
public void setProject(String project)
this.project = project;
*return Returns the hybridizeDate.
*/
public String getHybridizeDate)
return hybridizeDate;
*
@param hybridizeDate The hybridizeDate to set.
*/
public void setHybridizeDate(String hybridizeDate)
this.hybridizeDate = hybridizeDate;
*
@return Returns the protocol.
public String getProtocol()
return protocol;
* @param protocol The protocol to set.
@return
Returns the module.
*/
public String getModule)
return module;
*/
public void setProtocol(String protocol)
this.protocol = protocol;
f
* @return Returns the sampleType.
*/
*
@param
module The module to set.
*/
public void setModule(String module)
this-module = module;
public String getsampleType){
return sampleType;
*
@return Returns the name.
*/
*
@param
sampleType The sampleType to set.
public void setSampleType(String sampleType)
this-sampleType = sampleType;
public String getName()
return name;
*
@return
*/
Returns the scanDate.
ExperimentData.java
Tue Apr 19 02:16:12 2005
3
public String getScanDate()
return scanDate;
* @return Returns the station.
* @param scanDate The scanDate to set.
*/
*/
public String getStation()
return station;
public void setScanDate(String scanDate)
this.scanDate = scanDate;
@param station The station to set.
*/
public void setStation(String station)
this.station = station;
*
* @return Returns
the scannerId.
public String getScannerId()
return scannerId;
/**
* @param scannerId The scannerId to set.
*/
public void setScannerId(String scannerId)
this.scannerId = scannerId;
*
@return
Returns the scannerType.
*/
public String getScannerType()
return scannerType;
* @param scannerType The scannerType to set.
*/
public void setScannerType(String scannerType)
this.scannerType = scannerType;
* @return Returns the scanTemperature.
*/
public String getScanTemperature()
return scanTemperature;
@param scanTemperature The scanTemperature to set.
*/
public void setScanTemperature(String scanTemperature)
this.scanTemperature = scanTemperature;
*
@return Returns the solutionLot.
*/
*
public String getSolutionLot()
return solutionLot;
* @param solutionLot The solutionLot to set.
*/
public void setSolutionLot(String solutionLot)
this.solutionLot = solutionLot;
@return Returns the solutionType.
*/
public String getSolutionType()
return solutionType;
@param solutionType The solutionType to set.
*/
public void setSolutionType(String solutionType)
this.solutionType = solutionType;
*
ExperimentFileParser. java
Tue Apr 19 05:06:56 2005
*
1
String value = "";
/*
Created on Mar 7, 2005
logger.info("reading file contents");
while(lineMatcher.find())
* TODO To change the template for this generated file go to
* Window - Preferences - Java - Code Style - Code Templates
String line = lineMatcher.group();
if(line.startsWith(CHIPTYPE))
*/
package edu.mit.parsers.affymetrix;
import
import
import
import
import
value = line.substring(CHIPTYPE.length()).trim);
if(!value.equals(""))
data.setChipType(value);
java.io.*;
java.nio.CharBuffer;
java.util.logging.Logger;
java.util.regex.Matcher;
java.util.regex.Pattern;
else if(line.startsWith(CHIPLOT))
{
import edu.mit.data.affymetrix.ExperimentData;
import edu.mit.parsers.FileParser;
}
else if(line.startsWith(SAMPLETYPE))
* Pauthor Aidan Downes
* TODO To change the template for this generated type comment go to
* Window - Preferences - Java - Code Style - Code Templates
*/
public
class
value = line.substring(CHIPLOT.length()).trim));
if(!value.equals(""))
data.setChipLot(value);
value = line.substring(SAMPLETYPE.length()).trim);
if(!value.equals(""))
data.setSampleType(value);
ExperimentFileParser extends FileParser {
else if(line.startsWith(DESCRIPTION))
private
private
private
private
private
private
private
private
private
private
private
private
private
private
private
private
private
private
private
private
private
final String CHIPTYPE = "Chip Type";
final String CHIPLOT
"Chip Lot";
final String SAMPLETYPE = "Sample Type";
final String DESCRIPTION = "Description";
final String
PROJECT = "Project";
final String COMMENTS = "Comments";
final String SOLUTION_TYPE = "Solution Type";
SOLUTIONLOT = "Solution Lot";
final String
final String
PROTOCOL = "Protocol";
final String STATION = "Station";
MODULE = "Module";
final String
final String HYBRIDIZEDATE = "Hybridize Date";
final String PIXELSIZE = "Pixel Size";
final String FILTER = "Filter";
static final String SCANTEMPERATURE = "Scan Temperature";
SCANDATE = "Scan Date";
static final String
static final String
SCANNERID = "Scanner ID";
NUMBEROFSCANS = "Number of Scans";
static final String
static final String SCANNERTYPE = "Scanner Type";
static final String OPERATOR = "Operator";
ExperimentData data;
static
static
static
static
static
static
static
static
static
static
static
static
static
static
value
= line.substring(DESCRIPTION.length()).trim));
if(!value.equals(""))
data.setDescription(value);
else if(line.startsWith(PROJECT))
value = line.suhstring(PROJECT.length()).trim();
if(!value.equals(""))
data.setProject(value);
else if(line.startsWith(COMMENTS))
value = line.substring(COMMENTS.length()).trim));
if(!value.equals(""))
data.setConunents(value);
else if(line.startsWith(SOLUTIONTYPE))
value = line.substring(SOLUTIONTYPE. length)). trim));
if(!value.equals(""))
data.setSolutionType(value);
else if(line.startsWith(SOLUTIONLOT))
public void read(String filePath) throws IOException
value = line.substring(SOLUTIONLOT.length()).trim);
if(!value.equals(""))
File file = new File(filePath);
String name = file.getName();
int index = name.lastIndexOf(".");
name = name.substring(0,index);
data.setSolutionLot(value);
else if(line.startsWith(PROTOCOL))
CharBuffer buffer = getCharBuffer(filePath);
value = line.substring(PROTOCOL.lengtho)).trim));
if(!value.equals(""))
//line pattern
Pattern linePattern = Pattern.compile(".*$",
data.setProtocol(value);
Pattern.MULTILINE);
else if(line.startsWith(STATION))
//Match line
Matcher lineMatcher = linePattern.matcher(buffer);
data = new ExperimentData);
data.setName(name);
value = line.substring(STATION.length()).trim);
if(!value.equals(""))
data.setStation(value);
else if(line.startsWith(MODULE))
ExperimentFileParser.java
Tue Apr 19 05:06:56 2005
value = line.substring(MODULE.length()).trim);
if(!value.equals("'"))
data.setModule(value);
2
*/
public ExperimentData getData()
return data;
else if(line.startsWith(HYBRIDIZEDATE))
public static void main(String[] args)
value = line.substring(HYBRIDIZE_DATE.length().trim()
if(!value.equals("'))
data.setHybridizeDate(value);
else if(line.startsWith(PIXELSIZE))
value =
line.substring(PIXELSIZE.length()).trim);
if(!value.equals(""))
data.setPixelSize(value);
else if(line.startsWith(FILTER))
value = line.substring(FILTER.length()).trim(;
if(!value.equals (""))
data.setFilter(value);
else if(line.startsWith(SCANTEMPERATURE))
value =
line.substring(SCANTEMPERATURE.length()).trim
if(!value.equals(""))
data.setScanTemperature(value);
else if(line.startsWith(SCANDATE))
value = line.substring(SCANDATE.length().trim();
if(!value.equals ( '))
data.setScanDate(value);
else if(line.startsWith(SCANNER_ID))
value = line.substring(SCANNERID.length)).trim();
if(!value.equals(""))
data.setScannerId(value);
else if(line.startsWith(NUMBEROF_SCANS))
value = line.substring(NUMBEROFSCANS.length()).trim(
if(!value.equals (""))
data.setNumnberOfScans(value);
else if(line.startsWith(SCANNERTYPE))
value = line.substring(SCANNER_TYPE.length)).trimo;
if(!value.equals('""))
data.setScannerType(value);
else if(line.startsWith(OPERATOR))
value = line.substring(OPERATOR.length()).trim;
if(!value.equals ('))
data.setOperator(value);
logger.info("finished reading file contents");
*
@return
Returns the data.
ExpressionProbeSetResults.java
1
Wed Mar 09 04:19:54 2005
public void setDetection(int detection)
this.detection = detection;
/*
Created on Mar 8, 2005
* TODO To change the template for this generated file go to
* Window - Preferences - Java - Code Style - Code Templates
package edu.mit.data.affymetrix;
* @return Returns the detectionPValue.
*/
public float getDetectionPValue()
return detectionPValue;
import java.io.Serializable;
*
* @author Aidan Downes
* TODO To change the template for this generated type comment go to
* Window - Preferences - Java - Code Style - Code Templates
* @return Returns the hasCompResults.
public boolean isHasCompResults)
return hasCompResults;
1L;
}
* @param hasCompResults The hasCompResults to set.
float signal;
int noOfPairs;
*/
public void setHasCompResults(boolean hasCompResults)
this.hasCompResults = hasCompResults;
int noOfUsedPairs;
int detection;
boolean hasCompResults;
float changePValue;
@return Returns the noOfCommonPairs.
*/
public int getNoOfCommonPairs()
return noOfCommonPairs;
*
float signalLogRatio;
float signalLogRatioLow;
float signalLogRatioHigh;
int noOfCommonPairs;
int change;
*
*
@return
detectionPValue The detectionPValue to set.
public void setDetectionPValue(float detectionPValue)
this.detectionPValue = detectionPValue;
*/
public class ExpressionProbeSetResults implements Serializable {
Comment for <code>serialVersionUID</code>
*/
private static final long serialVersionUID =
float detectionPValue;
@param
Returns the change.
*/
@param noOfCommonPairs The noOfCommonPairs to set.
public void setNoOfCommonPairs(int noOfCommonPairs)
this.noOfCommonPairs = noOfCommonPairs;
public int getChange()
return change;
* @return Returns the noofPairs.
* @param change The change to set.
*/
public void setChange(int change)
this.change = change;
* @return Returns the changePValue.
*/
public float getChangePValue()
return changePValue;
*/
public int getNoOfPairs()
return noOfPairs;
* @param noOfPairs The noOfPairs to set.
*/
public void setNoOfPairs(int noOfPairs)
this.noOfPairs = noOfPairs;
* @return Returns the noOfUsedPairs.
*
@param
changePValue The changePValue to set.
*/
public int getNoOfUsedPairs()
return noOfUsedPairs;
public void setChangePValue(float changePValue)
this.changePValue =
changePValue;
@param noOfUsedPairs The noOfUsedPairs to set.
*/
public void setNoOfUsedPairs(int noOfUsedPairs)
this.noOfUsedPairs = noOfUsedPairs;
*
*
@return
Returns the detection.
*/
public int getDetection()
return detection;
* @return Returns the signal.
* @param detection The detection to set.
*/
*/
public float getSignal() {
return signal;
ExpressionProbeSetResults.java
* @param signal The signal
Wed Mar 09 04:19:54 2005
to set.
*/
public void setSignal(float signal)
this.signal = signal;
@return Returns the signalLogRatio.
*/
public float getSignalLogRatio()
return signalLogRatio;
*
@param signalLogRatio The signalLogRatio to set.
*/
public void setSignalLogRatio(float signalLogRatio)
this.signalLogRatio = signalLogRatio;
*
@return Returns the signalLogRatioHigh.
*/
*
public float getSignalLogRatioHigh()
return signalLogRatioHigh;
@param signalLogRatioHigh The signalLogRatioHigh to set.
*/
public void setSignalLogRatioHigh(float signalLogRatioHigh)
this.signalLogRatioHigh = signalLogRatioHigh;
*
@return
Returns the signalLogRatioLow.
public float getSignalLogRatioLow()
return signalLogRatioLow;
@param signalLogRatioLow The signalLogRatioLow to set.
*/
public void setSignalLogRatioLow(float signalLogRatioLow)
this.signalLogRatioLow = signalLogRatioLow;
*
}
2
Tue Mar 08 21:55:58 2005
FileParser.java
1
/*
protec ted ByteBuffer getBuffer(String filePath) throws FileNotFoundException,
Created on Mar 7, 2005
IOException
(
//Map file to file byte buffer
FileInputStream input = new FileInputStream(filePath);
FileChannel channel = input.getChannel();
long fileLength = channel.size(;
MappedByteBuffer buffer = channel .map(FileChannel.MapMode.READONLY, 0
* TODO To change the template for this generated file go to
* Window - Preferences - Java - Code Style - Code Templates
package edu.mit.parsers;
import
import
import
import
import
import
import
import
import
import
import
*
java.io.FileInputStream;
java.io.FileNotFoundException;
java.io.IOException;
java.nio.ByteBuffer;
java.nio.CharBuffer;
java.nioMappedByteBuffer;
java.nio.channels.FileChannel;
java.nio.charset.Charset;
java.nio.charset.CharsetDecoder;
java.util.logging.Level;
java.util.logging.Logger;
@author
fileLength);
return buffer;
Aidan Downes
* TODO To change the template for this generated type comment go to
Window - Preferences - Java - Code Style - Code Templates
*/
public class FileParser
protected Logger logger;
public FileParser)
logger = Logger.getLogger("edu.mit.parsers");
* Gets a character buffer containing the contents of the file at <code>filePa
th</code>
*
*
@param filePath
@return
@throws IOException
*/
public CharBuffer getCharBuffer(String filePath) throws IOException
*
try
{
ByteBuffer buffer = getBuffer(filePath);
//Converter to char buffer
Charset charset = Charset.forName("ISO-8859-1");
CharsetDecoder decoder = charset.newDecoder);
return decoder.decode(buffer);
} catch
(FileNotFoundException e) {
logger.log(Level.WARNING, e.getLocalizedMessage), e);
throw e;
}
catch (IOException
e)
logger.log(Level.WARNING, e.getLocalizedMessage),
throw e;
/**~
*
@param
*
@return
@throws
*
*
filePath
FileNotFoundException
@throws IOException
e);
GenotypeProbeSetResults.java
1
Wed Mar 09 00:06:00 2005
@return Returns the pvalueAB.
*/
public float getPvalueAB()
return pvalueAB;
/*
*
* Created on Mar 8, 2005
* TODO To change the template for this generated file go to
Window - Preferences - Java - Code Style - Code Templates
*/
package edu.mit.data.affymetrix;
*
import java.io.Serializable;
*
@author
@param
pvalue.AB The pvalueAB to set.
*/
public void setPvalueAB(float pvalueAB)
this.pvalueAB = pvalueAB;
Aidan Downes
*
* TODO To change the template for this generated type comment go to
* Window - Preferences - Java - Code Style - Code Templates
@return Returns the pvalueBB.
public float getPvalueBB()
return pvalueBB;
*/
public class GenotypeProbeSetResults implements Serializable
@param pvalueBB The pvalueBB to set.
*/
public void setPvalueBB(float pvalue-BB)
this.pvalue_BB = pvalueBB;
*
* Comment for <code>serialVersionUID</code>
*/
private static final long serialVersionUID =
int alleleCall;
float confidence;
float rasl;
float ras2;
float pvalue_-AA;
float pvalueAB;
float pvalueBB;
float pvalueNoCall;
1L;
* @return Returns the pvalueNoCall.
*/
public float getPvalueNoCall() {
return pvalueNoCall;
@param pvalue.NoCall The pvalue_NoCall to set.
*/
public void setPvalue_NoCall(float pvalueNoCall)
this.pvalueNoCall = pvalue_NoCall;
@return Returns the alleleCall.
*/
public int getAlleleCall()
return alleleCall;
*
@return Returns the rasl.
*/
public float getRasl()
return rasl;
*
@param alleleCall The alleleCall to set.
*/
public void setAlleleCall(int alleleCall)
this.alleleCall = alleleCall;
@param rasl The rasl to set.
*/
public void setRasl(float rasl)
this.rasl = rasl;
*
* @return Returns the confidence.
*/
public float getConfidence()
return confidence;
@return Returns the ras2.
*/
public float getRas2()
return ras2;
*
@param confidence The confidence to set.
*/
public void setConfidence(float confidence)
this.confidence = confidence;
*
* @param ras2 The ras2 to set.
*
@return
*/
public float getPvalueAA()
return pvalueAA;
{
@param pvalueAA The pvalue__AA to set.
*/
public void setPvalueAA(float pvalueAA) {
this.pvalueAA = pvalueAA;
*
*/
public void setRas2(float ras2)
this.ras2 = ras2;
Returns the pvalue_AA.
}
GetOpt.java
Thu Apr 21 02:15:46 2005
1
// -//
// -//
//
// -//
package org.experibase.importer.utils;
//NOTES
// Original Author not known
//
OVERVIEW:
//
//
//
//
//
GetOpt provides a general means for a Java program to parse command
line arguments in accordance with the standard Unix conventions;
it is analogous to, and based on, getopt(3) for C programs.
(The following documentation is based on the man page for getopt(3).)
//
DESCRIPTION:
//
//
//
//
//
//
//
//
//
//
//
//
GetOpt interprets command arguments in accordance with the standard
Unix conventions: option arguments of a command are introduced by
followed by a key character, and a non-option argument terminates
the processing of options. GetOpt's option interpretation is controlled
by its parameter optString, which specifies what characters designate
legal options and which of them require associated values.
//
//
in the command line arguments that matches a letter in optString.
optString must contain the option letters the command using getopt
//
//
For example, getopt("ab") specifies that the command
will recognize.
line should contain no options, only "-a", only "-b", or both "-a" and
(The command line can also contain non-option
// "-b" in either order.
Multiple options per argument
// arguments after any option arguments.)
// are allowed, e.g., "-ab" for the last case above.
//
//
//
//
//
//
//
If a letter in optString is followed by a colon, the option is expected
to have an argument. The argument may or may not be separated by
whitespace from the option letter. For example, getopt("w:") allows
either "-w 80" or "-w80". The variable optArg is set to the option
argument, e.g., "80" in either of the previous examples. Conversion
functions such as Integer.parseInt(), etc., can then be applied to
optArg.
getopt places in the variable optIndex the index of the next command
line argument to be processed; optIndex is automatically initialized
to 1 before the first call to getopt.
When all options have been processed (that is, up to the first
// non-option argument), getopt returns optEOF
(-1).
getopt recognizes the
// command line argument "--" (i.e., two dashes) to delimit the end of
Subsequent,
// the options; getopt returns optEOF and skips "--".
// non-option arguments can be retrieved using the String array passed to
// main(), beginning with argument number optIndex.
//
//
//
//
//I
//
DIAGNOSTICS:
getopt prints an error message on System.stderr and returns a question
mark ('?') when it encounters an option letter in a command line argument
that is not
included in optString.
Setting the variable optErr to
false disables this error message.
// NOTES:
//
//
//
//
//
//
//
//
is handled.
Sun and DEC getopt(3)'s differ w.r.t. how "---"
(or anything starting with "--") the same as
Sun treats "---"
as two separate "-" options
DEC treats "---"
//
//
// -//
(so "-" should appear in option string).
Java GetOpt follows the DEC convention.
An option 'letter' can be a letter, number, or most special character.
Like getopt(3), GetOpt disallows a colon as an option letter.
public class GetOpt {
GetOpt is a Java class that provides one method, getopt,
and some variables that control behavior of or return additional
information from getopt.
// The getopt method returns the next, moving left to right, option letter
//
//
//
//
//
//
Duplicate command line options are allowed; it is up to user to
deal with them as appropriate.
A command line option like "-b-" is considered as the two options
"b" and "-" (so "-" should appear in option string); this differs
from "-b --'.
The following notes describe GetOpt's behavior in a few interesting
or special cases; these behaviors are consistent with getopt(3)'s
behaviors.
by itself is treated as a non-option argument.
- A '-'
- If optString is "a:" and the command line arguments are "-a -x",
then "-x" is treated as the argument associated with the "-a".
private String[] theArgs = null;
private int argCount = 0;
private String optString = null;
public GetOpt(String[] args, String opts)
theArgs = args;
argCount = theArgs.length;
optString = opts;
// user can toggle this to control printing of error messages
public boolean optErr = false;
public int processArg(String arg, int n)
int value;
try {
value = Integer.parseInt(arg);
} catch (NumberFormatException e)
if (optErr)
System.err.println("processArg cannot process " + arg
+ " as an integer");
return n;
return value;
public int tryArgiint k, int n)
int value;
try {
value = processArg(theArgs(k], n);
} catch (ArrayIndexOutOfBoundsException e)
if (optErr)
System.err.println("tryArg: no theArgs[" + k + "]);
return n;
}
return value;
}
public long processArg(String arg, long n)
long value;
try (
value = Long.parseLong(arg);
} catch (NumberFormatException e)
if (optErr)
System.err.println("processArg cannot process " + arg
+ " as a long");
return n;
return value;
public long tryArg(int k, long n)
long value;
--
GetOpt.java
try
Thu Apr 21 02:15:46 2005
(
value = processArg(theArgs[k], n);
} catch (ArrayIndexOutOfBoundsException e)
if (optErr)
System.err.println("tryArg: no theArgs["
return n;
+ k +
"]");
2
value = processArg(theArgsk], b);
} catch (ArrayIndexOutOfBoundsException e)
if (optErr)
System.err.println("tryArg: no theArgs[" + k + "I");
return b;
return value;
return value;
public String tryArg(int k, String s)
String value;
public double processArg(String arg, double d)
double value;
try (
value = Double.valueOf(arg).doubleValue);
} catch (NumberFormatException e)
if
try {
value = theArgs[k];
} catch (ArrayIndexOutOfBoundsException e)
if (optErr)
(optErr)
System.err.println("processArg cannot process
"
+ arg
System.err.println("tryArg: no theArgs["
return s;
+ k + "]");
+ " as a double");
return value;
return d;
return value;
private static void writeError(String msg, char ch)
System.err.println("GetOpt:
"
+ msg +
--
+ ch);
public double tryArg(int k, double d)
double value;
try
(
value = processArg(theArgs[k],
public static final int optEOF = -1;
d);
private int optIndex = 0;
public int optIndexGet) {return optIndex;)
catch (ArrayIndexOutOfBoundsException e)
if
(optErr)
System.err.println("tryArg:
no theArgs["
+
k +
"]");
return d;
private String optArg = null;
public String optArgGet()
(return
optArg;}
return value;
private int optPosition = 1;
public float processArg(String arg, float f)
float value;
try {
value = Float.valueOf(arg).floatValue);
catch (NumberFormatException e)
if
(optErr)
System.err.println("processArg cannot process " + arg
+ " as a float");
return f;
return value;
public int getopt()
optArg = null;
if (theArgs == null 11 optString == null) return optEOF;
if (optIndex < 0 || optIndex >= argCount) return optEOF;
String thisArg = theArgs[optIndex];
int argLength = thisArg.length(;
// handle special cases
if (argLength <= 1 11 thisArg.charAt(0) != '-')
{
// e.g., "", "a", "abc", or just
return optEOF;
} else if (thisArg.equals("--")) { // end of non-option args
optIndex++;
return optEOF;
public float tryArg(int k, float f)
float value;
try {
value = processArg(theArgs(k], f);
} catch (ArrayIndexOutOfBoundsException e)
if (optErr)
System.err.println("tryArg: no theArgs[" + k + "I");
return f;
// get next "letter" from option argument
char ch = thisArg.charAt(optPosition);
// find this option in optString
int pos = optString.indexof(ch);
if (pos ==-1
ch == ':')
{
if (optErr) writeError("illegal option", ch);
ch =
?';
else { //
return value;
public boolean processArg(String arg, boolean b)
// 'true' in any case mixture is true; anything else is false
return Boolean.valueOf(arg).booleanValueo;
handle colon, if present
if (pos < optString.length()-l && optString.charAt(pos+l)
==
if (optPosition != argLength-1) (
// take rest of current arg as optArg
optArg = thisArg.substring(optPosition+l);
optPosition = argLength-1; // force advance to next arg below
} else ( // take next arg as optArg
optIndex++;
if (optIndex < argCount
public boolean tryArg(int k, boolean b)
boolean value;
try
{
&& (theArgs[optIndex].charAt(0) 1= '-'
theArgs[optIndex].length() >= 2 &&
(optString.indexOf(theArgs[optIndex.charAt(1)) == -1
3
Thu Apr 21 02:15:46 2005
GetOpt.java
theArgs~optIndex].charAt(l) ==
optArg = theArgsfoptIndex];
':')))
{
} else {
if (optErr) writeError("option requires an argument", ch);
optArg = null;
not ?
// Linux man page for getopt(3) says
ch = ':';
// advance to next option argument,
// which might be in thisArg or next arg
optPosition++;
if (optPosition >= argLength) {
optIndex++;
optPosition = 1;
return ch;
public static void main(String[] args) ( // test the class
GetOpt go = new GetOpt(args, "Uab:f:h:w:");
go.optErr = true;
int ch = -1;
// process options in command line arguments
boolean usagePrint = false;
//
int aflg = 0;
// default
// values
boolean bflg = false;
String filename = "out";
set
// of
int width = 80;
//
options
double height = 1;
//
here
while ((ch = go.getopt()
!=
go.optEOF)
if
((char)ch == 'U') usagePrint
else if ((char)ch == 'a') aflg++;
else if ((char)ch == 'b')
true;
=
bflg = go.processArg(go.optArgGet), bflg);
else if ((char)ch == 'f') filename = go.optArgGet(;
else if ((char)ch == 'h')
height = go.processArg(go.optArgGet),
else if ((char)ch == 'w')
width = go.processArg(go.optArgGeto,
else System.exit(l);
height);
width);
// undefine d option
// getopt) returns '?'
if (usagePrint) {
System.out.println("Usage: -a -b bool -f file -h height -w width");
System.exit(0);
System.out.println("These are all the command line arguments " +
"before processing with GetOpt:");
for (int i=0; i<args.length; i++) System.out.print(" " + args[il);
System.out.println) ;
System.out.println("-U " + usagePrint);
System.out.println("-a " + aflg);
System.out.println("-b " + bflg);
System.out.println("-f ' + filename);
System.out.println("-h " + height);
System.out.println("-w " + width);
// process non-option command line arguments
for (int k = go.optIndexGet(; k < args.length; k++)
System.out.println("normal argument
"
+ k + " is
"
+ args[k]);
}
MM
MiamexpressImporter.java
Thu May 19 22:50:34 2005
package org.experibase.importer;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import java.util.logging.FileHandler;
import java-util.logging.Level;
import java.util.logging.Logger;
import oracle.jbo.ApplicationModule;
import oracle.jbo.Key;
import oracle.jbo.Row;
import oracle.jbo.RowIterator;
import oracle.jbo.ViewObject;
import oracle.jbo.client.Configuration;
import oracle.jbo.domain.Number;
import org.experibase.importer.utils.GetOpt;
import org.experibase.miamexpress.ExperimentFactorViewRowImpl;
import org.experibase.miamexpress.views.TardesinViewRowImpl;
import org.experibase.miamexpress.views.TarrayViewRowImpl;
import org.experibase.miamexpress.views.TauthorViewRowImpl;
import org.experibase.miamexpress.views.TctlvcbviewRowImpl;
import org.experibase.miamexpress.views.TexpfctrViewRowImpl;
import org.experibase.miamexpress.views.TexprmntViewRowImpl;
import org.experibase.miamexpress.views.TexprtypViewRowImpl;
import org.experibase.miamexpress.views.TextractViewRowImpl;
import org.experibase.miamexpress.views.ThybridViewRowImpl;
import org.experibase.miamexpress.views.TlabelViewRowImpl;
import org.experibase.miamexpress.views.TntxsynViewRowImpl;
import org.experibase.miamexpress.views.TothersViewRowImpl;
import org.experibase.miamexpress.views.TprotclsViewRowImpl;
import org.experibase.miamexpress.views.TpublicViewRowImpl;
import org.experibase.miamexpress.views.TsubmisViewRowImpl;
import org.experibase.miamexpress.views.TlabhybViewRowlmpl;
import org.experibase.miamexpress.views.TsampleViewRowImpl;
import org.experibase.microarrays.SubmissionviewRowImpl;
import oracle.jbo.domain.*;
import org.experibase.microarrays.array.ArrayTableViewRowImpl;
import org.experibase.microarrays.arraydesign.ArrayDesignTableViewRowlmpl;
import org.experibase.microarrays.bioassay.PhysicalBioAssayViewRowImpl;
import org.experibase.microarrays.biomaterial.BioSampleViewRowImpl;
import org.experibase.microarrays.biomaterial.LabeledExtractViewRowImpl;
import org.experibase.microarrays.bqs.PersonViewRowImpl;
import org.experibase.microarrays.bqs.PublicationAuthorViewRowImpl;
import org.experibase.microarrays.bqs.PublicationViewRowImpl;
import org.experibase.microarrays.description.DescriptionTableViewRowImpl;
import org.experibase.microarrays.experiment.ExperimentDesignViewRowImpl;
import org.experibase.microarrays.experiment.ExperimentTableViewRowImpl;
import org.experibase.microarrays.experiment.ExperimentTypeAssocViewRowImpl;
import org.experibase.microarrays.experiment.ExperimentTypeViewRowImpl;
import org.experibase.microarrays.experiment.ExperimentalFactorAssocViewRowImpl;
import org.experibase.microarrays.experiment.ExperimentalFactorViewRowImpl;
1
private
private
private
private
private
private
private
int experibaseId;
int miamexpressId;
ApplicationModule eModule;
ApplicationModule mxModule;
TsubmisViewRowImpl submisRow;
SubmissionViewRowImpl subImpl;
boolean createMode = true;
public MiamexpressImporter(int experibaseId, int miamexpressId)
this.experibaseId = experibaseId;
this.miamexpressId = miamexpressId;
eModule = Configuration.createRootApplicationModule("org.experibase.microarrays.Ex
peribaseModule", "ExperibaseModuleLocal");
mxModule = Configuration.createRootApplicationModule("org.experibase.miamexpress.v
iews.MiamexpressModule", "MiamexpressModuleLocal");
public void importData()
findMXSubmissiono;
findExperibaseSubmission();
importExperimentData();
private void persist()
logger.info("Committing all changes made to the db');
try
eModule.getTransaction().commit();
logger.info("All changes commited to db");
catch (Exception e)
logger.log(Level.WARNING, "Exception thrown in committing to database", e);
private
void findMXSubmission()
"+ miamexpressId);
logger.info("finding submission with id
ViewObject submis = mxModule.findViewObject("TsubmisViewl");
if(submis == null)
logger.log(Level.WARNING, "Could not find miamexpress submission");
throw new RuntimeException("Cant find submision");
}
Key key = new Key(new Objectt]{new Number(miamexpressId)});
Row[] rows = submis.findByKey(key, 1);
* Imports data in MIAMExpress into Experibase
if(rows.length == 0)
* In order to import data from MIAMExpress,
the MIAMExpress submision Id
is needed to find the data to import. In order to maintain the tie to the
MIAMExpress submission with an Experibase project, pass the experibase project Id a
s
logger-log(Level.WARNING, "Could not find miamexpress submission");
System.err.println("Error finding miamexpress submission");
throw new RuntimeException("Can't find miamexpress submission");
* well.
Data mapping was gained from reverse engineering MIAMExpress mage-ml converter tool
provided with miamexpress
submisRow = (TsubmisViewRowImpl)rows[0];
logger.info("Found miamexpress submission");
*/
public class MiamexpressImporter
private void findExperibaseSubmission()
private static Logger logger = Logger.getLogger("org.experibase");
MiamexpressImporter.java
Thu May 19 22:50:34 2005
2
ViewObject submis = eModule. findViewObject ( "ExperimentSubmissionView");
submis. setWhereClauseParam( O, new Number(experibaseId));
submis . executeQuery ();
/ /LABELEDHYBRIDS
Rowlterator labelHybridIterator = submisRow.getTlabhybView();
//
if(submis.hasNext()
logger.info("Found existing row");
subImpl = (SubmissionViewRowImpl)submis .next );
if(subImpl.getExperibaseSubId().intValue() != experibaseId)
logger.warning("inconsistent state, miamexpress submission added previously wi
th a different submision id. Exiting");
System.exit(0);
/
//{
//
while(labelHybridlterator.hasNext()
TlabhybViewRowImpl labeledHybrid =
(TlabhybViewRowImpl)labelHybridIterator.nex
to;
//
//
if(create)
createLabeledHybrid(labeledHybrid);
//
}
persist();
}
createMode = false;
logger.info("In update mode");
private void createPublication(TpublicViewRowImpl tpub)
else
ViewObject allSubmissions = eModule.findViewObject("SubmissionView");
logger.info("Found Submission View");
createMode = true;
subImpl = (SubmissionViewRowImpl)allSubmissions.createRow);
subImpl.setExperibaseSubId(new Number(experibaseId));
allSubmissions.insertRow(subImpl);
logger.info("Created new row in submission table");
eModule. getTransaction) . commit);
logger.info("Commit initial submission");
logger.info("in create mode");
RowIterator pubs = subImpl.getPublicationView();
PublicationViewRowImpl pub = (PublicationViewRowImpl)pubs.createRow);
String identifier = "MIAMEXPRESS:PUBLICATION:"+tpub.getTpublicSysuido;
pub.setIdentifier(identifier);
pub.setFirstPage(tpub.getTpublicFirstPage() .toString ();
pub.setLastPage(tpub.getTpublicLastPage() .toString));
pub.setTitle(tpub.getTpublicTitleo());
pub.setYear(tpub.getTpublicYear().toString));
pub.setVolume (tpub.getTpublicVolume ));
pub.setStatus(getCV(new Number(tpub.getTpublicStatus () . intValue))));
pub.setJournal(getCV(new Number(tpub.getTpublicPublication() .intValue())));
pubs.insertRow(pub);
persist);
}
*
Imports experiment data into Experibase
*
private void importExperimentData )
//adding authors
RowIterator tauthors = tpub.getTauthorViewl();
RowIterator assocs = pub.getPublicationAuthorViewo;
while (tauthors .hasNext ()
TauthorViewRowImpl tauthor = (TauthorViewRowImpl)tauthors . next);
ViewObject persons = eModule.findViewObject ("PersonViewl");
Row person = persons.createRow();
person.setAttribute("FirstName", tauthor.getTauthorFname());
person.setAttribute("LastName", tauthor.getTauthorLname());
person.setAttribute("MidInitials", tauthor.getTauthorInitial());
persons.insertRow(person);
//EXPERIMENTS
RowIterator texperimentIterator =
if(!texperimentIterator.hasNext()
submisRow.getTexprmntView();
logger.log(Level.WARNING, "Error in MIAMExpress");
return;
TexprmntViewRowImpl experiment = (TexprmntViewRowImpl)texperimentIterator.next
();
logger .info ("experiment id: "+experiment.getTexprmntExprid));
if(createMode)
createExperiment(experiment);
else
PublicationAuthorViewRowImpl assoc = (PublicationAuthorViewRowImpl) assocs
.createRowo;
assoc.setAuthorId( (Number)person.getAttribute("Id"));
assoc.setPublicationId(pub.getId().getSequenceNumber));
assocs.insertRow(assoc);
updateExperiment(experiment);
//PUBLICATION
RowIterator tpubs = submisRow.getTpublicView();
while (tpubs .hasNext ()
private void updatePublication (TpublicViewRowImpl tpub)
PublicationViewRowImpl pub = getPublicationForTpub(tpub);
TpublicViewRowImpl tpub =
(TpublicViewRowImpl)tpubs . next);
if(createMode)
createPublication(tpub);
if (pub == null)
{
createPublication(tpub);
return;
}
String identifier =
else
"MIAMEXPRESS:PUBLICATION:"+tpub.getTpublicSysuid(;
updatePublication(tpub);
pub.setIdentifier(identifier);
MiamexpressImporter. java
Thu May 19 22:50:34 2005
3
String value = getCV(new Number(texpfct.getTexpfctrId().intValue()));
ExperimentalFactorViewRowImpl factor = getExperimentFactorForValue(value);
assoc.setExperimentalFactorId(factor.getId().getSequenceNumber));
expFactorsAssocIter.insertRow(assoc);
pub.setFirstPage(tpub.getTpublicFirstPage().toString));
pub.setLastPage(tpub.getTpublicLastPage() .toString());
pub.setTitle(tpub.getTpublicTitle());
pub.setYear(tpub.getTpublicYear().toString());
pub.setVolume(tpub.getTpublicVolume());
pub.setStatus(getCV(new Number(tpub.getTpublicStatus).intValue())));
pub.setJournal(getCV(new Number(tpub.getTpublicPublication().intValue())));
//Experiment Types
RowIterator texpTypesIter = row.getTexprtypView();
RowIterator experimentTypeAssocIter = experimentDesign.getExperimentTypeAssocView(
while(texpTypesIter.hasNext()
//adding authors
RowIterator tauthors = tpub.getTauthorViewl();
RowIterator assocs = pub.getPublicationAuthorView);
while(tauthors.hasNexto)
TexprtypViewRowImpl texpType = (TexprtypViewRowImpl)texpTypesIter.next();
ExperimentTypeAssocViewRowImpl assoc = (ExperimentTypeAssocViewRowImpl)experim
entTypeAssocIter.createRow();
TauthorViewRowImpl tauthor = (TauthorViewRowImpl)tauthors.next));
ViewObject persons = eModule.findViewObject("PersonViewl");
Row person = persons-createRowo;
String value = getCV(new Number(texpType.getTexprtypId().intValue()));
ExperimentTypeViewRowImpl expType = getExperimentTypeForValue(value);
assoc.setExperimentTypeId(expType.getId().getSequenceNumber();
experimentTypeAssocIter.insertRow(assoc);
person.setAttribute("FirstName", tauthor.getTauthorFname));
person.setAttribute( "LastName", tauthor.getTauthorLname());
person.setAttribute("MidInitials", tauthor.getTauthorInitial());
persons.insertRow(person);
RowIterator tothersIter = row.getTothersView);
PublicationAuthorViewRowImpl assoc = (PublicationAuthorViewRowImpl) assocs
.createRowo;
while(tothersIter.hasNext()
assoc.setAuthorId((Number)person.getAttribute("Id"));
assoc.setPublicationId(pub.getId().getSequenceNumber));
TothersViewRowImpl tothers = (TothersViewRowImpl)tothersIter.next();
if(tothers.getTothersId().equals("EXPERIMENTTYPE"))
assocs.insertRow(assoc);
I
private void createExperiment(TexprmntViewRowImpl row)
//Get experiment table in experibase
RowIterator experimentIter = subImpl.getExperiments);
ExperimentTypeAssocViewRowImpl assoc = (ExperimentTypeAssocViewRowImpl)exp
erimentTypeAssocIter.createRow();
ExperimentTypeViewRowImpl expType = getExperimentTypeForValue(tothers.getT
othersValue());
assoc.setExperimentTypeId(expType.getId().getSequenceNumber());
experimentTypeAssocIter.insertRow(assoc);
else if(tothers.getTothersId().equals("EXPERIMENTALFACTOR"))
//identifier
I
String identifier =
"MIAMEXPRESS:EXPERIMENT"+row.getTexprmntExprid(;
//add experiment
ExperimentTableViewRowImpl newExperiment = (ExperimentTableViewRowImpl)experimentI
ter.createRow();
newExperiment.setldentfier(identifier);
newExperiment.setName(submisRow.getTsubmisSubDescr());
experimentIter.insertRow(newExperiment);
ExperimentalFactorAssocViewRowImpl assoc = (ExperimentalFactorAssocViewRow
Impl)expFactorsAssocIter.createRow();
ExperimentalFactorViewRowImpl factor = getExperimentFactorForValue(tothers
.getTothersValue());
assoc.setExperimentalFactorId(factor.getId).getSequenceNumber));
expFactorsAssocIter.insertRow(assoc);
//addExperiment design
ViewObject experimentDesigns = eModule.findViewObject("ExperimentDesignViewl");
ExperimentDesignViewRowImpl experimentDesign = (ExperimentDesignViewRowImpl)experi
mentDesigns.createRow();
experimentDesign.setExperimentTableId(newExperiment.getId());
experimentDesign.setDescription(row.getTexprmntDescr());
experimentDesign.setHoldDate(submisRow.getTsubmisHoldDate().toStringo);
//create Samples
RowIterator tsamplesIter = row.getTsampleView();
while(tsamplesIter.hasNext()
I
TsampleViewRowImpl tsample = (TsampleViewRowImpl)tsamplesIter.next();
logger-info("Viewing sample id:"+tsample.getTsampleSysuid));
createSample(newExperiment, tsample);
experimentDesigns.insertRow(experimentDesign);
//Experimental Factors
RowIterator
texpFactorsIter =
persist();
row.getTexpfctrViewo;
RowIterator expFactorsAssocIter = experimentDesign.getExperimentalFactorAssocView(
}
I;
while (texpFactorslter.hasNext())
TexpfctrViewRowImpl texpfct = (TexpfctrViewRowImpl)texpFactorsIter.next();
ExperimentalFactorAssocViewRowImpl assoc = (ExperimentalFactorAssocViewRowImpl
)expFactorsAssocIter.createRow();
* Updates experibase experiment
* @param texperiment
*/
private void updateExperiment (TexprrmtViewRowlmpl texperiment)
MiamexpressiMporter.java
Thu May 19 22:50:34 2005
4
//create Samples
RowIterator tsamplesIter = texperiment.getTsampleView();
while(tsamplesIter.hasNext()
RowIterator experimentIter = subImpl.getExperiments();
String identifier = "MIAMEXPRESS:EXPERIMENT"+texperiment.getTexprmntExprid();
if(!experimentIter.hasNext()
TsampleViewRowImpl tsample = (TsampleViewRowImpl)tsamplesIter.next();
logger.info("Viewing sample id:"+tsample.getTsampleSysuid));
BioSampleViewRowImpl sample = getSampleForTsample(tsample);
if(sample != null)
logger.log(Level.WARNING, "ExperibaseExperiment not found, creating experi
ment");
createExperiment(texperiment);
return;
updateSample(sample, tsample);
if(!sample.getExperimentId().equals(experiment.getId())
ExperimentTableViewRowImpl experiment = (ExperimentTableViewRowImpl)experiment
sample.setExperimentId(experiment.getId());
Iter.next));
experiment.setIdentfier(identifier);
experiment.setName(submisRow.getTsubmisSubDescr));
else
ExperimentDesignViewRowImpl experimentDesign =
(ExperimentDesignViewRowImpl)experiment.getExperimentD
logger.log(Level.WARNING, "Could not find sample, so creating new one");
createSample(experiment, tsample);
esignView));
experimentDesign.setDescription(texperiment.getTexprmntDescr));
experimentDesign.setHoldDate(submisRow.getTsubmisHoldDate).toString());
//ExperimentFactors
private ExperimentTypeViewRowImpl getExperimentTypeForValue (String value)
RowIterator experimentalFactorAssocIter =
experimentDesign.getExperimentalFac
ViewObject experimentTypes = eModule.findViewObject("ExperimentTypeQueryViewl"
torAssocView));
while(experimentalFactorAssoclter.hasNext()
experimentalFactorAssocIter.next).remove));
experimentTypes.setWhereClauseParam(O, value);
experimentTypes.executeQuery();
texperiment.getTexpfctrView();
RowIterator texpFactorsIter =
while(texpFactorsIter.hasNext))
if(experimentTypes.hasNext()
I
return (ExperimentTypeViewRowImpl)experimentTypes.next);
TexpfctrViewRowImpl texpfct = (TexpfctrViewRowImpl)texpFactorsIter.next);
ExperimentalFactorAssocViewRowImpl assoc = (ExperimentalFactorAssocViewRow
Impl)experimentalFactorAssocIter.createRow(;
String value = getCV(new Number(texpfct.getTexpfctrId).intValue()));
ExperimentalFactorViewRowImpl factor = getExperimentFactorForValue(value);
assoc.setExperimentalFactorId(factor.getId).getSequenceNumber());
experimentalFactorAssocIter.insertRow(assoc);
//Experiment Types
RowIterator experimentTypeAssocIter = experimentDesign.getExperimentTypeAssocV
else
experimentTypes = eModule. findViewObject ("ExperimentTypeViewl");
ExperimentTypeViewRowImpl experimentType = (ExperimentTypeViewRowImpl)expe
rimentTypes.createRow();
experimentType.setValue(value);
experimentType.setSource("MIAMExpress");
experimentType.setDescription("Experiment Type");
experimentTypes.insertRow(experimentType);
persist);
return experimentType;
iew);
while(experimentTypeAssocIter.hasNext())
experimentTypeAssocIter.next().remove));
private ExperimentalFactorViewRowImpl getExperimentFactorForValue(String value)
ViewObject experimentFactors
= eModule.findViewObject("ExperimentFactorQueryVi
ewl");
RowIterator texpTypesIter = texperiment.getTexprtypView();
while(texpTypesIter.hasNext())
TexprtypViewRowImpl texpType =
(TexprtypViewRowImpl)texpTypesIter.next();
experimentFactors.setWhereClauseParam(O, value);
experimentFactors.executeQuery();
if(experimentFactors.hasNext()
ExperimentTypeAssocViewRowImpl assoc = (ExperimentTypeAssocViewRowImpl)exp
erimentTypeAssocIter.createRow();
String value = getCV(new Number(texpType.getTexprtypId().intValue()));
return (ExperimentalFactorViewRowImpl)experimentFactors.next));
else
ExperimentTypeViewRowImpl expType = getExperimentTypeForValue(value);
assoc.setExperimentTypeId(expType.getId().getSequenceNumber());
experimentTypeAssocIter.insertRow(assoc);
experimentFactors = eModule.findViewObject("ExperimentalFactorViewl");
ExperimentalFactorViewRowImpl experimentFactor = (ExperimentalFactorViewRo
wImpl)experimentFactors.createRow);
experimentFactor.setValue(value);
MiamexpressImporter. java
Thu May 19 22:50:34 2005
5
sample setTimeUnit(getCV(new Number(tsample.getTsampleTimeUnit().intValue(
experimentFactor.setSource("MIAMExpress");
experimentFactor.setDescription('Experiment Type");
experimentFactors.insertRow(experimentFactor);
persist));
samples.insertRow(sample);
persist));
return experimentFactor;
}
private void
createSample(ExperimentTableViewRowImpl
private BioSampleViewRowImpl getSampleForTsample (TsampleViewRowImpl tsample)
exp, TsampleViewRowImpl tsamp
le)
String identifier = "MIAMEXPRESS:BIOSAMPLE:"+tsample.getTsampleSysuid);
ViewObject query = eModule.findViewObject("BioSampleQueryViewl");
query.setWhereClauseParam(O, identifier);
query.executeQuery();
if(query.hasNext()
RowIterator samples = exp.getBioSampleView();
BioSampleViewRowImpl sample = (BioSampleViewRowImpl) samples.createRow);
sample-setIdentifier()"MIAMEXPRESS:BIOSAMPLE:"+tsample.getTsampleSysuid));
if(tsample.getTsampleAdditional() != null)
sample.setAdditional(tsample.getTsampleAdditional());
if(tsample.getTsampleAgerangeMax() != null)
BioSampleViewRowImpl sample = (BioSampleViewRowImpl)query.next));
logger.info("found biosample id: "+sample.getId().getSequenceNumber());
return sample;
sample.setAgeRangeMax(new Number(tsample.getTsampleAgerangeMax).floatValu
e()));
!= null)
if(tsample.getTsampleAgerangeMin)
sample.setAgeRangeMin(new Number(tsample.getTsampleAgerangeMin).floatValu
else return null;
e());
private PublicationViewRowImpl getPublicationForTpub(TpublicViewRowImpl tpub)
!= null)
if(tsample.getTsampleAgeStatus)
sample.setAgeStatus(getCV(new Number(tsample.getTsampleAgeStatus).intValu
String identifier = "MIAMEXPRESS:PUBLICATION:"+tpub.getTpublicSysuid(;
ViewObject query = eModule.findviewObject("PublicationQueryViewl");
query.setWhereClauseParam(O, identifier);
query.executeQuery();
if(query.hasNext())
e( ))));
if(tsample.getTsampleCellLine)
!= null)
sample.setCellLine(tsample.getTsampleCellLine));
if(tsample.getTsampleCellProvider() != null)
sample.setCellProvider(tsample.getTsampleCellProvider));
!= null)
if(tsample.getTsampleDevStage)
sample.setDevStage(getCV(new Number(tsample.getTsampleDevStage).intValue(
PublicationViewRowImpl pub =
return pub;
if(tsample.getTsampleDiseaseState() != null)
sample.setDiseaseState(tsample.getTsampleDiseaseState));
!= null)
if(tsample.getTsampleGeneticVariation)
sample.setGeneticVariation(getCV(new Number(tsample.getTsampleGeneticVaria
else return null;
private void updateSample(BioSampleViewRowImpl sample, TsampleViewRowImpl tsample)
tion().intValue())));
if(tsample.getTsampleIndividual() != null)
sample.setIndividual(tsample.getTsampleIndividual());
if(tsample.getTsampleIndividualGen)
!= null)
sample.setIndividualGen(tsample.getTsampleIndividualGen));
!= null)
if(tsample.getTsampleOrganismPart)
sample.setOrganismPart(tsample.getTsampleOrganismPart));
if(tsample.getTsampleSampleType() != null)
sample.setSampleType(getCV(new Number(tsample.getTsampleSampleType() .intVa
lue())
(PublicationViewRowImpl)query.next));
{
if(tsample.getTsampleAdditional() != null)
sample.setAdditional(tsample.getTsampleAdditional());
if(tsample.getTsampleAgerangeMax) != null)
sample.setAgeRangeMax(new Number(tsample.getTsampleAgerangeMax).floatValu
e()));
if(tsample.getTsampleAgerangeMin() != null)
sample.setAgeRangeMin(new Number(tsample.getTsampleAgerangeMin() . floatValu
);
if(tsample.getTsampleSeparationTech)
!= null)
sample.setSeparationTech(getCV(new Number(tsample.getTsampleSeparationTech
).intValue())
e())
e()
);
))
if(tsample.getTsampleAgeStatus ()!= null)
sample.setAgeStatus(getCV(new Number(tsample.getTsampleAgeStatus ().intValu
);
if(tsample.getTsampleSex)
!= null)
sample.setSex(getCV(new Number(tsample.getTsampleSex).intValue))));
if(tsample.getTsampleTargetCellType)
!= null)
sample.setTargetCellType(tsample.getTsampleTargetCellType));
if(tsample.getTntxsynView)
= null)
if(tsample.getTsampleCellLine() != null)
sample.setCellLine(tsample.getTsampleCellLine));
if(tsample.getTsampleCellProvider() !=null)
sample.setCellProvider(tsample.getTsampleCellProvider));
if(tsample.getTsampleDevStage) !=null)
sample.setDevStage(getCV(new Number(tsample.getTsampleDevStage).intValue(
TntxsynViewRowImpl taxonomy = (TntxsynViewRowImpl)tsample. getTntxsynView)
sample.setTaxonomy(taxonomy.getTntxsynNameTxt));
if(tsample.getTsampleTimePoint()
= null)
if(tsample.getTsampleDiseaseState) != null)
sample.setDiseaseState(tsample.getTsampleDiseaseState));
if(tsample.getTsampleGeneticVariation() != null)
sample.setGeneticVariation(getCV(new Number(tsample.getTsampleGeneticVaria
tion).intValue))));
sample.setTimePoint(getCV(new Number(tsample.getTsampleTimePoint).intValu
if(tsample.getTsampleIndividual()
!= null)
sample.setIndividual(tsample.getTsampleIndividual());
if(tsample.getTsampleIndividualGen)
!= null)
e())));u
if(tsample.getTsampleTimeUnit() != null)
sample.setIndividualGen(tsample.getTsampleIndividualGen));
if(tsample.getTsampleOrganismPart() != null)
Thu May 19 22:50:34 2005
MiamexpressImporter.java
sample.setOrganismPart (tsample.getTsampleOrganismPart ));
if(tsample.getTsampleSampleType() !=null)
sample.setSampleType(getCV(new Number(tsample.getTsampleSampleType() .intVa
lue()
));
if(tsample.getTsampleSeparationTech()
null)
sample.setSeparationTech(getCV(new Number(tsample.getTsampleSeparationTech
().intValue()));
null)
if(tsample.getTsampleSex()
sample.setSex(getCV(new Number(tsample.getTsampleSex() . intValue ()));
if (tsample.getTsampleTargetCellType() != null)
sample.setTargetCellType(tsample.getTsampleTargetCellType) );
if (tsample.getTntxsynView) != null)
TntxsynViewRowImpl taxonomy = (TntxsynViewRowImpl)tsample.getTntxsynView()
sample.setTaxonomy(taxonomy.getTntxsynNameTxt));
if (tsample.getTsampleTimePoint)
6
experibasePba.setLabeledExtractId(labeledExtract.getId) .getSequenceNumber
//
());
persist );
//
//
/'/
//
//
//
//
////
//
//
//
//
//
//
//import protocols
while(protocols.hasNext()
TprotclsViewRowImpl protocol = (TprotclsViewRowImpl)protocols.next );
//import scan protocol;
if(scanProtocol != null)
I
= null)
sample. setTimePoint (getCV(new Number(tsample.getTsampleTimePoint () . intValu
e)))));
if (tsample.getTsampleTimeUnit()
= null)
sample.setTimeUnit(getCV(new Number(tsample.getTsampleTimeUnit () .intValue(
private void createLabeledHybrid(TlabhybViewRowImpl row)
//
//
//-
ThybridViewRowImpl thybrid = (ThybridViewRowImpl )row.getThybridView();
TlabelViewRowImpl tlabel = (TlabelViewRowImpl)row.getTlabelView();
//I
//I
TarrayViewRowImpl tarray =
//
//I
//
//create pba
RowIterator pbas = subImpl.getPhysicalBioAssays);
PhysicalBioAssayViewRowImpl pba = (PhysicalBioAssayViewRowImpl)pbas.createRo
TextractViewRowImpl textract =
(TextractViewRowImpl)tlabel.getTextractView()
(TarrayViewRowImpl) thybrid.getTarrayView);
wO);
/I/
pba. setIdentifier( "MIAMEXPRESS: "+row.getTlabhybSysuid());
//pbas.insertRow(pba);
/I/
//
//import array
//I
ViewObject experibaseArrays = eModule.findViewObject("ArrayView");
if(arrays.hasNexto)
/////I/I/
//
/1/
//
//import label
//ViewObject labeledExtracts = eModule.findViewObject ("LabeledExtract");
//
while(labels.hasNext())
////TlabelViewRowImpl label = (TlabelViewRowImpl)labels.next();
//
/'LabeledExtractViewRowImpl labeledExtract = (LabeledExtractViewRowImpl)labe
lad]Extracts.createRow();
//labeledExtract.setName(label.getTlabelId());
//
labeledExtracts.insertRow(labeledExtract);
////I
persist();
private void importExperimentDesign)
//
//create record in database
//
ViewObject experimentDesigns = eModule. findViewObject ("ExperimentDesignViewl");
//
ExperimentDesignViewRowImpl row = (ExperimentDesignViewRowImpl)experimentDesigns
.createRow );
row.setExperimentTableId(expImpl.getId());
//
//
//
//add types
RowIterator types = row.getExperimentTypeView);
//
RowIterator ctlvcbs = mx-exp.getExperimentTypeView();
//
//
while(ctlvcbs.hasNext()
//
//
(
TctlvcbViewRowImpl ctlvcb = (TctlvcbViewRowImpl)ctlvcbs.next );
//
ExperimentTypeViewRowImpl type = (ExperimentTypeViewRowImpl)types. createRow);
I/
type.setSource("MIAMEXPRESS");
//
type.setDescription(ctlvcb.getTctlvcbDescr());
//
type.setValue(ctlvcb.getTctlvcbValue));
//
types.insertRow(type);
//
//
//
}
//
/I/
xperimentDesigns .insertRow (row);
private String getCV(Number number)
ViewObject tctlvcbView = mxModule. findViewObject ("TctlvcbViewl");
logger.info("Finding ctrlvcb with key " + number);
Key key = new Key(new Object[]{number});
Row[] rows = tctlvcbView.findByKey(key, 1);
logger.info("found " + rows.length);
if(rows.length == 1)
{
TctlvcbViewRowImpl row = (TctlvcbViewRowImpl)rows[0];
return row.getTctlvcbValue);
else
return null;
MiamexpressInporter.java
Thu May 19 22:50:34 2005
7
*/
static void doHelp(int returnValue)
System.err.println("Usage: MiameExpressImporter -m miamexpressId -e experibase
Id");
Transfers data from Miamexpress to Experibase. First argument is miamexpress
* submisison Id, second argument is experibase submission Id;
* @param args First argument is miamexpress submisison Id, second argument is exper
ibase submission Id;
*/
public static void main(String[] args)
*
try
{
FileHandler fh = new FileHandler("miamexpesslog.txt", 5242880, 1, false);
logger. addHandler (fh);
catch (IOException e)
{
//ignore
logger. setLevel (Level.WARNING);
GetOpt go = new GetOpt(args, "vhm:e:");
go.optErr = true;
int ch = -1;
// process options in command line arguments
boolean usagePrint = false;
int miamexpressId = -1;
int experibaseld = -1;
while ((ch = go.getopt()) !=go.optEOF)
((char)ch == 'h') usagePrint = true;
if
else if ((char)ch == 'im')
miamexpressId = go.processArg(go.optArgGet(), miamexpressId);
logger.info("MiamexpressId is " + miamexpressId);
else if ((char)ch == 'e')
experibaseId = go.processArg(go.optArgGet(), experibaseId);
logger.info("ExperibaeId is " + experibaseId);
}
else if ((char)ch == 'v')
logger.setLevel(Level.ALL);
else doHelp(l);
// undefined option
if (usagePrint)
doHelp(0);
if(miamexpressld < 0 11 experibaseId < 0)
doHelp(l);
else
MiamexpressImporter importer = new MiamexpressImporter(experibaseId, miamexpre
ssId)
importer. importData );
importer. persist();
Stub for providing help on usage
/**
* You can write a longer help than this, certainly.
System.err.println("\tp\tExperibaseId");
System.err.println("\th\tPrints this menu");
logger.info("help shown with return value " + returnValue);
System.exit(returnValue);
NameValuePair. java
Wed Mar 09 01:53:30 2005
1
/*
*
* @param value The value to set.
*/
public void setValue(String value)
this.value = value;
Created on Mar 8, 2005
* TODO To change the template for this generated file go to
* Window - Preferences - Java - Code Style - Code Templates
*/
package edu.mit.data.affymetrix;
}
import java.io.Serializable;
* @author Aidan Downes
* TODO To change the template for this generated type comment go to
* Window - Preferences - Java - Code Style - Code Templates
*/
public class NameValuePair implements
/*
*
Serializable{
(non-Javadoc)
@see java.lang.Object#toString)
public String toString()
return new StringBuffer() .append(name) .append("=>") .append(value).toSt
ring() ;
for <code>serialVersionUID</code>
*/
private static final long serialVersionUID =
private String name;
private String value;
* Comment
1L;
*/
public NameValuePair()
super (;
*
@param
name
@param value
*/
public NameValuePair(String name, String value)
*
super();
this.name = name;
this.value = value;
*
@return
Returns the name.
*/
public String getName()
return name;
@param name The name to set.
*/
public void setName(String name)
this.name = name;
* @return Returns the value.
*/
public String getValue)
return value;
AffymetrixImporter.java
Thu May 12 13:36:46 2005
package org.experibase.importer;
import edu.mit.data.affymetrix.CELData;
import edu.mit.data.affymetrix.CELFileEntryType;
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
import
edu.mit.data.affymetrix.CELHeaderData;
edu.mit.data.affymetrix.CHPData;
edu.mit.data.affymetrix.CHPHeaderData;
edu.mit.data.affymetrix.ExperimentData;
edu.mit.data.affymetrix.ExpressionProbeSetResults;
edu.mit.data.affymetrix.NameValuePair;
java.io.IOException;
java.io.StringWriter;
java.util.Iterator;
java.util.List;
java.util.logging.Level;
java.util.logging.Logger;
oracle.jbo.domain.Number;
org.biomage.Array.ArrayManufacture;
org.biomage.Array.Array-package;
org.biomage.ArrayDesign.ArrayDesign-package;
org.biomage.ArrayDesign.PhysicalArrayDesign;
org.biomage.AuditAndSecurity.AuditAndSecurity-package;
org.biomage.AuditAndSecurity.Person;
org.biomage.BioAssay.BioAssaypackage;
org.biomage.BioAssay.Channel;
org.biomage.BioAssay.Hybridization;
org.biomage.BioAssay.ImageAcquisition;
org.biomage.BioAssay.PhysicalBioAssay;
org.biomage.BioAssayData.BioAssayData-package;
org.biomage.BioAssayData.DerivedBioAssayData;
org.biomage.BioAssayData.Transformation;
org.biomage.BioEvent.BioEvent-package;
org.biomage.BioMaterial.BioMaterialpackage;
org.biomage.BioMaterial.BioSource;
org.biomage.BioMaterial.Compound;
org.biomage.Common.MAGEJava;
org.biomage.Common.NameValueType;
org.biomage.Description.Description;
org.biomage.Description.OntologyEntry;
org.biomage.Experiment.Experiment;
org.biomage.Experiment.Experimentpackage;
org.biomage.Protocol.Hardware;
org.biomage.Protocol.HardwareApplication;
org.biomage.Protocol.Parameter;
org.biomage.Protocol.ParameterValue;
org.biomage.Protocol.Protocol;
org.biomage.Protocol.ProtocolApplication;
org.biomage.Protocol.Protocol-package;
oracle.jbo.*;
oracle.jbo.domain.*;
oracle.jbo.client.Configuration;
org.experibase.microarrays.SubmissionViewRowImpl;
org.experibase.microarrays.arraydesign.ArrayDesignTableViewRowImpl;
org.experibase.microarrays.bioassaydata.AffyCelDataViewRowImpl;
org.experibase.microarrays.bioassaydata.AffyChpDataViewRowImpl;
org.experibase.microarrays.bioassaydata.CELAnalysisViewRowImpl;
org.experibase.microarrays.bioassaydata.CHPAnalysisViewRowImpl;
org.experibase.microarrays.bioassaydata.DerivedBioAssayDataViewRowImpl;
org.experibase.microarrays.bioassaydata.MeasuredBioAssayDataViewImpl;
org.experibase.microarrays.bioassaydata.MeasuredBioAssayDataViewRowImpl;
org.experibase.microarrays.common.*;
import org.experibase.microarrays.experiment.ExperimentTableViewRowImpl;
BioEventpackage bep;
private Protocolpackage pp;
private BioMaterial-package bmp;
AuditAndSecurity-package aasp;
private
private Array-package ap;
private ArrayDesign-package adp;
private Experiment-package ep;
private Experiment exp;
private BioAssayData-package badp;
private Protocol protocol;
private static Logger logger = Logger.getLogger("org.experibase");
private ApplicationModule eModule;
private ExperimentTableViewRowImpl expImpl;
private SubmissionViewRowImpl subImpl;
private AffyChpDataViewRowImpl dbaImpl;
private AffyCelDataViewRowImpl mbaImpl;
private
* Creates importer for particular affymetrix submission
* @param gId The Experibase GroupId
* @param pId The Experibase ProjectId
public AffymetrixImporter(int eId)
this.experibaseId = eId;
init();
private void init()
logger.entering("AffymetrixImporter",
"init");
eModule = Configuration.createRootApplicationModule("org.experibase.microarrays.Ex
peribaseModule", "ExperibaseModuleLocal");
logger.info("Found application module");
findSubmission();
RowIterator iter =
subImpl.getExperiments);
if(iter.hasNext()
expImpl = (ExperimentTableViewRowImpl)iter.next));
logger.info("Found existing experiment");
else
expImpl = (ExperimentTableViewRowImpl) iter.createRow();
iter.insertRow(expImpl);
logger.info("Created new row in experiment table");
logger.exiting("AffymetrixImporter",
"init");
private void findSubmission()
ViewObject submis = eModule.findViewObject("ExperimentSubmissionView");
submis.setWhereClauseParam(O, new Number(experibaseId));
submis.executeQuery();
public
class AffymetrixImporter
private int experibaseId;
private MAGEJava mage;
private BioAssay-package bap;
if(submis.hasNext()
logger.info("Found existing row");
subImpl = (SubmissionViewRowImpl)submis.next();