Adding Custom Affymetrix Annotations to the GenMAPP database

advertisement
Adding Custom Agilent Annotations to the GenMAPP database
This document describes the process of adding custom Agilent annotation to the GenMAPP
Gene Database. These instructions require that the Microsoft Office Access database application
is installed.
1
2
3
Principles of the GenMAPP Gene Database and strategy for adding annotations
Formatting the data
2.1 Formatting the Agilent Gene Table
2.2 Formatting the Relationship Tables
Importing the data in Access
1
Principles of the GenMAPP Gene Database and strategy for adding annotations
The GenMAPP Gene Database is a species-specific library of gene information used by the
GenMAPP program. It contains the relationship between gene IDs from different systems.
The Gene Database contains two main types of information, Gene Tables and Relationship
Tables.
A Gene Table is a collection of gene identifiers from a gene or probe cataloging system for your
species. In addition to the gene identifiers, the Gene Table can also contain other annotation,
such as gene symbol, gene name, gene description etc. The Agilent probe set identifiers for all
Agilent arrays supported in a database are stored in the Agilent Gene Table.
A Relationship Table provides links between the gene IDs of two gene ID systems. If your Gene
Database contains the two systems A and B as Gene Tables, they should also contain the
Relationship Table A – B. For example, if you have UniGene and UniProt annotations for your
Agilent probes, then you need two relationship tables: Agilent-UniGene and Agilent-UniProt
The strategy for adding annotations to the database used in this document does not use the
GenMAPP built-in features for customizing the database. The reason for this is that these
features do not allow additions to the database for existing tables (i.e. Agilent).
The strategy employed here is to add new annotations to the existing tables. To do this, the
format of the new annotations has to exactly match that of the existing table.
2
Formatting the data
The input data for custom arrays can come from an Agilent annotation file or you may have
created the annotations through some other means. For import to the GenMAPP database, the
annotations have to be formatted correctly.
2.1 Formatting the Agilent Gene Table
NOTE: This is not necessary unless specific annotations exist about the probes, such as
custom descriptions. Also, a Gene Table is necessary if a direct weblink can be
established to a remote site, which will be linked to from GenMAPP backpages and the
Gene Finder.
2.1.1 Open the Agilent annotations in MS Excel. Change the column headers (case sensitive)
and content to match the following scheme: (see example figure below)
ID
Contains the probe IDs
Chip Contains the name of the Agilent arrays used. This column should have the same entry in
each row for one array.
Species
Two-letter species code, for example |Dr| for Drosophila melanogaster
Date Date of annotation. It is recommended that the Date field is left blank since the format of
this is usually not recognized by Access.
Remarks
Can contain comments you have about specific probes, or it can be empty
2.1.2
Save the file as a .txt file and close the file.
2.2 Formatting the Relationship Tables
For each annotation available for the Agilent array (Unigene, UniProt etc) a Relationship Table
has to be created according to the below instructions. The instructions refer to the AgilentUnigene relationship as an example.
2.2.1 Open the file containing the Agilent-Unigene relationship in MS Excel. Remove the
column headers to match the following scheme: (see example figure below). Save the file as
a text file (.txt).
2
Importing the data in Access
Adding custom annotation to a GenMAPP database in Access involves importing the newly
formatted files. During import, we will select to add the data to an existing table, rather than
creating a new table.
3.1 Locate the GenMAPP Database to be modified. Right-click on the file icon and select Open
With>Microsoft Office Access. During opening, you might get a security warning, click Open
to continue.
3.2 When the database opens in Access, click on the Tables tab on the left side to display all
tables in the database. Confirm that the tables you are planning to add to are indeed present
in the database. For example, if you have formatted files for an Agilent-Unigene
Relationship table, then make sure that there is an Agilent-Unigene table.
3.3 In the Tables tab tool bar, click New, and then select Import Table in the window that
appears.
3.4 Browse to find the Agilent-Unigene text file created under 2.1.2 above. A series of windows
will appear, asking for information about the file:
Step 1: Select Delimited and click Next.
Step 2: Select Tab and un-check the option for First Row Contains Field Names. Click Next.
Step 3: Select In an Existing Table and then select the Agilent-Unigene table from the dropdown box. Click Next.
Step 4: In the final window, click Finish.
A message will appear informing you of if the import was successful.
3.5 Repeat steps 1-4 with each of the remaining Relationship Tables, remembering to change
the target table in step 3 to the appropriate table. The database is now ready to use.
Download