Batch Data

advertisement
Mouse Genome Informatics Workshop
University of North Carolina, Chapel Hill
February 20 - 22, 2007
Mouse Genome Informatics (MGI) provides integrated access to data on the genetics,
genomics and biology of the laboratory mouse. In this workshop you will explore the MGI
database in depth. You will use MGI to:











find mouse models of human disease;
identify single nucleotide polymorphisms (SNPs), PCR polymorphisms and biological
sequences specific to selected mouse strains;
find primer sequences, genotyping protocols, and suppliers of mouse mutants;
locate suppliers of BAC or cDNA clones;
review nomenclature for mouse genes, markers, alleles, mutations and strains;
learn how to submit your own mouse genome, mutant, allele, strain or expression data
to MGI;
locate quantitative trail loci (QTL) associated with a specific phenotype;
view mouse markers, alleles and phenotypes mapped onto a genome browser;
find gene expression assays and images for specific anatomical structures and
developmental stages;
retrieve upstream regulatory sequence of a gene;
view terms describing the molecular function, biological process and cellular
component of a gene, and retrieve a list of genes annotated with a specific Gene
Ontology (GO) term.
Human Disease (OMIM) and Mouse Models
The Human Disease Vocabulary Browser contains terms from Online Mendelian Inheritance in
Man (OMIM). You can use the vocabulary browser to find mouse models of human disease.
The following example searches for mouse models of Greig Cephalopolysyndactyly Syndrome
and Bruton Type Agammaglobulinemia.
1. Go to the MGI home page at www.informatics.jax.org. In the Search Menus column on the
right, click the link to Vocabulary Browsers. Then click the link to the
Human Disease Vocabulary Browser.
1
2. Click on the letter G to view a list of terms beginning with this character. Scroll down the list
to Greig Cephalopolysyndactyly Syndrome; GCPS. Click this link to view details about the one
mouse model associated with this disease.
3. On the Human Disease and Mouse Model Detail page, click the link to Gli3Xt-J/Gli3Xt-J in the
Mouse Models section.
4. On the Gli3Xt-J Phenotypic Allele Detail page, click on the link at top right to show the
3 image(s) involving this allele. Click directly on any image of mouse feet for an expanded
view.
5. Return to the Phenotypic Allele Detail page by using your browser’s back button, or clicking
the Gli3Xt-J symbol. Scroll down through the phenotype annotations to the bottom of the page.
6. At the bottom of the page, click the number in parentheses in All references (x). This page
lists papers curated by MGI staff to provide phenotype annotations, disease associations, and
other information for allele Gli3Xt-J.
7. Return to the vocabulary browser by clicking the Human Disease (OMIM) link in the left
hand navigation bar.
8. Type btk in the text box and click the Search button.
9. Click the link to Bruton Agammaglobulinemia Tyrosine Kinase; BTK.
10. Note the 6 mouse models in the table and the different genotypes of each. Click the links in
the Ref(s) column to read reference abstracts.
11. To find mouse genes associated with human diseases, click the Genes/Markers link in the
left hand navigation bar. Click the Genes and Markers link on the main part of the page to
retrieve the Genes and Markers Query Form.
12. In the Gene/Marker section at the top of the page, type pax in the
Gene/Marker Symbol/Name text box.
13. In the Mouse phenotypes & mouse models of human disease section, click the link to
Anatomical Systems Affected by Phenotypes, scroll down to the bottom of the list and select
vision/eye. Click the Add To Query button.
14. Click the Search button.
15. Click the Pax6 gene symbol in the Symbol, Name column on the Query Results page.
2
16. On the Pax6 Gene Detail page, find the Phenotypes section and click the number in
parentheses in Alleles Annotated to Human Diseases(x).
17. Note that the alleles listed on the Query Results page have been associated with
Aniridia, Type II as well as other human diseases. Find allele symbol Pax6Sey-Dey in the first
column and click this link.
18. Click directly on the image of the mice to view a larger image and read additional detail
about phenotype.
Phenotypes and Alleles
Alleles in MGI may be annotated with Mammalian Phenotype (MP) terms or free text notes.
You may find mutant or genetically engineered alleles, transgenes, or QTL variants by
phenotype annotation, human disease annotation, nomenclature or chromosomal location.
1. Click the Phenotypes/Alleles link in the navigation bar. Click the link to
Phenotypes and Alleles to retrieve the Phenotypes and Alleles Query Form.
2. Click the link to Anatomical Systems Affected by Phenotypes and select nervous system
from the menu. Click the Add to Query button.
3. Select chromosome 1 in the Nomenclature & location section of the form.
4. In the Categories section, check the box next to QTL. Click the Search button.
5. Scroll down the Query Results page and click the link to Allele Symbol Szs1DBA2J.
6. From the QTL Variant Detail page, click the See Below link for references and additional
information about this variant.
7. Move back up the page and click the link to Mammalian Phenotype (MP) term seizures, or
any other hyperlinked MP term.
8. On the Mammalian Phenotype Browser page, click the link to (x genotypes, y annotations) to
view a list of the genotypes annotated with the term.
9. Click the link to Mouse GBrowse in the left hand navigation bar.
10. Erase any text in the Landmark or Region text box and type Sox10. Click the Search
button.
3
11. Further down the page, find the Tracks section. Leave the default boxes checked, and in
addition, check the box next to:
Alleles
All on
For Phenotypes, check boxes next to:
Endocrine_exocrine_gland
Growth_size
Nervous_system
Pigmentation
Click the Update image button.
12. Find the Scroll/Zoom pull-down menu at the top center of the page, above the overview of
chromosome 15. Select Show 1 Mbp.
13. GBrowse now displays a 1 Mbp region of chromosome 15 surrounding the Sox10 gene.
The display includes representative transcripts, gene models, alleles and associated phenotype
terms. These mapped features are clickable. Click one of your choice to retrieve details for that
feature.
14. Return to the MGI home page by clicking the link to MGI Home in the top left corner of the
page. At the bottom left of the MGI home page, click the link to Deltagen and Lexicon
Knockout (KO) Mice.
15. Scroll down the page to gene symbol Slc12a6. To find this symbol quickly, hold down
Ctrl F on Windows machines,  F on Macintosh machines, or open the Edit menu on your
browser and choose Find. Type Slc12a6 in the text box.
16. Click the link to allele symbol Slc12a6Gt(IRESBetageo)105Lex. On the Phenotypic Allele Detail page,
find the Phenotypes section and note the behavior/neurological phenotypes our curators have
assigned to this allele. In the Additional Information section, click the link to data as provided
by Lexicon Genetics, Inc.
17. On the Lexicon Knockout Mice Phenotypic Data Summary page, click the Neurology folder
at left to open. Open the folders for Inverted_Screen, Openfield, or Tail_Suspension. Click on
Results for a test. Note that data are given for wild type (WT), heterozygotes (HET), and
homozygotes (HOM) for this allele.
18. Open the folders for Radiology or Opthalmology and view results within the subfolders.
These show nice images from CATScans, angiograms, and other scans displaying skeletal and
optical anatomy. Click on the images to zoom in.
4
19. MGI makes four out-of-print books, including The Coat Colors of Mice by Willys K. Silvers,
available electronically. To return to the MGI home page, click on the MGI logo in the top left
corner. Click Additional MGI Tools and Links on the MGI home page beneath the blue Search
box. Click the Online Books link above the Nomenclature section. Click the link to
The Coat Colors of Mice. Click directly on the image of the book to enter. In the left hand
navigation bar, scroll to the bottom and click the link to List of Figures. Click on any link to
Plates 1, 2 or 3 to view photographs displaying a variety of coat color mutants.
20. Return to the List of Figures and find Chapter 12 : Micropthalmia and Other
Considerations in the left hand navigation bar. Click the link to Roman numeral
I: Micropthalmia Locus. Click the link to mi Allele (MGI) in the box beneath the subheading
A. Micropthalmia (mi).
21. On the Phenotypic Allele Detail page, click directly on the image of MitfMi/MitfMi and
control mice to enlarge the image and read a brief description.
22. Return to the MitfMi Phenotypic Allele Detail page by clicking the allele symbol or using
your browser’s back button. To find mice with Mitf mutations, click the link to
Search for IMSR strains in the Allele details section of the page. The International Mouse Strain
Resource contains information from a number of different repositories including Oak Ridge
National Laboratory (ORNL), The Jackson Laboratory (JAX), and the Mutant Mouse Regional
Resources Centers (MMRRC) among others. Note that column 1 links to strain information at
the repository site; column 2 links to the holder for questions or orders; farther right, the allele
symbol links to the Phenotypic Allele Detail page in MGI.
Strains and Polymorphisms
You may use MGI to compare polymorphisms between mouse strains. MGI contains single
nucleotide polymorphisms (SNPs), restriction fragment length polymorphisms (RFLPs), and
PCR polymorphisms for a long list of strains. You can also retrieve strain-specific biological
sequences, locate suppliers of BAC or cDNA clones, review nomenclature for genes, alleles,
mutations and strains, and submit data about a new mutation or strain.
1. From the IMSR, return to MGI by clicking the MGI logo or using your browser’s back
button. Click the link to Strains/Polymorphisms in the navigation bar on the left. Then click
the SNPs link to retrieve the Mouse SNP Query Form.
2. From the Available Strains list, select the following strains. To select all at once, hold down
the Ctrl key on Windows machines while selecting each strain. On Macintosh machines, hold
down the  key.
A/J
C3H/HeJ
5
C57BL/6J
DBA/2J
3. Click the Add button to move strains to the Selected Strains list.
4. In the Associated genes section of the Mouse SNP Query Form, find the
Gene Symbol/Name text box and type Pparg. Click the Search button. A new browser
window will open displaying SNPs identified within or near the gamma peroxisome
proliferator activated receptor (Pparg) gene. Click any link to MGI SNP Detail in the SNP ID
column to view flanking sequence for a SNP.
5. To view SNPs associated with all strains assayed, return to the Mouse SNPs Query Results
page and click the Pparg link in the Gene: dbSNP Function Class column. On the Pparg Gene
Detail page, find the Polymorphisms section and click the number in parentheses next to
SNPs within 2kb(x).
6. To find PCR polymorphisms, return to the Strains and Polymorphisms menu by clicking the
link in the left hand navigation bar. Click the link to RFLP/PCR Polymorphisms. On the
RFLP/PCR Polymorphism Query Form, type Snca in the Locus Symbol/Name text box. Click
a PCR link in the Type column to view details about alleles, fragment sizes and strains. Click a
link in the Probe column to retrieve further detail, including primer sequences in many cases.
7. To find genotyping protocols including reaction components, cycling conditions and
primers, visit JAX Mice at this URL:
http://jaxmice.jax.org
In the left hand navigation bar, click the link to Mouse Strain Information. Find the JAX
Mice Resources section in the column at right, and click the link to Genotyping Protocols.
Type Snca in the Gene Symbol text box and click the Search button. Click a link in the
Protocol column for detailed information. Scroll down the page below the gel image to view
detailed reaction conditions. For additional information please contact JAX Mice directly at
micetech@jax.org or 800-422-MICE.
8. To find sequences by strain, return to MGI and click the Sequences link in the navigation
bar. Click the Mouse Sequences link to retrieve the Mouse Sequence Query Form. For
Sequence Type, select RNA. In the Sequence Source section, type C57BL/6 in the
Strain/Species box, and check the box next to NOT. Change “begins” to “contains” in the
drop-down menu.
9. Scroll down the page to the Protein domains section and type wnt in the text box.
6
10. In the Expression section of the form, type limb, skeleton in the Anatomical
Structure(s) text box.
11. Find the Sorting and output format section at the bottom of the page. Change
Maximum number of items returned to 15 to speed up the search. Change the Sort by menu
from Type to Strain/Species. Click the Search button.
12. This complex search returns the first fifteen RNA sequences from strain names not
containing “C57BL/6”. All sequences are annotated with an Interpro domain containing
“wnt” and have expression data for the anatomical structures “limb” or “skeleton” or their
substructures.
13. To locate suppliers of a BAC or cDNA clone, click the link to Probes/Clones in the left
hand navigation bar, then the link to the Molecular Probes and Clones Query Form.
14. For Gene/Marker Symbol/Name, change “contains” to "=" in the drop-down menu and
type Park2 in the text box.
15. For Sequence Type, select “genomic” from the drop-down menu in the Probe/Clone
attributes section. Click the Search button.
16. Click on the MGI ID for any of the RPCI23 clones. For example, click the link to
MGI:2790721. On the page that follows, copy the ID number RP23-355B17.
17. Go to Tools and Links and click the link to Sources for clones in the Community Links
section. Click the link to BACPAC Resources Center to order this clone.
18. At the BACPAC website, click the link to Online Ordering. Then click the link to Clones
beneath Product Selector.
19. Enter ID number RP23-355B17 in the box and click the button to Verify These Clones.
20. Click the Add Clones to Cart button for pricing information.
21. For cDNA clones, return to MGI and repeat the search on the Molecular Probes and Clones
Query Form. Select “cDNA” for sequence type. Detail pages for cDNA clones may contain
direct links to IMAGE or RIKEN clone suppliers.
22. To review mouse nomenclature, return to MGI and click Tools and Links in the left hand
navigation bar.
23. Find the Nomenclature section and click the link to the Mouse Nomenclature Main Page.
The Quick Guides are good places to start for gene, allele and mutation nomenclature. More
7
detailed information may be found in the full guides. For assistance in using these documents,
see the contact information for the Mouse Genomic Nomenclature Committee (MGNC) at the
bottom of the page.
24. To submit data, click the links to Submit a proposed locus symbol or mutant allele, or
Register a new mouse strain name.
25. For experimental data such as mapping, linkage, haplotype, expression or other data, click
Tools and Links again and locate the link to Data and Nomenclature Submissions.
Gene Expression and Regulatory Regions
The Gene Expression Database (GXD) in MGI emphasizes endogenous gene expression during
mouse development. GXD stores primary data from different types of assays, including blots,
RNA in-situ, and immunohistochemical studies. GXD may be searched by anatomical term,
developmental stage, assay type, gene or chromosomal location. This section also
demonstrates how to access upstream regulatory sequence from resources linked to MGI.
1. Click on the Expression link at left. Click the link to Gene Expression Data to retrieve the
Gene Expression Data Query Form.
2. In the Expression section of the query form, change the radio button from “either” to
"detected in".
3. For Anatomical Structure(s), change the pull-down menu from "contains" to "=", and type
neural crest in the text box.
4. To sort results by embryonic age, select the radio button next to Age in the Sorting and
output format section near the bottom of the page. Also select the Assay Results radio button.
Click the Search button.
5. The Query Results page lists positive gene expression in the neural crest and its
substructures. Click the link to MGI:2386962 in the Result Details column. This links to further
detail of RNA in situ results for gene Adrbk1.
6. On the Result Details page, find any table with an Image column and click the link to the
figure. These link to color images from the original paper.
7. Click the back button twice on your browser window to return to the Query Results page for
neural crest. In the Structure column, click any link, such as TS11: future brain: neural fold:
neural crest.
8
8. On the Anatomical Dictionary Browser page, click the link to show all gene expression
results.
9. To compare expression between anatomical structures and/or developmental stages, use the
Gene Expression Data Expanded Query Form. Click the link to Expression on the left, then the
link to Gene Expression Data (Expanded).
10. In the Expression section, type cerebral cortex in the first Anatomical Structure(s) text
box, and hypothalamus in the second.
11. Select TS 21 (12.5 – 14.0 dpc) from both Developmental Stage(s) menus.
12. Click the Search button. This search returns genes expressed in the cerebral cortex but not
detected in hypothalamus from 12.5 to 14.0 days post-conception.
13. Click the link to Foxd1 in the Gene column. Find the Gene Ontology (GO) classifications
section, and click the link to GO term axon guidance. To retrieve a list of genes annotated with
this term, click the link to (x genes, y annotations).
14. To find regulatory elements of a gene, click the link to gene symbol Ablim1 in the
Symbol, Name column.
15. On the Ablim1 Gene Detail page, click the link to either Ensembl ContigView or VEGA
ContigView in the Sequence Map section. Ensembl genes are computationally predicted,
while VEGA genes are manually curated and will likely contain more accurate, complete
information. The following example uses the Ensembl ContigView link, though you may use
either, understanding that coordinate values will differ between the two.
16. At Ensembl, find the Detailed view. In the text boxes next to Jump to region, notice that
this gene has been mapped to chromosome 19 from base pairs 57,085,524 to 57,270,282.
17. Scroll to the bottom of the box named Detailed view. Directly above the Gene legend, find
the track labeled Length. This track indicates the length of the displayed region, as well as the
direction of transcripts and other features mapped to this region. Notice the arrow at right
indicating the reverse strand.
184.76 Kb
Reverse strand
18. Scroll up the Detailed view and find the track labeled DNA(contigs). All transcripts and
features mapped below this track run in the reverse direction (from right to left), and those
above it in the forward direction (from left to right).
9
19. Three transcripts for Ablim1 lie below the DNA(contigs) track, indicating that they run
from right to left on the reverse strand. The gene coordinates for Ablim1 are given as 57,085,524
to 57,270,282, so the 5’ end of the gene is located at 57,270,282 bp.
20. To retrieve upstream sequence for Ablim1, add 2,000 bp to the 5’ end of the gene. Change
the values in the text boxes next to Jump to region to 57,270,282 and 57,272,282. Click the
Refresh button.
21. In the navigation bar at the left, click the link to Export sequence as FASTA. Select Text as
the Output format and click the Continue >> button. Select and copy the sequence.
22. A number of promoter predicting tools are available online at no charge. Search for a
phrase like “promoter finder” or “promoter prediction” using Google or another search engine
of your choice. Paste the sequence into the text box of your chosen promoter finding tool.
Batch Data Download
1. To download plain text files of all genes and markers in MGI, sequences associated with
these markers, or orthologies between mouse, human and rat, return to MGI and click the link
to Additional Tools and Links.
2. Under Downloads, click the link to Database Reports.
3. The MGI Data and Statistical Reports page is divided into several sections, each containing
links to plain text files. For example, click on the link to MRK_List1.sql.rpt to retrieve a list of
all markers in MGI.
4. For those comfortable with complex SQL queries, MGI grants direct access with public SQL
accounts. Before considering this, please see the more than 180 user tables in the Schema
Browser. Back up to the Additional MGI Tools and Links page and click the Help link beneath
the MGI logo. Click the link to the Schema Browser beneath Software Developer Resources.
Contact MGI User Support for a public account.
For more information, please contact MGI User Support by clicking the Help link at upper left, by e-mailing
mgi-help@informatics.jax.org, or by calling (207) 288-6445. For comments or corrections, click the
Your Input Welcome button on any Gene Detail, Phenotypic Allele Detail, or GXD Query Results page.
Susan McClatchy
Office: (207) 288-6445
URL: www.informatics.jax.org
E-mail: mgi-help@informatics.jax.org
10
Download