UCSC Genome Browser: navigating a genomic region

advertisement
Introduction to BioComputing
Biology in silico
3rd February 2010
Carrie Iwema, PhD, MLS
Molecular Biology Information Specialist
Health Sciences Library System
University of Pittsburgh
iwema@pitt.edu
http://www.hsls.pitt.edu/guides/genetics
General Topics

Information Overload

Genome  Gene  Protein
http://www.hsls.pitt.edu/guides/genetics
Specific Topics

Information Overload


PubMed
Alternatives to PubMed






GoPubMed
Novoseek
PubGet
Molecular Databases
HSLS Molecular Biology Information Service
Genome  Gene  Protein


Genome Biology
Genome Browsers




UCSC Genome Browser
NCBI MapViewer
Entrez Gene
UniProt
http://www.hsls.pitt.edu/guides/genetics
Information Overload
209K
5,394
Journals
1.3 billion
searches
in 2009
4K
• Breast
Cancer
84K
• Colon
Cancer
52K
• p53
• STAT1
http://www.hsls.pitt.edu/guides/genetics
Alternatives to PubMed
http://www.hsls.pitt.edu/guides/genetics
Growth of Molecular Databases
2010: 1230
2009: 1170
2008: 1075
Source: Nodal Point Blog
http://www.hsls.pitt.edu/guides/genetics
Molecular Databases

Nucleic Acids Research: Oxford Journals



Journals




Annual Database Issue
Annual Web Server Issue
Bioinformatics: Oxford Journals
BMC Bioinformatics: BioMed Central
Database: Oxford Journals *new in 2009*
Articles on “genetic databases”


PubMed: 21,851 results
MeSH: 16,398 results
http://www.hsls.pitt.edu/guides/genetics
HSLS Molecular Biology Information Service
Workshops
Bioinformatics
Consultations
Website
Software
Licensing
http://www.hsls.pitt.edu/guides/genetics
HSLS OBRC
http://www.hsls.pitt.edu/guides/genetics
HSLS OBRC in Science
HSLS
OBRC
2441 links to
databases
and software
~3000
hits/day
http://www.hsls.pitt.edu/guides/genetics
search.HSLS.MolBio

Integrated search system









Databases & Software
Articles on Databases & Software
Genes/Proteins
Pathways
Protocols
Videos
Recommended Articles
Tabbed browsing
Clustered search results
http://www.hsls.pitt.edu/guides/genetics
Hands-on exercises

Locate databases on


Retrieve gene information for


Your favorite gene, BRCA1, STAT1
Find a suitable protocol for


Natural antisense, UTR, copy number variation
Methylation PCR, in situ hybridization, primer design
Identify videos on

Protein structure prediction, human genome project
http://www.hsls.pitt.edu/guides/genetics
Genome Biology
http://www.hsls.pitt.edu/guides/genetics
From Cell to Gene
Human Genome Project Video
http://www.hsls.pitt.edu/guides/genetics
Genome Biology Time Line
RNA
Bacteriophage
MS2
Human Genome
Draft Seq
1976
Diploid Genome seq of
an Individual Human
2001
1995
Haemophilus
Influenza
2007
2003
2008
Published Complete
Human Ref Genome
Published Complete
Genomes: 1191 organisms
Jim Watson
Genome
Human Genome Project Video
http://www.hsls.pitt.edu/guides/genetics
2010
Genome Resources

NCBI:

Genomes Resources : Link


Genome Project
Genome: 6108 species

Genomes OnLine Database (GOLD): Link

JGI: Integrated Microbial Genomes: Link
http://www.hsls.pitt.edu/guides/genetics
NCBI Genome Resources
http://www.hsls.pitt.edu/guides/genetics
Practice Question:

Query: Check the status of genome sequencing for
an organism, such as rabbit.

Answer:

Pick an organism or metagenome project name.
Search the Genome Project database. To get the most precise
results specify the organism field when searching with an
organism name, for example: human[orgn].
Click on the desired Genome Project if more than one result.
The Genome Project summary page will provide information of
available projects and sequencing status.



http://www.hsls.pitt.edu/guides/genetics
NCBI Genome Project

A collection of complete and in-progress large-scale sequencing,
assembly, annotation, and mapping projects for cellular organisms. The
database is organized into organism-specific overviews that function as
portals for browsing and retrieving projects pertaining to each organism.
Rabbit
CLICK
http://www.hsls.pitt.edu/guides/genetics
NCBI Genome Project : Rabbit Genome
http://www.hsls.pitt.edu/guides/genetics
NCBI Genome Project : Rabbit Genome
http://www.hsls.pitt.edu/guides/genetics
NCBI Entrez Genome:
http://www.hsls.pitt.edu/guides/genetics
Genomes Online Database (GOLD)


http://genomesonline.org/index2.htm
Global resource for comprehensive access to information regarding
complete and ongoing genome projects, metagenomes, and metadata.
“genome sequencing has come of age, and genomics will become
central to microbiology's future. It may appear at the moment that
the human genome is the main focus and primary goal of genome
sequencing, but do not be deceived. The real justification in the long
run, is microbial genomics”
Carl Woese, 1998
http://www.hsls.pitt.edu/guides/genetics
Genome Browsers
http://www.hsls.pitt.edu/guides/genetics
Genome Browsers: What are they?
Genome Browsers enable researchers
to visualize and browse
entire genomes with annotated data
including gene prediction and
structure, proteins, expression,
regulation, variation, comparative
analysis, etc.
http://www.hsls.pitt.edu/guides/genetics
Genome Browsers

Display: Vertical
The Big Three





NCBI MapViewer
UCSC Genome Browser
EBI Ensembl
Display: Horizontal
Generic Genome Browser (Gbrowse)
JBrowse (Ajax based like Google Map)
http://www.hsls.pitt.edu/guides/genetics
Tutorial Articles
Link
Link
Link
Link
http://www.hsls.pitt.edu/guides/genetics
Tutorial/Seminar Videos
Link
Link
Link
Link
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser
http://www.hsls.pitt.edu/guides/genetics
Navigating the Human Genome
Browse the region of human
chromosome 7 between bp
54,318,043 to 55,974,438
UCSC Genome Browser
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Set up basic browser parameters
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Start fresh
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human
chromosome 7 between bp
54,318,043 to 55,974,438
What genes are present in
this region?
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
NCBI sequence databases

RefSeq


based on GenBank records; non-redundant,
expert-verified databases of reference
sequences Link
GenBank

archival database of nucleotide sequences
from >160,000 organisms Link
http://www.hsls.pitt.edu/guides/genetics
International Nucleotide Sequence Database Collaboration
http://www.hsls.pitt.edu/guides/genetics
Primary Vs Derivative databases
http://www.hsls.pitt.edu/guides/genetics
RefSeq Scope & Accessions

Genomic DNA





NC_123456 - complete genome, chromosome, plasmid
NG_123456 - genomic region
NT_123456 - genomic contig
mRNA NM_123456
Protein NP_123456
more about RefSeq scope and accessions...
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human
chromosome 7 between bp
54,318,043 to 55,974,438
Zoom in and display only
the EGFR gene
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Select the gene region
from the “Scale” track to
zoom in
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human
chromosome 7 between bp
54,318,043 to 55,974,438
Display all Single Nucleotide
polymorphisms (SNPs)
present in this gene
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human
chromosome 7 between bp
54,318,043 to 55,974,438
Retrieve the nucleotide sequence of
this genomic region showing all
exons in blue and SNPs in Red,
bold faced and underlined.
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region:
sequence view
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human
chromosome 7 between bp
54,318,043 to 55,974,438
Look in probable promoter region
and see if there’s anything
interesting…
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Zoom out
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
Browse the region of human
chromosome 7 between bp
54,318,043 to 55,974,438
What transcription factors
bind in this region?
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: navigating a genomic region
http://www.hsls.pitt.edu/guides/genetics
Discovery Tool…
http://www.hsls.pitt.edu/guides/genetics
NCBI MapViewer
http://www.hsls.pitt.edu/guides/genetics
NCBI MapViewer

How To: View/download features around an object or
between two objects on a chromosome
Starting with...CHROMOSOMAL COORDINATES

Begin on the Map Viewer home page. Click the "R" icon
under Tools for the desired organism and build.
Select the chromosome, enter the coordinates in the From
and To boxes, and click Go. Use either exact coordinates,
e.g., 61551076, or values such as, 61M or 61551K.
If necessary, use the Maps & Options dialog box to change
displayed maps; the maps and region displayed determine
the data available.


Entrez Gene
http://www.hsls.pitt.edu/guides/genetics
Common Questions
What is its genomic seq?
How many splice varients are there?
What are its intron-exon architechure?
What is its function?
Which tissues it expressed ?
What are its neighboring genes?
What diseases are
associated with it?
http://www.hsls.pitt.edu/guides/genetics
How can I get
its cDNA clone?
NCBI : Entrez Gene
Chromosomal
Localization
Amino acid
Genomic mRNA Sequence
Sequence
Sequence
Homologous
Sequences
Expression
Profile
Disease
3D Structure
SNP
http://www.hsls.pitt.edu/guides/genetics
Interacting
Partners
Entrez Gene
Find:






gene symbols and aliases
sequences: genomic, mRNA, protein
intron-exon architecture
genomic context: neighboring and
antisense genes
interacting partners
associated gene ontology terms:
function, cellular component and
biological process
http://www.hsls.pitt.edu/guides/genetics
Entrez Gene

a searchable database of genes,
from RefSeq genomes, and
defined by sequence and/or
located in the NCBI Map Viewer

each record represents a single
gene from a given organism
http://www.hsls.pitt.edu/guides/genetics
Entrez Gene Sequences
Genomic Seq
Protein Seq
mRNA Seq
http://www.hsls.pitt.edu/guides/genetics
Entrez Gene Links
http://www.hsls.pitt.edu/guides/genetics
Gene Ontology (GO)
Controlled vocabulary tagging



Function
Biological Processes
Cellular Component
http://www.hsls.pitt.edu/guides/genetics
Entrez Gene: Gene Table
Introns/Exons
http://www.hsls.pitt.edu/guides/genetics
Try it!
Find mRNA sequence for
your gene of interest
http://www.hsls.pitt.edu/guides/genetics
Find mRNA Sequence for Reelin Gene
http://www.hsls.pitt.edu/guides/genetics
FASTA vs GenBank records
http://www.hsls.pitt.edu/guides/genetics
NCBI Entrez Gene Tutorials

Information page with wiki,
video, blog etc.

Entrez gene: A Directory of
Genes, NCBI Handbook

Short Video Tutorial (MIT)
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: find a gene in the genome
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: find a gene in the genome
http://www.hsls.pitt.edu/guides/genetics
UCSC Genome Browser: find a gene in the genome
http://www.hsls.pitt.edu/guides/genetics
Bioinformatics Databases & Software Providers

NCBI




Home page
Site map
Resource Guide
EBI



Home page
Databases
Software
http://www.hsls.pitt.edu/guides/genetics
UniProt
http://www.hsls.pitt.edu/guides/genetics
UniProt
world's most
comprehensive
catalog of
information on
proteins
a central repository of protein sequence and
function created by joining the information
contained in Swiss-Prot, TrEMBL, and PIR
http://www.hsls.pitt.edu/guides/genetics
UniProt
http://www.hsls.pitt.edu/guides/genetics
UniProt
http://www.hsls.pitt.edu/guides/genetics
Thank you!
Any questions?
Carrie Iwema
iwema@pitt.edu
412-383-6887
Ansuman Chattopadhyay
ansuman@pitt.edu
412-648-1297
http://www.hsls.pitt.edu/guides/genetics
Download