Week9 Handouts

advertisement
UCSC Genome Browser
Accessing the browser and tools:
1. Genomes:
a. Gateway Page: Genomes on the left blue bar and Genome Browser on top.
b. Photo Gateway: Photographs of each species represented in the browser.
Genome Browser Interface
 Click on the hyperlink on the track name under track groups for track description.
 Right Click and View image to get image for publications.
Searching the database:
1. Searching genes:
a. PPP1R1B
2. Searching coordinates:
a. chr21:33,031,597-33,041,570
Specific Tracks
1. Genetic Association Database (GAD) view
a. Example BRCA1, GAD full view
b. Mouse over red box to see the list of associated genes
2. Gene and Gene Prediction Tracks
a. UCSC genes
b. TransMap Cross_species Alignments
c. Refseq
d. mRNA and EST Tracks
e. Conservation and Regulation Data
f. And many more.
2. Blat: On top blue bar
a. Fast alignment tool
b. BLAT on DNA is designed to quickly find sequences of 95% and greater similarity of
length 25 bases or more.
c. BLAT on proteins finds sequences of 80% and greater similarity of length 20 amino acids
or more.
d. Allows user to align DNA/protein to the genome assemblies.
e. Example: Align ‘PORIN_HUMAN’ from GCBA814 dataset page.
3. Tables: On the top blue bar and table browser on the left blue bar.
a. Retrieve data associated with tracks
b. Example: Mammal/Human/hg19/Genes and Gene Prediction Tracks/Refseq
Genes/refGene/genome
4. Gene Sorter
a. Filter: human/hg19/ FOXA2/Expression(GNF Atlas 2)/configure(columns to display)
b. Description page:
i. http://genome.ucsc.edu/goldenPath/help/hgNearHelp.html
5. Genome Graphs:
a. Upload data set (hgGenome_example1.txt)
b. Description:
i. http://genome.ucsc.edu/goldenPath/help/hgGenomeHelp.html
6. PCR: Search for PCR primer sequence in a genome assembly.
a. Human/hg19/genome assembly/ TAACAGATTGATGATGCATGAAATGGG/
CCCATGAGTGGCTCCTAAAGCAGCTGC
7. VisiGene: a virtual microscope for viewing in situ hybridization images.
a. NM_007492
8. Utilities : tools to remove non-sequence-related characters from DNA or protein, to create a gif
image for a phylogenetic tree, and to convert genome coordinates between assemblies
a. http://genome.ucsc.edu/util.html
9. Downloads: Dumps of UCSC genome browser databases and open source tools.
10. Custom Tracks: Displaying your own annotations in the genome browser.
a. Session:
i. Creating account
ii. Resetting session from My Data -> Session management -> Click here to reset
b. Example:
i. Upload test file “NO6_nc_small”
ii. URL example from golgi
c. Description: http://genome.ucsc.edu/goldenPath/help/customTrack.html
Other Resources:
https://genome.ucsc.edu/training.html
Reference:
Harte, R. A., Diekhans, M., Kent, W. J. & Haussler, D. Guide to the UCSC Genome Browser. Cambridge,
MA: NPG Education, 2010
Galaxy (https://usegalaxy.org/)
Account Creation:
1. User -> Register
a. Required for
Input data:
1. Upload File from local computer
a. Download data from UCSC genome browser
i. Table Browser -> Mammal/Human/hg19/Genes and Gene Prediction
Tracks/UCSC Genes/knownGene/position:chr22/…./(output format) BEDbrowser extensible data/(output file)Exons/plain text/get output
ii. Next Page -> Whole Gene/Send query to Galaxy
2. UCSC Main
a. UCSC Main -> Mammal/Human/hg19/Variation and Repeats/All
SNPs(138)/snp138/(position)chr22/…./(output format) BED-browser extensible
data/Galaxy checked/plain text/get output
3. Other sources
Simple Data Manipulation
1. Finding Exons with highest number of SNPs
a. Operate on Genomic Intervals -> Join
i. Joins 1:Exons with 2:SNPs with min overlap 1
ii. Return: Only records that are joined(INNER) JOIN)
iii. Execute
b. Error: All datasets must belong to same genomic build, this dataset is linked to build
'hg19'
c. Edit Attributes and change Database/Build to Human Feb. 2009 (GRCH37/hg19)(hg19)
d. Output:
chr22 16258185 16258303 uc002zlh.1_cds_1_0_chr22_16258186_r 0 - chr22 16258278 16258279 rs2845178
0 +
chr22 16258185 16258303 uc002zlh.1_cds_1_0_chr22_16258186_r 0 - chr22 16258285 16258286 rs113311305 0 +
chr22 16258185 16258303 uc002zlh.1_cds_1_0_chr22_16258186_r 0 - chr22 16258284 16258285 rs199586979 0 +
chr22 16258185 16258303 uc002zlh.1_cds_1_0_chr22_16258186_r 0 - chr22 16258293 16258294 rs200891952 0 +
chr22 16266928 16267095 uc002zlh.1_cds_2_0_chr22_16266929_r 0 - chr22 16266963 16266964 rs10154680
0 +
chr22 16266928 16267095 uc002zlh.1_cds_2_0_chr22_16266929_r 0 - chr22 16266984 16266985 rs376011635 0 +
e. Group the exons to count SNPs
i. Join, Subtract, and Group -> Group
2. Sorting exons by SNP count
a. Filter and Sort -> Sort
3. Selecting top hits
a. Text Manipulation -> Select First
4. Recovering exon info and displaying data in genome browser
a. Join, Subtract and Group -> Compare two Datasets
i. 1:Exons/c4/7:Select first on data 6/c1/Matching rows of 1st dataset
5. Visualizing selected exons in genome browser
a. display at UCSC main
6. Histories
a. History of data, analysis and results.
7. Workflows
a. Pipeline that can be used to re run the analysis with minimal clicking
b. History Settings -> Extract Workflow -> give name -> Create Workflow
c. Select Workflow from top menu bar.
8. Other tools.
Reference
http://galaxyproject.org/
Download