UCSC Genome Browser Dror Hollander Gil Ast Lab Sackler Medical School Understanding the Genome promoters genes non-coding RNA conservation miRNA How can you examine a genomic segment while taking all of these factors into account? histone DNA SNPs midifications nucleosome occupancy GC content repetitive elements gene expression RNA secondary structure exon-intron structure alternative splicing protein Lecture Overview UCSC Genome Browser Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy UCSC Genome Browser Basic design: “the Genome Browser stacks annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information” genome track track (64 eukaryote genomes) Genome Browsing… Basic Genome Browser Interface chromosomal position genomic coordinates start codons in green zoom mark and drag here to zoom in stop codons in red UCSC Refseq •Black feature has a corresponding entry in the Protein Data Bank (PDB) Genes Genes intron •Dark blue - transcript has been reviewed or validated by either the CDS RefSeq, track gene track SwissProt or CCDS staff direction •Medium blue - other RefSeq transcripts •Light blue - non-RefSeq transcripts (> / <) UTR Basic Genome Browser Interface Configure track visualization: Basic Genome Browser Interface “RefSeq track shows known human protein-coding and non-protein-coding genes taken from the NCBI RNA reference sequences collection (RefSeq). The data are updated daily” a few examples online… “TheLet’s UCSCexamine track shows gene predictions based on data from RefSeq, Genbank, CCDS and UniProt… includes both protein-coding and putative non-coding transcripts… Compared to RefSeq, this gene set has generally about 10% more protein-coding genes, approximately five times as many putative non-coding genes, and about twice as many splice variants” Basic Genome Browser Interface A few more tracks worth mentioning: miRNA (Genes and Gene Prediction Tracks -> sno/miRNA) conservation (Comparative Genomics -> Conservation) Expression tracks Regulation tracks (chromatin structure and modifications, DNA methylation, etc.; includes ENCODE data) RNA secondary structure (Genes and Gene Prediction Tracks -> EvoFold) SNPs (Variation and Repeats -> SNPs) Basic Genome Browser Interface browser to an Get genomic DNAwindow for viewed Convert sequence to athe different image file: coordinates: genome assembly or genome: Lecture Overview UCSC Genome Browser Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy Detecting Alternative Splicing Events Via Human mRNAs & Spliced ESTs tracks (mRNA and EST Tracks) “The mRNA track shows alignments between human mRNAs in GenBank and the genome” gene DNA mRNA “…alignments between human expressed sequence tags (ESTs) in GenBank and the genome… ESTs are single-read sequences, typically about 500 bases in length” Detecting Alternative Splicing Events Via Alt Events track (Genes and Gene Prediction Tracks) – based on UCSC genes gene DNA cassetteExon > Detecting Alternative Splicing Events Via Burge RNA-seq track (Expression) gene DNA Burge RNA-seq Click on the track name to choose tissues Different Alternative Splicing Types Exon skipping Alternative splice site (3’) Intron retention Lecture Overview UCSC Genome Browser Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy Histone Modifications Transcription Factor Binding DNA Methylation Lecture Overview UCSC Genome Browser Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy BLAT BLAT query BLAT query BLAT = Blast-Like Alignment Tool BLAT is designed to find similarity of >95% on DNA and >80% for protein Lecture Overview UCSC Genome Browser Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy PCR coordinates strand primers amplicon in fasta format amplicon temperatures Lecture Overview UCSC Genome Browser Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy Galaxy “Galaxy allows you to do analyses you cannot do anywhere else without the need to install or download anything. You can analyze multiple alignments, compare genomic annotations, profile metagenomic samples and much much more...” Galaxy – What Is It Good for? Getting the best out of UCSC Operating on UCSC data Supports operations both at the interval level, and at the sequence level Designed for biologists! Galaxy – Typical Workflow Extract sets of coordinates either upload from computer or from UCSC table browser Operate on different sets of coordinates (intersect, subtract etc.) Fetch genomic sequences of coordinates