Galaxy

advertisement
UCSC Genome
Browser
Dror Hollander
Gil Ast Lab
Sackler Medical School
Understanding the Genome
promoters
genes
non-coding RNA
conservation
miRNA
How can you examine a genomic segment while
taking all of these factors into account?
histone
DNA SNPs
midifications
nucleosome
occupancy
GC content
repetitive elements
gene expression
RNA
secondary structure
exon-intron structure
alternative splicing
protein
Lecture Overview


UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR
Galaxy
UCSC Genome Browser

Basic design: “the Genome Browser
stacks annotation tracks beneath
genome
coordinate
positions,
allowing rapid visual correlation of
different types of information”
genome
track
track
(64 eukaryote
genomes)
Genome Browsing…
Basic Genome Browser Interface
chromosomal
position
genomic
coordinates
start codons in
green
zoom
mark and drag here
to zoom in
stop codons in
red
UCSC
Refseq
•Black
feature
has
a
corresponding entry in the Protein Data Bank (PDB)
Genes
Genes
intron
•Dark blue - transcript has been reviewed or validated by
either the CDS
RefSeq,
track
gene
track
SwissProt or CCDS staff
direction
•Medium blue - other RefSeq transcripts
•Light blue - non-RefSeq transcripts (> / <)
UTR
Basic Genome Browser Interface

Configure track visualization:
Basic Genome Browser Interface

“RefSeq track shows known human protein-coding and
non-protein-coding genes taken from the NCBI RNA
reference sequences collection (RefSeq). The data are
updated daily”

a few
examples
online…
“TheLet’s
UCSCexamine
track shows
gene
predictions
based on data
from RefSeq, Genbank, CCDS and UniProt… includes both
protein-coding and putative non-coding transcripts…
Compared to RefSeq, this gene set has generally about
10% more protein-coding genes, approximately five times
as many putative non-coding genes, and about twice as
many splice variants”
Basic Genome Browser Interface

A few more tracks worth mentioning:






miRNA (Genes and Gene Prediction Tracks ->
sno/miRNA)
conservation (Comparative Genomics ->
Conservation)
Expression tracks
Regulation tracks (chromatin structure and
modifications, DNA methylation, etc.; includes
ENCODE data)
RNA secondary structure (Genes and Gene
Prediction Tracks -> EvoFold)
SNPs (Variation and Repeats -> SNPs)
Basic Genome Browser Interface

browser
to
an
Get genomic
DNAwindow
for
viewed
Convert
sequence
to athe
different
image file:
coordinates:
genome
assembly or genome:
Lecture Overview


UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR
Galaxy
Detecting Alternative Splicing Events

Via Human mRNAs & Spliced ESTs
tracks (mRNA and EST Tracks)

“The mRNA track shows alignments between
human mRNAs in GenBank and the genome”
gene DNA

mRNA
“…alignments between human expressed
sequence tags (ESTs) in GenBank and the
genome… ESTs are single-read sequences,
typically about 500 bases in length”
Detecting Alternative Splicing Events

Via Alt Events track (Genes and Gene
Prediction Tracks) – based on UCSC genes
gene DNA
cassetteExon
>
Detecting Alternative Splicing Events

Via Burge RNA-seq track (Expression)

gene DNA
Burge
RNA-seq
Click on the track name to choose
tissues
Different Alternative Splicing Types

Exon skipping

Alternative splice site (3’)

Intron retention
Lecture Overview


UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR
Galaxy
Histone Modifications
Transcription Factor Binding
DNA Methylation
Lecture Overview


UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR
Galaxy
BLAT


BLAT
query
BLAT
query
BLAT = Blast-Like Alignment Tool
BLAT is designed to find similarity of
>95% on DNA and >80% for
protein
Lecture Overview


UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR
Galaxy
PCR
coordinates
strand
primers
amplicon
in fasta
format
amplicon
temperatures
Lecture Overview


UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR
Galaxy
Galaxy

“Galaxy allows you to do analyses
you cannot do anywhere else
without the need to install or
download
anything.
You
can
analyze
multiple
alignments,
compare genomic annotations,
profile metagenomic samples and
much much more...”
Galaxy – What Is It Good for?

Getting the best out of UCSC



Operating on UCSC data
Supports operations both at the
interval level, and at the sequence
level
Designed for biologists!
Galaxy – Typical Workflow

Extract sets of coordinates



either upload from computer or from
UCSC table browser
Operate on different sets of
coordinates (intersect, subtract
etc.)
Fetch genomic sequences of
coordinates
Download