Activities

advertisement
UCSC Genome Browser Activities
The purpose of this set of activities is to familiarize you with the UCSC genome browser. By the end of
this set, you will have learned how to use:



the UCSC genome browser to look at different tracks for genes and other genome
characteristics
table browser to extract and manipulate data
genome browser tools such as BLAT
1. CTCF is an evolutionarily well-conserved protein that is involved in multiple cellular processes. Using
the Feb. 2009 human genome assembly (hg19) complete the following tasks:
a. identify the location of the gene CTCF (isoform 1, RefSeq annotation).
b. examining the graphical output on the browser, identify and describe any RefSeq isoforms that
are readily identifiable?
c. identify the length of mature mRNA of the shortest isoform.
d. using the table browser (notice the tools link at the top of the browser page) and dbSNP 138,
identify how many common SNPs are at the CTCF locus.
e. create a BED of the SNPs.
f. Identify the number of SNPs that intersect with transcription (Txn) factors identified using
ChIPSeq.
g. identify the location of the SNP with ID rs72140612.
h. identify the number of SNPs that intersect with the mature mRNAs produced by this locus.
2. Obtain the amino acid sequence for the largest CTCF isoform and BLAT it against the mouse genome
(mm10) to find the mouse homolog. (Hint, scroll down a bit after clicking on the graphic of the isoform
in the browser and look for the predicted protein.)
a. What is the percent identify of the best match?
b. Go to the mouse browser and identify the top mouse mRNA there.
c. How many SINE elements are in intron 2 of the mouse homolog? Remember, you’ll need to
figure out what the orientation of the gene is. (Hint: to determine the orientation, click on one
of the mRNAs. ‘+’ or ‘-‘ indicates the strand. Genes on the + strand are read from left to right.)
3. Go back to the CTCF locus in the human genome. There are 33 GO annotations in three categories
associated with the gene.
a.
b.
c.
d.
What is the second GO annotation (include its ID number)?
What is the definition of that function found on the page that links from the ID number?
Does this protein have a zinc finger domain?
Using information determined from the Comparative Toxicogenomics Database (CTD),
determine whether the gene interacts with acetaminophen?
e. Using information from Microarray expression data, determine if this gene is expressed in the
thymus.
f. If so, is that expression higher or lower than what is found in skeletal muscle?
Download