Assign #5

BIO 224 Laboratory
Oct 4 & 6, 2010
Gene Expression: Lab Assignment 5
(due Wed, October 13th; 15 pts)
Go to the UniGene site ( and answer the following
A. Examine the Homo sapiens entry (click on the name). Why does Homo sapiens
have so many clusters that have been identified? If it is projected that there are approximately
35,000 protein-coding genes within the genome, why would there be such a discrepancy as to the
number of clusters that have been identified?
B. Find your favorite organism on the UniGene homepage and click on the species
name. Describe the type of information you find in the entry. In addition, explain the “Histogram
of cluster sizes” table listed for your species (note: cluster size is on the left, and # of clusters is
on the right side).
C. Find the UniGene accession number entry that corresponds to your assigned human
mRNA sequence. Record it below (note: you can either search the UniGene database directly or
find it's cross-listing within the UniProt entry you worked with last assignment).
D. What types of information did you find in regards to Gene Expression?
E. In specific, what did you find out about where this gene is found to be expressed?
(Note: don't use the GEO profile to answer this question since this will be addressed in a separate
question- Q4).
F. For the EST profile, explain the information that is listed for the tissue that has the
highest expression (e.g. what do the numbers mean?) Why does this only relay relative
expression estimates?
G. How many mRNA sequences and EST sequences have been documented for your
protein in this database?
H. For the EST sequences, examine the first three listed that are from different cDNA
libraries and answer the following:
a. How long was the sequence read? (Sequence length)
b. From what tissues did these libraries originate from?
c. What was the vector that the cDNA was cloned into? (hint: need to click on
the library link to find out how it was constructed)
Go to the Mammalian Gene Collection ( ) and answer the following
A. Find the full length MGC clone for your assigned human mRNA and record the Image
ID #. (note: if more than one exists then record the total number of clones and select only one for
the next questions)
B. From what type of tissue did this sequence come from (can find this out by examining
the library link)?
C. What type of vector was used to construct this cDNA library?
D. What “universal” vector primers could you use to sequence the cDNA insert?
BIO 224 Laboratory
Oct 4 & 6, 2010
3. Go to the Digital Differential Display website
( and answer the following questions.
A. What is Digital Differential Display (DDD) based on? How are differences in
expressed genes evaluated? Why is it important to know the number of EST sequences that
have been examined within each library for the DDD analyses? (Note: you will need to go to the
following website to find out more info on DDD:
B. Click on the "Begin DDD for" link at the top of the page and list the species Homo
sapiens. Use the information you obtained in question 1H regarding cDNA libraries from which
your ESTs were derived. Why are normalized or subtracted libraries not particularly useful in
DDD analyses?
C. Compare two different cDNA libraries that you found for the above ESTs using DDD.
What were the libraries you chose to examine (and from what tissues)?
(clip notes: Within Homo sapiens, edit library and then find your library in the list- try to look for
the tissue type and also cross reference the library accession to make sure you have the correct
library; after chosing a name for the library, accept changes; then click “New” for a new library
and then find the second library for comparison)
D. Was your protein differentially expressed in the tissues (note: if your protein was not
significantly differentially expressed, it would not be listed in the results of the differential display
E. List the top three proteins that were differentially expressed in the comparison of
these libraries? Which library significantly expressed more of these particular proteins? Explain
what these proteins do (hint: accession numbers) and try to connect this to the differences noted
between the two libraries. (meaning, try to come up with an explanation or hypothesis as to why
they might be differentially expressed given the tissues you chose).
4. Go to back to the UniGene entry for your human protein. Click on the Gene Expression
Omnibus (GEO) link within the Gene Expression section. Examine some of the entries. Pick one
of the microarray experiments listed to examine it in more detail. Study the bar graph figure to
the right of the description by clicking on it to view the results and also examine the description of
the experiment (click on the GDS??? accession number listed next to the check box).
A) Briefly describe what the scientists were testing in the microarray experiment or in
other words the goal. (e.g. expression profiles of normal, precancer and cancer cells, note: often
there is literature cited and you can view their abstract).
B) What were the results of the experiment with regards to only your gene of interest (i.e.
mRNA expression level). To answer this you will need to click on the bar graph shown to the right
of the initial GEO entry. There is a "Graph caption help" function that can assist your
interpretation of the data.