Final

advertisement
Functional Genomics, Final 2009
1. In 2008, the Venter team successfully assembled a complete synthetic
Mycoplasma genitalium genome. However, they stopped short of transplanting the
synthetic genome into a donor cytoplasm to spark the flame of synthetic life.
a)
Given that existing technology only allows efficient synthesis of 5-7 kb
oligonucleotides, how is it possible to synthesize an genome from scratch?
b) Compare and contrast the genomes of 1) wild type M. genitalium, 2) M.
genitalium JCVI-1.0, and 3) the theoretical minimal M. genitalium derived genome.
c)
Could a synthetic Mycoplasma genome be “sparked” to life using an E. coli donor
cytoplasm? Why or why not? According to Venter, how might this have been an
advantage in his research?
2. Assuming RNAi can be delivered successfully into a cell of interest, what is one
potential problem with using RNAi to seek the function of a novel gene? How can
this potential problem be used to the researchers advantage?
3. RNA interference is an excellent technique for doing targeted genetics
studies, especially in organisms that do not readily perform homologous
recombination with exogenous DNA, and it also has other advantages, as well
as some disadvantages that impede its use as a research tool and as a potential
therapeutic technique. What are some of these advantages and disadvantages?
4. How can microarrays be used to do comparative analysis of transcripts?
Why would you want to do such an experiment?
5. What is the basic RNA-seq methodology? And how can it be used to improve on
learning transcriptome expression levels over the microarray technology?
6. In “The complete genome of an individual by massively parallel DNA sequencing”
why would the experimenters think it was surprising that Watson was homozygous
for four-base pair deletion in SGEF. What was the researchers explanation for this
and what did the 4-bp deletion ultimately suggest in reference to Watson’s genome
and about the use of reference genomes in general?
7. Please explain the molecular technique the researchers in the paper “The
complete genome of an individual by massively parallel DNA sequencing” used to
validate the SNPs found by 454 sequencing and alignment. What did the “technique”
reveal as far as accuracy and how did they explain the low accuracy? (Hints: Table 2,
and Fig. 1b)
8. The research lab you work in has recently discovered an mRNA transcript
and the corresponding protein (PrNew) in the M. genetalium. After a
comparison to the predicted gene sequences generated by OTTO, you find that
the new transcript/protein correspond to a hypothetical gene/protein.
A) What is a hypothetical gene, and how would a program like OTTO
predict one.
B) A bioinformatics study based on gene sequence, protein sequence and
protein structure yielded high similarity scores to lactate
dehydrogenase. You think that lactate dehydrogenase and PrNew may
have complimentary/compensatory functions. How could you test if
PrNew and lactate dehydrogenase have similar functions in a cell? i.e
interact with similar proteins, undergo similar reactions etc.
C) During a gene essentiality study, both are deemed unessential as single
knockouts, can this data be extrapolated to consider them as
unessential overall, in any circumstance? Detail an experiment to back
up your statement.
9. After graduating from Western you enter the job market and are hired by a
biotechnology firm that has just received a hefty Gates Foundation grant to
develop new treatment strategies for influenza. The hope is to be better
prepared for the anticipated pandemic when avian influenza subtype H5N1,
commonly referred to as “Bird Flu”, mutates and becomes more transmissible
and lethal to humans. The H5N1 genome has been sequenced, along with
other less pathogenic strains of influenza that commonly infect humans. Your
lab has infected tissue samples available. Your research group leader believes
that determining the H5N1 protein interaction network and comparing it to
other influenza interactomes may be of assistance in developing a treatment
strategy. How would you go about doing this? After the interaction networks
have been determined, how would you go about analyzing and comparing
them?
10. You are working with a microorganism and found a gene that has not been
discovered. Curious about the function of the gene you set out to do a study on that
gene. What method can you use to study this gene? You don’t know whether this is
an essential gene or not and this microorganism is known for its high rate of nonhomologous recombination.
11. In the study about the NEXTGEN sequencing of Watson’s genome, a
reference genome was used to find SNPs, CNVs and indels . Why was a
reference genome necessary? Was the reference genome sufficient? Why or
Why not?
12. In terms of Pharmacogenomics, how will the Human Genome Project help a
subset of the population? What particular subset of the population will the data
from the Human Genome Project benefit and why?
13. In “ . . . Cloning of a Mycoplasma genitalium Genome” assembly of the genome
was carried out in two stages. The first stage was in vitro assembly of synthetic
cassettes and the second in vivo assembly by recombination in yeast. These two
methods employed the use of different cloning vectors.
What initially prompted the use of two different vectors? Please discuss the process
of assembly in both cases, making note of similarities as well as differences between
the two vectors used. Also, include the following terms in your discussion,
-NotI restriction sites
-T4 polymerase
-3’ exonuclease
-Recombination ‘hooks’
14. Explain three ethical issues addressed in the paper “The complete genome of an
individual by massively parallel sequencing”. Explain how each of these issues were
addressed for Watson’s genome sequence and how this is relevant in regards to
sequencing the genomes of the general public.
15. Based on what was presented on transcriptomics by Todd, Mark, and Tom: If
you were given a microarray that was correlated with a mutant phenotype
(meaning expression levels where done on a +phenotype, -phenotype, then both
placed on a microarray to measure relative expression levels) and it was done over
time, first, how would to try to figure out meaningful data from the array? Secondly,
what are the problems of doing this, (include problems in general of any microarray
study) and how would you try to solve them? And third, how would you test/study
this information about the expression levels to find cause-effect?
16. In the paper “Gene expression profiling predicts clinical outcome of breast
cancer”, the expression profiles of genes associated with cancerous tissue are used
to predict disease outcome. Approximately 25,000 human genes are screened using
microarray technology. Compare and contrast the benefits of microarry and RNA
sequencing for use in researching the transcriptone of cancer tissue.
17. Explain what a candidate gene is and how they can be identified using at least
two techniques we have learned in class. Why do candidate genes have such a high
failure rate in terms of discovering adverse drug reactions?
18. Using examples from the presentation, why is it important that
pharmaceutical companies expand their drug trials into a wider range of
racial backgrounds? How will advances in the sequencing of the human
genome aid these trails and why might pharmaceutical companies not be
inclined to expand their studies?
19. You are working with the CDC on a virus with a very broad host range,
including most eukaryotic cells. You’ve identified a gene on the virus that
when altered, attenuates virulence. Your hypothesis is that the viral product
of this gene interacts with host proteins(s). In fact, preliminary data shows
that there is one interacting protein in your host model organism. You
perform bioinformatics and learn that it is part of a large gene family, with
unknown function. Further, the gene family though ubiquitous, has <20%
amino acid similarity. You need to write a grant to get funding.
a. Convince the grant panel that you have a Plan A to find host interacting
proteins, with confirmation. And a Plan B.
b. For the second part of the grant, you ask for money to sequence many, many
viral genomes that have known hosts (the gene sequences of the hosts are
already known.) Why? What will you do with the viral sequences?
Hint: Why might this particular host protein family have low amino acid
similarity?
20. What are the barriers to and benefits of pharmacogenomics and how are
companies working to make personalized healthcare a reality for all (or at
least those who can afford it)?
21. In the network article the researchers have tested and eventually predicted
meaningful protein networks by discovering protein-protein interactions in
Kaposi’s sarcoma-associated herpesvirus (KSHV), varicella zoster virus (VZV) and
their interactions in the human protein network. What benefit(s) could result from
making this network?
22. The authors discussed three types of genetic variation present in Watson’s
genome. Choose two of these types of variation for discussion. Touch on the
methods of identification and validation. As well, include a summery of the
analysis, relevance, and possible shortcomings and/or future possibilities for
this type of variation.
Download