Exam III, Functional Genomics 2010 1. You’re a researcher in the employ of a pharmaceutical company. Your assignment is to create a new energy drug. Being that many of the easy paths have been taken in the discovery and marketing of new energy boosting supplements. You are looking to think outside the box so to speak, when you come across a paper titled “Metagenomic Analysis of the Human Distal Gut Microbiome.” Your heart begins to flutter as you realize that this could be the big break your looking for. You notice the extreme levels of enrichment involved in the 2-methyl-D-erythritol 4-phosphate (MEP) pathway. Being the smart researcher you are, you remember that there are about 25,000 known compounds involved in that pathway, everything from vitamins to amino acids, which are involved in a multitude of human homeostatic functions. You hatch a plan to exploit this hidden and little known realm of human/microbiology. A rough hypothesis is formed which involves not targeting the human so much as the microbes in the development of your new drug. Unfortunately, you and everyone else knows far too little about the gut flora to even begin designing drugs that would supplement it in any meaningful manner. So you call up 176 of your closest collegues and you hatch a grandiose plan to be the first to generate a complete model system of the human gut flora. You decide you need to perform experiments which go way beyond the current level of analyses and begin to characterize the flora of the gut as a huge symbiotic organ. You forget all about the drug and your new goal in life is a large scale multidimensional analyses of the entire gut region. Good thing you’ve just been given clearance to spend 1 billion dollars on this project, because your gonna need it. To begin, you need a very thorough analyses of the metabolic functions and community composition of the gut microbes in your samples, one that goes beyond stool sample composition and looks at the entire gut region. What sequencing methods do you use for the phylogenetic(community composition) and metabolic analyses, be specific? What challenges are faced by these methods? Your goal is to perform a “complete” multi-dimensional analyses, so what sampling methods should be employed, i.e. location, timing, subjects? 2. A happy little family has a nasty history of predisposition for heart disease. A woman from this family wants to participate in whole-genome research and have her genome sequenced, but her son is opposed to her involvement. The woman is a genetics research technician and is curious as to her probabilities of developing heart disease while the son is an artist and simply does not want to know. The researcher wants to conduct a study where he obtains genome sequences and uses them to predict the odds that the participant will develop heart disease; he is contemplating extending his research to test if his predictions work on families. Describe the ethical concerns from the perspective of the woman, the son, and the researcher. Put yourself in their shoes and try to understand where they each are coming from. What can each of them do/change to get around the son’s hesitation? 3. In the paper “A small-cell lung cancer genome with complex signatures of tobacco exposure” what rearrangement was recurrent in all three small cell lung cancer cell lines? Describe each these three rearrangements (ex. fusion gene of x and y) and how expression of this gene was affected in each cell line. In two of these cell lines this rearrangement resulted in the co-expression of what other gene? What is the significance of this rearrangement’s reoccurrence in all three cell lines? 4. Recent transcriptome analyses show that the mammalian genome contains thousands of long non coding RNAs (ncRNA) with unknown function. As a new graduate student at AR University you’re pioneering the ncRNA front. Your lab is investigating axon regeneration and your job is to find a ncRNA involved in axon regeneration and characterize its role as much as possible in the next two years. Describe experiments you could run and why to i) select a ncRNA involved, and ii) identify the role of this ncRNA. Be creative and think outside the box. For example, what genomic tools would might you use to confirm ncRNA function? 5. In the paper ‘Comprehensive genomic characterization defines human glioblastoma genes and core pathways’ it was shown that the perturbation of p53, RB and RTK/RAS/PI(3)K signaling pathways leads to the development of glioblastoma. In 74 percent of sampled glioblastoma tissues all three pathways harbored alteration in at least one gene. Interestingly, some genetic alterations associated with glioblastoma development are activating alterations (alterations include single nucleotide polymorphisms, insertion/deletion events and copy number alteration). Discuss how, in general, an activating alteration in a signaling pathway might increase the likelihood of that cell becoming cancerous. EGFR was an interesting example of a protein that underwent an “activating genetic alteration” What was the significance of ‘focal amplifications’ of EGFR and how does this relate to the general mechanism cancer development through an activating mutation that you previously described. 6. In the paper “Cancer Proliferation Gene Discovery Through Functional Genomics” the technique of using Half Hairpin RNA was described, with this method in mind how would you use hhRNA to elucidate an unknown gene pathway in Arabidopsis? 7. Why is MS-MS a “genomic” technique if it just finds specific proteins? Suggest a novel (for this class) Mud Pit experiment and explain how you would design the experiment. ______________________________________________________ 8. You decide to elucidate the genes necessary for cancer cell proliferation within a specific human tissue. You perform an hhRNA microarray analysis and observe the following results: a) Given what you know about hhRNA microarrays, explain what the results for hhRNAs A, B, and C indicate (A = yellow, B = green, C = red). b) In order to demonstrate your understanding of hhRNA microarrays, be sure to describe the logic behind cy5 and cy3 labeling of cell samples and how this plays into the results shown on the right. c) Furthermore, this particular type of cancer is known to be associated with a complete inactivating mutation in protein p53 (a tumor suppressor protein). With this information, match up hhRNAs A, B, and C to some likely gene candidates (listed below) and explain your reasoning. A MDM2 B TP53 C FF RBX1 9. You isolate a tissue biopsy from a 62-year old woman with lung cancer caused by decades of smoking. You are unsure as to the type of lung cancer she has and decide to sequence her lung cancer cell line. You find that there is an excess of nonsynonymous mutations present (as compared to the number of mutations occurring by chance) and there are many clustered near the promoter region of certain genes. Describe these findings and possible implications. What does this tell you about this lung cancer line? How does it compare with small-cell lung cancer? 10. You are a researcher planning to do whole genome analysis to find genes that may confer susceptibility to brain cancers. You are working with possible participants, and several of them have expressed a desire to maintain their right to withdraw from the study for as long as possible. You understand the participants desire, but two colleagues of yours from other research groups have expressed interest in future access to the sequence data from your participants. How do you design the study to accommodate both groups? 11. After reading the paper “The Human Microbiome Project” you become extremely interested in all the microbes that consist of your distal gut. Design an experiment that would allow you to study this mixed microbial community from identification of individual microbes in the human. What temporal and special scales will you be looking at, and what results might be interesting for future experiments? Any limits (give 3)? What information could the use of mice tell you about the interactions and the functional contributions between these intriguing microbes and the human supraorganism. (Hint: look at the supporting Distal Gut microbe paper to “design/outline” your experiment) 12. You are studying the amount of point mutation due to smoking tobacco products on the cytosine (C>T, C>G, and C>A) of the commonly occurring dinucleotide pair CpG. You have the entire lung cancer genome sequenced. How would you determine if a mutagen is preferentially mutating the cytosine on the CpG sites as opposed to mutating all cytosine’s throughout the genome equally? 13. Microarray experiments use fluorescence to measure transcription levels of RNAs. How are microarrays able to determine differential transcription of an RNA between two different time points? Does this method work for expression of a single transcript as well, that is if RNA.A is present at level Y in condition 1 and present at level X in condition 2, can an array experiment determine the difference? 14. In the paper Cancer Proliferation Gene Discovery Through Functional Genomics, (Schlabach et. al. 2008) the authors knock-down specific genes and test to see if these knocked-down cells remain viable (to identify which genes are crucial to cancer cells vs. healthy cells). To do this, they use retroviruses to deliver shorthairpin RNA sequences to the target genome, which they then detect before and after an incubation period using a PCR-based technique to develop half-hairpin barcode amplicons specific to each short-hairpin RNA target gene. The innovation in this paper is that unlike other shRNA methods requiring the sequencing of a barcode to identify the gene being knocked-down, this technique uses hybridization of the interfering sequence to a chip for identification. However, in amplifying this region, why don’t they see false positives due to the presence of viral genetic material or from the organism's original gene? Why would you expect to see false positives if you did not take the measures they did? 15. The problem facing today's scientific community? How to move cancer treatment beyond the carpet-bombing approach of standard chemotherapy into an era of smart bombs - using targeted therapies that destroy only the cancer. A worldwide initiative called The Cancer Genome Atlas (TCGA) is currently seeking to build a complete catalogue of all the genetic glitches that make “good cells go bad”. What exactly is TCGA and what techniques do they employ in their quest to one day fulfill a complete cancer genome? When it came to understanding glioblastomas, the first cancer they studied, what were the three main techniques which played major roles in the discovery of previously unknown genes associated with glioblastomas? Describe these techniques, the three newly discovered genes, and the pathways they affect. In terms of pathway discovery, why were these connections so novel and what does it mean as a whole when it comes to the future of cancer treatments? 16. In the paper, “Comprehensive genomic characterization defines human glioblastoma genes and core pathways” many genes were implicated in the interruption of three major cell signaling pathways. What effects on the pathways did they find? Explain the techniques used to compare gene expression between glioblastoma cells and normal cells. What were some of the goals of this study? 17. A rival research group has recently published a paper. They used a whole genome tiling microarray to measure levels of transcription in developing neurons at every 12 hours of development and have claimed a high degree of positive correlation between several genes with antisense noncoding transcripts that appear to overlap at the 5’ ends. The coding genes have been previously characterized as markers of neurodevelopment and are expressed only at the fore of developing axons. The group then performed in situ hybridization to demonstrate that the noncoding transcripts colocalize with the coding genes. To illustrate the importance of the associated noncoding transcripts, they developed a knockout of the first exon of one of the identified genes which entirely deletes the promoter sequence of the noncoding partner and show that in the knockout, neurodevelopment is significantly altered mophologically. Genetically, the noncoding transcript is not expressed and the coding gene expression fails to turn off at the normal time in development. With this, they conclude that overlapping antisense transcripts activate the miRNA silencing pathway. 18. Genetic privacy is a major ethical consideration in the public release of wholegenome sequence data. Define the components of genetic privacy in relation to an individual’s genome, along with ways in which each of its components can be infringed on. 19. A mouse breeder claims that he has the best mice, they are easy to take care of, can be fed anything without weight gain or negative effects and have better general health. The breeder has noticed that only females have been passing on these traits. They have exhausted Mendelian genetics and hire you to find out a few things: what is causing this phenotype and why only females can pass it on. Explain your hypothesis, planned testing methods and techniques. 20. Smoking allows for more that 60 mutagens to attach and chemically bind to DNA causing modification. Explain how DNA distortion occurs (and variation) along with an explanation for how the cell tries to fix the error from the mutagen. Also list one interesting fact that you learned from the presentation. 21. Suppose Venter placed a patent on his methods of doing whole genome shotgunning. If you were lead researcher of the human microbiome project and couldn’t use the whole shotgun sequencing method because Venter’s lab placed a patent on the method and Venter simply hates your guts; what other experimental methodologies would you be able to propose to your team to try to analyze the human microbiome? What would be the advantages and disadvantages towards your proposed idea? 22. In the cancer genomics paper that used half-hairpin RNA, which genes were used as controls and why were they used? 23. New developments sequencing have allowed for the analysis of the human metagenome. Your team is trying to find the most basic / common microbiom composition. You have two bacteria in mind that you believe everyone has in their mircobiome. Explain how you would attempt to find a minimal microbiome for all humans.