The plant biologists’ goals are to… > communicate desired outcomes of visualization > generate tools that unite data sets > address the genoytpe<>phenotype conundrum - G2P enable new insights into processes that are rendered in visually intuitively accessible form Investing Images of Biological Processes with Added Meaning • Events occurring in time and space, in three dimensions. • Relationships between different levels of biological organization/data types. • Transforming images by populating them with quantitative, dynamic, data. Questions Plant Biologists Ask • 1. What chains of changes occur in a plant over time and space in response to alterations in the environment or “signals”? Drought, Cold, Light • 2. What internally “programmed” chains of changes in a plant occur over time and space ? Flowering, getting old, beginning photosynthesis. • 3. How are these events regulated and what interactions occur between 1 and 2? That’s why we need more sophisticated, and quantifiable, images. Overarching Goal and Difficulty • Goal: To find dependable ways to unite the data available at various scales into coherent pictures for particular plants in particular contexts. Generalizations/hypotheses may follow. • Difficulty: There are no complete data sets, and the mechanistic bases for the data that are available are not fully understood. Intelligent guesses are the norm! We want to create a context in which those guesses can become more accurate. The Current Terminology • The term “plant biology” is used to avoid categorization into scales and modes of measuring the data. • A unifying goal of the plant biology community is to reveal and make clear the underlying cellular events (“mechanisms”) that give rise to a particular outcome (a “phenotype”). • This is ultimately proved by establishing a causal connection between “genotype” and “phenotype”. Not easy. Genes and Genomes • Genes contain information. They are encoded in the molecule DNA (deoxyribonucleic acid). • Genomes contain all the DNA in an organism. Different information is accessed under different circumstances, “readout”. • Not all of that DNA is in the form of genes. • The ancestral relationships of organisms are manifest in the relatedness of their genomes. Cells and Genomes • All organisms are composed entirely of cells, or the products of cellular activity. • With some exceptions, all cells contain a complete copy of that organism’s genome. Diagram of a Plant Cell Plant cells have internal compartments or organelles. Cytoplasm- site of translation Each organelle serves a distinct function and is surrounded by a barrier or membrane Nucleus – majority of the plant genome Chloroplast - site of photosynthesis. (Tom will show data about photosynthesis.) The plant genome is divided among the nucleus, the chloroplast and the mitochondrion. Readouts from the three organelles are coordinated. Genes and Genomes 2 No two genomes are identical, even within a species. (Genetic variation) • Each genome is also a historical document, illustrating past evolutionary processes. We cannot completely read the document, however. Genotype, Phenotype? • A fundamental distinction has to be made between all the genes possessed by an organism (Mendel’s “heritable units”), its “genotype” AND • The characteristics that it exhibits, or its appearance, its “phenotype”. Phenotype is the manifestation of a particular readout of the genome. “Phenotype” “Phenotype” means appearance. It originally meant “appearance” as it appears to the human eye of the observer e.g. the surface of Mendel’s pea seeds or fruit fly eye color. Nowadays it can also mean any manifestation of genome readout, many of which are observed with the help of contemporary, high-throughput technology, as “read” by computational tools. Plants Have A Special Problem Versus I’m out of here! Running away is not an option External and Internal Signals Sorry, Tom, you’ll have to pretend it’s a corn plant! Signal is perceived at the cellular level . Genome responds Signals Light One kind of readout Internal Developmental Signals Buchanan, Gruissem, Jones, 2000, Biochem,. Mol. Biol. Of Plants (BGJ) Scales Over Which Data Are Read • 1. Whole fields of plants, or whole plant behavior (Crop Science). (Steve W., Jeff White, Jim Jones) • 2. Individual tissues, e.g. roots or leaves, (Plant Physiology). (Steve W., Siobhan Brady). • 3. Cell types, e.g. mesophyll cells or bundle sheath cells. (Tom B.) • 4 The genome, transcriptome, proteome, metabolome. Language of “omics”). (Nick P., Tom B., RG, Siobhan B., All of us, ultimately) • Note: Individual genes come in different “flavors”, called alleles, giving rise to slightly different mRNAs, and slightly different proteins, but still with the same function The Language of “ ” “omics” alludes to the study of the totality of information associated with defined levels of biological organization. (This classification is way over-simplified). “Genomics” – genes, structure/”landscape” of the genome.. A constant for any given organism. “Transcriptomics”- transcripts. Variable “Proteomics” – proteins. Variable “Metabolomics”- metabolites, Variable “Phenomics”- phenotypes. Variable Different Views of Genomes • 1. Some people want to know about the history of genomes, how they came to have a particular structure, and, from that, how different organisms are related. (Eric L.) • 2. Others want to know how the information in the genome (”genotype”) is variously read so as to produce a specific organism that grows, reproduces, and responds to internal and external signals (“phenotypic plasticity”). (RG, TB, SW, NP, BU). Genome Function Would that it were that simple! The readout cannot really be described as simply proceeding from the first level of transcription because there are many other points of control. Complete information Readout from the genome Different genes are under different conditions. Different proteins are present under different conditions Translation from the language of DNA/RNA into protein Proteins: the workhorses of the cell Harder Solutions • The era of “blame it all on one gene/allele” is coming to an end, although it still holds in some cases. • We now know that any particular phenotype may result as the outcome of a network of interacting genes. (Actually, the readout from those genes.) • That interaction may be manifested between genes or protein molecules, and always at a specific location and under a specific condition within the plant cell. Different Levels of Genome Regulation The amount of mRNA available for translation can be varied The amount of protein can be varied mRNA The protein can be chemically modified in different tissues. Functions of Proteins • The proteins that are produced perform functions such as – Catalyzing chemical reactions that would not otherwise go on in the cell, producing “metabolites”. – Acting as structural components of the specialized membranes in the cell. – Causing the activation of particular genes by binding to their promoter region, “transcription factors”. – Interacting with each other in a “signal cascade”. • For each of these categories, “There’s a database for that”, but they are not yet fully integrated, nor fully visualized. Nick will show us a tool that integrates databases. Transcription Factors (TFs) DNA binding proteins often interact with each other. / TFs The TFs bind to sequences that are specific for the gene The expression of groups of genes is often coordinated. Much active work in analyzing patterns of gene expression in response to given stimuli (Nick, RG, and many others). “Promoter region” Gene is transcribed to mRNA after proteins have bound to adjacent DNA, “promoter”. External Signals Signal acts on the genome though TFs A “Signal Cascade” Transmission of a signal, from, say, the periphery of a cell to the nucleus, where a particular gene (s) is then activated. This mechanism involves the physical interaction of proteins with each other. Databases of known protein-protein interactions are available. (Nick’s work on “The Interactome”) Simplified carotenoid biosynthetic pathway in plants Example of a metabolic pathway Caps and bold: enzymes Tom and his colleagues studied the effect of genetic variation in maize on the levels of the nutrient compounds in red. The variation is due to small differences in the genes that encode these enzymes. C. E. Published by AAAS Metabolic Pathways Many metabolic pathways involve physical interactions among enzymes. Stress hormone Harjes et al., Science 319, 330 -333 (2008) Metabolic Pathways: “MapMan” • Bjoern will show us a tool for the depiction of metabolic pathways in plants, and for pasting gene expression values on to those pathways. In MapMan, genes are “binned” into pathways. What is Missing • A way to integrate these phenomena so that the user can visually follow known events from, say, the appearance of the signal all the way to the phenotype passing through the corresponding chain of events. A Crossroads for Plant Biology.. There are models and then there are models “…….models bring together two of biologists’ favorite tools: abstraction and a love of drawing. Often they indulge in both of these passions by sketching down simplistic models, which may include five arrows, a factor X, and two question marks. What has worked for decades to illustrate interactions among a few genes, has clearly reached its limits in the postgenomic era where research is no longer data-limited.” Sensing Signaling Salinity Drought Temperature Ionic Responses Stress-specific Homeostatic Adjustment Osmotic Temperature * Growth Control/ Development Injury Status *** Cell Death Detoxification** Damage Control & Repair T O L E R A N C E or A V O I D A N C E SOSs, DREBs, CBFs, NACs * protein unfolding, membrane leakage, water/ion imbalance; ** distinct pathway, not specific for a particular stress; *** overlapping, pathways acc. Zhu, 2001, Bohnert & Bressan, 2001; Grene & Bohnert, 2009 [important stress response transcription factors] are affected by and themselves affect ROS- and redox-dependent pathways The Study of Genome Function is • 1. We only know the full sequence of a few genomes. • 2. Even for those in which the sequence is known, we do not have a full picture of how the genome is “read”. • 3. Valid generalizations about genome structure and genome readout can be made, but, mostly, we cannot identify all the “working parts” for any organism. • 4. Some biologists draw “models”, which usually means a static pictorial summation of what we know about a given phenomenon. No ODEs or stats involved, unless you are Steve Welch and the like. MYB96 mediates ABA signals via RD22 in regulating drought stress response An early response to drought is the synthesis of ABA The TF activates this gene When drought occurs, ABA is made, causing the TF gene to be expressed Phenotype Expression of this TF Phenotype Seo, P. J., et al. Plant Physiol. 2009;151:275-289 ABA is a plant hormone, produced under stress A “Model” of ABA-Dependent and Independent Stress Responses in Arabidopsis Inhibition by these proteins TFs binding to promoter of this genes Everything beyond expression of this gene is a black box. A “Model” for Temperature Responses in Arabidopsis Grey ovals contain TFs that respond to more then one stress TFs (ABA, GA, auxin, ethylene are plant hormones). The rectangles contain generalized allusions to phenotypes The arrow leading to each rectangle is a type of hand-waving.. Time and space are hard to decipher here. Can we improve on this? Wish List: Layering • A system of layers, where the user can mine a given set of data/”layer” (repositories of gene expression, metabolite data, protein-protein interactions), and view those data in a cellular context in 3D. A suite of genes/proteins/metabolites could be visualized dynamically, showing changes over time. Layers could be fused at will, within a temporal or spatial context. • The user could upload his/her own data. • All data would be linked to other available information, for example, orthologs in other species, published diagrams of signaling pathways, what is known about regulation of the genes, proteins or pathways of interest….. Each Sub-Discipline has its Own Manhattan, NY Rallying Cry Unite the languages, enable a view that yields quantitative data from fields of chilly plants to cells where signaling cascades operate to activate defense genes, to altered metabolism to chilling-tolerant plants. The Flyovereverybody else “Center of the Universe”, e.g. Fields, Tissues, Cells, Genomes Bjoern Presents His Data A “Model” of the Regulation of ABA Signaling by AREB1 (a TF) Stress Response Fujita, Y., et al. Plant Cell 2005;17:3470-3488 Genome Readout 3 • Production of mRNA and its regulation • Production of protein and its regulation. • Production of functionally active protein in a specific subcellular location. • Mechanisms for providing other factors needed for function (e.g. substrates for enzymes). • Effect of the functioning of proteins on cell/tissue/ plant function. Making the G2P Connection Connecting a particular allele to a particular phenotype. How is gene expression regulated? Connecting a substring within a given chromosome to a given phenotype. Stats. Connecting a gene network to a given phenotype. How is the network regulated? Connecting a metabolic pathway to a given phenotype. How is the pathway regulated?