Systems and Synthetic Biology 499A 5th Jan 2009 Herbert Sauro hsauro@u.washington.edu www.sys-bio.org This course is about networks: The Science and Engineering of Biological Networks The world is full of networks Electronic WWW Road Social Biological Networks Metabolic Networks Metabolic About 1000-1400 genes that code for metabolic enzymes in E. coli (out of a total of about 4300 genes) Protein-Protein Networks Protein Signaling Network Protein-Protein Networks Protein Signaling Network: CellDesigner Kohn MIMS 20% of the human protein-coding genes encode components of signaling pathways, including transmembrane proteins, guaninenucleotide binding proteins (G proteins), kinases, phosphatases and proteases. Protein-Protein Networks C Genetic Networks Gene Regulatory Networks: BioTapestry Genetic Networks Gene Regulatory Networks: BioTapestry : Ventral Neural Tube in Vertebrate Embryo Genetic Units Understanding the Dynamic Behavior of Genetic Regulatory Networks by Functional Decomposition. William Longabaugh and Hamid Bolouri Curr Genomics. Author manuscript; available in PMC 2007 December 12. Published in final edited form as: Curr Genomics. 2006 November; 7(6): 333–341. Hybrid Network: Cell Cycle Control is Bacteria Two Kinds of Representations 1. Non-Stoichiometry – or ball and stick networks No stoichiometry, kinetics or mass conservation Cytoscape: Ball and Stick 2. Stoichiometry – reaction maps ?? – Stuff that people make up, whose knows what they really mean Stoichiometric Network Classification Elementary Stoichiometric Non-Elementary Networks Probabilistic NonStoichiometric Ball and Stick (Data dependent) Systems and Synthetic Biology Systematic Biology Synthetic Network Physiology Biology Top Down Bottom Up Systems Biology Synthetic Biology Top Down and Bottom Up Top Down “-omics” System • Whole cell Model • Statistical Correlations Data • High-throughput Yeast Protein-Protein Interaction Map Top Down and Bottom Up Top Down “-omics” Bottom Up ”mechanistic” System • Whole cell System • Networks/Pathways Model • Statistical Correlations Model • Mechanistic, biophysical Data • Quantitative, single-cell Data • High-throughput Genomes Smallest Genome – was in 1999 } Single Gene One of the smallest Genomes: Mycoplasma genitalium (Small parasitic bacterium) 19 Smallest Genome Total genes: Protein coding genes: tRNA and rRNA: 521 482 39 This genome is of interest to synthetic biology because Craig Venter wants to use this organism as the basis for a minimal organism for genetic engineering. Venter’s group has removed roughly 101 genes and the organism is still viable, the idea then is to patent the minimal set of genes required for life. PNAS (2006) 103, 425--430 20 Gene Function The complexity of simplicity Scott N Peterson and Claire M Fraser Genome Biol. 2001;2(2):COMMENT 2002. Epub 2001 Feb 8. 21 But the real prize goes to…. The 160-Kilobase Genome of the Bacterial Endosymbiont Carsonella Atsushi Nakabachi, Atsushi Yamashita, Hidehiro Toh, Hajime Ishikawa, Helen E. Dunbar, Nancy A. Moran, and Masahira Hattori (13 October 2006) Science 314 (5797), 267. Endosymbiont : organism that lives in another cells. 160-Kilobase Genome of the Bacterial Endosymbiont Carsonella Symbiont of sap sucking PSYLLIDS or ‘jumping plant lice’ ~182 genes 22 Prokaryotic Cells: E. coli 1 .Bacteria lack membrane bound nuclei 2. DNA is circular 3. No complex internal organelles 2-3 um http://www.ucmp.berkeley.edu/bacteria/bacteriamm.html 23 Prokaryotic Cells: E. coli http://atlas.arabslab.com 24 Comparison to Eukaryotic Cells http://www.cod.edu/people/faculty/fancher/ProkEuk.htm 25 E. coli Cytoplasm Average spacing between proteins: 7 nm/molecule Diameter of a protein: 5 nm David S. Goodsell (Scripps) 26 E. Coli Statistics Length: 2 to 3 um Diameter: 1 um Generation time: 20 to 30 mins Translation rate: 40 aa/sec Transcription rate: 70 nt/sec Number of ribosomes per cell : 18,000 Small Molecules/Ions per cell: Alanine: Pyruvate: ATP: Ca ions: Fe ions: 350,000 370,000 2,000,000 2,300,000 7,000,000 Data from: http://bionumbers.hms.harvard.edu http://redpoll.pharmacy.ualberta.ca/CCDB/cgi-bin/STAT_NEW.cgi David S. Goodsell (Scripps) 27 E. Coli Statistics E coli has approximately 4300 protein coding genes. Protein abundance per cell: ATP Dependent helicase: 104 LacI repressor: 10 to 50 molecules LacZ (galactosidase) : 5000 CheA kinase (chemotaxis): 4,500 CheB (Feedback): 240 CheY (Motor signal): 8,200 Chemoreceptors: 15,000 Glycolysis Phosphofructokinase: 1,550 Pyruvate Kinase: 11,000 Enolase: 55,800 Phosphoglycerate kinase: 124,000 Source: Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 2008, 9:102. Ishihama et al. Krebs Cycle Malate Dehydrogenase: 3,390 Citrate Synthase: 1,360 Aconitase: 1630 28 E. Coli Statistics E coli has approximately 4300 protein coding genes. Molecules Numbers in Prokaryotes: 1. 2. 3. 4. 5. 6. Ions Small Molecules Metabolic Enzymes Signaling Proteins Transcription Factors DNA Millions 10,000 – 100,000 1000 – 10,000s 100 – 1000s 10s to 100s 1 – 10s Source: Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 2008, 9:102. Ishihama et al. Protein abundance per cell: ATP Dependent helicase: 104 LacI repressor: 10 to 50 molecules LacZ (galactosidase) : 5000 CheA kinase (chemotaxis): 4,500 CheB (Feedback): 240 CheY (Motor signal): 8,200 Chemoreceptors: 15,000 Glycolysis Phosphofructokinase: 1,550 Pyruvate Kinase: 11,000 Enolase: 55,800 Phosphoglycerate kinase: 124,000 Krebs Cycle Malate Dehydrogenase: 3,390 Citrate Synthase: 1,360 Aconitase: 1630 29 Circular Chromosome in E. coli Most Prokaryotic DNA is circular. Gene are located on both strands of the DNA. Genes on the outside are transcribed clockwise and those on the inside anticlockwise. E. coli’s genome is 4,639,221 base pairs Coding for 4472 genes, of which 4316 are genes that code for proteins. Proteins 4316 tRNAs 89 rRNAs 22 Other RNAs 64 30 Circular Chromosome in E. coli 88% of the E. coli genome codes for proteins, the rest includes RNA coding, promoter, terminators etc. In contrast, the Human genome: 3,000,000,000 base pairs and about 25,000 genes. Only 2% of the Human genome codes for proteins. The rest is……RNA regulatory network? Human genes are also segmented into Exon and Introns, with alternative splicing, significantly increasing the actual number of protein 31 EcoCyc: http://ecocyc.org/ 32 Preview Model – Repressilator A Synthetic Oscillatory Network of Transcriptional Regulators; Michael B. Elowitz and Stanislas Leibler; Nature. 2000 Jan 20;403(6767):335-8. TetR tetR Gene XYZ Protein Degradation Preview Model TetR tetR Gene XYZ Protein Degradation Preview Model TetR GFP tetR Fluorescence gfp Gene XYZ Protein Degradation Preview Model TetR GFP tetR Fluorescence gfp CI lamda cI Gene XYZ Protein Degradation Preview Model TetR GFP tetR Fluorescence gfp CI lamda cI LacI lacI Gene XYZ Protein Degradation Preview Model TetR GFP tetR Fluorescence gfp CI lamda cI LacI lacI Gene XYZ Protein Degradation Preview Model v n=1 n=2 n=4 n=8 TetR tetR LacI This is an empirical model of gene expression. Gene Basal Rate (leakage) Hill coefficient Maximal Rate (Vmax) XYZ Protein Degradation Preview Model TetR GFP tetR Fluorescence gfp CI lamda cI LacI lacI Gene XYZ Protein Degradation Preview Model TetR GFP tetR Fluorescence gfp CI lamda cI LacI lacI Gene XYZ Protein Degradation Preview Model Preview Model p = defn cell TetR -> $w; k1*TetR; LacI -> $w; k2*LacI; CI -> $w; k3*CI; $s -> TetR; a0 + a/(1+LacI^n); $s -> CI; a0 + a/(1+TetR^n); $s -> LacI; a0 + a/(1+CI^n); end; p.a0 p.k1 p.k2 p.k3 = = = = 0.1; 0.127; 0.116; 0.08; p.a = 1; p.n = 8; m = p.sim.eval (0, 450, 1000, [<p.Time>, <p.TetR>, <p.LacI>, <p.CI>]); graph (m); Preview Model p = defn cell TetR -> $w; k1*TetR; LacI -> $w; k2*LacI; CI -> $w; k3*CI; $s -> TetR; a0 + a/(1+LacI^n); $s -> CI; a0 + a/(1+TetR^n); $s -> LacI; a0 + a/(1+CI^n); end; p.a0 p.k1 p.k2 p.k3 = = = = 0.1; 0.127; 0.116; 0.08; p.a = 1; p.n = 8; m = p.sim.eval (0, 450, 1000, [<p.Time>, <p.TetR>, <p.LacI>, <p.CI>]); graph (m); The next ten weeks….. Week 1: Basic Prokaryotic Biology and Genetics Week 2: Kinetic rate laws Week 3: Kinetics of gene expression; mass-balance; Computer simulation; steady state Week 4: Stability, stochastic systems Week 5: Synthetic Biology and experimental techniques Week 6: Networks, feedback and feed-forward systems Week 7: Oscillators, bistability, filers, band detectors etc Week 8: Protein networks, design patterns, modularity Week 9: Structural characteristics of networks Week 10: Presentations