What is systems biology? Being a mathematician in a biologist’s and a bioinformatician’s world Zofia Jones, PhD Mathematical Sciences Outline • Intro • Mass action / ODE example. • Metabolic modelling example. A mathematical biologist is… by definition interdisciplinary. My main motivation: a chance to use a broader range of mathematical skills than I could in physics and fewer rules. (but that’s just me, lots of people love biology for its own sake) However, progress is somewhat dependent on effective collaboration and communication. First: What is the computational focus of this group? • Something like… First: What is the computational focus of this group? • Identification of species at OTU level. • Fitting diversity distributions to hypotheses – rarefication, neutral theory with immigration from one or more metacommunities • Accounting for diversity using environmental variables. • Lots of bioinformatics – de-noising, assembly, visualisation, phyologenic trees, identification • Metagenomics – takes the bioinformatics challenge up a notch in complexity and looks at function as well as phylogeny But more about diversity than function But to understand diversity we need to ask how microorganisms are competitive And for that we need to understand how they function How does the available energy in the environment translate to fitness? •What are the principles underlying diversity? •Then match to data. The big goals • Pharmaceuticals £££ • Alternative fuels – bacteria to produce ethanol from waste biomass • Bioremediation, phytoremediation, mycoremediation • Use microorganisms to grow building materials and cellulose based “plastics” – check out BioMason, Ecovative… • Want to engineer specific metabolic pathways and their efficiency Scientific Method- hypothesis and evidence Deduction / Induction Need predictive models • on a cell scale • or/and on community / ecological scale This covers with a LOT of science and expertise What do predictive models do? • Need to integrate information on metabolic pathways, regulation, kinetics... • See if we can reproduce what we observe in experiments on a computer • Predict growth/no growth, specific pathways, coexistence, inhibition factors • Help plan experiments. • Help save money. • Help save time. • (Inspire funding) The most common tasks of systems biologists when modelling a standard gene regulatory network - an example systems biology model Figure 2. Network models, derived from the heuristic MIM shown in Figure 1, for simulation. Detailed Diagram -lots of interaction. Lots of research and Biologist’s input on this -many years of work. Kim S, Aladjem MI, McFadden GB, Kohn KW (2010) Predicted Functions of MdmX in Fine-Tuning the Response of p53 to DNA Damage. PLoS Comput Biol 6(2): e1000665. doi:10.1371/journal.pcbi.1000665 http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000665 Research Similar Models with elements similar to yours. Make assumptions on rate constants - group them by type, evidence of relative magnitude Write down some equations -here we have simple mass action (rate proportional to concentration) -though more complicated with less information Make assumptions on Initial conditions Fit to experimental data -some well developed data here Sensitivity Analysis – how sensitive are outputs to parameters? Figure 10. Prediction of late response to DNA damage. Make some predictions on dynamics -qualitative statements are best eg. oscillation or decay? Give this back to the Biologists. Kim S, Aladjem MI, McFadden GB, Kohn KW (2010) Predicted Functions of MdmX in Fine-Tuning the Response of p53 to DNA Damage. PLoS Comput Biol 6(2): e1000665. doi:10.1371/journal.pcbi.1000665 http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000665 Figure 7. Bifurcation diagram of the effects of MdmX on p53 oscillatory behavior. So some stability analysis -where in toparameter Kim S, Aladjem MI, McFadden GB, Kohn KW (2010) Predicted Functions of MdmX in Fine-Tuning the Response of p53 DNA Damage. PLoS Comput Biol 6(2): e1000665. doi:10.1371/journal.pcbi.1000665 space do these behaviours http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000665 occur? What might you think of when you think about mathematical modelling? Experimental biologists are pragmatic people… What might you think of when you think about mathematical modelling? • Something complicated or time consuming. • Memories of tedious maths lessons at school. • Michaelis-Menten enzyme kinetics. • Don’t know where to start. • Something to be used sparingly for specific questions or problems. • There are interdisciplinary courses available. Actually need a lot of qualitative knowledge. I spend a lot of time reading the small print. This is the real research bottleneck for numerical people in biology. I miss sums … Bad luck Charlie Brown. What I think of … • Asking lots of questions • What is the science I need to learn? • What is the experimental evidence? • What do we want to explain? • What is the fundamental cause/effect behind these results and discussion papers? • How can I account for these relations quantitatively? What I think of … • I need to know a lot about the biology: details on what drives syntrophy, evolutionary trade-offs, information on primary and secondary metabolites, phylogeny of enzymes, thermodynamic limitations, thermodynamic gradients, co-regulation, regulation of pH, limiting factors, summary of open questions, good experimental data. • Need specific, precise information and clearly expressed ideas and theories. • I need to know a lot about the relevant mathematics which fits this biological information. • The maths is not driven by complexity or difficulty but relevance. • ODEs, graph theory, flux balance analysis with further constraints…. Again… • Need specific, precise information and clearly expressed ideas and theories. (and biology is big on vagueness … ) • The maths is not driven by complexity or difficulty but relevance and context. • Should be helpful and accessible to both mathematicians and biologist’s specialised in that area. • If it is confusing they haven’t done their job. Let’s look at some infographics “The Virtuous Cycle” • to constrain models to be more precise • infer regulatory networks • better kinetic data, information on interactions • to put the bigger picture theories on evolutionary biology or cell function to the test Modern research : Different people involved in different steps. • metabolic reconstruction • mass action kinetics • directed graphs • microarray • transcriptomics • diversity data • environmental variables Eg. One lady Jeanette Johnson • Who works with mathematicians in Oxford on diffusion problems. • Said that she often find things in her experiments which the modellers then can explain in theory. • And they can predict things they find in experiment. • So that’s the ideal situation. Requires lots of skills and lots of people… Requires communication, teamwork and patience. After all, each cell can be viewed as a tiny computer with a core program modified by experimental evidence. Scientific Deduction / Induction Use your imagination! Need predictive models • on a cell scale • or/and on community / ecological scale This covers with a LOT of science and expertise Another example. Metabolic Modelling ie. Only need genome to get stoichiometry of network and some estimates of parameters which can be provided by Kbase. Then can improve annotation as required. Solution of metabolic model == • Net flux at each node = 0 • Flux is concentration x rate • Extracellular source terms for substrates etc. • Sink terms for biomass. • Assign a vector to be optimised eg. biomass flux • Standard linear discrete optimisation problem. • Many alternative solutions are usually possible. • Finding the biologically meaningful ones… How to build a metabolic model • Genome: get it, annotate it. • All you need is databases. • Get it: sequencing then assembly, NCBI, arCOG • Initial annotation: curated genomes NCBI, RAST • Additional annotation: comparative genomics with MAUVE, literature review How to build a metabolic model • Draft a model: join the dots and create a sbml file. • SEED or Kbase – same software different packages. • SEED: slow and over-subscribed. • Kbase: command line and faster. • Draft built on RAST annotation then can add additional missing reactions manually. How to build a metabolic model • Edit model: add or delete reactions • Fit model to growth or no growth data. • This data is usually specified by choice of media or gene deletions. • False growth positive requires more gaps. • False growth negative requires more reactions. • Can be done efficiently on KBase. • Put model in paper. Some Kbase Commands kbws-login zofia1 -p l******1 kbws-url http://kbase.us/services/workspace_service kbws-workspace glasgow kbws-listobj kbws-url http://kbase.us/services/fba_model_services kbfba-getmedia acetate_minimal glasgow -e >> acetate_minimal.txt annotate_genome contigs.fasta concilii kbfba-buildfbamodel concilii methanosaeta kbfba-gapfill methanosaeta –m acetate_minimal = gapfilled model Curate from getting gapfilling results. = good enough model Metabolic models mainly used to show we have correct understanding of network via growth/ no growth data. Metabolic model building can be used to check understanding as we make guesses about which pathways are present or active in a given environment. •Set constraints to reflect presence/absence of substrates. •Run flux balance analysis to get steady state solution. •See which pathways are active in solution. •Use solution to help interpret transcriptomics, microarray data, metagenomics or PCR. •Adjust understanding of organism or model as necessary. How to use a metabolic model • Check understanding of topology: the most complicated bit • Done mainly by referring to growth or no growth data. • Databases are light on info on archaea and nonpathogenic micro-organisms. • Difficult is microorganisms can only grow on a limited range of media – fewer experimental options. • Challenge to chose the correct edits. How to use a metabolic model • Curate detail: learn about your pet organisms • Some detail to add… proton/electron ratio = getting the net ATP produced correct How to use a metabolic model • Curate detail: learn about your pet organisms • Some detail to add… accurate rate of ATP production -> accurate growth rate Minimal ATP requirement. ATP requirement increases linearly with growth. How to use a metabolic model • Curate detail: learn about your pet organisms • Some detail to add… biomass composition How to use a metabolic model • Curate detail: learn about your pet organisms • Some detail to add… Thermodynamics: is reaction reversible or not? How to use a metabolic model • Add kinetics: model works out yield • Just add Michaelis-Menten kinetics for carbon and nitrogen sources. How to use a metabolic model • • • • • Use to integrate omics data Transcriptome, ribosome, proteome, metabolome … Various statistical methods to try. Need money and experimental expertise. Ask what is necessary for a specific question. How to use a metabolic model • Search and brain-storm good questions. • Then find ways of testing them in silico • There is no set way of doing this. How to use a metabolic model Visual starting point - Cytoscape • Ask theoretical questions – hurrah! • What is being optimised – ATP production, efficiency, flux minimisation, what is the optimisation trade-off? • What does the structure of the metabolism infer? Can we infer regulatory networks? • Does more modularity indicate robustness? • Can many similar networks (genotypes) produce a similar phenotype? • What role does thermodynamic buffering play? • Simplification – which pathways are responsible for bulk of growth? Complex or just Complicated • “Complexity” arises from the application of fundamental principles. • But are there fundamental principles in biology? • After we exhaustively make lists and databases of what we know, is it just complicated? • Or is it just something in between where principles do exist, but in a less black and white way? Elementary Flux Modes • A unique path through a network • Form a basis set to all other paths • Typically millions – computationally expensive or impossible - meaningless? • Find the k shortest EFMs • Look for EFMs which connect two points of interest • Prune reactions with low flux values • Can then test co-regulation and yield predictions with transcriptomics or microarray data • Check understanding of what is responsible for yield • Find ways to improve yield via gene deletions etc. Scientific Deduction / Induction Requires lots of skills and lots of people… Requires communication, teamwork and patience. What do predictive models do? • Need to integrate information on metabolic pathways, regulation, kinetics... • See if we can reproduce what we observe in experiments on a computer • Predict growth/no growth, specific pathways, coexistence. • Help plan experiments. • Help save money. • Help save time. Please ask if you’re interested in constructing Kbase metabolic models. Tutorial coming soon! Mathematical biologist? I spend a lot of time reading the small print and reducing what it says to modellable parts. You need to know your system to model your system. I miss sums … Bad luck Charlie Brown.