From silicon cell to silicon human Hans V. Westerhoff 1,2, Malkhey Verma1, Frank J. Bruggeman2, Alexey Kolodkin2, Maciej Swat2, Neil Hayes1, Maria Nardelli1, Barbara M. Bakker3, and Jacky L. Snoep1,2,4 1 Manchester Centre for Integrative Systems Biology, the University of Manchester 2 Netherlands Institute for Systems Biology, VU University Amsterdam 3 Department of Paediatrics, Centre for Liver, Digestive and Metabolic Diseases, University Medical Centre Groningen, University of Groningen 4 Department of Biochemistry, Stellenbosch University Summary This chapter discusses the silicon cell paradigm, i.e. the existing systems biology activity of making experiment-based computer replica of parts of biological systems. Now that such mathematical models are accessible to in silico experimentation through the world-wide web, a new future has come to biology. Some experimentation can now be done in silico, leading to significant discoveries of new mechanisms of robustness, of new drug targets, as well as to harder validations or falsifications of biological hypotheses. One aspect of this future is the association of such live models into models that simulate larger parts of the human body, up to organs and the whole individual. Reasons to embark on this type of systems biology, as well as some of the challenges that lie ahead, are discussed. It is shown that true silicon-cell models are hard to obtain. Short-cut solutions are indicated. One of the major attempts at silicon-cell systems biology, in the Manchester Centre for Integrative Systems Biology, is discussed in some detail. Early attempts at higher order, human, silicon-cell models are described briefly, one addressing interactions between intracellular compartments and a second trying to deal with interactions between organs. Table of Contents Summary ................................................................................................................................................. 1 Introduction ............................................................................................................................................ 2 Where Systems Biology is different .................................................................................................... 3 What Systems Biology? ....................................................................................................................... 3 How Systems Biology? ............................................................................................................................ 4 Top-down Systems Biology ................................................................................................................. 4 The silicon cell ..................................................................................................................................... 4 Silicon cell models: advantages and disadvantages ........................................................................... 6 Blueprint modelling ............................................................................................................................ 9 The wisdom of MOSES: domino systems biology ............................................................................... 9 Metabolic Control Analysis models .................................................................................................. 11 The silicon-cell strategy in yeast ....................................................................................................... 12 Silicon cell and differential network-based drug design................................................................... 13 The true silicon cell ........................................................................................................................... 13 Crossing the scales ............................................................................................................................ 15 Different types of modelling ............................................................................................................. 15 Towards the silicon human ................................................................................................................... 17 Acknowledgements............................................................................................................................... 19 References ............................................................................................................................................ 20 Introduction This chapter addresses how the molecular biology of cell types may be related to their cell biology, and how both of these may be related to the functioning of a multi-cellular organism. It focuses on methodologies that make realistic models. These methodologies enable the understanding of mechanism and control of function. This analysis package is comprehensive, because the authors of this chapter have invested considerable effort to make it such. Although the endocrine role of insulin production by beta cells is what the authors have in mind, this application is not made explicit, in part because too little has been done, and in part because this chapter wishes to inspire experts to have a fresh go at this. We shall first address the differences that systems biology may make. Subsequently we shall describe multiple aspects of our silicon-cell mode of systems biology. We end by speculating how the approach may lead to a true-to-Life model of how human functions through its interacting molecules. Where Systems Biology is different Genomics and Molecular Biology have focused on the identification of all the individual macromolecules, their inherent activities, and sometimes their interactions with immediate partners. Molecular Cell Biology has drawn schemes that indicate which macromolecules interact with which other macromolecules, either directly or indirectly. Some of these schemes distinguish between stimulatory and inhibitory interactions. Few of them indicate the strengths of the interactions and none of them indicate how the strengths of the interactions may depend on other factors, such as concentrations of other molecules in the network or the concentrations of the interactors. Probably because of the robustness and adaptability of biological functions, the latter tend to be regulated through both positive and negative interactions. As a consequence one cannot come to understanding and predictions without assessing the strength of the interactions quantitatively. Because insufficient attention has been paid to collecting the intermolecularinteractions data quantitatively and because all data relevant to a certain function have rarely been integrated into a single frame of reference, network analyses have remained qualitative and thereby speculative. On the other hand, mathematical biology has had the tendency to abstract away from the detail and the actual, because it aimed for generic principles. Of the principles that were found, such as gradient driven self-organization as possible mechanism for developmental biology, specific predictions could be falsified. This made self-organization theories irrelevant in the eyes of experimental developmental biologists (Lawrence 1992; Davidson 2006; Peter and Davidson 2009). As an alternative paradigm for developmental biology, the concept of the genetic program became popular, in which the expression of one gene would lead to a protein activating the expression of the genes of the subsequent phase. Although feedback and feedforward loops are recognizable in the corresponding networks, it is not clear whether self-organization plays a role (Peter and Davidson 2009). To understand living organisms we need to appreciate with sufficient precision how their components interact. We need to reckon with a combination of a genetic programme that came about accidentally in evolution with mechanisms that involved self-organization. This will require integration of the historical paradigms of mathematical biology and molecular genetics (Westerhoff and Palsson 2004). It is in this integration that systems biology differs from both mathematical biology and molecular genetics, and in fact from mainstream physics and biology (Westerhoff, Winder et al. 2009). Systems biology also differs from physiology, which describes the functioning of biological systems in their entirety, without complete reference to the components. Cell physiology helps describe qualitatively how ATP levels change when muscle is innervated and why this leads to contraction. It does not explain this in a mode that predicts on the basis of changes in molecular processes. What Systems Biology? Systems biology has existed for more than 10 years now. Some of the low hanging fruits have been picked. This included the discovery of interesting potential patterns of networking (Albert and Barabasi 2000) and regulation (Alon 2007) based on computational analyses of the completely sequenced genomes. However even definitive information that two network components can interact, does not certify that they actually do interact, or that the flow of mass or information flux between the two components is significant. A transcription factor can interact with a gene only under the, possibly rare, condition where the former is actually expressed. A metabolite for which an enzyme has a binding site may only rarely attain concentrations that exceed its binding constant in the compartment the enzyme resides in. Without dynamic information about the actual states of the living systems, conclusions about scale-free intracellular networking and about prevalent genenetwork motifs for biological function are preliminary. Understanding of network function requires the experimental determination of the kinetic or binding properties of the macromolecular components. Systems Biology should then assemble this information into a mathematical replica and calculate the fluxes. The latter should then correspond to what is measured experimentally. Lack of correspondence should be taken as a lead to discovery of new interactions or parameter values. How Systems Biology? Accepting the above ideal scenario for systems biology, one should translate this into something that is operational. At present this is almost impossible, because too little is known or can be measured quantitatively. In addition, some parameter values are ‘soft’, i.e. depend on intracellular conditions that are not quite known. Examples are expression levels and hence Vmax and KM values that depend on pH or even on the concentrations of other medium components (van Eunen, Bouwman et al. 2010). In addition, it is difficult to measure the property of some enzymes, whereas it can be easier to do this for others. The strategies for systems biology have not yet been tried out yet. Below we shall review some such strategies, in particular the ones that relate to the silicon cell. Top-down Systems Biology The strategy that is closest to genomics is called top-down systems biology (Alberghina and Westerhoff 2005). Here the concentrations of all components of a certain class (mRNA, proteins, or metabolites) are measured in a genome-wide sense, as a function of time, or of conditions. The components that behave similarly are then grouped together, assuming that correlation indicates a mechanistic or functional relationship. This may then lead to the proposal that all members of a group are regulated by the same transcription factor. Such a hypothesis may then be tested by identification of that transcription factor. It may also lead to the proposal of a temporal sequence of the action of regulatory molecules, hence to a regulatory pathway. Risks include the confounding of causes with effects , as well as the fact that regulation does not proceed through a single level of cellular organization (such as mRNA levels) but tends to involve at least gene expression and covalent modification through signal transduction, if not metabolism as well. The silicon cell The silicon cell approach (Westerhoff 2001; Snoep 2005) is a strong form of the so-called ‘bottom-up systems biology’. The approach has been elaborated most for metabolic pathways. It consists of isolating all the enzymes of the pathway that is studied and of determining their kinetic properties, as well as their Vmax’s. The rate equations of all these enzymes are then put into a computer model, together with balance equations that give the change in time of the concentrations of all the metabolites as functions of all the reaction rates. The resulting system of equations is solved numerically for steady state, or after addition of initial conditions, for time evolution. Thus a computer replica of a biochemical pathway is created with behaviour identical to real behaviour, if the model is right. The above approach may not seem new, but in its precise sense it is: although silicon cell type models have been made before, in many cases kinetic information was taken from databases for enzymes assayed under conditions that were not the same for all enzymes, nor corresponded to the condition in vivo. The silicon-cell models of human erythrocyte glycolysis (Rapoport, Otto et al. 1977), T. brucei glycolysis by (Bakker, Michels et al. 1997), of yeast glycolysis by (Teusink, Passarge et al. 2000), and of the bacterial phosphotransferase system by (Rohwer, Meadow et al. 2000) are early examples of what is close to the silicon cell approach. Yet, some of these were imperfect because the kinetics of the pathway enzymes were determined in cell extracts rather than with purified enzymes, or the cells were derived from fairly undefined pre-culture (which was however of immediate relevance for the application, e.g. baker’s yeast). The silicon cell is a rather loose research program that is greatly stimulated by the JWS-Online modelling web site (Snoep, Bruggeman et al. 2006). JWS (short for Java Web Simulation project) is a ‘live’ model repository, from which mathematical models of biochemical pathways can be downloaded in SBML form (Systems Biology Markup Language is the model specification language through which systems biology models are exchanged between modelling platforms (Hucka, Finney et al. 2003)). The model repository is ‘live’ in the sense that the models can also be run through a web interface to JWS-Online, without downloading. A user can therefore be completely ignorant of modelling and still do experiments in silico. The models come with the standard parameter set taken from their primary publication, which should correspond to the standard physiological state. Parameter values can be altered and then the changes of concentrations and fluxes can be calculated as functions of time. In addition, systems properties such as the magnitudes and the control of steady state fluxes and concentrations can be calculated. Before acceptance of models, JWS Online checks that they reproduce simulations and calculations they express a claim to in their original publication. Because this reproduction is rarely complete, this model repository has an important function in quality control. BioModels, with which JWS collaborates, is another model repository with an even larger set of mathematical models, most of which can also be simulated in the JWS Online simulator via a direct link within BioModels. Its models have a more systematic annotation facility (Le Novere, Bornstein et al. 2006). The way JWS-Online is populated with models is not completely systematic yet, because there is too little funding for JWS-Online per se. Consequently, the first generation of models in JWS-Online, were made by the small JWS-silicon cell community. The second generation consists of models published in the high quality scientific journals (e.g. FEBS Journal) that became interested in the quality control aspects of JWS-Online. As part of their refereeing procedure, models in submitted papers are put into a (non-public) version of JWS, and the models are run to check that they produce every Figure and Table in the submitted manuscript. Perhaps surprisingly, this quality control mechanism finds faults with more than 90 % of the submitted manuscripts. Only if the paper is accepted, the model becomes part of JWS Online (unless the authors do not want it to). In addition, there is a number of models that have been contributed to JWS-Online by authors interested in getting their model used by colleagues through JWS-Online or getting their citation numbers increased. It is the second and the third modes of contribution that should become most important in the future. In Figs 1, 3 and 5 of this chapter we give jus three examples of silicon cell models. More such models are in JWS-Online, e.g. accessible through http://jjj.mib.ac.uk/index.html . When going through the models in the JWS repository the reader will find some diversity. However, she/he will also recognize that the variety of models is not representative for biology or even cell biology. This is because until now some parts of cell biology have led to more computer-replica than others, or because some authors have not submitted their models to JWS Online. Reasons for the relative abundance of metabolic and especially glycolytic models are that in metabolism the law of conservation of the elements has direct consequences: At steady state, what flows into any node of the network must be equal to what flows out. This helps tremendously when defining the models and the associated experiments> This has led to such metabolic models being much more concrete and complete than signal transduction and gene expression models. Moreover, silicon cell models require accurate experimental data. Until recently, these were obtained either in extracts of cells, or with enzymes purified from wild-type cells. In both cases, highly active enzymes are analyzed most readily and hence the pathways that carry most flux can be approached most successfully. Most models in JWS-Online are of the bag-of-enzymes type, i.e. they assume that enzymes convert metabolites that are present in well defined pools and that there is no direct transfer of metabolites between enzymes, i.e. no metabolite channelling. Likewise, enzyme sequestration by binding to other macromolecules, macromolecular crowding, and active structuring are underrepresented, as are metabolic pathways that are subject to adaptation through gene-expression regulation. These issues are underrepresented, but they are not absent. In particular the silicon cell model of the E.coli phosphotransferase system (Rohwer, Meadow et al. 2000) is rich in these complications: it addresses signal transduction, transport, channelling and macromolecular crowding. Gene expression regulation and DNA structure regulating gene expression are modelled in (Snoep, van der Weijden et al. 2002). Many models are about steady states and the approaches to steady states. However, in biology systems with steady states as the main attractor, dominate and the rather large numbers of models in JWS-Online that deal with oscillations may actually over-represent oscillatory systems. They include yeast glycolytic oscillations (e.g. (Wolf, Passarge et al. 2000)), the cell cycle (Conradie, Bruggeman et al. 2010) and oscillations in NFκB signaling (Ihekwaba, Wilkinson et al. 2007). Silicon cell models: advantages and disadvantages What is the advantage of having a silicon-cell type model, a ‘computer replica’, of a biochemical pathway? If perfect, such a model is just as complex as reality, hence it does not correspond to the abstraction and simplification of reality that is often associated with ‘understanding’. Mathematical biology has long made models of biological systems that aimed at this type of ‘understanding’. Why not stay with those models of mathematical biology? Examples of such mathematical biology models include the Turing type of models which were used to show that self-organization might explain pattern formation in developmental biology (Glansdorff & Prigogine, 1972; Gierer & Meinhardt, 1972). These models each contained a simple network with positive and negative feedbacks. Using simple parameter values their predictions were calculated and shown to lead to pattern formation. Most often no attempt was made to produce a precise correspondence between simulation results and experimental data. Where such attempts were made and the predictions of the model did not fit experimental observations, the parameters in the model would be adjusted until a statistically satisfactory fit was obtained. In principle the fitted parameter values could then be verified experimentally, but this is rarely undertaken in practice: The number of parameters exceeds the number that could be determined experimentally at the required level of accuracy, or, more often, the parameters refer to abstract properties that cannot be measured directly. Even if a parameter value could be measured and was shown not to correspond to what was assumed in the model, then other parameter values would be adjusted so as to obtain a renewed fit between model prediction and experimental system behaviour. Only if such fitting would prove completely impossible the model could serve the important function of falsifying a hypothesis about mechanism, but this has been rare. More often, parameter values could be found for which the model fitted the experimental behaviour, but there was no assurance that those parameter values corresponded to reality. For instance the model would fit the data if a lower than actual Vmax was inserted for an enzyme (such as hexokinase in Teusink, Walsh et al., 1998). The fitted model would be wrong mechanistically, even though it would appear to explain the phenomenon of interest, such as pattern formation. The resulting model could still be used, but then as a phenomenological, descriptive model. Phenomenological models have a long and successful history in both physics and engineering. In physics, because of greater simplicity, subsequent experimental testing was possible and often led to reformulation in terms of a more detailed, mechanistic model, and then validation or falsification. In engineering the models were considered useful also without such validation, because the purpose of a model was the description of the behaviour of the system, not necessarily an explanation of how that behaviour was actually achieved. Most of biology is different however; it is much more complex than physics, actual detail matters (see above), and it often wishes to relate physiological behaviour of the system to its components’ properties. The latter is important for metabolic engineering and therapeutic purposes. Now we get to the answer to the question why one could not stay with the usual models of mathematical biology. The reason is that they do not enable one to validate that proposed mechanisms are actually operative in, and explanatory for, observed functional behaviour. Silicon cell models are realistic and suitable for a falsification/validation strategy. This is a prime utility of silicon cell type of models, i.e. scientific validation/falsification of proposed understanding of systems. Although silicon cell models do not themselves constitute understanding in the sense of simplification to what is most important, they do instantiate another type of understanding, i.e. that of the ability to predict. If the prediction fails to correspond to reality, experimental follow-up can lead to improved understanding. In other words, silicon cells are the tools that are ultimately required for the continued development of our understanding of biological systems. In addition, silicon-cell models can contribute considerably to understanding by enabling computational experiments. Complex actual mechanisms may be elucidated more readily by interrogating a computer replica of reality through computational biology, than by experimental biology. Fig. 1 illustrates how this has worked already. It shows that the silicon cell model of yeast glycolysis was rather unrobust with respect to the activity of the glucose import system; as shown in Fig. 1B only a slight increase in that activity, could lead to a ‘metabolic explosion’, i.e. to a continued increase in the concentrations of some metabolites. Because real yeast is robust in this respect, but a mutant is not, this led us to understand an aspect of the ‘turbo’ organization of many catabolic pathways that could lead to fragility and then to a hypothesis on how a regulatory interaction for which no function was known and which had not been included in the silicon cell, might be quite important for yeast glycolysis (Teusink, Walsh et al. 1998). Silicon cell models have two additional advantages. One is that their parameters are ‘hard’ in the sense that they correspond to properties of real molecules. This means that, once known, the parameter values should not change anymore unless the model is wrong, or the properties of the molecules involved change. Fitted, phenomenological models have the disadvantage that for every new experiment the entire model should be refitted to all existing experiments, allowing all parameter values to be adjusted so as to make the fit optimal (Novak, Csikasz-Nagy et al. 1998). For large models this can become increasingly bothersome. The second additional advantage of silicon cell models is that because they are formulated in terms of real entities, models that address adjacent parts of cell function tend to be formulated in the same terms, or in terms that can be readily translated into one another. Thereby, the silicon cell strategy should allow for the assembly of some of its models into larger models. Related to this, the silicon cell initiative furthers standardization. Many modellers like to see their models used by others in a wider context and are therefore willing to standardize them. The development of SBML (Hucka, Finney et al. 2003) is a sign of this, but the silicon cell initiative tends to go further in certain aspects. Whereas SBML is a standardization of a model description format, we aim for a standardization of model construction protocols. Fig. 1. Non robustness of a silicon cell for yeast glycolysis. Development in time of a number of concentrations. A: the normal state (see www.jjj.bio.vu.nl for the model (Teusink, Passarge et al. 2000)). B: the same but after increasing the Vmax of glucose uptake from 95 to 150; the concentrations of pyruvate and fructosebisphophate fail to reach steady state. The silicon cell strategy also has many disadvantages. One is that it requires an awful lot of careful experimentation to determine all the kinetic parameters. In addition it requires all components to be assayed, which is impossible for realistic systems, first because they contain too many components and second because there is always a component that is most difficult to isolate or assay. A second disadvantage is that it is excruciatingly slow and not always maximally exciting. For instance, the silicon cell approach suggests that having made such a model for an organism for a particular experimental condition, one should start all over again if one is interested in a different organism or a different condition; the organism may then express different isoenzymes. However, repeating the procedure for the different condition, one may obtain the same result in terms of true understanding of function, as one had obtained for the original conditions and organism. On the other hand, quite similar organisms may have entirely different functions or mechanisms which they may achieve by differences in networking of essentially the same molecules ( compare (Haanstra, van Tuijl et al. 2008) to (Teusink, Walsh et al. 1998)). This issue now leads to comparative systems biology. A third disadvantage is that until now, the actual silicon cell models have been about parts of cell function that were considered to belong together, such as metabolic pathways in their classical definition. Strategies for a more rational definition of what pathways silicon cell models should begin to focus on are being developed (Westerhoff et al., 2009). Blueprint modelling Blueprint modelling tries to deal with this demotivating feature of having to redo silicon cell models of related organisms and with the motivating feature of comparative systems biology. The blueprint procedure starts from the silicon-cell model that is already available of a related organism and then changes this in the light of what is already known of the molecular properties of the organism under study. Comparing the predictions of this adjusted blueprint model with physiological behaviour measured experimentally, one then prioritizes which parts of the blueprint model need to be detailed further. The wisdom of MOSES: domino systems biology Intracellular networks are vast and virtually completely connected. In principle, a true silicon cell model is a model of the total expressed genome. This is impossible to achieve, at least for the foreseeable time, and one needs to start with a part of the intracellular network. Ways to divide the intracellular network into modules that can be considered separately, are highly important therefore (Schuster, Kahn et al. 1993; van der Gugten and Westerhoff 1997; Hartwell, Hopfield et al. 1999; Schuster 1999). Growth Nucleotide Synthesis Maintenance ATP ADP Glycolysis AMP Drug Efflux DNA repair Figure 2: Several modules linked by their consumption, production or other interactions (e.g. allosteric) with the adenine nucleotide pool. ‘Domino systems biology’ begins at a key metabolite and then uses pre-existing knowledge concerning the pathways and processes that synthesize this metabolite and the processes that consume it. It determines, by using pre-existing pathway models from silicon cell, by performing new in vitro enzyme kinetic assays, or by modular kinetic analysis (Ciapaite, Van Eikenhorst et al. 2005), how these processes depend on the concentration of the key metabolite. Starting with the most important synthesis process and the most important degradation process, it then formulates a first model with the intermediate in the middle and the two processes around it. It then predicts how activation of the processes affect the concentration of the intermediate at steady state and the fluxes, and compares this with the results of corresponding experiments. Failure of the model to predict the latter type of observations, is then used to invoke either an additional process or an additional metabolic intermediate. By incorporating a next additional process or metabolite one adds the next domino stone. Micro-Organism Systems biology: Energy and Sacharomyces cerevisiae (MOSES), is a research program that develops domino systems biology for yeast. Fig 2 shows the example for when one takes ATP as the central intermediate, which is relevant because for cellular energetics. Fig. 3 shows a modelling result that comes from this approach, i.e. a perhaps somewhat paradoxical dynamic behaviour of the ATP level upon activation of the glycolytic pathway producing ATP (Somsen, Hoeben et al. 2000). 4.5 Adenine nucleotides (mM) 4 AMP ATP ADP AXP 3.5 3 Glucose added 2.5 2 1.5 1 0.5 0 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time (min) 4.5 Adenine nucleotides (mM) 4 AMP ATP ADP AXP 3.5 3 Glucose added 2.5 2 1.5 1 0.5 0 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time (min) Figure 3: Adenine nucleotides dynamics for glucose perturbation by integration of glycolysis and maintenance modules. AXP=ATP+ADP+AMP. Metabolic Control Analysis models Another strategy to enable precise modelling does not seek to limit the network size, but to reduce the types of questions that are addressed by the model. Metabolic Control Analysis is such an approach. It only addresses the control of fluxes and concentrations, not their magnitudes. It is possible to calculate the flux and concentration control coefficients from enzyme kinetic properties called elasticity coefficients (Kacser and Burns 1973; Westerhoff and Kell 1987; Reder 1988; Westerhoff and Kell 2009). Elasticity coefficients contain limited information about the enzymes that participate in the pathway and can hence be estimated in the absence of the full information. Galazzo & Bailey pioneered this approach experimentally, using a fair number of rather precise rate equations which enabled them to calculate the elasticity coefficients, because they had measured the intracellular concentrations of some metabolites by NMR (Nuclear Magnetic Resonance) (Galazzo and Bailey 1990). They found much but non-exclusive control of the flux by the glucose transport system, but this was partly the result of a proposed inhibition of the transporter by glucose-6-phosphate, for which there is no direct experimental evidence. The silicon-cell strategy in yeast Of course an alternative to the above approximate approaches is to carry out the silicon cell agenda as completely as possible. It is indeed one of the main aims of the Manchester Centre for Integrative Systems Biology to provide a first, fully predictive, and essentially complete, systems biology of the most important function of an organism, in terms of a silicon cell model. The initial strategy was to over-express and partially purify each enzyme of yeast and then to determine its kinetic and interactive properties. This approach was not efficient enough, as high throughput kinetic assays were only successful for some enzymes. For most others the substrates were not available commercially, or the enzymes were too unstable. Therefore it was decided to leave this genomics-driven strategy and to switch to a function-driven strategy, i.e. to select a function of interest, estimate which enzymes are most involved in that function, isolate and characterize those enzymes, and then make a silicon cell model (Westerhoff et al., 2009). The resulting strategy is illustrated in Fig. 4. Threads A. Experimental design and pathway finding Understanding Nonlinear FBA Standard FBA Expt. based FBA Ethanol determination B. Component characterization C. Pathway modelling Flux Pathway ranking Paths Exo Meta bolome Protein purification, Characteriz, Determination, Vmax Proteomics Pathway modelling Interactions component functions Flux prediction Improving model Concentr prediction D. Validation New Turbidostat Endometa experiment bolomics Isotope fluxes Interactions Mechanisms E. Discovery Network functioning Behaviour New functions Fig. 4. The strategy of the Manchester Centre for Integrative Systems Biology (MCISB) toward a silicon cell, focusing on a single function, i.e. most of the carbon flux through the organism under study. Silicon cell and differential network-based drug design Most drugs have multiple effects on the patient. One reason is that their targets are parts of molecular networks that connect with other networks. The concept that drugs should be targeted at single molecules may be good for the ability to define drug action biochemically, but it will not be able to define that action biologically. For the latter definition the multiple effects of the target molecule on network performance should be understood. There have been attempts at the corresponding systems-biology driven drug targeting. One of these has used the silicon cell approach to find molecular targets for drugs against T. brucei, the causative agent of sleeping sickness. Indeed, one of the first silicon cells was the glycolytic network of T. brucei (Bakker, Michels et al. 1997). The functional target was the ATP synthesis by the parasite. However rather than targeting pyruvate kinase, i.e. the enzyme that makes most of the cytosolic ATP, the network was scanned for the molecule that had the greatest influence on ATP production. The glucose transporter came out as the number-one target (Bakker, Westerhoff et al. 2000). An equally important aspect as drug effectiveness is drug toxicity. Accordingly a drug should be maximally effective against the parasite but minimally effective against the host. A differential analysis comparing trypanosomes with human erythrocytes confirmed that the glucose transporter might be a good target because the glucose transporter of human erythrocytes was calculated to have little control on ATP synthesis (Bakker, Assmus et al. 2002). However, the human host contains many more cell types than the erythrocytes and the drug should be ineffective against all host targets. For further evaluation of drugs, silicon cells of most host tissues should be useful if not necessary. The true silicon cell Until now the words ‘silicon cell’ have been misnomers. All that exists presently, as exemplified by the collection of models on the JWS-Online model repository, are models of mostly metabolic pathways. There is no model of entire cells. The name silicon cell stems from the ambition ultimately to combine silicon-cell models of pathways into models of entire cells. Cells are compartmentalized and involve more than metabolic pathways. Fig. 5A shows a network that is not involved in metabolism but in signalling. It represents a blue-print model of nuclear hormone receptor signalling in various human cell types. Nuclear receptors (NRs) belong to a family of transcription factors involved in a diverse range of regulatory functions, such as the ones that are active during development, inflammation and metabolism (Carlberg and Dunlop 2006). A NR is a protein that is synthesized in the cytoplasm, shuttles between the nucleus and the cytoplasm, and binds with its response element on the DNA. Addition of ligand L results in the appearance of NRL in the cytoplasm. Then, NRL shifts into the nucleus and binds to response element, causing transcriptional response (Figure 5). A curious aspect is that both export of importins and export of liganded receptor are driven by RanGTP hydrolysis. Why would the cell spend free energy on these processes that both seem to work in the wrong direction? We made a model of this network which is on its way to but not yet equivalent to a silicon cell model; many kinetic parameters are still unknown. In this model we asked what would happen if we decreased ∆G of both processes by a factor of 100 (dashed line). We found that the high investment of Gibbs free energy would stimulate transcription at high concentration ratios of importin to nuclear hormone receptor. This leads us to formulate the hypothesis that the investment of free energy serves to prevent sequestration of nuclear receptor by importin. Fig. 5. Silicon cell model of a nuclear hormone receptor signalling network and prediction of the dependence of transcription activation on the total concentration of importin (Imp) in the system and the free energy driving the transport cycles. Crossing the scales In the above venture from pathway to cell, we met the complication of an extra compartment. When two compartments have different volumes, processes in the one compartment are likely to have different kinetics from processes in the other. Even in a single compartment, time scales may be diverse. One origin of this is in the gene expression cascade. The concentrations of the enzymes in metabolic pathways may adjust to changes at the level of metabolism through regulated gene expression. Because the lifetimes of proteins are mostly longer than the lifetimes of many intermediary metabolites, the dynamics of gene expression are often quite a bit slower than the dynamics of metabolic changes. The methodology of rate and balance equations that has mostly been used in the silicon cell up to now, can deal with a full range of dynamics. However for conceptual purposes, and for the purpose of more rapid computation, methods that summarize the behaviour of more detailed, faster scales into behaviour at the often more relevant , slower and less detailed scales, are important (de la Fuente, Snoep et al. 2002). A well known issue at the faster time scales is that of the dynamic behaviour of enzymes. This can be described at the level of the free substrate [S], the free enzyme [E] and the enzyme-substrate complex [ES], or at the level of the total substrate and the total enzyme concentration. The latter has the advantage that total enzyme is set by the dimension of gene expression not by metabolism. Even though the quasi-steady state approach (QSSA) to enzyme kinetics is a fair way to deal with this usually, there are recent methods for dealing with the cases where enzyme and substrate concentrations are comparable and the QSSA fails (de la Fuente, Snoep et al. 2002; Hardin, Zagaris et al. 2009). MCA has been extended to address the multi-scale issue of signal transduction (Kahn and Westerhoff 1991) and of that metabolism versus gene expression (Westerhoff, Koster et al. 1990; Westerhoff 2008), also experimentally (Snoep, van der Weijden et al. 2002; Hardin, Zagaris et al. 2009). When moving up from the cell level, to the whole body, additional scales appear, such as the scale of the circulation, which is important for the organism action of beta-cells. The coupling of models of the silicon cell type should again help at those scales. We shall discuss this below. Different types of modelling This chapter is motivated by the question how molecular issues in beta cells might be put in the perspective of their biological function. Since their biological function is at the level of the whole human, this involves the crossing of temporal and spatial scales from molecules to the whole mammalian body. Apart from, but sometimes related to, the scales at which one is considering these issues, there are different modelling methodologies. Above we have discussed a few, i.e. top- down systems biology, blue print modelling, domino systems biology, metabolic control analysis, and silicon cell. Three of these five modelling methodologies involve balance equations and kinetic equations. Metabolic control analysis uses less than this (Westerhoff and Kell 1987; Reder 1988), but is limited to control aspects. Top-down systems biology tends to lead to phenomenological models describing patterns. There are quite a few other modelling methodologies that we have not discussed until now. This is because this chapter is devoted to describing the methods that we find most important for obtaining a useful mathematical representation of the human that enables to relate her function to her molecules. This is not to say that other modelling methods are not more useful for other important problems, or even that they will not be important for some aspects of the silicon human. For instance, flux balance analysis as a modelling method may help establish where to look first for important pathways (Westerhoff et al., 2009). However, it ultimately suffers from the fact that we do not know what the relevant objective functions are. A sole objective function of maximum yield of ATP is likely to be irrelevant for most human cells. Modelling cells in terms of Boolean networks may be quite helpful for initial understanding, but suffers from the limitation that in reality the intracellular networks are based on ensembles of molecules and not on individual chains of molecules. Therefore only after it has been shown that parts of the intracellular networks do indeed act as switches, one could engage this method. Transcription does not yield a single mRNA molecule that is then transcribed into a single enzyme which then makes a single molecule of the product. The difference matters for Life, which critically depends on the ability to deal with the challenges imposed by the second law of thermodynamics. The latter depends on the laws of larger numbers and entropy (Westerhoff & Van Dam, 1987). Boolean networks have no problem with violating the second law of thermodynamics. Bayesian networks are a more subtle alternative to Boolean networks, allowing for event probabilities in between 0 and 1. Living systems operate in states that are steady or steady on average. Thereby part of the essence is not how they move from one state to the next, but how a single state is functioning. For sure, when a glucose molecule enters a tumour cell, its C1 carbon atom has a certain probability to end up in carbon dioxide and a different probability to end up in lactate. One would be interested in how these probabilities are influenced by the expression of glycolytic genes. This is a matter of a steady-state balance between rates, the implications of which for metabolite concentrations are modelled best by rate and balance equations. Bayesian networks operate by forward logics, i.e. what happens can only be determined by the present and not by the future. Already shortly after activation of intracellular networks, what happens in their beginning is codetermined by what has happens at their end. At steady state the end of the pathway just coexists with its beginning: the former depends on feedback loops through the latter, one of the reasons why the first step is not completely rate limiting. Bayesian networks do not seem to accommodate this essential, feedback property of living cells. One interpretation of ‘computer replica’ of the living organism, would indeed model the system in terms of all its individual molecules as they are interacting. This would inspire a gigantic Monte Carlo simulation including the quasi Brownian motion through Cartesian and chemical space. This however would generate models that are more complex than can be calculated in the lifetime of the Planet, even after introducing the simplifications offered by biological organization discussed above. In addition it would depend on the initial conditions of all the individual molecules, which one could never determine. It would also be impossible to trace the behaviour of every individual molecule, without perturbing it; this problem is not unique to quantum mechanics. The silicon cell project models mostly in terms of ensemble-averaged concentrations, whenever this is feasible on the basis of statistical mechanical considerations (Westerhoff and Van Dam 1987). Stochastic modelling does become important when molecules numbers in the relevant compartments are below 100. This is rare, though occasionally important. Partial differential equation based modelling is needed when gradients within compartments become important (Kholodenko 2006). Towards the silicon human In the context of the human, the ambition is even greater, i.e. to combine models of cells into models of tissues and then to combine models of tissues into body-wide models. Because the cell models would still be in terms of molecular activities, the result would be a multi-scale model relating whole body function to molecular activities in time and space. Here, the silicon-cell project will become a silicon organism project, with variations such as the virtual physiological human and the digital human projects. The idea is similar to that of integrating pathway models. Relatively autonomous models of organs are to be combined. One thought is to leave the coordination of each organ model including the corresponding computations to an individual research centre and then to integrate the models dynamically through web services. Although perhaps slow, this would have the advantage of maximum responsibility of a group over a part of the whole model, ensuring quality control. Fig. 6 illustrates this approach, where of course the beta cell component model will play an important role. Another thought has models for parts of the system uploaded to JWS-Online by the respective research groups, these models being automatically merged into the complete model, available to all participating groups. Pharmacokinetics has already studied the human body as a multi-compartment problem. Recently it has been proposed that more mechanistic information should be incorporated into pharmacokinetics (Lave, Chapman et al. 2009). We are therefore elaborating the silicon cell approach for tissue-tissue interaction in the whole human body. We thereby focus on the part of Fig. 6 that is depicted in Fig. 7A. The pancreatic beta-cells, shown schematically on the left, are connected with a model for C-peptide kinetics. Based on experimentally measured C-peptide levels in a patient we are able, using this model, to estimate the dynamic and static component of the insulin secretion, the former being a function of the glucose concentration above a certain threshold level, the latter being a function of the rate of increase of the glucose concentration. Fig. 7B and C give the results of calculations for two different silicon humans (i.e. different mechanistic parameter values for the two models) of insulin secretion rates in the normal and in the hypercaloric state. Fig. 8 A illustrates a complementary model for glucose and insulin dynamics. It allows for estimation of the insulin sensitivity of a virtual patient, a numerically calculated measure quantifying the interplay between insulin level, and the ability of the organism to balance its glucose concentration. The figure shows that provided individuals can be characterized in terms of a few mechanistic parameter values, implications of food intake for insulin dynamics can be predicted. At this stage, it is unclear whether those predictions would be correct or not, but this is now accessible to experimental validation. Fig. 7. Minimalistic whole body silicon-cell model relevant for insulin, glucose and c-peptide dynamics and some of its predictions. A. The scheme referring to the insulin release model and Cpeptide kinetics. B. Calculations of insulin secretion after administration of glucose for a silicon human subject to a normal (the line that is the highest in the beginning) and a hypercaloric diet. C. The same calculations for a different silicon human. Fig. 8. Another minimal whole-body silicon-cell model relevant for insulin and glucose dynamics and some of its predictions. A. The scheme referring to interplay between insulin and its effect on glucose utilization and storage. B. Calculations of glucose absorption profile during an oral glucose tolerance test (bottom plot) and fitted glucose time course (top plot). To many, the idea of a silicon human seems too complex to even think about. This may however derive from a failure to appreciate that biological organization greatly reduces complexity (Westerhoff, 2010). Moreover, the silicon human is already developing. Models of important aspects of the heart (Noble 2006) and of the liver cell (Vera, Bachmann et al. 2008) are constructed. 30 years from now we will avail of thousands of mathematical models that each describe a part of the human. Perhaps the only strategic decision we need to make now, is whether all those models will have resulted from a cottage industry such that it will be impossible to integrate them with each other, or all those models will have been developed in a common context and can be merged into a larger, more complete model. The latter possibility should enable each researcher working on her/his part of the human to appreciate the implications of her/his findings for understanding the functioning of the human as a whole. And, because there will be simultaneous top-down and ‘middle-out’ (Noble, 2006) strategies towards mathematical models of the human, we also have another choice. Either the results of these three methodologies will be developed independently of each other and the results will be in different languages. Or, some time is spent now to ensure that ultimately they become continuous with each other. The choice is (y)ours. Acknowledgements We thank the BBSRC, EPSRC (BBD0190791, BBC0082191, BBF0035281, BBF0035521, BBF0035521, BBF0035361, BBG5302251, SySMO P 49 ), EU-FP7 (BioSim, NucSys, EC-MOAN) and other funders (http://www.systembiology.net/support/ ) for support of this rather encompassing activity. References Alberghina, L. and H. V. Westerhoff (2005). Systems Biology: Definitions and Perspectives. Berlin, Springer. Albert, R. and A. L. Barabasi (2000). "Dynamics of complex systems: scaling laws for the period of boolean networks." Phys Rev Lett 84(24): 5660-3. Alon, U. (2007). "Network motifs: theory and experimental approaches." Nat Rev Genet 8(6): 450-61. Bakker, B. M., H. E. Assmus, et al. (2002). "Network-based selectivity of antiparasitic inhibitors." Molecular Biology Reports 29(1-2): 1-5. Bakker, B. M., P. A. M. Michels, et al. (1997). "Glycolysis in bloodstream form Trypanosoma brucei can be understood in terms of the kinetics of the glycolytic enzymes." Journal of Biological Chemistry 272(6): 3207-3215. Bakker, B. M., H. V. Westerhoff, et al. (2000). "Metabolic control analysis of glycolysis in trypanosomes as an approach to improve selectivity and effectiveness of drugs." Molecular and Biochemical Parasitology 106(1): 1-10. Carlberg, C. and T. W. Dunlop (2006). "An integrated biological approach to nuclear receptor signaling in physiological control and disease." Critical Reviews in Eukaryotic Gene Expression 16(1): 1-22. Ciapaite, J., G. Van Eikenhorst, et al. (2005). "Modular kinetic analysis of the adenine nucleotide translocator-mediated effects of palmitoyl-CoA on the oxidative phosphorylation in isolated rat liver mitochondria." Diabetes 54(4): 944-951. Conradie, R., F. J. Bruggeman, et al. (2010). "Restriction point control of the mammalian cell cycle via the cyclin E/Cdk2:p27 complex." FEBS Journal 277(2): 357-367. Davidson, E. H. (2006). The regulatory genome: gene regulatory networks in development and evolution. New York, Academic Press. de la Fuente, A., J. L. Snoep, et al. (2002). "Metabolic control in integrated biochemical systems." European Journal of Biochemistry 269(18): 4399-4408. Galazzo, J. L. and J. E. Bailey (1990). "FERMENTATION PATHWAY KINETICS AND METABOLIC FLUX CONTROL IN SUSPENDED AND IMMOBILIZED SACCHAROMYCES-CEREVISIAE." Enzyme and Microbial Technology 12(3): 162-172. Haanstra, J. R., A. van Tuijl, et al. (2008). "Compartmentation prevents a lethal turbo-explosion of glycolysis in trypanosomes." Proceedings of the National Academy of Sciences of the United States of America 105(46): 17718-17723. Hardin, H. M., A. Zagaris, et al. (2009). "Simplified yet highly accurate enzyme kinetics for cases of low substrate concentrations." Febs Journal 276(19): 5491-5506. Hartwell, L. H., J. J. Hopfield, et al. (1999). "From molecular to modular cell biology." Nature 402(6761 Suppl): C47-52. Hucka, M., A. Finney, et al. (2003). "The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models." Bioinformatics 19(4): 524-31. Ihekwaba, A. E., S. J. Wilkinson, et al. (2007). "Bridging the gap between in silico and cell-based analysis of the nuclear factor-kappaB signaling pathway by in vitro studies of IKK2." Febs J 274(7): 1678-90. Kacser, H. and J. A. Burns (1973). "The control of flux." Symp Soc Exp Biol 27: 65-104. Kahn, D. and H. V. Westerhoff (1991). "Control theory of regulatory cascades." Journal of Theoretical Biology 153(2): 255-285. Kholodenko, B. N. (2006). "Cell-signalling dynamics in time and space." Nat Rev Mol Cell Biol 7(3): 165-76. Lave, T., K. Chapman, et al. (2009). "Human clearance prediction: shifting the paradigm." Expert Opinion on Drug Metabolism & Toxicology 5(9): 1039-1048. Lawrence, P. A. (1992). The making of a fly. Oxford, Blackwell scientific publications. Le Novere, N., B. Bornstein, et al. (2006). "BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems." Nucleic Acids Research 34: D689-D691. Noble, D. (2006). The Music of Life: biology beyond genes. Oxford, Oxford University Press. Novak, B., A. Csikasz-Nagy, et al. (1998). "Mathematical model of the fission yeast cell cycle with checkpoint controls at the G1/S, G2/M and metaphase/anaphase transitions." Biophys Chem 72(1-2): 185-200. Peter, I. S. and E. H. Davidson (2009). "Modularity and design principles in the sea urchin embryo gene regulatory network." Febs Letters 583(24): 3948-3958. Rapoport, T. A., M. Otto, et al. (1977). "An extended model of the glycolysis in erythrocytes." Acta Biol Med Ger 36(3-4): 461-8. Reder, C. (1988). "Metabolic control theory: a structural approach." J Theor Biol 135(2): 175-201. Rohwer, J. M., N. D. Meadow, et al. (2000). "Understanding glucose transport by the bacterial phosphoenolpyruvate : glycose phosphotransferase system on the basis of kinetic measurements in vitro." Journal of Biological Chemistry 275(45): 34909-34921. Schuster, S. (1999). "Use and limitations of modular metabolic control analysis in medicine and biotechnology." Metab Eng 1(3): 232-42. Schuster, S., D. Kahn, et al. (1993). "Modular analysis of the control of complex metabolic pathways." Biophys Chem 48(1): 1-17. Snoep, J. L. (2005). "The Silicon Cell initiative: working towards a detailed kinetic description at the cellular level." Curr Opin Biotechnol. 16: 336-343. Snoep, J. L., F. Bruggeman, et al. (2006). "Towards building the silicon cell: A modular approach." Biosystems 83(2-3): 207-216. Snoep, J. L., C. C. van der Weijden, et al. (2002). "DNA supercoiling in Escherichia coli is under tight and subtle homeostatic control, involving gene-expression and metabolic regulation of both topoisomerase I and DNA gyrase." European Journal of Biochemistry 269(6): 1662-1669. Somsen, O. J., M. A. Hoeben, et al. (2000). "Glucose and the ATP paradox in yeast." Biochem J 352 Pt 2: 593-9. Teusink, B., J. Passarge, et al. (2000). "Can yeast glycolysis be understood in terms of in vitro kinetics of the constituent enzymes? Testing biochemistry." European Journal of Biochemistry 267(17): 5313-5329. Teusink, B., M. C. Walsh, et al. (1998). "The danger of metabolic pathways with turbo design." Trends in Biochemical Sciences 23(5): 162-169. van der Gugten, A. A. and H. V. Westerhoff (1997). "Internal regulation of a modular system: the different faces of internal control." Biosystems 44(2): 79-106. van Eunen, K., J. Bouwman, et al. (2010). "Measuring enzyme activities under standardized in vivolike conditions for systems biology." FEBS Journal Jan 7. Vera, J., J. Bachmann, et al. (2008). "A systems biology approach to analyse amplification in the JAK2STAT5 signalling pathway." Bmc Systems Biology 2: 13. Westerhoff, H. V. (2001). "The silicon cell, not dead but live!" Metab Eng 3(3): 207-10. Westerhoff, H. V. (2008). "Signalling control strength." Journal of Theoretical Biology 252(3): 555567. Westerhoff, H.V. (2010) Biochem. Soc. Trans., in the press Westerhoff, H. V. and D. B. Kell (1987). "Matrix-method for determining steps most rate-limiting to metabolic fluxes in biotechnological processes." Biotechnology and Bioengineering 30(1): 101-107. Westerhoff, H. V. and D. B. Kell (2009). "Matrix Method for Determining Steps Most Rate-Limiting to Metabolic Fluxes in Biotechnological Processes." Biotechnology and Bioengineering 104(1): 3-9. Westerhoff, H. V., J. G. Koster, et al. (1990). "On the control of gene expression." Cornish-Bowden, A. and M. Luz Cardenas (Ed.). Nato Asi (Advanced Science Institutes) Series Series a Life Sciences, Vol. 190. Control of Metabolic Processes; Workshop, Il Ciocco, Italy, April 9-15, 1989. Xiii+454p. Plenum Publishing Corp.: New York, New York, USA; London, England, Uk. Illus: 399-412. Westerhoff, H. V. and B. O. Palsson (2004). "The evolution of molecular biology into systems biology." Nat Biotechnol 22(10): 1249-52. Westerhoff, H. V. and K. Van Dam (1987). Thermodynamics and control of biological free-energy transduction. Amsterdam, Elsevier. Westerhoff, H. V., C. Winder, et al. (2009). "Systems Biology: The elements and principles of Life." FEBS Letters. Wolf, J., J. Passarge, et al. (2000). "Transduction of intracellular and intercellular dynamics in yeast glycolytic oscillations." Biophysical Journal 78(3): 1145-1153.