1. Introduction to Molecular Biology

advertisement

6. Gene Regulatory Networks

EECS 600: Systems Biology & Bioinformatics

Instructor: Mehmet Koyuturk

Regulation of Gene Expression

6. Gene Regulatory Networks

Transcriptional Regulation of telomerase protein component gene hTERT

2 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Genetic Regulation & Cellular Signaling

3 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Organization of Genetic Regulation

Negative ligand-independent repression at chromatin level

Up-regulation

Gene

Down-regulation

4

Genetic network that controls flowering time in A. thaliana

(Blazquez et al, EMBO Reports, 2001)

EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Gene Regulatory Networks

Transcriptional Regulatory Networks

Nodes with outgoing edges are limited to transcription factors

5

Can be reconstructed by identifying regulatory motifs (through clustering of gene expression & sequence analysis) and finding transcription factors that bind to the corresponding promoters

(through structural/sequence analysis)

EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Gene Regulatory Networks

Gene expression networks

General model of genetic regulation

Identify the regulatory effects of genes on each other, independent of the underlying regulatory mechanism

Can be inferred from correlations in gene expression data, time-series gene expression data, and/or gene knock-out experiments

6

Observation Inference

EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Boolean Network Model

Binary model, a gene has only two states

ON (1): The gene is expressed

OFF (0): The gene is not expressed

Each gene’s next state is determined by a boolean function of the current states of a subset of other genes

A boolean network is specified by two sets

Set of nodes (genes)

State of a gene:

Collection of boolean functions

7 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Logic Diagram

8

Cell cycle regulation

Retinoblastma (Rb) inhibits DNA synthesis

Cyclin Dependent

Kinase 2 (cdk2) & cyclin E inactivate Rb to release cell into S phase

Up-regulated by CAK complex and downregulated by p21/WAF1 p53

EECS 600: Systems Biology & Bioinformatics

Wiring Diagram

6. Gene Regulatory Networks

9 EECS 600: Systems Biology & Bioinformatics

10

6. Gene Regulatory Networks

Dynamics of Boolean Networks

Gene activity profile (GAP)

Collection of the states of individual genes in the genome

(network)

The number of possible GAPs is 2 n

The system ultimately transitions into attractor states

Steady state (point) attractors

Dynamic attractors: state cycle

Each transient state is associated with an attractor (basins of attraction)

In practice, only a small number of GAPs correspond to attractors

What is the biological meaning of an attractor?

EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

State Space of Boolean Networks

Equate cellular with attractors

Attractor states are stable under small perturbations

Most perturbations cause the network to flow back to the attractor

Some genes are more important and changing their activation can cause the system to transition to a different attractor

11

This slide is taken from the presentation by I. Shmulevich

EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Identification of Boolean Networks

We have the “truth table” available

Binarize time-series gene expression data

REVEAL

Use mutual information to derive logical rules that determine each variable

 If the mutual information between a set of variables and the target variable is equal to the entropy of that variable, then that set of variables completely determines the target variable

For each variable, consider functions consisting of 1 variable, then 2, then 3, …, then i…, until one is found

 Once the minimum set of variables that determine a variable is found, we can infer the function from the truth table

In general, the indegrees of genes in the network is small

12 EECS 600: Systems Biology & Bioinformatics

REVEAL

6. Gene Regulatory Networks

13 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Limitations of Boolean Networks

14

The effect of intermediate gene expression levels is ignored

It is assumed that the transitions between states are synchronous

A model incorporates only a partial description of a physical system

Noise

Effects of other factors

One may wish to model an open system

A particular external condition may alter the parameters of the system

Boolean networks are inherently deterministic

EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Probabilistic Models

Stochasticity can account for

Noise

Variability in the biological system

Aspects of the system that are not captured by the model

Random variables include

Observed attributes

Expression level of a particular gene in a particular sample

Hidden attributes

 The boolean function assigned to a gene?

15 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Probabilistic Boolean Networks

Each gene is associated with multiple boolean functions

Each function is associated with a probility

Can characterize the stochastic behavior of the system

16 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Bayesian Networks

A Bayesian network is a representation of a joint probability distribution

A Bayesian network B=(G,  ) is specified by two components

A directed acyclic graph G, in which directed edges represent the conditional dependence between expression levels of genes (represented by nodes of the graph)

A function  that specifies the conditional distribution of the expression level of each gene, given the expression levels of its parents

Gene A is gene B’s parent if there is a directed edge from A to B

P(B | Pa(B)) =  (B, Pa(B))

17 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Conditional Independence

In a Bayesian network, if no direct between two genes, then these genes are said to be conditionally independent

The probability of observing a cellular state (configuration of expression levels) can be decomposed into product form

18 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Variables in Bayesian Network

Discrete variables

Again, genes’ expression levels are modeled as ON and OFF

(or more discrete levels)

If a gene has k parents in the network, then the conditional distribution is characterized by r k parameters (r is the number of discrete levels)

Continuous variables

Real valued expression levels

We have to specify multivariate continuous distribution functions

Linear Gaussian distribution:

Hybrid networks

19 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Equivalence Classes of Bayesian Nets

Observe that each network structure implies a set of independence assumptions

Given its parents, each variable is independent of its nondescendants

More than one graph can imply exactly the same set of independencies (e.g., X->Y and Y->X)

Such graphs are said to be equivalent

By looking at observations of a distribution, we cannot distinguish between equivalent graphs

An equivalence class can be uniquely represented by a partially directed graph (some edges are undirected)

20 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Learning Bayesian Networks

Given a training set D = {x

1

, x

2

, …, x n

} of m independent instances of the n random variables, find an equivalence class of networks B=(G,  ) that best matches D x’s are the gene expression profiles

Based on Bayes’ formula, the posterior probability of a network given the data can be evaluated as where C is a constant (independent of G) and

21 is the marginal likelihood that averages the probability of data over all possible parameter assignments to G

EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Learning Algorithms

The Bayes score S(G : D) depends on the particular choice of priors P(G) and P(  | G)

The priors can be chosen to be structure equivalent, so that equivalent networks will have the same score decomposable, so that the score can be represented as the superposition of contributions of each gene

The problem becomes finding the optimal structure (G)

We can estimate the gain associated with addition, removal, and reversal of an edge

Then, we can use greedy-like heuristics (e.g., hill climbing)

22 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

Causal Patterns

Bayesian networks model dependencies between multiple measurements

How about the mechanism that generated these measurements?

Causal network model: Flow of causality

Model not only the distribution of observations, but also the effect of observations

If gene X codes for a transcription factor of gene Y, manupilating X will affect Y, but not vice versa

But in Bayesian networks, X->Y and Y->X are equivalent

Intervention experiments (as compared to passive observation): Knock X out, then measure Y

23 EECS 600: Systems Biology & Bioinformatics

Dynamic Bayesian Networks

Dependencies do not uncover temporal relationships

Gene expression varies over time

Dynamic Bayesian

Networks model the dependency between a gene’s expression level at time t and expression levels of parent genes at time t-1

24 EECS 600: Systems Biology & Bioinformatics

6. Gene Regulatory Networks

6. Gene Regulatory Networks

Linear Additive Regulation Model

The expression level of a gene at a certain time point can be calculated by the weighted sum of the expression levels of all genes in the network at a previous time point

 e i

: expression level of gene i w ij

: effect of gene j on gene i u k

: kth external variable n ik

: effect of kth external variable on gene j b i

: gene-specific bias

Can be fitted using linear regression

25 EECS 600: Systems Biology & Bioinformatics

Download