BIOWARE: A FRAMEWORK FOR BIOINFORMATICS DATA

advertisement

A GUIDED SAMPLING ALGORITHM FOR IDENTIFYING NETWORK

MOTIFS IN A TRANSCRIPTION REGULATORY NETWORK

Raymond Wan * , Nelson Hayes * , Susumu Goto, Hiroshi Mamitsuka

{rwan,nelson,goto,mami}@bic.kyoto-u.ac.jp

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji,

611-0011, Japan

The identification of network motifs is an important problem in bioinformatics due to its complexity and application to other areas. Network motifs are statistically significant subgraphs when compared to a random network [Milo et al., 2002]. Milo et al. showed that the technique used for transcription regulatory networks can also be applied to artificial networks such as electronic circuits and links between Web documents. One refinement to their brute-force technique involved random sampling [Kashtan et al.,

2004].

In this poster, we describe some work-in-progress which is an extension to the random sampling approach. Determining whether or not a network motif is statistically significant depends on the number of samples taken. However, the fact that graphs such as transcription regulatory networks contain local clusters [Artzy-Randrup et al., 2004], has motivated us to investigate the possibility of guiding the sampling process.

Essentially, not all vertices in the graph are equal. We apply a scoring mechanism to each vertex based on the number of directed edges it and its neighbors contain. Then, we group the vertices into n buckets based on the distribution of the scores. By sampling the same proportion from each bucket, there is an approximately equal probability of sampling vertices from regions of high and low connectivity. Preliminary experiments with the transcription regulatory network of E. coli have shown that the feed-forward loop identified by others [Milo et al., 2002] could be located by our method using roughly 25% of the vertices. In the future, we plan to expand our experiments to other data sets and explore other vertex scoring mechanisms in order to refine our technique.

References

Y. Artzy-Randrup, S. J. Fleishman, N. Ben-Tal, and L. Stone. Comment on `Network Motifs: Simple

Building Blocks of Complex Networks' and `Superfamilies of Evolved and Designed Networks'. Science ,

305(5687):1107c, 2004.

N. Kashtan, S. Itzkovitz, R. Milo, and U. Alon. Efficient Sampling Algorithm for Estimating Subgraph

Concentrations and Detecting Network Motifs. Bioinformatics , 20(11):1746-1758, 2004.

R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network Motifs: Simple

Building Blocks of Complex Networks. Science , 298(5994):824-827, 2002.

*

These authors contributed equally to this work.

Download