On Network Randomization Methods: A Negative Control Study 2012 NSF Bio-Grid REU Research Fellow University of Connecticut Max Espinoza (Fairfield 2013) Understanding negative control in any scientific discipline is a necessity when trying to show correlations. Negative control in an experiment is a sample that should yield a negative result. In network theory, and many other sciences, graph randomization is vital to generating a negative control group. A typical example of this can be found in motif detection algorithms. We will use motif detection as a case to base this study on due to the its need for randomized graphs as a negative control. Given an input network, a motif detection algorithm will find that network’s sub-graph frequencies then compare those frequencies to the negative control’s subgraph frequencies to determine which subgraph are statistically overrepresented. These statistically overrepresented subgraphs are defined as motifs. The generation of randomized networks that constitute the control group is done by a graph randomization algorithm. Thus far there are many graph randomization algorithms in use in motif detection. The problem lies in the fact that if randomized graphs are used as negative control, then different randomization algorithms may yield different results. Randomization algorithms differ in the topologies they preserve. The more network topologies preserved by the randomization processes, the probability of encountering a false-positive is lessened. In motif detection some networks might have higher frequencies of specific subgraphs due to topologic properties, which could be misinterpreted as motifs if the randomized graphs did not preserve those topological properties. We intend to study and compare the set of result motifs found by motif detection software to investigate how changing the randomization algorithm employed will affect the motifs found given a static input network. We will use mFinder, an open source network-centric motif detecting tool, due to its documentation and modularity. An E. Coli Protein to Protein Interaction network will be used as the static input network. We will compare motifs found by mFinder using the following graph randomization algorithms: The switching method, the stubs method, and go with the winner algorithm. Future research could include modifying mFinder to utilize other graph randomization algorithms such the ErdsRnyi algorithm and Barabasi-Albert Preferential Attachment algorithm. Looking for network motifs three nodes and under wouldnt yield many significant results to compare and looking for network motifs over six nodes is computationally expensive. Therefore, we will use mFinder to search for four to six node motifs in the E.Coli PPI network using these graph randomization algorithm to generate the negative control group. We will then compare the motifs found and examine the effect of altering graph randomization algorithms have on motif detection. Demonstrating how changing the graph randomization algorithm in mFinder could yield different result is an incremental step to future research. The importance of this study is to help researchers understand and see the effects of altering graph randomization algorithms have in terms of negative control. Network motifs detection are only one of the many applications which require the use of randomization algorithms to generate negative control groups. 1