bio - Figshare

advertisement
SUPPLEMENT
PANET: A GPU-based tool for fast parallel analysis of robustness dynamics and feedforward/feedback loop structures in large-scale biological networks
Hung-Cuong Trinh1, Duc-Hau Le2 and Yung-Keun Kwon1,*
1
School of Electrical Engineering, University of Ulsan, 93 Daehak-ro, Nam-gu, Ulsan 680-749
School of Computer Science and Engineering, Water Resources University, 175 TaySon, Dong Da,
Hanoi, Vietnam
2
1
Hung-Cuong Trinh et al.
Text S1. A brief introduction to OpenCL
In this work, we employed an OpenCL library which is designed to run on any available multi-core central processing
unit (CPU) or graphics processing unit (GPU) (http://www.khronos.org/opencl/). It utilizes the tremendous computing power
of a normal computer by operating the cores of the CPU or hundreds/thousands of cores in the GPU. In general, an OpenCLexecutable device is divided into one or more compute units (CUs). Each of these is further divided into one or more processing elements (PEs). CUs refer to cores in a multi-core CPU or streaming multiprocessors in a GPU whereas a PE represents a virtual scalar processor, an arithmetic logic unit of a CPU or a scalar processor of a GPU.
In general, an OpenCL application is divided into two parts, the device and the host programs. The device program consists of special functions, called kernels, which are coded with the OpenCL programming language. On the other hand, the
host program offers an interface to manage the device execution flow. In other words, the kernel is a basic unit of executable
code that can run on GPU or CPU devices whereas the host program takes responsibility for sending kernels to be executed
on devices using command queues.
From a logical data-parallelism respect, the host program defines an N-dimensional array of work-items (N = 1, 2 or 3) in
each of which the same kernel is executed. In addition, work-items are grouped into work-groups, and each work-group performs synchronization between work-items by sharing local memory. From the viewpoint of the OpenCL hardware architecture, the work-groups are distributed to CUs and the work-items in a work-group are executed concurrently on PEs of the
same CU.
OpenCL defines a hierarchy of different memory types in terms of functionality, size, and speed. The first type of memory
is global memory, which has the largest size and the slowest bandwidth. It can be read and written by the host and the
OpenCL device, and thus allows intercommunication between the host and the OpenCL device. The second type of memory
is constant memory, which is the part of the global memory that remains constant during the execution of a kernel. The third
type of memory is the local memory, which is the smallest but the fastest. Each CU has an individual local memory to be
shared by the PEs within the CU. It can be used to synchronize between the work-items in the same work-group. The last one
is private memory, which is private to a work-item. Variables defined in the private memory of a work-item are not visible to
the other work-items. The programmer must choose the most appropriate memory in order to achieve the best possible performance with the available memory bandwidth.
2
Hung-Cuong Trinh et al.
Text S2. OpenCL-based parallel computation of robustness
(a) Pseudo-codes for robustness computation in parallel
The following figure shows the pseudo-codes of two important functions, parallel_computing_attractors_for_all_states
and parallel_computing_attractors_for_all_rules which can compute the attractors in parallel for all initial states (S) and every update rule (F), respectively, given a Boolean network. In computing attractors, we used an array ATT where each element
ATT[s, f] represents an attractor of a network G(V, A) starting from the initial state s and the sequence of update rules f. The
algorithm iteratively computes a state transition until it arrives at a state which has already been visited. We note that the
dashed blocks denote kernel codes which are executed in parallel on CPUs or GPUs. In other words, the original NetDS serially computed the attractors for a number of initial states or update rules, whereas PANET computes them in parallel by distributing the tested cases to PEs in the OpenCL device.
function[ATT] parallel_computing_attractors_for_all_states(V, A, f, S)
// V, A: A set of nodes V={v1, v2, …, vN} and a set of links A of a network (Here,
V[i] represents viV.)
A sequence of update rules (Here, f = f1 f2 …fN and fi represents the
// f:
update rule with respect to viV.)
A collection of initial states considered for the robustness investigation
(Here, S[i] represents ith initial state in S.)
// ATT: The resulting collection of attractors each of which is represented by a
sequence of states.
// S:
ATT[0.. 2 -1]  NULL; // Every element of ATT is initialized by NULL.
nth[0.. 2|V| -1]  0; // Every element of nth is initialized by 0.
|V|
for i1 to |S| // for every state
s  S[i];
if (ATT[s, f] ≠ NULL) continue;
endif
traj  NULL;
count  0;
while (TRUE)
count++;
traj  trajs; // Here  represents the string concatenation operation.
nth[s]  count;
s’  update_states (V, A, f, s); // This computes the next state.
if (nth[s’] ≠ 0)
if (ATT[s’, f] = NULL)
att  trajnth[s’]..count; // Given a string t=t1t2…tT, ti..j represents
// titi+1…tj-1tj which is a substring of t.
else
att  ATT[s’, f];
endif
for j1 to count
ATT[trajj, f] = att;
endfor
break;
function [ATT] parallel_computing_attractors_for_all_rules (V, A, F, s)
// V, A: A set of nodes V={v1, v2,…, vN} and a set of links A of a network
(Here, V[i] represents viV.)
A collection of sequences of update rules (Here, F[i] represents ith
sequence of update rules in F)
//s:
An initial state considered for the robustness investigation
//ATT: The resulting collection of attractors each of which is represented
by a sequence of states.
//F:
ATT[0..2|V|-1]NULL; // Every element in ATT is initialized by NULL.
for i1 to |F| // for every rule
nth[0.. 2|V|-1] 0; // Every element in nth is initialized by 0.
trajNULL;
count = 0;
while (TRUE)
count++;
trajtrajs; //  represents the string concatenation operation.
nth[s] count;
s’ update_states (V,A, F[i], s); // This computes the next state.
if(nth[s’] ≠ 0)
att = trajnth[s’]..count; // Given a string t=t1t2…tT, ti..j represents
//titi+1…tj-1tj which is a substring of t.
ATT[s, F[i]] = att;
break;
else
ss’;
endif
endwhile
endfor
returnATT;
end
else
s  s’;
endif
endwhile
endfor
return ATT;
end
By using those functions, we can easily compute not only the robustness of a node against the initial-state perturbation
and the update-rule perturbation (γs(v) and γr(v), respectively), but also the robustness of a network G against the initial-state
perturbation and the update-rule perturbation (γs(G) and γr(G), respectively) as shown in the following pseudo-codes.
3
Hung-Cuong Trinh et al.
function [] robustness_initial_state (V, A, f, S)
// V, A: A set of nodes V={v1, v2, …, vN} and a set of links A of a network
// f:
// S:
// :
(Here, V[i] represents viV.)
A sequence of update rules (Here, f = f1 f2 …fN and fi represents the
update rule with respect to viV.)
A collection of initial states considered for the robustness
investigation (Here, S[i] represents ith initial state in S.)
The resulting robustness against initial-state perturbations
// Step 1: Examine the original attractors.
ATT  parallel_computing_attractors_for_all_states (V, A, f, S);
function [] robustness_update_rule (V, A, f, S)
// V, A: A set of nodes V={v1, v2, …, vN} and a set of links A of a network
// f:
// S:
// :
(Here, V[i] represents viV.)
A sequence of update rules (Here, f = f1 f2 …fN and fi represents the
update rule with respect to viV.)
A collection of initial states considered for the robustness
investigation (Here, S[i] represents ith initial state in S.)
The resulting robustness against update-rule perturbations
// Step 1: Examine the original attractors.
ATT  parallel_computing_attractors_for_all_states (V, A, f, S);
// Step 2: Examine the changed attractors by initial-state perturbations.
R[1.. |V|]  0; // Every element of R is initialized by 0.
// Step 2: Examine the changed attractors by update-rule perturbations.
R[1.. |V|]  0; // Every element of R is initialized by 0.
for i1 to |S|
S’[1..|V|]  NULL; // Every element of S’ is initialized by NULL.
for j1 to |V|
s  S[i];
sj  1- sj; // sj denotes the value of vj in s, and then the resultant s
for i1 to |S|
F[1..|V|]  NULL; // Every element of F is initialized by NULL.
for j1 to |V|
f’  f;
if (fj = AND ) f’j  OR;
else f’j  AND; // f’means an update-rule perturbation at a node vjV.
endif
F[j]  f’;
endfor
ATT’  parallel_computing_attractors_for_all_rules (V, A, F, S[i]);
// denotes an initial-state perturbation at a node vjV.
S’[j]  s;
endfor
ATT’  parallel_computing_attractors_for_all_states(V, A, f, S’);
for j1 to |V|
if ( ATT[i] = ATT’[j] ) R[j]++;
endif
endfor
endfor
for j1 to |V|
if ( ATT[i] = ATT’[j] ) R[j]++;
endif
endfor
endfor
// Step 3: Compute the robustness against the initial-state perturbations
 0;
for j1 to |V|
R[j] = R[j]/|S|; // Here, R[j] represents s(vj).
  + R[j];
endfor
  / |V|; // As a result,  represents the robustness of the given network.
return ;
end
// Step 3: Compute the robustness against the update-rule perturbations
 0;
for j1 to |V|
R[j] = R[j]/|S|; // Here, R[j] represents r(vj).
  + R[j];
endfor
  / |V|; // As a result,  represents the robustness of the given network.
return ;
end
4
Hung-Cuong Trinh et al.
Text S3. OpenCL-based parallel examination of feedback and feed-forward loops
(a) A pseudo-code for efficient FBL search in parallel
The following figure shows the pseudo-code of the ‘searchFBL’ function to search all feedback loops of a maximum
length L in a given network G(V, A). For each link (vi, vj) ∈ A, the algorithm starts to search all the FBLs involving the link
(vi, vj) based on depth-first-search (DFS). We note that the dashed block explains the searching task. It is a kernel code which
can be executed in parallel on CPUs or GPUs. In addition, we improved the search speed by avoiding redundant search (*
and ** lines in the pseudo-code).
function [FBL] searchFBL(V, A, L)
// V, A:
// L:
// FBL:
A set of nodes V={v1, v2,…, vN} and a set of links A of a network (Here, V[i] represents viV.)
The maximal FBL length to be examined
The resultant set of the feedback loops found
FBLNULL;
for each (vi,vj)∈ A with i<j// To avoid redundant search (*)
k  0;
stack[k](vi,vj);
visited[(vi,vj)]TRUE;
while (k  0)
(va,vb)stack[k];
if (k = L or b<i)// To avoid redundant search (**)
k k-1;
continue;
endif
if(b= i)
FBLFBL∪{(stack[0], stack[1],…, stack[k])};// The sequence of links, (stack[0],stack[1],…,stack[k])
// eventually represents the feedback loop found.
k k-1;
continue;
endif
if ((vb,vc)∈A such that visited[(vb,vc)]≠TRUE)
visited[(vb,vc)]TRUE;
k k+1;
stack[k] (vb,vc);
else
k k-1;
for each(vb, v)∈A, visited[(vb, v)]FALSE;
endfor
endif
endwhile
endfor
returnFBL;
end
5
Hung-Cuong Trinh et al.
(b) A pseudo-code for efficient FFL search in parallel
The following figure shows the pseudo-code of the ‘searchFFL’ function to search all feed-forward loops of a maximum
length L in a given network G(V, A). We have a set of source nodes VS and a set of destination nodes VD used to find FFLs.
For each link (vs, vj) ∈ A with vs ∈ VS, the algorithm starts to search all the FFLs involving the link (vs, vj) based on depthfirst-search. We note that the dashed block explains the searching task. It is a kernel code which can be executed in parallel
on CPUs or GPUs.
function [FFL] searchFFL(VS,VD, A, L)
// VS, VD: A set of source nodes VS and a set of destination nodes VD of a network (VS∩VD=)
// A:
// L:
// FFL:
A set of links A of a network
The maximal FFL length to be examined
The resultant set of the feed-forward loops found
FFLNULL;
for each (vs,vj)∈ A with vs∈ VS
k  0;
stack[k] (vs,vj);
visited[(vs,vj)]TRUE;
while (k  0)
(va,vb)stack[k];
if (k= L)
k k-1;
continue;
endif
if(vb∈ VD)
FFLFFL∪{(stack[0], stack[1],…, stack[k])};// The sequence of links, (stack[0],stack[1],…,stack[k])
// eventually represents the simple path found.
k k-1;
continue;
endif
if ((vb,vc)∈A such that visited[(vb,vc)]≠TRUE)
visited[(vb,vc)]TRUE;
k k+1;
stack[k] (vb,vc);
else
k k-1;
for each(vb, v)∈A, visited[(vb, v)]FALSE;
endfor
endif
endwhile
endfor
returnFFL;
end
6
Hung-Cuong Trinh et al.
Text S4. Format of an output file by batch-mode simulation on RBNs
After the batch-mode simulation is completed, two resultant files are created: “net_based_result.txt” and
“node_based_result.txt”. The former and the latter describe network-based and node-based results, respectively.
(a) Network-based result
As shown in the figure below, “net_based_result.txt” consists of 11 network-based results with respect to robustness and
FFL/FBL structures of RBNs. Each row describes a result of one RBN.
Column
1
2
3
4
5
6
7
8
9
10
11
Name
Network ID
No.Nodes
No.Edges
sRobustness
rRobustness
NuFBL+
NuFBLNuCoFBL
NuInCoFBL
NuCoFFL
NuInCoFFL
Description
The unique identification number of an RBN
The number of nodes of an RBN
The number of edges of an RBN
The robustness against initial-state perturbation of an RBN
The robustness against update-rule perturbation of an RBN
The number of positive FBLs of an RBN
The number of negative FBLs of an RBN
The number of coherently coupled FBLs of an RBN
The number of incoherently coupled FBLs of an RBN
The number of coherently coupled FFLs of an RBN
The number of incoherently coupled FFLs of an RBN
(Column description in “net_based_result.txt”)
(Example of “net_based_result.txt”)
(b) Node-based result
As shown in the figure below, “node_based_result.txt” shows more detailed results than “net_based_result.txt” because
it includes the results with respect to robustness and FBL structures at each node level in the RBNs (The result regarding FFL
structures are not included for simplicity, though, because there can be as many cases as the number of all pairs of nodes).
7
Hung-Cuong Trinh et al.
Column
1
2
3
4
5
6
7
8
9
Name
Network ID
No.Nodes
No.Edges
Node ID
sRobustness
rRobustness
NuFBL<=L
NuFBL=L
PosNuFBL<=L
PosNuFBL=L
NegNuFBL<=L
NegNuFBL=L
Description
The unique identification number of an RBN
The number of nodes of an RBN
The number of edges of an RBN
The unique identification number of a node
The robustness against initial-state perturbation of a node
The robustness against update-rule perturbation of a node
The number of FBLs whose length <= L involved by a node
The number of FBLs whose length = L involved by a node
The number of positive FBLs whose length <= L involved by a node
The number of positive FBLs whose length = L involved by a node
The number of negative FBLs whose length <= L involved by a node
The number of negative FBLs whose length = L involved by a node
(Column description in “node_based_result.txt”)
(Example of “node_based_result.txt”)
8
Hung-Cuong Trinh et al.
0.80
0.75
Shuffle I, |V| = 1609 and |A| = 5063
0.75
0.65
0.70
0.60
γr(G)
0.65
γr(G)
Shuffle I, |V| = 818 and |A| = 1801
0.70
0.60
0.55
0.55
0.50
0.45
0.50
0.40
0.45
0.35
0.40
0.17
0.18
0.19
0.20
0.21
0.22
0.23
0.30
0.1000
0.24
0.1125
Ratio of coherent FFLs
0.1250
0.1375
(a)
0.745
Shuffle II, |V| = 1609 and |A| = 5063
0.780
0.740
0.775
0.735
0.770
0.730
0.765
0.725
0.760
0.715
0.750
0.710
0.745
0.705
0.325
0.350
0.375
0.400
0.425
Ratio of coherent FFLs
(c)
0.1750
0.1875
0.450
0.475
Shuffle II, |V| = 818 and |A| = 1801
0.720
0.755
0.740
0.300
0.1625
(b)
γr(G)
γr(G)
0.785
0.1500
Ratio of coherent FFLs
0.700
0.125
0.150
0.175
0.200
0.225
0.250
0.275
Ratio of coherent FFLs
(d)
Figure S1. Relationship between the ratio of coherent FFLs and update-rule robustness in large-scale Boolean networks by Shuffling models.
(a) Result of Shuffle I-based RBNs of the same size with HSN. (b) Result of Shuffle I-based RBNs of the same size with
CCSN. (c) Result of Shuffle II-based RBNs of the same size with HSN. (d) Result of Shuffle II-based RBNs of the same size
with CCSN. The maximal length of examined FFLs is set to 4 or 6 for (a) and (c), or (b) and (d), respectively. For robustness
against update-rule perturbation, |S| is set to 1,024. In (a), (b) and (c), the correlations are not statistically significant (P-values
= 0.715, 0.490, and 0.832, respectively). On the other hand, the correlation only in (d) is statistically significant (the slope of
the regression line = 0.03297, P-value = 0.015).
9
Hung-Cuong Trinh et al.
0.98
1.00
|V| = 1609 and |A| = 5063
|V| = 818 and |A| = 1801
0.95
0.90
0.97
γs(G)
γs(G)
0.85
0.96
0.80
0.75
0.70
0.95
0.65
0.60
0.94
0.35
0.40
0.45
0.50
0.55
0.60
0.30
0.65
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
Ratio of coherent FBLs
Ratio of coherent FBLs
(a)
(b)
1.00
1.00
|V| = 50 and |A| = 97
|V| = 50 and |A| = 117
0.95
0.95
0.90
0.90
0.80
γs(G)
γs(G)
0.85
0.75
0.85
0.70
0.80
0.65
0.60
0.75
0.55
0.70
0.50
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Ratio of coherent FBLs
(c)
0.8
0.9
1.0
0.40
0.45
0.50
0.55
0.60
0.65
0.70
Ratio of coherent FBLs
(d)
Figure S2. Relationship between the ratio of coherent FBLs and initial-state robustness in Boolean networks by the
ER model.
(a) Result of RBNs of the same size with the HSN. (b) Result of RBNs of the same size with the CCSN. (c) Result of RBNs
with |V| = 50 and |A| = 97. (d) Result of RBNs with |V| = 50 and |A| = 117. The maximal length of examined FBLs is set to 6,
8, 50 and 12, in (a) through (d), respectively. For robustness against initial-state perturbation, |S| is set to 1,024. In (a), (b),
and (d), the correlations are not statistically significant (P-values = 0.052, 0.384, and 0.080, respectively). On the other hand,
the correlation is significantly positive in (c) (the slope of the regression line = 0.05610, P-value = 0.012).
10
Hung-Cuong Trinh et al.
0.65
0.80
Shuffle I, |V| = 1609 and |A| = 5063
0.60
0.70
0.55
0.65
0.50
0.60
γs(G)
γs(G)
Shuffle I, |V| = 818 and |A| = 1801
0.75
0.45
0.55
0.50
0.45
0.40
0.40
0.35
0.35
0.30
0.4996
0.4998
0.5000
0.5002
0.5004
0.5006
0.5008
0.5010
0.30
0.4994
0.5012
0.4996
0.4998
(a)
0.670
0.5002
0.5004
0.5006
0.5008
0.5010
(b)
Shuffle II, |V| = 1609 and |A| = 5063
0.685
0.665
Shuffle II, |V| = 818 and |A| = 1801
0.680
0.660
0.655
0.675
0.650
0.670
0.645
γs(G)
γs(G)
0.5000
Ratio of coherent FBLs
Ratio of coherent FBLs
0.640
0.665
0.635
0.660
0.630
0.655
0.625
0.650
0.620
0.615
0.51
0.52
0.53
0.54
0.55
0.56
0.57
Ratio of coherent FBLs
(c)
0.58
0.59
0.60
0.645
0.496
0.498
0.500
0.502
0.504
0.506
0.508
0.510
0.512
Ratio of coherent FBLs
(d)
Figure S3. Relationship between the ratio of coherent FBLs and initial-state robustness in Boolean networks by Shuffling models.
(a) Result of Shuffle I-based RBNs of the same size with the HSN. (b) Result of Shuffle I-based RBNs of the same size with
the CCSN. (c) Result of Shuffle II-based RBNs of the same size with the HSN. (d) Result of Shuffle II-based RBNs of the
same size with the CCSN. The maximal length of examined FBLs is set to 6 or 8 for (a) and (c), or (b) and (d), respectively.
For robustness against initial-state perturbation, |S| is set to 1,024. In (a), (b), (c) and (d), the correlations are not statistically
significant (P-values = 0.603, 0.356, 0.211 and 0.551, respectively).
11
Download