Parallel Apriori Algorithm and Frequent itemsets with Gene Expression Data

advertisement
CSE 633 Fall 2012
Parallel Apriori Algorithm and
Frequent itemsets with Gene Expression Data
Hyunwoo Sung
Generating Frequent Itemsets
Gene Expression Data Used For Input
Sample1
Sample2
Sample3
Sample4
Sample5
Sample6
Sample7
Sample8
UP
UP
Down
Down
UP
UP
UP
Down
UP
Down
Down
Down
UP
Down
UP
Down
Down
UP
UP
Down
Down
Down
Down
UP
UP
Down
UP
Down
Down
Down
UP
UP
ALL
ALL
ALL
AML
ALL
ALL
AML
ALL
11
21
30
41
51
11
20
31
40
51
10
20
31
41
51
10
20
30
40
50
11
21
30
40
51
11
20
30
40
51
11
21
30
41
50
10
20
31
41
51
When transmitting data
between processors, they
exchange integers that are
transformed instead of String.
UP becomes 1 and Down
becomes 0. Also, the name of
the cancers such as ALL or AML
becomes 1 or 0, respectively.
Sequential Apropri Algorithm Implementation in C++
ItemsetIndex = 1;
while(itemsetIndex < numberOfItems );
{
generateCandidates(itemsetIndex);
generateFrequentItemsets(itemsetIndex);
print(frequentItemsetTableElements);
itemsetIndex++;
}
Continued from Implementation
GenerateCandidates(int sequence)
for I = 0 to candidates.size()
for j = I + 1 to candidates.size()
construct the candidates through self-joining.
store the candidates in the map. (key = candidate, value = 0)
Worst case: O( )
> 2, C:numberofcandidates
: (i:1,2,3,…number of transactions - 1)
Continued from Implementation
generateFrequentItemsets(sequence);
for I = 0 to numberOfTransactions(N) in a transaction matrix
for j = 0 to numberOfItems(W)
iterate through HashMap
get the key(item) and compare the key with the actual
transaction items in order to count the # of transactions
update the value that maps to the key in HashMap
if the support of itemsets >= MIN_SUPPORT
place the selected itemsets on CandidateItemsetTable
•
•
•
•
Best: O( *C) when sequence == 1, C = # of candidates
Worst: O( *C’ *C’’) when sequence > =2 , i:itemsetIndex
&′: ()
) # +, -./ - 0 /01
&′′: -(0) # +, -./ .+ +/10(
A Parallel Implementation
bool *alive = new bool[NUM_PROCESSORS];
for( int i = 0; i < NUM_PROCESSORS; ++i )
alive[i] = true;
get_initial_data ( RANK, globalInput, input );
for( int i = 0; i < log2(NUM_PROCESSORS); ++i)
{
if( alive[RANK] )
{
instance.process(input, output);
if( should_request( alive, i ) )
{
int neighbour = RANK + (int) pow(2.0,i);
request_data( neighbour, neighbor_output );
merge_results( output, neighbor_output, input);
alive[neighbour] = false;
}
Implementation(Continued)
else
{
alive[RANK] = false;
int neighbour = RANK - (int)pow(2.0,i);
send_data( neighbour, output);
delete[] alive;
MPI::Finalize();
exit( 0 );
}
}
}
delete[] alive;
MPI::Finalize();
return 0;
}
A Parallel Solution Illustration
ID
Transaction
100 1,3,4
101 2,3,5
102 1,2,3
103 2,4,5
Minimum_Support = 50%
1,3,4
1
2
3
4
5
2,3,5
1
1
2
1
1
1,2,3
1
2
3
4
5
2,4,5
1
2
1
1
1
A Parallel Solution(continued)
ID
Transaction
100 1,3,4
101 2,3,5
102 1,2,3
103 2,4,5
Minimum_Support = 50%
12
13
23
24
25
45
1,3,4
2,3,5
1,3,4
1
2
3
4
5
2,3,5
1
1
2
1
1
1
2
2
1
2
1
1,2,3
2,4,5
1,2,3
1
2
3
4
5
2,4,5
1
2
1
1
1
A Parallel Solution(continued)
ID
Transaction
100 1,3,4
101 2,3,5
102 1,2,3
103 2,4,5
Minimum_Support = 50%
1,3,4
2,3,5
123
245
235
234
1,3,4
2,3,5
1,2,3
2,4,5
12
1
13
2
23
2
24
1
25
2
45
1
1
1
1
1
1,2,3
2,4,5
A Parallel Solution(continued)
ID
Transaction
100 1,3,4
101 2,3,5
102 1,2,3
103 2,4,5
Minimum_Support = 50%
123
245
235
234
1
1
1
1
1,3,4
2,3,5
1,2,3
2,4,5
The row of the input size grows exponentially as the tree gets
trimmed(the processor starts dying. In this example, the final input
size becomes 4 by 3 matrix. It started with 1 row. It becomes 2 rows
and ends with 4 rows.
Result
• Use the gene expression data that consists of the cancer-causing
gene structure information and the name of the cancers.
• There are two different inputs. One is the numbers stored in the two
dimensional matrix and the other is the minimum support rate.
• The input size for the runs is 100 * 100, 150*150, 200 * 200, 250 *
250, 300 * 300, 350 * 350, 500 * 500, and 1000 * 1000
• The min_support is set to 50% for all of the runs.
- In the future, I will try the benchmarks with a lower min_support,
since it is more likely to reveal the true speedup/efficiency.
• The number of the processors(cores) used is 1, 2, 4, 8, 16 and 32.
• Each test was run three times and obtained the average run
Average Runtime vs. Input Size
A Number of Processors
Input size
1
2
4
8
16
32
10000
0.126
0.071
0.045
0.032
0.029
0.024
22500
0.287
0.166
0.109
0.076
0.068
0.056
40000
0.606
0.370
0.245
0.168
0.151
0.118
62500
1.062
0.582
0.386
0.268
0.240
0.207
90000
1.713
0.968
0.640
0.451
0.405
0.337
122500
2.458
1.415
0.936
0.637
0.554
0.489
250000
9.531
5.506
3.646
2.494
2.161
1.980
37.225
25.383
17.421
15.686
15.247
1000000 66.223
Runtime unit: Second
Average Runtime vs. Input Size
Average Running Time vs Input Size
70
60
50
Running Time
1
40
2
4
8
30
16
32
20
10
0
10000
22500
40000
62500
90000
122500
250000
1000000
Speedup vs. Num of processors
Speedup vs. Num of Processors
6
5
100
4
Speedup
150
200
250
3
300
350
2
500
1000
1
0
1
2
4
8
16
32
Efficiency vs. Number of Processors
Efficiency vs. Num of Processors
1.2
1
100
0.8
Efficiency
150
200
250
0.6
300
350
0.4
500
1000
0.2
0
1
2
4
8
16
32
Observation
• Average run time grows exponentially as the input size grows
• Speedup grows linearly at the beginning. Then, it starts going
down
• Efficiency drops as more processors are added. Even if more
processors would take the divided inputs, their work would
not affect the efficiency that much.
Future work
• Test with the various minimum support rate other than the
fixed minimum support 50%
• Use various kinds of data such as Waltmart or Top’s
transactions
• Modify current implementations to achieve the better
efficiency
• OpenMP implementation written in C
Reference
http://www.cse.buffalo.edu/faculty/azhang/cse601/cse601associationrule.ppt
Download