pptx

advertisement
I/O-Algorithms
Lars Arge
Spring 2012
April 17, 2012
I/O-algorithms
I/O-Model
D
Block I/O
M
• Parameters
N = # elements in problem instance
B = # elements that fits in disk block
M = # elements that fits in main memory
T = # output size in searching problem
• We often assume that M>B2
P
Lars Arge
• I/O: Movement of block between memory
and disk
2
I/O-Algorithms
Fundamental Bounds
•
•
•
•
Scanning:
Sorting:
Permuting
Searching:
Internal
N
N log N
N
log
External
N
B
N
B
log
B
min{ N ,
2
N
log
M
N
B
B
N
B
log
M
B
N
B
}
N
• Note:
– Linear I/O: O(N/B)
– Permuting not linear
– Permuting and sorting bounds are equal in all practical cases
– B factor VERY important: NB  NB log M B NB  N
– Cannot sort optimally with search tree
Lars Arge
3
I/O-Algorithms
Scalability Problems: Block Access Matters
• Example: Traversing linked list (List ranking)
– Array size N = 10 elements
– Disk block size B = 2 elements
– Main memory size M = 4 elements (2 blocks)
1 5 2 6 3 8 9 4 7 10
Algorithm 1: N=10 I/Os
1 2 10 9 5 6 3 4 8 7
Algorithm 2: N/B=5 I/Os
• Large difference between N and N/B large since block size is large
– Example: N = 256 x 106, B = 8000 , 1ms disk access time
 N I/Os take 256 x 103 sec = 4266 min = 71 hr
 N/B I/Os take 256/8 sec = 32 sec
Lars Arge
4
I/O-algorithms
List Ranking
• Problem:
– Given N-vertex linked list stored in array
– Compute rank (number in list) of each vertex
13 54 2
9 4
5 6
9 38 8
7 10
2 7
6 10
• One of the simplest graph problem one can think of
• Straightforward O(N) internal algorithm
– Also uses O(N) I/Os in external memory
• Much harder to get O ( NB log M B NB ) external algorithm
Lars Arge
5
I/O-algorithms
List Ranking
• We will solve more general problem:
– Given N-vertex linked list with edge-weights stored in array
– Compute sum of weights (rank) from start for each vertex
• List ranking: All edge weights one
1
1
1
1
1
1
1
1
1 5 2 6 3 8 9 4 7 10
1
1
• Note: Weight stored in array entry together with edge (next vertex)
Lars Arge
6
I/O-algorithms
List Ranking
1
1
1
2
2
1
1
1
3
4
2
5
1
1
6
2
7
1
1
8
1
9
10
•
Algorithm:
1. Find and mark independent set of vertices
2. “Bridge-out” independent set: Add new edges
3. Recursively rank resulting list
4. “Bridge-in” independent set: Compute rank of independent set
•
•
Step 1, 2 and 4 in O ( NB log M B NB ) I/Os
Independent set of size αN for 0 < α ≤ 1
 T ( N )  T (( 1   ) N )  O ( NB log M B NB )  O ( NB log
Lars Arge
N
M B B
) I/Os
7
I/O-algorithms
List Ranking: Bridge-out/in
2
1
3 24 385 49 58 67 10
7 82 96 10
1
• Obtain information (edge or rang) of successor
– Make copy of original list
– Sort original list by successor id
– Scan original and copy together to obtain successor information
– Sort modified original list by id
 O ( NB log M B NB ) I/Os
Lars Arge
8
I/O-algorithms
List Ranking: Independent Set
• Easy to design O ( NB log M B NB ) randomized algorithm:
– Scan list and flip a coin for each vertex
– Independent set is vertices with head and successor with tails
 Independent set of expected size N/4
3 4 5 9 8 7 10 2 6
• Deterministic algorithm:
– 3-color vertices (no vertex same color as predecessor/successor)
– Independent set is vertices with most popular color
 Independent set of size at least N/3
• O ( NB log
Lars Arge
N
M B B
) 3-coloring  O ( NB log
N
M B B
) I/O algorithm
9
I/O-algorithms
List Ranking: 3-coloring
• Algorithm:
– Consider forward and backward lists (heads/tails in two lists)
– Color forward lists (except tail) alternately red and blue
– Color backward lists (except tail) alternately green and blue

3-coloring
3 4 5 9 8 7 10 2 6
Lars Arge
10
I/O-algorithms
List Ranking: Forward List Coloring
• Identify heads and tails
• For each head, insert red element in priority-queue (priority=position)
• Repeatedly:
– Extract minimal element from queue
– Access and color corresponding element in list
– Insert opposite color element corresponding to successor in queue
3 4 5 9 8 `7 10 2 6
• Scan of list
• O(N) priority-queue operations
 O ( NB log M B NB ) I/Os
Lars Arge
11
I/O-algorithms
Summary: List Ranking
• Simplest graph problem: Traverse linked list
13 54 2
9 2
4 7
5 6
9 38 8
7 10
6 10
• Very easy O(N) algorithm in internal memory
• Much more difficult O ( NB log M B NB ) external memory
– Finding independent set via 3-coloring
– Bridging vertices in/out
• Permuting bound O (min{ N , NB log M B NB }) best possible
– Also true for other graph problems
Lars Arge
12
I/O-algorithms
Summary: List Ranking
• External list ranking algorithm similar to PRAM algorithm
– Sometimes external algorithms by “PRAM algorithm simulation”
• Forward list coloring algorithm example of “time forward processing”
– Use external priority-queue to send information “forward in time”
to vertices to be processed later
3 4 5 9 8 7 10 2 6
Lars Arge
13
I/O-algorithms
Algorithms on Trees
TBD
Lars Arge
14
I/O-algorithms
References
• External-Memory Graph Algorithms
Y-J. Chiang, M. T. Goodrich, E.F. Grove, R. Tamassia. D. E.
Vengroff, and J. S. Vitter. Proc. SODA'95
– Section 3-6
• I/O-Efficient Graph Algorithms
Norbert Zeh. Lecture notes
– Section 2-4
• Cache-Oblivious Priority Queue and Graph Algorithm
Applications
L. Arge, M. Bender, E. Demaine, B. Holland-Minkley and I.
Munro. SICOMP, 36(6), 2007
– Section 3.1-3-2
Lars Arge
15
Download