Brandon Grant
Tomas Mann
Florida State University
Spring 2010 Multicore Programming
CIS4930
• Introduction
• Implementation
• Performance
• Conclusions
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
2
• Why is compression important?
– Limited disk space
– Shrink bloated and redundant files
– Expedite file transfer
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
3
• What benefits can CUDA bring to the table?
– Nvidia Compute Unified Device Architecture
– Massively parallel GPU
– Can run thousands of threads
• We want to implement a parallel lossless compression algorithm that can compress larger files faster by taking advantage of parallelism.
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
4
• Modified Burrows-Wheeler Compression
– Implement a sequential version to identify individual tasks.
– Identify potentially parallelizable tasks.
– Implement parallel versions of these tasks and use them to replace their sequential counterparts.
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
5
• Parallel: Burrows-
Wheeler
Transformation
– Computes the Burrows-
Wheeler code.
– Computes the index of the original string in the sorted string rotations table.
Algorithm 1: Parallel Burrows-Wheeler
1: s := string for which the thread is responsible, rank = 0.
2: for each string x in the list
3: if x < s
4: rank = rank + 1
5: end if
6: end for
7: output[rank] = last character of s
8: if (s == original input sequence)
9: BW_index = rank
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
6
• Parallel: Huffman Coding
– Ascii table initialization
– Character occurrence counter
– Node sorter
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
7
• Our parallel algorithm did not show any significant performance advantages over the sequential version
– Burrows-Wheeler algorithm is optimized for a single core implementation, significant performance boosts would be difficult to realize
– Memory hierarchy restrictions in CUDA hamper performance
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
8
Compression Time
Compression Times (Seconds)
Sequential
Parallel
Bzip2
English.dic
3.456
4.146
0.713
World95.txt
Delaware.osm
6.114
294.817
6.752
0.806
270.002
63.097
Decompression Time
Decompression Times (Seconds)
Sequential
Parallel
Bzip2
English.dic
2.002
2.747
0.241
World95.txt
Delaware.osm
2.208
40.308
3.005
0.360
38.658
3.460
Original File Size
File Sizes (Bytes)
File
Size
English.dic
4,067,439
World95.txt
Delaware.osm
2,988,578 79,648,840
Compression Ratios
Compression Ratio
English.dic
Sequential
Parallel
Bzip2
0.336
0.336
0.300
World95.txt
Delaware.os
0.242
m
0.161
0.242
0.193
0.161
0.045
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
9
Burrows-Wheeler Performance
• english.dic
• world95.txt
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
10
Thank you for viewing our presentation.
Sunday, April 12, 2020
File Compression Using the CUDA
Framework
11