File Compression Using the CUDA Framework

advertisement

File Compression Using the

CUDA Framework

Brandon Grant

Tomas Mann

Florida State University

Spring 2010 Multicore Programming

CIS4930

• Introduction

• Implementation

• Performance

• Conclusions

Contents

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

2

Introduction

• Why is compression important?

– Limited disk space

– Shrink bloated and redundant files

– Expedite file transfer

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

3

Introduction

• What benefits can CUDA bring to the table?

– Nvidia Compute Unified Device Architecture

– Massively parallel GPU

– Can run thousands of threads

• We want to implement a parallel lossless compression algorithm that can compress larger files faster by taking advantage of parallelism.

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

4

Implementation

• Modified Burrows-Wheeler Compression

– Implement a sequential version to identify individual tasks.

– Identify potentially parallelizable tasks.

– Implement parallel versions of these tasks and use them to replace their sequential counterparts.

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

5

Implementation

• Parallel: Burrows-

Wheeler

Transformation

– Computes the Burrows-

Wheeler code.

– Computes the index of the original string in the sorted string rotations table.

Algorithm 1: Parallel Burrows-Wheeler

1: s := string for which the thread is responsible, rank = 0.

2: for each string x in the list

3: if x < s

4: rank = rank + 1

5: end if

6: end for

7: output[rank] = last character of s

8: if (s == original input sequence)

9: BW_index = rank

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

6

Implementation

• Parallel: Huffman Coding

– Ascii table initialization

– Character occurrence counter

– Node sorter

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

7

Conclusions

• Our parallel algorithm did not show any significant performance advantages over the sequential version

– Burrows-Wheeler algorithm is optimized for a single core implementation, significant performance boosts would be difficult to realize

– Memory hierarchy restrictions in CUDA hamper performance

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

8

Conclusions

Compression Time

Compression Times (Seconds)

Sequential

Parallel

Bzip2

English.dic

3.456

4.146

0.713

World95.txt

Delaware.osm

6.114

294.817

6.752

0.806

270.002

63.097

Decompression Time

Decompression Times (Seconds)

Sequential

Parallel

Bzip2

English.dic

2.002

2.747

0.241

World95.txt

Delaware.osm

2.208

40.308

3.005

0.360

38.658

3.460

Original File Size

File Sizes (Bytes)

File

Size

English.dic

4,067,439

World95.txt

Delaware.osm

2,988,578 79,648,840

Compression Ratios

Compression Ratio

English.dic

Sequential

Parallel

Bzip2

0.336

0.336

0.300

World95.txt

Delaware.os

0.242

m

0.161

0.242

0.193

0.161

0.045

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

9

Conclusions

Burrows-Wheeler Performance

• english.dic

• world95.txt

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

10

Thank you for viewing our presentation.

QUESTIONS?

Sunday, April 12, 2020

File Compression Using the CUDA

Framework

11

Download