Radar Pulse Compression Using the NVIDIA CUDA SDK HPEC 2008

advertisement
Radar Pulse Compression Using
the NVIDIA CUDA SDK
Stephen Bash, David Carpman, and David Holl
HPEC 2008
September 23-25, 2008
MIT Lincoln Laboratory
0839-118-1
This work is sponsored by the Air Force Research Laboratory under Air Force contract FA8721-05-C-0002.
Opinions, interpretations, conclusions and recommendations are those of the author and not necessarily endorsed
by the United States Government.
NVIDIA Compute Unified Device
Architecture SDK
•
•
•
•
Create custom kernels that run on GPU
Extension of C language
Provides driver- and runtime-level APIs
Includes numerical libraries
– CUFFT
– CUBLAS
• $/GFLOP  GPU=$1.27 CPU=$29.18
HPEC 07:
Benchmarking the NVIDIA 8800GTX
with the CUDA Development Platform
Michael McGraw-Herdeg, MIT
Douglas P. Enright, The Aerospace Corporation
B. Scott Michel, The Aerospace Corporation
©2007 The Aerospace Corporation
MIT Lincoln Laboratory
0839-118-2
NVIDIA Compute Unified Device
Architecture SDK
•
•
•
•
Create custom kernels that run on GPU
Extension of C language
Provides driver- and runtime-level APIs
Includes numerical libraries
– CUFFT
– CUBLAS
• $/GFLOP  GPU=$1.27 CPU=$29.18
HPEC 07:
Benchmarking the NVIDIA 8800GTX
with the CUDA Development Platform
Michael McGraw-Herdeg, MIT
Douglas P. Enright, The Aerospace Corporation
B. Scott Michel, The Aerospace Corporation
©2007 The Aerospace Corporation
MIT Lincoln Laboratory
0839-118-3
Radar Pulse Compression
• Waveform design and processing
Replica
to achieve higher range
resolution and sensitivity*
Fast Time
FFT
• Processing consists of
Fast Time
IFFT
convolution with FIR filter
– Doppler tolerant (top): traditional
frequency domain convolution
– Doppler intolerant (bottom):
additional FFT and Doppler
correction required
Slow Time
FFT
Doppler
Correction
Replica
Fast Time
FFT
Fast Time
IFFT
Fast Time
IFFT
MIT Lincoln Laboratory
0839-118-4
* Skolnik, Radar Handbook, Second Edition. McGraw Hill Publishing, Boston, MA, 1990.
GPU vs. CPU Comparison
CPU vs GPU comparison in real-world
conditions
–
–
2 GHz dual quad-core AMD Opterons
vs eVGA eGeForce 8800 Ultra
Memory transfer to and from GPU
included in timing
1D FFT
500
3
490
2
480
1
0.5
0.3
1
4
16
64
Stage Time (ms)
27000
10368
4725
1960
1000
65536
32768
16384
8192
4096
2048
1024
GPU Speedup
FFT Size
•
Processing Time Per Stage
CPU
GPU
60
50
40
30
20
10
0
256
Batch Size
MIT Lincoln Laboratory
0839-118-5
Backups
MIT Lincoln Laboratory
0839-118-6
Reference: $/GFLOP
As of July 2007, these products represent the top of the line consumer CPU and graphics
card according to floating point computational power:
1. Kentsfield Core 2 Extreme QX6800
37.7 GFLOPS – fastest CPU as of 7/16/2007
http://www.tomshardware.com/2007/07/16/cpu_charts_2007/page36.html
$1100 – price as of March 10, 2008
http://www.google.com/products?q=Kentsfield+Core+2+Extreme+QX6800
$/GFLOPS = $29.18
Notes: Price excludes motherboard + power supply + memory + GPU
2. EVGA GeForce 8800 Ultra Superclocked (NVIDIA)
576 GFLOPS – theoretical peak
http://en.wikipedia.org/wiki/GeForce_8_Series
$730 – price as of March 10, 2008
http://www.google.com/products?q=768-P2-N887-AR&scoring=p
$/GFLOPS = $1.27
Notes: Price includes 768 MB GDDR3 memory, but excludes: motherboard + power supply + CPU
MIT Lincoln Laboratory
0839-118-7
Download