Digital Signal Processing

advertisement
West Coast Spectrometer Team
Mark Wagner, Berkeley project manager, FPGA designer
Terry Filiba,
data transport: FPGA --> CPU --> GPU
Suraj Gowda,
boosting FFT/FPGA clock speed
Glenn Jones,
digital downconverter design (caltech)
Guifre Molera, 10Gbit ethernet protocol, GUPPI mods
Gregory Desvignes,
Guppi Code modifications
Simon Scott,
systems integration, arriving march 26
Hong Chen,
fft optimizations (bit growth, unscramber)
Billy Mallard,
DSP library optimizations (DSP48, etc)
Andrew Siemion, galactic center pulsar application
Dan Werthimer, taking credit for above work
1
Observing Modes: 250 MHz  GPU
2
3
Overall Block Diagram
4
Roach Motel (Roach Nest) (KAT)
7
Roach I vs Roach II
• Roach I works well.
Deployed at many observatories
• Roach II doesn’t exist. Prototypes spring. Prod Winter?
• Roach I resources are tight,
harder to get to work at high speed
hard to add features, 500 MHz 8K channels won’t fit
• Roach II can use SFP+ connectors,
more reliable 10Gbe connector
• Plan: develop and test using Roach I. Decide later.
8
3 GS/s ADC Board
9
10
Polyphase Filter Bank
FGPA Spectrometer – Mark Wagner
12
FPGA DDC/Packetizer (Mark Wagner)
(extract sub-band(s) and send to GPU)
13
64 channel spectrometer
2 GHz bandwidith with xilinx place/route
1024 channel spectrometer, 3 GHz BW
Suraj Gowda scripts for autoplacement
“Automated Placement for Parallelized
FPGA FFTs” Suraj Gowda et al, 2011
No Placement Constraint
Placement Constrained using
our algorithm
Processable Bandwidth
<2.4 GHz
> 3 GHz
Compile time
80:19 minutes
38:22 minutes
Existing CASPER DDC/Decimation filter
Quarter band filter for 8 real inputs
8*Fclk real samples per
second (BW=4*Fclk)
x0
x1
x2
x3
x4
x5
x6
x7
x
x
x
x
x
x
x
x
Multiply by complex
sinusoid
E0(z)
E1(z)
E2(z)
E3(z)
E4(z)
E5(z)
E6(z)
E7(z)
Polyphase filter
components
Fclk cplx samples per
second (BW=Fclk)
+
… y3 y2 y1 y0
Half Band DDC/Filter - Glenn Jones
8*Fclk real samples per
second (BW=4*Fclk)
x0
x4
x
E0(z)
x1
x5
x
E1(z)
x2
x6
x
E2(z)
x3
x7
x
E3(z)
2*Fclk cplx samples per
second (BW=2*Fclk)
v0
v4
v1
v5
v2
v6
v3
v7
+
… y2 y0
+
… y3 y1
SPEAD packet - FGPA  10Gbe (Guifre Molera)
Preliminary Design Work
• Concentrating on the hard parts
– 3 GS/s sampling and PFB/FFT calculations
– Heterogeneous Computing Approach
• Divide processing into front/back ends
• Use FPGAs to fully process bandwidths greater than 250 MHz
• Use FPGA front-ends to pre-process, split and packetize data, then
GPUs to provide fine channelization on narrower chunks
– Software Design
• Adapting code from the Green Bank Ultimate Pulsar Processing
Instrument (GUPPI)
20
Pulsars at the Galactic Center ??
100’s of pulsars predicted in the central pc
none undiscovered - Macquart, Frail, Ransom, Bower..
Map gravitational field (timing), ISM at GC,
black hole spin?
Extreme scattering smears out the pulse
High Frequency Observation to minimize scattering
High Bandwidth Needed at High Frequency (low flux)
800 MHz  8 GHz
21
Worries
• Speed of FPGA - difficult, time consuming layouts,
perhaps impossible for full Roach I chip
• Lots of modes - time consuming
(design/test/software/document)
• Will everything fit in Roach I?
Use Roach II?
• Might loose Mark Wagner in September
22
Download