Design of a 96 Element FX Correlator for the LOFAR-UK Station G. 1,2 Foster 1 Adami and K. Zarb 1University of Oxford 2griffin.foster@astro.ox.ac.uk RSP Interface The international LOFAR station at Chilbolton Observatory consists of a 96 element low band array (LBA) and a 96 element high band array (HBA) connected to a single digital backend. The station was completed in September 2010 and has been commissioned for operation. The current backend is designed to create beamlets from the station antennas to be beamformed and correlated at the LOFAR correlator in Groningen, Netherlands. The current station digital backend uses a XAUI loop for forming beams and correlator calibration using the 24 RSP boards. Approximately 25% of the total bandwidth is unused. Each XAUI contains four lines, three will continue to be part of the main loop and the remaining line will be connected to the station correlator. An RSP firmware modification will allow a selectable 7 MHz of the band to be output over a single XAUI line. This firmware modification will be completed by ASTRON and be used in the SuperTERP correlator for the AARTFAACc project. LOFAR-UK: Chilbotlon Observatory The LOFAR-UK Station ROACH II Hardware The next generation ROACH-II board designed by CASPERa/KAT is based on a Xilinx Virtex 6 FPGA. The CASPER design tools are built around reusable DSP blocks. Designs are built and simulated using Simulink and the Xilinx toolflow. Traditional HDL can also be incorporated into designs. Each board can process up to 60 Gbps using CX-4 adapted cards. A modular design will be used to compute correlations subsets of the band across multiple boards. Two ROACH-II boards will be required to compute the full Stokes correlation of all 96 elements for the 7 MHz band. A Single Station Correlator CASPER: ROACH-II Each LOFAR station has a limited calibration correlator which has been used for single station, widefield images throughout station commissioning as a diagnostic tool and for developing the imaging pipeline. This correlator cycles through the individual subbands to produce a single channel correlation on second timescales. A dedicated correlator is in development which can process a selectable portion of the band (7 MHz per module), provide further subband channelization, and output correlations on subsecond timescales. A key science goal for this instrument will be, among others, monitoring and imaging of short timescale transient events. In addition to the FPGA based correlator a CPU/GPU realtime imaging pipeline will be necessary to cope with the large output data rates. This instrument will interface with the current LOFAR RSP such that commensal observations can be performed while the station is being used during international LOFAR operations. Completed development of the correlator and imaging pipeline is expect in early 2012. PELICAN Imaging Framework Imaging on short timescales leads to very large correlator output data rates. In order to cope with these rates and produce updated calibration coefficients it is necessary to process the output data stream in real time. PELICANb developed by the Oxford e-Research Center (OeRC) is a efficient and modular framework to process real time data streams. Data is split into parallel streams processed on CPU/GPUs to form images of the transiting sky and differential images for transient detection. Correlator Specs Ant-pols 192 Baselines 18528 (Auto + Cross) Bandwidth 7 MHz / XAUI ROACH II IO/Memory RSP Boards 24 (4 antennas w/ 2 pol per board) FPGA Subband Width 200 kHz Data Format Xilinx FPGA Device Virtex 6 SX475T Virtex 6 SX475T Clock ~300 MHz QDR 4 x 36 bit x 2M QDR II+ Logic Slices 74400 16 bit complex DRAM 144 bit DDR3 DRAM Interface DSP48e* 2016 Integration Time ~10-100 ms 10 GbE (3 CX-4 or 4 SFP+) x 2 GTX IO 36 (6.6 Gbps Max) Integration Size 141 MB Input Data Rate 60 Gbps The PELICAN pipeline will be used to form sky images along with providing a calibration routine which will be able to update the correlator phase and amplitude coefficients in real time. A local sky model will provide the initial calibration. A GPU based 2D FFT will be used to form the dirty image. For short integrations and low resolution, bright point sources will dominate the field. A short CLEAN loop can be used to isolate the sources based on the sky model. A differential comparison of images based on a number of time steps will be performed and a threshold detector will be used to find transient events. A slower stacking module will also be used to form a high dynamic range sky survey image. *DSP48e contains a 25x18 multiplier and accumulator ROACH-II 0 PELICAN Framework XAUI XAUI XAUI0 Subband Splitter Antenna Reorder Corner Turn XAUI Windowed X Engine 0 (48 Dual pol taps, 9 subbanbs) Vector Accumulator 0 Windowed X Engine 1 (48 Dual pol taps, 9 subbanbs) Vector Accumulator 1 10 GbE Subband Chunker 10 GbE Subband Chunker RFI Flagging Calibration FFT Imager Local Sky Model Global Sky Model CLEAN Differential Image Threshold Detection Stack Image RSP0 (Ant 0…4: pol X, pol Y) XAUI1 Quantization Equalization (4 bits) RCU0 (Ant 0: pol X, pol Y) XAUI XAUI XAUI . . . XAUI2 XAUI . . . … Phase/Amp Coefficients x24 Image Database XAUI3 RCU95 (Ant 95: pol X, pol Y) XAUI4 10 GbE Vector Accumulator 3 Vector Accumulator 2 Quantization Equalization (4 bits) Windowed X Engine 3 (48 Dual pol taps, 9 subbanbs) XAUI Windowed X Engine 2 (48 Dual pol taps, 9 subbanbs) Corner Turn Antenna Reorder Subband Splitter XAUI XAUI AP0 10 GbE ASTRON ring Subband Chunker … … ROACH-II 1 To interface the correlator with the current LOFAR digital backend a modification to the RSP firmware is required along with a passive XAUI combiner board. serdes Inter board interface (IBI) AP1 BP LCU CEP AP2 serdes AP3 XAUI Line ring RSP Board Inter board interface (IBI) The current correlator design is implemented on 2 ROACH-II boards, each board has 3 CX-4 interfaces for input from the RSP and 4 SFP+ interfaces for interboard communications and 10 GbE output. The input data streams need to be reordered such that half of the subbands go into a single ROACH-II. A CASPER windowed X Engine is used to optimize resource utilization. DataBlob input data abstraction Windowed X Engine Taps 1 auto + 48 cross (1/2 the number of antennas for a dual pol array) CMACs / Tap 4 (16 4x4 multipliers and 4 accumulators) Multipliers / X Engine 784 XML Configuration DataBlob output data abstraction Auxiliary DataBlob PELICAN C++ Module Class Image by Jean-Mathisa Grassmeier, produced using data from LOFAR Station FR606 at Nancay XAUI5 RSP23 (Ant 92…95: pol X, pol Y) Subband Chunker PELICAN Module References a Collaboration for Astronomy Signal Processing and Electronics Research (http://casper.berkeley.edu) for Extensible, Lightweight Imaging and CAlibratioN (https://wiki.oerc.ox.ac.uk/svn/pelican/user/index.html, https://github.com/pelican/) c Amsterdam—ASTRON Radio Transients Facility and Analysis Centre (http://www.aartfaac.org/) b Pipeline