SERPent - Automated RFI software for e-MERLIN Luke W. Peck

advertisement
SERPent - Automated RFI software for e-MERLIN
Luke W. Peck∗ & Danielle M. Fenech†
University College London
August 31, 2012
Abstract
This memo summarizes the SERPent pypeline used to address the varying RFI environment
present across the e-MERLIN array. SERPent is an automated RFI mitigation procedure ultilizing the Sum Threshold methodology (LOFAR pipeline) and is written in the Parseltongue
language to interact with the AIPS program. In addition to the flagging of RFI affected visibilities, the script also flags the Lovell stationary scans inherent to e-MERLIN system.
Both flagging and computational performances of SERPent are presented here with eMERLIN commissioning datasets for both L and C band observations. The refining of automated reduction and calibration procedures is essential for the e-MERLIN Legacy projects,
where the vast data sizes (> TB) mean that the traditional astronomer interactions with the
data are unfeasible.
∗
†
email: lwp@star.ucl.ac.uk
email: dmf@star.ucl.ac.uk
1
1
Introduction
With the advent of new receivers, electronics, correlators and optical fibre networks; modern interferometers such as e-MERLIN, EVLA and ALMA are becoming an ever more sensitive window
to the radio universe. The wide receiver bandwidths are particularly prevalent in increasing the
sensitivity of the interferometer by increasing the spectral range of radio frequencies observed and
thus increasing the uv coverage. However, this increased bandwidth now incoorporates more radio
frequencies reserved for more commercial purposes such as mobile phones, satellites, radio stations
to name a few sources. Such Radio Frequency Interference (RFI) is an old nemesis of radio astronomers who traditional removed or ‘flagged’ RFI manually, but with the increase in data sizes
from the order of Megabytes to Terabytes due to recent improvements to arrays, this becomes an
unreasonable task. Thus highlighting the importance of automated procedures, particularly for
RFI mitigation.
The Scripted E-merlin Rfi-mitigation PypelinE for iNTerferometry (SERPent) was created to
tackle this problem for the RFI environment affecting e-MERLIN using Parseltongue; a python
based language which calls upon AIPS tasks.
2
SERPent Requirements
SERPent has been run on a number of systems and seems to be fairly stable. Here is a list
of program versions which we are running the code on, and should probably be considered the
‘minimum’ requirements for the code to work. For computational requirements and timings for the
script to run on test data, please see Section 6.
ˆ AIPS release 31DEC11
ˆ Python 2.6.5
ˆ Parseltongue 2.0 (with Obit 1.1.0)
ˆ Numpy 1.6.1
3
Lovell Stationary Scans
A problem unique to the e-MERLIN array is the Lovell stationary scan. Due to the size of the Lovell
telecope and the subsequent slew time, the Lovell telescope only participates in every alternative
phase-cal scan, remaining stationary on the target for the other scans. The other antennas in the
array are not affected. This results in the visibilities from baselines containing the Lovell telescope
to have two different amplitude levels for the phase-cal. In most cases the phase-cal will be brighter
than the target, thus when the Lovell is observing the phase-cal the received flux will be greater
than when the Lovell does not participate in the phase-cal scan and remains on the target source.
This behaviour can be seen using the IBLED task within AIPS on the phase-cal source as figure
1 clearly shows. This figure also displays another problem with early e-MERLIN commissioning
data with multiple amplitude levels for scans throughout the observation. This property has been
traced to hardware issues within the receivers and new filters appear to have resolved the issue
for future observations. However, it was necessary to normalize this problem before flagging this
dataset.
In the main window each group of points represents one scan, for which there are three distinct
amplitude levels. The highest two levels are scans where the Lovell telescope contributes to the
2
Figure 1: AIPS IBLED task window, displaying the phase-cal source: 2007+404, stokes LL (for greater clarification).
The top panel shows all scans for the entire observation run, and the main central panel shows a small selection of
scans for closer inspection, before running SERPent.
observation (including the aforementioned filter issues affecting amplitude levels) and the lowest
level scans are where the Lovell does not contribute. Across the entire observation (top panel) the
Lovell stationary scans are consistent in magnitude and alternate between every other observation,
despite the varying amplitude levels of the Lovell on source scans, indicating that the Lovell dropout
scans are indeed the cause of the lowest level scans in figure 1.
If the array is e-MERLIN, SERPent will run an extra piece of code, which firstly determines
the Lovell baselines. It makes a first run through all the integration times and isolates each scan,
and evaluates the magnitude of each scan, the highest and lowest scan statistics and the integration
time step. A second run again isolates each individual scan and tests the following condition: if
the mean of the scan is between the lowest mean found in the previous run ±σ: then flag the entire
scan. The results are written to a text file via the cPickle Python module and are combined with
the main SumThreshold flagging results at a later time in the script. Figure 2 shows the IBLED
task window on the same phase cal source as in figure 1 after the Lovell stationary scans have been
removed by SERPent.
4
RFI Mitigation for e-MERLIN
One of the toughest challenges in RFI mitigation is accounting for its variable intensity, morphology
and unpredictable nature. There are numerous methods available to astronomers for both pre- and
post-correlation, both having advantages and disadvantages. Given the facilities in place at Jodrell
Bank, we decided post-correlation techniques would be most prudent. Early commissioning data
3
Figure 2: AIPS IBLED task window, displaying the phase-cal source: 2007+404, stokes LL (for greater clarification).
The top panel shows all scans for the entire observation run, and the main central panel shows a small selection of
scans for closer inspection after running SERPent. The lowest level scans present in figure 1 have been removed.
from e-MERLIN contained RFI varying in both time and frequency, thus necessitating threshold
detection methods. We now outline the RFI mitigation techniques deployed by SERPent.
4.1
SumThreshold Method
The most effective thresholding method was demonstrated by Offringa et al. 2010b [3] to be the
SumThreshold and this is the adopted RFI detection method for SERPent. An overview of the
method will be given here, for a more in depth analysis of the method please see the afore-mentioned
literature.
Threshold methods work on the basis that RFI increase visibility amplitudes for the times
and frequencies they are present. Therefore there will be a considerable difference compared to
other RFI-free visibility amplitudes, thus these RFI will be statistical outliers. If these RFI are
above a certain threshold condition then they are detected and flagged. The threshold level is
dictated by the statistics of the sample population, which can be the entire observation (all time
scans, frequency channels, baselines etc) or a smaller portion, for example: separate baselines and
IFs. The advantage of separating the visibilities this way increases the computational performance
(Python is faster when operating on many smaller chunks of data rather than one big chunk; i.e.
Dynamic Programming), and also makes the statistics more reliable as the RFI may be independent
of baseline and the distribution between different IFs. This is particularly relevant for L band
observations where the RFI is more problematic.
The SumThreshold method works on data which is separated by baselines, IFs and stokes and
is arranged in a 2D array with the individual time scans and frequency channels comprising the
4
array axes i.e. time-frequency space. The frequency channels were further split by IFs due to
the arguments previously stated. The idea is that peak RFI and broadband RFI will be easily
detectable when the visibility amplitudes are arranged in time-frequency space. The e-MERLIN
correlator outputs three numbers associated with any single visibility: the real part, the complex
part and the weight of the visibility. When appending visibilities in the time-frequency space, if
the weight is greater than 0.0 i.e. data exists for that time and frequency, then the magnitude of
the real and complex part of the visibility is taken to constitute the amplitude. If the weight is 0.0
or less i.e. no data exists for this baseline, time scan etc, then the amplitude is set to 0.0. This
visibility will thus have no effect on the statistics or threshold value, but will act as a substitute for
that elemental position within the array. The Python module NumPy is employed to create and
manipulate the 2D arrays, as the module is written in Fortran (which is intrinsically faster than
Python) and has been optimized1 .
There are two concepts associated with the SumThreshold method: The threshold and the
subset size i.e. a small slice of the total elements (in this case visibitility amplitudes) in a certain
direction of the array (time or frequency). The difference between the SumThreshold method (a
type of combinatorial thresholding) and normal thresholding is that after each individual element
in the array has been tested against the first threshold level χ1 , the values of a group of elements
can be averaged and tested against a smaller threshold level χi , where i is the subset number i.e.
the number of elements averaged and tested. Empirically a small subset i = [1, 2, 4, 8, 16, 32, 64]
works well (Offringa et al. 2010b) [3]. A window of size i cycles through the array in one direction
(e.g. time) for every possible permutation for the given array and subset size. After each subset
cycle a binary array of identical size records the positions of any elements which are flagged. 0.0
denotes a normal visibility and 1.0 signifies a RFI in the time direction (2.0 for frequency direction
and higher values for any subsquent runs of the flagger). At the beginning of the next subset cycle
any element within the flag array whose value is greater than 0.0, the corresponding amplitude
in the visibility array is reduced to the threshold level χi which progressively gets smaller with
increasing subset size. If a group of elements of any subset size i is found to be greater than the
threshold level χi , then all elements within that window are flagged. This method is implemented
in both array directions (i.e. time and frequency).
4.2
SERPent’s Implementation of the SumThreshold Method
In addition to the SumThreshold methodology, certain clauses have been added to prevent the
algorithm to overflag the dataset. If any threshold level reaches the mean + variance estimate the
flagging run for that direction stops. Before the full implementation of the SumThreshold method
is deployed, an initial single subset i = 1 run is done to remove any extremely strong RFI. The
amplitudes of any RFI detected are subsequently set to zero and the full flagging run begins. The
flagging process can run multiple times at the cost of computational time, and written in the code
as default is a second run, if the maximum value within the array is a certain factor of the median
and if there are flags from the previous run. On this second run all flagged visibilities from the first
run are set to 0.0 in the visibility array so the statistics are not skewed and this run can then search
for weaker RFIs which may remain. This may be necessary as some RFI in the early e-MERLIN
commissioning data were found to be over 10, 000 times stronger than the astronomical signal and
some weaker RFIs were still present. Note that the first run subsets increase in size in binary steps
up to 32, and the second run goes deeper to 256. This can easily be manually changed to lower
values to save time if there isn’t much RFI in the observations or to greater subset sizes if necessary.
1
It should be noted here that how this module is compiled and called upon on can have a significant effect on
performance.
5
The first threshold level can be calculated by a range of methods and statistics. The variance
of a sample is an important component for this threshold and various methods are described and
tested by Fridman (2008) [1]. The author concluded that Exponential Weighting is the best method
from the point of view of LOSS: a measure of the difference in standard deviation of a robustly
estimated variance and a simple estimate, in the absence of outliers. But the Median Absolute
Deviation (MAD) and Median of Pairwise Averaged Squares are the most effective ways to remove
outliers, although they comment that both are not particularly efficient and require more samples
to produce the same power, as other methods. Since the sample size in any given observation from
e-MERLIN will be of adequate size, this is not such an issue. The breakdown point for MAD is
also very high (0.5) i.e. almost half the data may be contaminated by outliers (Fridman 2008) [1].
MAD is adopted by this algorithm due to these robust properties. Again the author stresses that
the type and intensity of RFI, type of observation and the method of implementation are important
factors when deciding what estimate to use for any given interferometer.
The variance MAD used in the SERPent algorithm is defined by equation 1, where mediani (xi )
is the median of the original population. Each sample of the population is then modified by the
absolute value of the median subtracted from each sample. The median of this new absolute
median subtracted population is taken and multipled by a constant 1.4286 to make this estimation
consistent with that of an expected Guassian distribution.
M AD = 1.4826 medianj {|xj − mediani (xi ) |}
(1)
The first threshold level χ1 is thus determined by an estimate of the mean x̄, the variance σ
and an aggressiveness parameter β (equation 2) (Niamsuwan, Johnson & Ellingson 2005) [2]. Since
the median is less sensitive to outliers, it is preferred to the traditional mean in this equation (thus
x̄ = median) and the MAD to the traditional standard deviation for the variance for similar reasons
(σ = M AD). If the data is Guassian in nature then the MAD value will be similar to the standard
deviation and the median to the mean. A range of values for β were tested until a stable value
were found for multiple observations and frequencies of around β = 25. Increasing the value of β
reduces the aggressiveness of the threshold and decreasing the value increases the aggressiveness.
χ1 = x̄ + βσ
(2)
The subsequent threshold levels are determined by equation 3 where ρ = 1.5, this empirically
works well for the SumThreshold method (Offringa et al. 2010b) [3] and defines how ‘coarse’ the
difference in threshold levels, and i is the subset value.
χi =
χ1
log
ρ 2i
(3)
In summary, SERPent firstly calculates the median and MAD for each IF, baseline and stokes.
An initial run cycles through the visibilities to remove any individual amplitudes which are over
the first threshold, in case there are extremely strong RFI present and then sets them to zero for
the subsequent full flagging runs. Then the script starts the ‘first’ full run of the SumThreshold
method in both time and frequency directions. After this is completed it again sets any flagged
visibility amplitude’s to zero and recalculates the statistics. Then the second SumThreshold run is
performed to try and remove some weak RFI. All the parameters described here can be manually
changed by the user via the SERPent input file.
6
5
SERPent Outputs
Whilst SERPent is running it will continuously output text files containing information on the
flagging to a designated folder set by the user. The cPickle module in Python is used to store the
NumPy arrays from the flagging and are all read back in once the flagging has finished. SERPent
operates in this fashion due to the way it is parallelized for performance, meaning each CPU does
not need to retain any Python variables or information and is free to flag multiple runs. Each of
these files will be automatically deleted by SERPent at the end of the script.
SERPent will combine all the flag arrays and format them into the appropriate FG extension
table required by AIPS. To maximize the FG table efficiency, the SERPent FG output is fed through
the REFLG task in AIPS to condense the number of FG rows. This is important due to the limit
imposed by certain calibration tasks in AIPS in the number of FG entries which can be applied.
Multiple FG files will be created by SERPent from the flagging, Lovell dropouts, a combination of both and after the REFLG task has been run (if the AIPS version is recent enough).
Whilst the REFLG FG table is automatically attached to the input file, these files remain for user
manipulation.
6
SERPent Performance
Here we document the performance of the early test runs of SERPent on old MERLIN data, early
e-MERLIN commissioning data and RFI test data supplied by Rob Beswick (Jodrell Bank). Table
1 shows details on the datasets tested here. All tests have used SERPent version 31/07/12.
Table 1: SERPent Performance Test Datasets
Telescope
MERLIN2
e-MERLIN
e-MERLIN
e-MERLIN
6.1
Dataset Name
M82V
RFI Test Data:
1436+6336
COBRaS W1 2011:
0555+398
COBRaS W1 2011:
All Sources
Size
212 MB
1.63 GB
Band
L
L
Visibilities
82692
5812
Sources
6
1
Baselines
21
10
IFs
1
12
Channels
31
512
Stokes
2
4
2.33 GB
C
99149
1
10
4
128
4
25.3 GB
C
1079150
6
10
4
128
4
Flagging Performance
SERPent has been tested on both L and C band observations and has been found to flag all C
band RFI and the majority of L band RFI. The remaining L band is usually weak broadband RFI
or very weak RFI close to the median value of the sample.
Firstly we present some results from L band data. Figure 3 shows some RFI test data of
1436+6336 (data courtesy of Rob Beswick) with one baseline displayed via AIPS task SPFLG
in time-frequency space. The first IF is completely wiped out with noisy data, and some weak
broadband RFI remains in the central IFs. Almost everything else has been flagged, including
some very intricate RFI which can not be done as accurately with more simplistic RFI flagging
routines.
The L band results have shown that SERPent can flag complicated RFI in time-frequency
space, and figure 4 shows this also applies to the C band with the infamous ‘wiggly’ RFI found in
commissioning data. Note that this was very poor quality data and SERPent even started to flag
some of the noise. However this is a good example of the thresholding method in action.
7
Figure 3: AIPS SPFLG image of 1436+6336, L band, baseline 7 − 8, stokes: RR, IF: 1 − 12 after SERPent flagging.
The AIPS task REFLG was also deployed in this image. The vertical axis is time and horizontal axis is frequency.
6.2
Computational Performance
We have conducted multiple runs on a range of datasets and computers to access the flagging and
computational performance of SERPent. The manner in which SERPent is currently parallelized,
the script distributes the data via baselines to different CPUs and thus running multiple flagging
runs in parallel. Thus the speed performance will increase with more CPUs, until the number of
CPUs exceeds number of baselines, for e-MERLIN this is 21 baselines for the full array. So the
performance increases when the number of CPUs is a factor of the number of baselines e.g. 10
baselines distributed over 4 CPUs means 2 CPUs will run 2 baselines and the other 2 CPUs will
run 3 baselines. Alternatively; running 12 baselines on 4 CPUs has the same speed as running
9 baselines over 4 CPUs as the limiting factor is the CPU running the extra baseline, hence this
factor performance increase. It would also be correct to state that running 9 baselines on 3 CPUs
will roughly be the same speed as running 9 baselines on 4 CPUs.
There are views to increase the parallelization, by further splitting the jobs in IFs as well as
baselines. This will spread the workload more evenly across different CPUs whereas before some
CPUs would be idle. Computers with a large number of CPUs (> 20 CPUs) will also benefit
from this type of separation. We are discussing also splitting the data in time, as currently all the
timescans for a baseline and IF are passed through the flagger. This will have the performance
increase due to the nature of Python performing faster on many but smaller chunks of data rather
than a few big chunks.
We have so far analysed two datasets; M82V and COBRaS W1 2011, for computational performance on two different computer systems. Table 2 gives details on the computer systems we have
tested SERPent performance on.
Table 2: Computer Systems
Computer Name
Leviathan
Desktop
Memory (GB)
100
4
NCPUs
16
4
Firstly we compare the time taken to flag the MERLIN M82V and e-MERLIN COBRaS W1
8
Figure 4: AIPS SPFLG image of 0555+398, C band, baseline 5 − 7, stokes: RR, IF: 2 before (left) and after SERPent
flagging (right). The AIPS task REFLG was also deployed in this image. The vertical axis is time and horizontal
axis is frequency.
2011 dataset with both the Desktop and Leviathan using the entire range of CPUs available for each
system. Figure 5 demonstrates the average time taken over three separate runs for each number of
CPU for each computer. It is clear that by increasing the number of CPUs and thus splitting the
workload, results in an increase in performance which levels off as the number of CPUs reaches the
number of baselines in the dataset. To improve the performance, further parallelizations will need
to be done by splitting tasks via i.e. IFs in addition to baselines to individual CPUs.
Increasing the amount of memory also increased the computational performance, albeit by
a smaller amount than the parallelization. Leviathan has 25x more Memory than the standard
Desktop computer in our tests and is faster by a factor of 1.7, consistently when comparing between
multiple number of CPUs for both computers and datasets. This shows that the limiting factor
of running SERPent on interferometric datasets is the shear volume of data that needs processing
and not a RAM issue.
We displayed both datasets next to each other to highlight the increase in data size ∼ 11x
between the datasets and the seemingly linear time increase needed to process them i.e. SERPent
takes ∼ 11x longer to run using the same setup between datasets of 212MB and 2.3GB sizes. Initial
tests on the full 25.3GB COBRaS W1 2011 dataset reveals that Leviathan needs ∼ 6 hours with
10 CPUs to process the data. Comparing this to the time needed for Leviathan and 10 CPUs to
process 2.3GB of data (a ∼ 11x increase in size) gives the ratio of over 9. Thus for a simplistic
estimation, it is reasonable to assume a linear progression for computational time to run SERPent
and dataset size.
Lastly we present the performance of the Desktop and Leviathan on multiple CPUs as a ratio to
a single CPU in figure 6. At a low number of CPUs a linear relation exists in performance increase
which plateaus off at two distinct levels. This is due to how SERPent distributes tasks to different
9
2500
MERLIN M82V - 212MB
Time Taken (secs)
2000
25000e-MERLIN COBRaS W1 2011 - 2.33GB
Leviathan 100GB
Desktop 4GB
20000
1500
15000
1000
10000
500
5000
00 2 4 6 8 10 12 14 16 18 00
NCPUs
2
4
6
8
10 12
Figure 5: showing the time taken to flag the MERLIN M82V (left) and the e-MERLIN COBRaS W1 2011 (right)
datasets using a common Desktop computer and Leviathan over a range of CPUs. Each point is an average of 3 runs
using the same number of CPUs. Note that the COBRaS W1 2011 dataset only had 10 baselines and due to the
current parallelization would only benefit from using 10 CPUs with Leviathan.
CPUs as explained above.
7
Conclusion and Discussion
We have presented a simple script to flag RFI from radio interformetric data, ultilizing common
software packages and programs. The readily available scripts provide a simple and easy way for
the astronomer to remove RFI from their data. SERPent also addresses the Lovell stationary
scan problem which would otherwise affect the automated flagging methods within SERPent and
perhaps other scripts. We have discussed the RFI mitigation techniques involved and demonstrated
the flagging and computational performance of SERPent on a range of machines and setups. We
now discuss some areas of RFI mitigation which may be of interest.
To achieve complete automation of any procedure is a challenging task, particularly when
many variables and unexpected problems arise within arrays and in real datasets. The amplitude
splitting problem from receiver filter issues discussed in Section 3, provides a unique example to the
unexpected problems that may occur in any individual dataset. These issues have to be resolved if
successful calibration and analysis of the data is to happen, but they can not always be predicted.
To produce a complete pipeline (including reduction, flagging and calibration) which is to run blind
on an observation is perhaps something for the future when the interferometer’s systems are stable
and their behaviour is well established.
This point carries into another discussion on the size of datasets from modern interferometers.
The reason we strive to achieve complete pipelines is due to the data volume from modern interferometers. The computational performance demonstrated here shows that with old MERLIN data
(212MB, 6 sources) the entire dataset can be flagged in around 11 minutes on a modest desktop
10
MERLIN M82V NCPU Performance
7
Speed Ratio Relative to 1 CPU
6
5
4
3
2
1
00
Leviathan 100GB
Desktop 4GB
2
4
6
8
10
NCPUs
12
14
16
18
Figure 6: showing the speed relations of running SERPent on multiple CPUs on the Desktop and Leviathan relative
to a single CPU on the same systems. The ‘plateau’ fluctuations will be due to minor differences in running conditions
during tests.
(4GB memory and 4 CPUs), but a modest dataset from early commissioning e-MERLIN observations (25GB total, 6 sources with only half the full number of baselines), needs around 6 hours
running on a more powerful computer (100GB and 10 CPUs). Full e-MERLIN Legacy project
datasets are expected to be even larger ranging from 100’s GB up to a TB in size. This clearly
shows the computational challenge required to reduce these datasets, and also the necessity of
automating procedures as manually handling this amount of data becomes unfeasible.
In the future as interferometers get ever more powerful with arrays like the SKA, these issues
will only magnify. The scripts and procedures written for the current crop of interferometers such
as e-MERLIN provide a valuable insight and act as a stepping stone to the next generation of radio
interferometers.
References
[1] P. A. Fridman. Statistically Stable Estimates of Variance in Radio-Astronomy Observations as
Tools for Radio-Frequency Interference Mitigation. ApJ, 135:1810–1824, May 2008.
[2] N. Niamsuwan, J. T. Johnson, and S. W. Ellingson. Examination of a simple pulse-blanking
technique for radio frequency interference mitigation. Radio Science, 40:5, June 2005.
[3] A. R. Offringa, A. G. de Bruyn, M. Biehl, S. Zaroubi, G. Bernardi, and V. N. Pandey. Postcorrelation radio frequency interference classification methods. MNRAS, 405:155–167, June
2010.
11
Download