Computational Astronomy 2009 Exam part B

advertisement
Computational Astronomy 2009 exam part A (the 'take
home' exam)
Pages in this document: 6
Declaration:
1. I know that plagiarism is wrong. Plagiarism is to use another's work and pretend
that it is one's own.
2. All material I have written in answer to these exam questions is my own work.
3. I have not allowed, and will not allow, anyone to copy my work with the intention
of passing it off as his or her own work.
Signature ______________________________
1
Answers to this exam should take the form of electronic files which should be sent via
email to ims@ast.uct.ac.za. Your email must bear a date stamp before 0945 local
time, Monday 1 June 2009.
You should also sign the plagiarism declaration on page 1 and return this to your
lecturer Ian Stewart at your earliest convenience. Your answers to this exam will not
be assessed until you do so.
For the purpose of answering the questions in this exam, you may make use of any
resources consistent with the plagiarism declaration on page 1.
You are advised to read the question paper carefully before starting work on it - in
particular, the hints and reminders towards the end. If you make a mistake through
too-hasty reading, you may lose some assessment. 'Evil' coding practices may also
lose some assessment.
The four FITS files mentioned in this exam, namely xmm_image.fits,
background_map.fits, like_map.fits and source.fits, are available at
http://www.ast.uct.ac.za/~ims/teaching/comp_astron/exam/
2
The questions.
1) For this question, you are required to write two python programs, which I will call
program A and program B. Taken together, the two programs implement a
sliding-box algorithm for detecting sources in x-ray images.
a) Program A is intended to create a map of the likelihood that the x-rays in an xray image come from point sources rather than continuous background. This
program requires two input files and produces one output file. File input 1
contains an image of the sky at x-ray wavelengths. File input 2 contains an
image (of the same dimensions as input 1) which records the expected amount
of background in each pixel. The output file should contain an image which
records, in each pixel, the likelihood that the x-ray counts there come only
from the background.
The algorithm you must employ is described as follows:
-
The first thing you must do is write a function named like_calc, with a call
and return structure indicated as follows:
def like_calc(counts, background):
"""This function does three things:
- Calculates the Cash log-likelihood ratio C.
- Calculates the probability that C would occur
as a natural fluctuation if the null hypothesis
is true: namely that all the observed counts
are just due to the background.
- Calculates -log of this probability.
The formula for C in the present case is
C = 2*(counts*log(counts/background)counts+background).
The probability P of the null hypothesis is
P=Q(1/2.0, C/2.0) where Q is the complementary
incomplete gamma function.
The present function returns -log(P).
The input argument 'counts' is a scalar
integer; 'background' is a scalar floatingpoint; and the returned value 'minus_log_P' is
a scalar floating-point."""
# The remaining statements come here.
return minus_log_P
-
Your function algorithm should contain the following three tests:
3
1. Test for counts<=0 or background<=0. If either is true, the expression
for the Cash statistic C cannot be evaluated: make the function return
the value 0 in these cases.
2. Test for counts<background. Such negative fluctuations can't be
sources. Again, make like_calc return a value of 0 in this case.
3. For safety, you should also test the value of C before calling the Q
function. If C<=0, the Q function may return an error. In these cases
you should also make like_calc return a value of 0.
-
Now you should begin the main part of program A. The first thing you
must do is read into separate numpy arrays the following two FITS files:
- xmm_image.fits (this file contains integer values, which should be
stored in an integer-valued array)
- background_map.fits (this file contains floating-point values)
-
Obtain the X and Y dimensions of the first array. (You may assume that
the second array has the same dimensions.) For convenience in explaining
the rest of the exercise, I will assume that these dimensions are stored in
two variables named xi_size and yi_size respectively.
-
Construct a new numpy array, to be called like_array. This array should
have the same dimensions as the input images; its data type should be
floating-point; and it should initially be filled with zero values.
-
Write 2 nested loops. The outer loop index (let's call it xi) should have a
starting value of 2 and a final value of xi_size-3; the inner loop index (let's
call it yi) should have a starting value of 2 and a final value of yi_size-3.
- Inside both loops, add together the pixel values in the 5 by 5 'slice' of
the xmm_image array which extends in the X direction from xi-2 to
xi+2 and in the Y direction from yi-2 to yi+2.
-
-
Do the same with the similar 5 by 5 slice from the background_map
array.
-
At this point, still inside the two loops, you should have 2 numbers: the
first, an integer, being the total number of x-ray events detected within
the 5 by 5-pixel patch or box of xmm_image centred on pixel (xi,yi);
the second, a floating-point value, being the total expected background
counts within the same box. You should then call the function
like_calc with these two numbers as input arguments.
-
Write the function return value to pixel (xi,yi) of like_array.
That is the last statement which needs to go inside the nested loops. After
finishing the loops, like_array should be full of data. These values
represent -ln(P) where P is the probability that the observed counts are a
natural fluctuation of the background. Thus, a large value of -ln(P) means
4
it is likely that there is a source at that position. The final task in program
A is to write like_array to a new FITS file.
Now run program A.
You may wish, for purposes of debugging, to compare your like_array with
the precalculated one available, which is named like_map.fits. If yours doesn't
match, use the precalculated file as the input for program B. In this case you
should submit your program A code for assessment just the same - don't
discard it - if the fault in it is minor, your assessment won't be severely
affected.
b) Program B is intended to look for peaks in a map of likelihood values. This
program requires one input file and produces one output file. The input file is a
FITS file which contains like_array values (preferably the output file created
by program A above, but if you need to, you can use the precalculated file
named like_map.fits). The output file is a new FITS file which contains a
binary table extension, in which is stored the pixel coordinates and the
like_array value for each detected source. This table should have 1 row per
detected source, and 3 columns: two integer-valued columns for the X and Y
pixel locations of the source, and one floating-point column for the like_array
value.
The algorithm you must employ is described as follows:
-
Read a value for the variable like_cutoff from the command line.
-
Read the like_array data values from FITS file into a numpy array.
-
Read the dimensions of this array into variables xi_size and yi_size, as in
program A.
-
Declare (in other words construct, set up, or initialize) an empty list named
source_list.
-
Declare a tally integer by setting it to zero. This is to contain the total
number of peaks detected.
-
Write now a set of nested loops, very similar to the loops in program A,
except now you want, at each loop iteration, a 3 by 3 patch of like_array.
You will thus need xi to start at 1 and end at xi_size-2, and similar for yi.
- Within these loops, obtain a 3 by 3 slice of like_array from xi-1 to xi+1
and yi-1 to yi+1.
-
If the value in the central pixel of this 3 by 3 square patch is larger than
the value in any of the other 8 pixels, do the following:
- Extract the value of like_array at this pixel. I'll call this number
like_value for convenience.
-
Add 1 to the tally integer.
5
-
If like_value>= like_cutoff, then append the following dictionary
to source_list: {'xi':xi, 'yi':yi, 'like':like_value}.
-
The final task is to save source_list to a FITS file. You'll have to construct
a binary table, with three columns. Use the supplied FITS file sources.fits
as a pattern.
-
Finally, write three keywords to the header of the table:
- Write the value of the tally integer to a keyword named 'N_PEAKS'.
- Write the value of like_cutoff to a keyword named 'LIKE_CUT'.
- Set the name of the table extension to 'SRCLIST' by writing this value
to a keyword named 'EXTNAME'.
Now run program B. Use a value of 8 for like_cutoff.
To ensure full assessment, you should return three computer files to the examiner:
comprising two text files, which contain working copies of programs A and B
respectively; plus the FITS file which contains your output source list.
Hints and reminders:
-
Don't forget that the first index of a 2-dimensional numpy array is the Y index,
not the X index. If you get the order wrong, your errors will probably cancel out
in this exercise; but doing so may lose you some assessment.
-
Be careful with the range() function and with slicing. Remember that
range(A,B) for example returns the list [A,A+1,...,B-1]; remember that a slice
such as myarray[A:B] returns a sub-array starting at element A and ending at
element B-1.
-
Check that the various parts of your program are functioning properly before (or
at least, as well as) testing the whole.
-
The FITS format strings associated with integer and floating-point values are 'I'
and 'E' respectively.
-
See chapter 2 of the pyfits manual for how to write table data to a new FITS file.
-
Every FITS file must contain a primary Header-Data Unit (HDU); and its data
must be an array, not a table. But an empty primary HDU is perfectly acceptable
and can be constructed by calling pyfits.PrimaryHDU() with no arguments.
6
Download