Computational Astronomy 2009 exam part A (the 'take home' exam) Pages in this document: 6 Declaration: 1. I know that plagiarism is wrong. Plagiarism is to use another's work and pretend that it is one's own. 2. All material I have written in answer to these exam questions is my own work. 3. I have not allowed, and will not allow, anyone to copy my work with the intention of passing it off as his or her own work. Signature ______________________________ 1 Answers to this exam should take the form of electronic files which should be sent via email to ims@ast.uct.ac.za. Your email must bear a date stamp before 0945 local time, Monday 1 June 2009. You should also sign the plagiarism declaration on page 1 and return this to your lecturer Ian Stewart at your earliest convenience. Your answers to this exam will not be assessed until you do so. For the purpose of answering the questions in this exam, you may make use of any resources consistent with the plagiarism declaration on page 1. You are advised to read the question paper carefully before starting work on it - in particular, the hints and reminders towards the end. If you make a mistake through too-hasty reading, you may lose some assessment. 'Evil' coding practices may also lose some assessment. The four FITS files mentioned in this exam, namely xmm_image.fits, background_map.fits, like_map.fits and source.fits, are available at http://www.ast.uct.ac.za/~ims/teaching/comp_astron/exam/ 2 The questions. 1) For this question, you are required to write two python programs, which I will call program A and program B. Taken together, the two programs implement a sliding-box algorithm for detecting sources in x-ray images. a) Program A is intended to create a map of the likelihood that the x-rays in an xray image come from point sources rather than continuous background. This program requires two input files and produces one output file. File input 1 contains an image of the sky at x-ray wavelengths. File input 2 contains an image (of the same dimensions as input 1) which records the expected amount of background in each pixel. The output file should contain an image which records, in each pixel, the likelihood that the x-ray counts there come only from the background. The algorithm you must employ is described as follows: - The first thing you must do is write a function named like_calc, with a call and return structure indicated as follows: def like_calc(counts, background): """This function does three things: - Calculates the Cash log-likelihood ratio C. - Calculates the probability that C would occur as a natural fluctuation if the null hypothesis is true: namely that all the observed counts are just due to the background. - Calculates -log of this probability. The formula for C in the present case is C = 2*(counts*log(counts/background)counts+background). The probability P of the null hypothesis is P=Q(1/2.0, C/2.0) where Q is the complementary incomplete gamma function. The present function returns -log(P). The input argument 'counts' is a scalar integer; 'background' is a scalar floatingpoint; and the returned value 'minus_log_P' is a scalar floating-point.""" # The remaining statements come here. return minus_log_P - Your function algorithm should contain the following three tests: 3 1. Test for counts<=0 or background<=0. If either is true, the expression for the Cash statistic C cannot be evaluated: make the function return the value 0 in these cases. 2. Test for counts<background. Such negative fluctuations can't be sources. Again, make like_calc return a value of 0 in this case. 3. For safety, you should also test the value of C before calling the Q function. If C<=0, the Q function may return an error. In these cases you should also make like_calc return a value of 0. - Now you should begin the main part of program A. The first thing you must do is read into separate numpy arrays the following two FITS files: - xmm_image.fits (this file contains integer values, which should be stored in an integer-valued array) - background_map.fits (this file contains floating-point values) - Obtain the X and Y dimensions of the first array. (You may assume that the second array has the same dimensions.) For convenience in explaining the rest of the exercise, I will assume that these dimensions are stored in two variables named xi_size and yi_size respectively. - Construct a new numpy array, to be called like_array. This array should have the same dimensions as the input images; its data type should be floating-point; and it should initially be filled with zero values. - Write 2 nested loops. The outer loop index (let's call it xi) should have a starting value of 2 and a final value of xi_size-3; the inner loop index (let's call it yi) should have a starting value of 2 and a final value of yi_size-3. - Inside both loops, add together the pixel values in the 5 by 5 'slice' of the xmm_image array which extends in the X direction from xi-2 to xi+2 and in the Y direction from yi-2 to yi+2. - - Do the same with the similar 5 by 5 slice from the background_map array. - At this point, still inside the two loops, you should have 2 numbers: the first, an integer, being the total number of x-ray events detected within the 5 by 5-pixel patch or box of xmm_image centred on pixel (xi,yi); the second, a floating-point value, being the total expected background counts within the same box. You should then call the function like_calc with these two numbers as input arguments. - Write the function return value to pixel (xi,yi) of like_array. That is the last statement which needs to go inside the nested loops. After finishing the loops, like_array should be full of data. These values represent -ln(P) where P is the probability that the observed counts are a natural fluctuation of the background. Thus, a large value of -ln(P) means 4 it is likely that there is a source at that position. The final task in program A is to write like_array to a new FITS file. Now run program A. You may wish, for purposes of debugging, to compare your like_array with the precalculated one available, which is named like_map.fits. If yours doesn't match, use the precalculated file as the input for program B. In this case you should submit your program A code for assessment just the same - don't discard it - if the fault in it is minor, your assessment won't be severely affected. b) Program B is intended to look for peaks in a map of likelihood values. This program requires one input file and produces one output file. The input file is a FITS file which contains like_array values (preferably the output file created by program A above, but if you need to, you can use the precalculated file named like_map.fits). The output file is a new FITS file which contains a binary table extension, in which is stored the pixel coordinates and the like_array value for each detected source. This table should have 1 row per detected source, and 3 columns: two integer-valued columns for the X and Y pixel locations of the source, and one floating-point column for the like_array value. The algorithm you must employ is described as follows: - Read a value for the variable like_cutoff from the command line. - Read the like_array data values from FITS file into a numpy array. - Read the dimensions of this array into variables xi_size and yi_size, as in program A. - Declare (in other words construct, set up, or initialize) an empty list named source_list. - Declare a tally integer by setting it to zero. This is to contain the total number of peaks detected. - Write now a set of nested loops, very similar to the loops in program A, except now you want, at each loop iteration, a 3 by 3 patch of like_array. You will thus need xi to start at 1 and end at xi_size-2, and similar for yi. - Within these loops, obtain a 3 by 3 slice of like_array from xi-1 to xi+1 and yi-1 to yi+1. - If the value in the central pixel of this 3 by 3 square patch is larger than the value in any of the other 8 pixels, do the following: - Extract the value of like_array at this pixel. I'll call this number like_value for convenience. - Add 1 to the tally integer. 5 - If like_value>= like_cutoff, then append the following dictionary to source_list: {'xi':xi, 'yi':yi, 'like':like_value}. - The final task is to save source_list to a FITS file. You'll have to construct a binary table, with three columns. Use the supplied FITS file sources.fits as a pattern. - Finally, write three keywords to the header of the table: - Write the value of the tally integer to a keyword named 'N_PEAKS'. - Write the value of like_cutoff to a keyword named 'LIKE_CUT'. - Set the name of the table extension to 'SRCLIST' by writing this value to a keyword named 'EXTNAME'. Now run program B. Use a value of 8 for like_cutoff. To ensure full assessment, you should return three computer files to the examiner: comprising two text files, which contain working copies of programs A and B respectively; plus the FITS file which contains your output source list. Hints and reminders: - Don't forget that the first index of a 2-dimensional numpy array is the Y index, not the X index. If you get the order wrong, your errors will probably cancel out in this exercise; but doing so may lose you some assessment. - Be careful with the range() function and with slicing. Remember that range(A,B) for example returns the list [A,A+1,...,B-1]; remember that a slice such as myarray[A:B] returns a sub-array starting at element A and ending at element B-1. - Check that the various parts of your program are functioning properly before (or at least, as well as) testing the whole. - The FITS format strings associated with integer and floating-point values are 'I' and 'E' respectively. - See chapter 2 of the pyfits manual for how to write table data to a new FITS file. - Every FITS file must contain a primary Header-Data Unit (HDU); and its data must be an array, not a table. But an empty primary HDU is perfectly acceptable and can be constructed by calling pyfits.PrimaryHDU() with no arguments. 6