Image Subtraction or.... Peter Nugent(LBNL/UCB) If I Could Redo Everything Again for PTF, This Is What I Would Do... Peter Nugent(LBNL/UCB) Things to Know Understand the instrument and changes to it - de-trending is key to getting off to a good start: talk to the instrument scientists! NEVER be happy with what you have: Speed/turn-around Types of db queries References Catalogs (stars, galaxies, etc.) You do not need to visit the observatory! I have processed ~1PB of data (20M ccd chips) between Palomar-QUEST and PTF. I did not have to go to the mountain, the mountain came to me... Know what science the collaboration would like to achieve: Try to accommodate everything from start Be flexible enough to adapt mid-way Always look for new scientific opportunities Learn their science Do not mix image subtraction with other parts of pipeline iPTF Summer School PTF Pipeline 50-100 GBs/night iPTF Summer School Image Subtraction There are two types of image subtraction and they should not be confused – ever: Real-Time Goal is to identify transients Photometry should be good, but does not have to be perfect – in principle it can not be Final Photometry Good enough to write a paper on cosmology Strives for perfection Major advantage: You know where the object is... zoom in, pick your calibration stars, make perfect references, etc. iPTF Summer School What is out there hotpants – by Andy Becker High Order Transform of PSF ANd Template Subtraction http://www.astro.washington.edu/users/becker/v2.0/hotpants.html There are a few variants (and you will hear more about one tomorrow) but they all have the same form: Make a reference image Align and convolve with a new image Perform a subtraction Identify the candidates iPTF Summer School hotpants hotpants -inim ${new} –hki -n i -c t -tmplim ${refremap} -outim ${sub} -tu ${template_saturation} -iu ${new_sturation} -tl ${template_lower} -il ${input_lower} -r ${2.5*seeing} -rss ${6.0*seeing} -tni ${refremapnoise} -ini ${newnoise} -imi ${submask} -nsx ${nsx} -nsy ${nsy} hki : verbose output -c t : convolve to template -n i : normalize image nsx & nsy : size of regions within image (128X128 pixels ~ 2.5’) submasks: are key to getting things right (bad pixels kill) I used the standard 3 gaussian & 6 degree polynomial for the kernel. No need to do more or less. iPTF Summer School Reference Ideally the reference comes from one image, contributes no noise in the subtraction, and is of comparable seeing. Nothing is ideal: PTF had a dead chip. Pointing was atrocious, became ~1’ after improvements Took ~3 months to obtain images from each field that could make up a good reference Photometric calibration was USNO B1 catalog! Constantly made an effort to make better reference images during the survey Settled on ~7 images, best seeing (but not undersampled) to make reference on a PTF field/chip basis: depth, area & bad pix. iPTF Summer School New Don’t settle for having the survey forced down your throat, complain when things are going wrong! Demand that fits header keywords are right, say for example the FILTER: this separates you from them Know what the pointing/survey strategy is ahead of time (hitting M31 30 times in one night causes problems if you are not prepared for it) Don’t bother with subtractions when they are not needed (|galactic latitude| < 10) Everything is relative, treat the references as gold for photometric and astrometric calibration. Work out differences with the universe later (HST guide stars, absolute photometric calibration, etc.) iPTF Summer School New- Ref = Sub Reference Image Subtraction moon New Image This will always be a needle in a haystack problem. iPTF Summer School New- Ref = Sub Per image we would have ~250 5-σ detections. We would require 2 independent detections. Up to 300 images taken per night ~ 1000 sq. deg. iPTF Summer School Use Machine Learning to get rid of the crap... Do not attempt to make the perfect subtraction! PTF Sky Coverage References were made for ~20000 sq.deg. in R-band (minimum 7 minutes w/ seeing < 3.0” and limiting magnitude > 19.9). iPTF Summer School NERSC • • • Access though general DOEEdison (N7): Cray XC30 Intel Ivy Bridge w/ 133,824 cores HEP call for Cori (N8) will be one of the first large Intel KNL systems compute time at and will have unique data capabilities. 9,300 single-socket NERSC. nodes with 60 cores per node and burst buffer (NVRAM) 3B cpu hrs / year for the entire memory footprint. Hopper (N6): Cray XE6 Opteron w/ 153,216 cores • NERSC has a Global Filesystem which is viewable from all compute systems (40GB/s). Very high-speed local scratch space on each of the big-irons (168 GB/s) • 240 PB tape archive • Data Transfer nodes using ESnet • Science Gateway and Database nodes for access outside NERSC iPTF Summer School Why NERSC • Why buy the cow, when you get the milk for free? • You always want ~10X the compute you need to run a single night on hand at any time to catch up (network, shutdowns, new refs, etc.) • The subtractions are the source of all complaints, whether they are justified or not. – Where are my fields from last night? – How come it is taking so long to see the subs? – What is my SN/CV/GRB doing now? Thus you don’t want computing to be one of them. NERSC operates 24/7 with staff on-call for issues that come up round the clock. As PTF was special, 100 khrs/yr but real-time, we were granted special privileges. Special queues, db’s, global disk space, etc. On average there are 3-4 shutdowns per year: all moved to full moon since 2009. iPTF Summer School Observatory Pipeline Processing/db Data Transfer Nodes Science Gateway Node 2 Subtractions PTF Collaboration via Web Carver NERSC GLOBAL FILESYSTEM 250TB (170TB used) iPTF Summer School Science Gateway Node 1 PTF db • Chose a Postgres db with q3c for spatial queries • Based on studies comparing Oracle, mysql and postgres • Runs at NERSC on their scidb nodes: 32-core nodes on a ZFS filesystem • This currently houses the iPTF database which has over ~3M images and ~1.5B detections which are queried in realtime 24/7. ZFS is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include protection against data corruption, support for high storage capacities, efficient data compression, integration of the concepts of filesystem and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs. iPTF Summer School q3c Q3C is the plugin for PostgreSQL database, designed for working with large astronomical catalogs or any catalogs of objects on the sphere. Q3C allows you to perform fast circular, elliptical or polygonal searches on the sphere as well as perform fast positional cross-matches and nearest neighbor queries. Similar to htm (Hierarchical Triangular Mesh). The ideas behind Q3C are described in Koposov et al. (2006) iPTF Summer School PTF Database R-band g-band images 1.82M 305k subtractions 1.52M 146k references 29.2k 6.3k Candidates 890M 197M Transients 42945 3120 All in 851 nights. An image is an individual chip (~0.7 sq. deg.) The database reached 1 TB. iPTF Summer School Turn-around What does “real-time” subtractions really mean? 2012-07-06 150 In the last 2 years of PTF, for 95% of the nights all images are processed, subtractions are run, candidates are put into the database and the local universe script is run in < 1hr after observation. Number of subtractions 125 100 75 50 25 Median turn-around is 30m. 0 0 10 20 30 40 50 Minutes from Observation to Candidates in database iPTF Summer School 60 Palomar 48” Telescope 100 TBs of Reference Imaging HPWREN Microwave Relay Computing – I/O Astrometric Solution Reference Image Creation SDSC to ESNET NERSC Data Transfer Node Image Processing / Detrending Image Subtraction Nightly Image Stacking Networking Data Transfer Star/Asteroid Rejection Transient Candidate Real-Bogus ML Screening 500 GB/night Scanning Page Publish to Web Web UI Outside Database for Triggers Marshal iPTF Summer School Wake Me Up – Real Time Trigger Real-Time Trigger 40 Minutes Heavy DB Access 1.5B objects in D Outside Telescope Follow-up Future Surveys ZTF (46 deg.2) Telescope AΩ iPTF/PTF 8.7 DES 11.7 ZTF 42.6 LSST 82.2 iPTF (7.2deg.2) ZTF image processing will be more challenging as the goal will be to do everything even faster and it is 12 times more data. iPTF Summer School Parallel Processing/Subtractions All computers will have many cores, and the same amount of memory, 2+ years from now (10-100). Current pipelines work at the level of one ccd chip per core – this will fail in the future. Need to parallelize all aspects of the pipeline where possible. Threading is easy for most of this, keeping things in memory where possible is ideal: Astrometric catalogs matching Bad pixel masks, CR’s Flats, biases, masks, etc. Asteroid rejection (verification) Comparison with historical transients iPTF Summer School brightness Bottlenecks…crude vs. real 5- data in db time iPTF Summer School Conclusions - Future LSST - 15TB data/night Only one 30-m telescope iPTF Summer School