Weak Lensing Data Analysis Challenges GRavitational lEnsing Accuracy Testing Thomas Kitching

Weak Lensing Data Analysis Challenges
GRavitational lEnsing Accuracy Testing
Thomas Kitching
Weak lensing: Qualitative
Foreground galaxies
Weak Lensing: Quantitative
 Matrix distortion of each galaxy image
The Promise and the Problem
 Weak lensing has the potential to become one of the most
powerful cosmological probes
PS1 Weak Lensing
 Dark Energy Task Force
 “the method with the greatest
for constraining dark
Planck CMB
 Have a “wall” of data arriving
(next 5 years and beyond)
 Pan-STARRS (20,000)
 DES (5000)
9 galaxies
 Euclid (20,000), ≈3x10
Small Effects, Big Issues
 Weak lensing has inherent systematic effect
 Instrumental, theoretical, astrophysical
 However these can all be potentially removed
 Intrinsic alignments
 Photometric redshift errors
 Shape measurement systematics
 The bias in the shape measurement needs to be of
order 10-3
 For Euclid/LSST-like need Q=10-4/MSE <~ 1000
 Current methods have Q ~> 100
Why is shape measurement hard?
 Galaxies are not circles or ellipses! (complex shapes)
 Galaxy orientations may align during formation
 Intrinsic alignments
 Telescope and atmosphere convolve image
 point spread function (psf)
 spatially varying
 time varying
 CCD responsivity, cosmic rays, meteors, unresolved
sources, variable atmosphere, saturated stars
 Pixelisation of images (~sum of light over pixel)
 Partial and patchy sky coverage
 We don’t have galaxy distances only uncertain redshifts
Typical galaxy
used for cosmic
shear analysis
Typical star
Used for finding
Convolution kernel
Gravitational Lensing
Galaxies seen through dark matter distribution
analogous to light seen through your bathroom window
Cosmic Lensing
Real data:
Atmosphere and Telescope
Convolution with kernel
Real data: Kernel size ~ Galaxy size
Sum light in each square
Real data: Pixel size ~ Kernel size /2
Mostly Poisson. Some Gaussian and bad pixels.
Uncertainty on total light ~ 5 per cent
 Weak lensing community set a series of blind
challenges 2004-2008
 Constant shear, PSF unknown, object detection
 Shear TEsting Programme (STEP)
 STEP 1
 simplistic galaxy shapes (Heymans et al 2005)
 STEP 2
 more realistic galaxies (Massey et al 2006)
 STEP 3
 difficult (space) PSF’s (Rhodes et al 2009)
Variance with
PSF Type
But for the
future we
require 0.03%
Heymans et al 2005
→ Results on
current data
are reliable
Community Methods
 Two broad classes
 Model independent
 Kaiser Squires & Broadhurst 1995 (KSB) Q ~ 10
– Quadrupole moments of image
– Works surprisingly well despite many flaws
 Model fitting
 Shapelets Q ~ 10-50
Refregier, Bacon, Bernstein, Jarvis, Nakajima et al
A basis set for galaxies
“Quantum Mechanics” inspired basis set
Polynomials times a Gaussian (Laguerre polynomials)
 Lensfit v1.0 (T. Kitching, L. Miller) Q >~120
Fits realistic galaxy shapes (exponential profiles)
Bayesian estimator to remove bias
Works on individual exposures (PSFs), combines optimally
Currently best method that works on individual galaxies
More than good enough for PS1-2, DES, KIDs, CFHTS
Beyond STEP
 Pressure on Time
 Large volume of data is imminent
 Pressure on Resources
 Weak lensing community is relatively small
 There are many unsolved weak lensing problems
 Must prepare for the data wall
 Shape measurement
 There should exist an optimal method(s) to measure shear to the
required accuracy
 Statistical inference and image processing problem (no cosmology or
astronomy required)
 Bring in people from outside weak lensing and astronomy who are
 We want to
 Motivate and excite people about weak lensing and cosmology
 Gravitational lEnsing Accuracy Testing 2008
 Open up the shape measurement problem to
 The computational learning
 Statistical inference
 Wider astronomical community
 The 2008 PASCAL challenge
 EU network of computational learning community
 Set it as a PASCAL challenge
 Formulate the problem
 Back to basics - constant shear, PSF Known
 Be open and transparent about everything
 No astronomy (!)
 No jargon
PIs : S. Bridle, J. Shaw-Taylor
Star images
Convolution kernel
(as function of position)
Galaxy images
Shear estimate per galaxy
Apply statistic
e.g. correlation function
Predict statistic from
cosmological theory
Goodness of fit
e.g. χ2
Dark energy, cosmology
The challenge
 Run as a competition
 October 2008 to April 2009
 Results submitted to a live leader board
 Users downloaded 150 Gb of simulated images
 Analysed 30 million galaxy images
Accuracy and Speed are issues
Similar in statistical (not actual) scale to large surveys
Achievable accuracy matched to Euclid/LSST
Q=1000 possible
Each star, galaxy
placed ~in centre
of a separate image
→No overlapping objects
→No object detection question
Told which are stars
→No classification question
PSF same for set of images
Shear same for set of images
Should be enough to
Find g to 0.03% !
GREAT08 Results (Bridle et al., 2009 in prep)
GREAT08 Results (Bridle et al., 2009 in prep)
Present and Future
 Weak lensing is maturing into potentially the most powerful
cosmological probe
 Present methods are currently good enough and may improve
 But need much higher accuracy in the future
 Improvement by a factor of two
 Q=1000 accuracy achieved in some regimes
 However winners use of some unrealistic aspects (stacking)
 Need a roadmap of staged simulated challenges so that we
reach required accuracy
step 1/2/3
 Only 4 more challenges to solve problem (2009=half way)
 GREAT10 (PIs: T. Kitching; CoIs: A. Amara, A. Storkey, S. Bridle)
 More realistic & more matched to CompSci
 Power spectrum of shear varying across the field
 PSF/convolution kernel must be determined
 Object detection
 Challenge will be launched End 2009 / Jan 2010
 PASCAL2 Challenge
 Weak lensing can be one of the most powerful cosmological
 The shape measurement problem
 OK for now DES/PS1
 NOT solved for Euclid/LSST
 This is a computational problem
 GREAT challenges ideally matched to e-Science theme
 Many other e-Science matched problems in weak lensing
 Photo-z’s, Spectra
 Parameter Estimation
 PetaByte Simulations (ref Andy T talk)