Patrick Ogle and the NED Team
IPAC, California Institute of Technology
A fusion of multi-wavelength extragalactic data from journal articles and large catalogs
2MASS PSC
And much more, including classifications, notes, images, spectra…
• Very Large Catalogs
(VLCs, >10 7 sources)
• Find candidate matches in
NED
• Select best match
– Rule-based
– Statistical analysis
• Match data recorded in DB
• Reversible and iterable
GALEX ASC (NUV) vs. SDSS DR6 (gri, 6’x6’)
• VLC Source and NED Object Positions (RA, Dec, ±)
Source-Object Separation (s, ±σ)
• Source and Object Types
(galaxy, galaxy cluster, star, UV source, etc…)
• Background Object Density (measured for each source)
• Instrumental Beam Size
• Other: redshift, photometry, diameters
NED Pipeline for Very Large Catalogs
• Source Loader
– Load Very Large Catalog (VLC) source names and positions into NED.
• CSearch (PostgreSQL)
– Find match candidates with NED near position search
– Count background objects
– Spatial indexing will speed up search (e.g. Q3C, HTM)
• MatchExpert (python)
– Select best match from CSearch match candidates
– Object associations for no-matches
– Record match statistics for each match
– Match statistic distributions and integrals
– Code migration to DBMS for speed
• Object Loader (PostgreSQL)
– Create NED cross-IDs
– new objects
– associations
Source
Loader
CSearch
MatchEx
Object
Loader
Match
List from
Csearch
S<Scut
Thresholds
Type
Match
P>Pcut
Create NED object and associations
No
Match
S1/S2
<0.33
Error
Circles
Overlap
NED Cross-ID Match
NED dup.
Name
Prefix
Match
Single
Good
Match
• Where a match is not made to a nearby object, an association record may be created.
• Association types:
– Source and object position error circles overlap ( )
– Object is within the beam (PSF) of the source ( )
Error
Circles
Overlap
Create Error Overlap
Association record
No
Match
S<beam
Create In Beam
Association record
GALEX ASC (NUV) vs. NED
NED object
GALEX search region
Background region
SDSS DR6
(g,r,i)
SDSS DR6 (gri, 6’x6’)
• GALEX All-Sky Catalog of ~40 milllion unique
NUV sources created by
M. Seibert (2012)
• Matched against ~180 million NED objects
(2013)
•
•
•
• Search radius: r s
= 7.5″ for GALEX
Background radius: r b
= 46.5″ for GALEX
Density of background NED objects: n = N/(πr
Expected number inside s: <N s
> = N(s/r b
) b
2 )
2 , s = separation
• Poisson probability of x = k objects closer than s:
– P s
(x=k) = <N s
> k exp(-<N s
>)/k!
– For k=0, simplifies to:
P s
(x=0) = exp(-<N s
>) = exp(-N(s/r b
• False-match probability: P f
= 1-P s
(0)
) 2 ) r b
Example:
N = 4, s/r b
= 0.08
P s
(0) = 0.975
P f
= 0.025
s r s
• Optimize on 100K subsample in SDSS region
• False-positive rate decreases with increasing
Poisson cutoff.
• False negative rate increases with Poisson cutoff.
• Give 10x weight to false positives--it’s worse to make an incorrect match than to miss a match.
• Poisson cutoff value of 90% minimizes the combined, weighted error rate.
• 39,570,031 input GALEX ASC UV sources
• NED (2013) contained ~180 million distinct objects
• 10,595,382 (26.8%) of the ASC sources matched NED objects Cross-IDs
• 28,974,649 (73.2%) are not matched new NED objects
– 68.2% of GASC sources are in blank NED fields
– 5.0% have multiple match candidates
Image credit : GALEX
NASA/JPL-Caltech/SSC
GALEX ASC Match Results: Background
Rejection and False-Negative Rate
• Uncorrelated background out to 15 arcsec fit by straight line: dN/ds ~ s
• MatchEx is successful at filtering out this background.
• False-negative rate f n
= 2.4% estimated by comparison to background-subtracted match candidates (red line). false negatives
Separation (arcsec)
GALEX ASC Results: False Positive Rate
• The false-positive match rate is estimated by summing the
Poisson statistic (1-P) over all matches and dividing by the total number of sources : f p
=0.25%
20
15
10
5
GALEX ASC Results: Position Error
Distribution
• The distribution of normalized separation r=s/σ deviates from a Gaussian. The peak is at 0.9 instead of
1.0, and the tail is stronger.
Derivative of a Gaussian
Important Lessons Learned:
1. Do not assume reported catalog position errors are correct.
2. Do not assume position error distributions are Gaussian.
3. A 3.5σ threshold on match separation rejected more candidates than expected.
r=s/σ
• While no color criteria were used to select matches to
GALEX sources, the NUV-g colors of GALEX-SDSS matches were checked:
Most matches have -7<NUV-g<7 • GALEX ASC range: 14<NUV<24
• Detection rate falls at NUV>21.7
• Object Types ordered by candidate match frequency
• Most GALEX sources matched to galaxies (G) and stars (*)
• QSO, Galactic star (!*), UV excess object (UvES), and WD* matches overrepresented, as might be expected for a UV-selected catalog.
• Matches to RadioS, XrayS, GGroup, and GPair candidates were disallowed.
• GALEX ASC photometry added to NED spectral energy distribution of 3C 382 (CGCG 173-014)
• Over 145 million GALEX ASC NUV and FUV photometry records added to NED (2 extraction methods per band)
GALEX ASC: ~40,000,000 UV sources loaded and matched (2013)
GALEX MSC: ~22,000,000 UV sources loaded and matched (2014)
• Spitzer Source List: ~42,000,000 MIR sources (2014)
• 2MASS PSC: ~471,000,000 NIR sources loaded (2015 finish)
• AllWISE: ~748,000,000 MIR sources (2015 start)
• SDSS DR10: ~469,000,000 Vis sources (2015 start)
SDSS DR6: ~154,000,000 Vis sources loaded and matched (out of 217M), excluding sources with undesirable flag values (2008)
NED aims to quadruple its object holdings in the next year!