Slide show illustrating motivation and challenges

advertisement
Astro-DISC:
Astronomy and cosmology applications
of distributed super computing
Purpose
A toolkit for massive astronomical and
cosmological computations on large
clusters, which will include software
tools and algorithmic methods.
Universe
The visible universe is large:
• 28 billion light years
• 10 million galaxy superclusters
• 25 billion galaxy clusters
• 350 billion large galaxies
• 7 trillion dwarf galaxies
• 30 sextillion (3 ∙ 1022) stars
Data sets
Sky surveys:
Sloan Digitalof
Skyobjects
Survey (2000–2008):
•• Billions
230 million objects, 50 TByte
(stars, galaxies, …)
• Pan-STARRS (started in 2008):
Half-order of magnitude
• Multiple
imageslarger than Sloan
• Large Synoptic Survey Telescope (2016):
of
the
same object
Order
of magnitude
larger than Sloan
Simulations:
McWilliamsof
Center
at CMU:
•• Billions
objects
Black holes and dark matter,
• Multiple
runs
15B particles, 14 TByte / run
•• Multiple
time
points
LANL Coyote
universe:
1B particles, 1 TByte / run, 30 runs
• Many other projects
Data sets
The sizes of modern survey and
simulation datasets are between
1 and 100 billion objects.
Even larger sets are coming soon.
Their analysis requires
distributed computing.
Astronomers vs. computer scientists
DISC Cloud cluster
at Carnegie Mellon
Sixty-four nodes
Each node:
• Eight 2.83GHz cores
• Four 1TB disks
• 16 GByte memory
10GBit / second network
Specific problems
• Friends of Friends:
Identification of
galaxy clusters
• Correlation functions:
Analyzing distribution of
distances between galaxies
• Spatial matching:
Identification of observed
objects in the catalog
40%
60%
distance
observed
object
catalog
More problems
• Quasar detection:
Identifying quasars based on the
five passband fluxes
• Particle history:
Tracking the history of particles
in astrophysics simulations
Future work
• Distributed computation for other
standard astronomy problems:
Density distribution, photometric calibration,
asteroid detection,...
• General-purpose astronomy toolkit:
Massive spatial indices of celestial objects,
integrated with distributed algorithms
Download