Data Grids: A New Computational Infrastructure

advertisement
Paul Avery
Oct. 21, 2001
Version 1
Data Grids: A New Computational Infrastructure
for Data Intensive Science
1
Introduction
Twenty-first century scientific and engineering enterprises are increasingly characterized by their geographic dispersion and their reliance on large data archives. These characteristics bring with them unique
challenges. First, the increasing size and complexity of modern data collections require significant investments in information technologies to store, retrieve and analyze them. Second, the increased distribution of
people and resources in these projects has made resource sharing and collaboration across significant geographic and organizational boundaries critical to their success.
Infrastructures known as “Grids”[1] are being developed to address the problem of resource sharing. An
excellent introduction to Grids can be found in the article[2], “The Anatomy of the Grid”, which provides
the following interesting description:
“The real and specific problem that underlies the Grid concept is coordinated resource
sharing and problem solving in dynamic, multi-institutional virtual organizations. The
sharing that we are concerned with is not primarily file exchange but rather direct access
to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resource-brokering strategies emerging in industry, science, and
engineering. This sharing is, necessarily, highly controlled, with resource providers and
consumers defining clearly and carefully just what is shared, who is allowed to share, and
the conditions under which sharing occurs. A set of individuals and/or institutions defined by such sharing rules form what we call a virtual organization (VO).”
The existence of very large distributed data collections adds a significant new dimension to enterprise-wide
resource sharing, and has led to substantial research and development effort on “Data Grid” infrastructures,
capable of supporting this more complex collaborative environment. This work has taken on more urgency
for new scientific collaborations, which in some cases will reach global proportions and share data archives
with sizes measured in dozens or even hundreds of Petabytes within a decade. These collaborations have
recognized the strategic importance of Data Grids for realizing the scientific potential of their experiments,
and have begun working with computer scientists, members of other scientific and engineering fields and
industry to research and develop this new technology and create production-scale computational environments. Figure 1 below shows a U.S. based Data Grid consisting of a number of heterogeneous resources.
Figure 1: A transcontinental Data Grid composed of computational and storage resources of different types linked by highspeed networks.
1
My aim in this paper is to review Data Grid technologies and how they can benefit data intensive sciences.
Developments in industry are not included here, but since most Data Grid work is presently carried out to
address the urgent data needs of advanced scientific experiments, the omission is not a serious one. (The
problems solved while dealing with these experiments will in any case be of enormous benefit to industry
in a short time.) Furthermore, I will concentrate on those projects which are developing Data Grid infrastructures for a variety of disciplines, rather than “vertically integrated” projects that benefit a single experiment or discipline, and explain the specific challenges faced by those disciplines.
2
Data intensive activities
The number and diversity of data intensive projects is expanding rapidly. The following recounting of projects is presented as a survey that, while incomplete, shows the scope of and immense interest in data intensive methods in solving scientific problems.
Physics and space sciences: High energy and nuclear physics experiments at accelerator laboratories at
Fermilab, Brookhaven and SLAC already generate dozens to hundreds of Terabytes of colliding beam data
per year that is distributed to and analyzed by hundreds of physicists around the world to search for subtle
new interactions. Upgrades to these experiments and new experiments planned for the Large Hadron Collider at CERN will increase data rates to Petabytes per year. Gravitational wave searches at LIGO, VIRGO
and GEO will accumulate yearly samples of approximately 100 Terabytes of mostly environmental and
calibration data that must be correlated and filtered to search for rare gravitational events. New multiwavelength all-sky surveys utilizing telescopes instrumented with gigapixel CCD arrays will soon drive
yearly data collection rates from Terabytes to Petabytes. Similarly, remote-sensing satellites operating at
multiple wavelengths will generate several Petabytes of spatial-temporal data that can be studied by researchers to accurately measure changes in our planet’s support systems.
Biology and medicine: Biology and medicine are rapidly increasing their dependence on data intensive
methods. Experiments at new generation light sources have the potential to generate massive amounts of
data while recording the changes in shape of individual protein molecules. Organism genomes are being
sequenced and stored by new generations of sequencing engines in databases and their properties are compared using new statistical methods requiring massive computational power. Proteomics, the study of protein structure and function, is expected to generate enormous amounts of data, easily dwarfing the data
samples obtained from genome studies. In medicine, a single three dimensional brain scan can generate a
significant fraction of a Terabyte of data, while systematic adoption of digitized radiology scans will produce dozens of Petabytes of data that can be quickly accessed and searched for breast cancer and other diseases. Exploratory studies have shown the value of converting patient records to electronic form and attaching digital CAT scans, X-Ray charts and other instrument data, but systematic use of such methods
would generate databases many Petabytes in size. Medical data pose additional ethical and technical challenges stemming from exacting security restrictions on access to this data and patient identification.
Computer simulations: Advances in information technology in recent years have given scientists and engineers the ability to develop sophisticated simulation and modeling techniques for improved understanding of the behavior of complex systems. When coupled to the huge processing power and storage resources
available in supercomputers or large computer clusters, these advanced simulation and modeling methods
become tools of rare power, permitting detailed and rapid studies of physical processes while sharply reducing the need to conduct lengthy and costly experiments or to build expensive prototypes. The following
examples provide a hint of the potential of modern simulation methods. High energy and nuclear physics
experiments routinely generate simulated datasets whose size (in the multi-Terabyte range) is comparable
to and sometimes exceeds the raw data collected by the same experiment. Supercomputers generate enormous databases from long-term simulations of climate systems with different parameters that can be compared with one another and with remote satellite sensing data. Environmental modeling of bays and estuaries using fine-scale fluid dynamics calculations generates massive datasets that permit the calculation of
pollutant dispersal scenarios under different assumptions that can be compared with measurements. These
projects also have geographically distributed user communities who must access and manipulate these databases.
Physics at the Large Hadron Collider: Although I briefly discussed high energy physics earlier, the requirements for experiments at CERN’s Large Hadron Collider (LHC), due to start operations in 2006, are
2
so extreme as to merit a separate treatment. LHC experiments face computing challenges of unprecedented
scale in terms of data volume and complexity, processing requirements and the complexity and distributed
nature of the analysis and simulation tasks among thousands of scientists worldwide. Every second, each
of the two general purpose detectors will filter one billion collisions and record one hundred of them to
mass storage, generating data rates of 100 Mbytes per second and several Petabytes per year of raw, processed and simulated data in the early years of operation. The data storage rate is expected to grow in response to the pressures of increased beam intensity, additional physics processes that must be recorded and
better storage capabilities, leading to LHC data collections totaling approximately 100 Petabytes by the end
of this decade, and up to an Exabyte (1000 Petabytes) by the middle of the following decade. The challenge facing the LHC laboratory and scientific community is how to build an information technology infrastructure that will provide these computational and storage resources while enabling their effective and
efficient use by a scientific community of thousands of physicists spread across the globe.
3
Data Grids and Data Intensive Sciences
To develop the argument that Data Grids offer a comprehensive solution to data intensive activities, I first
summarize some general features of Grid technologies. These technologies comprise a mixture of protocols, services, and tools that are collectively called “middleware”, reflecting the fact that they are accessed
by “higher level” applications or application tools while they in turn invoke processing, storage, network
and other services at “lower” software and hardware levels. Grid middleware includes security and policy
mechanisms that work across multiple institutions; resource management tools that support access to remote information resources and simultaneous allocation (“co-allocation”) of multiple resources; general
information protocols and services that provide important status information about hardware and software
resources, site configurations, and services; and data management tools that locate and transport datasets
between storage systems and applications.
The diagram in Figure 2 outlines in a simple way the roles played by
various Grid technologies. The lowest level Fabric contains shared
resources such as computer clusters, data storage systems, catalogs,
networks, etc. that Grid tools must access and manipulate. The Resource and Connectivity layers provide, respectively, access to individual resources and the communication and authentication tools
needed to communicate with them. Coordinated use of multiple
resources – possibly at different sites – is handled by Collective protocols, APIs, and services. Applications and application toolkits
utilize these Grid services in myriad ways to provide “Grid-aware”
services for members of a particular virtual organization. A much
more detailed explication of Grid architecture can be found in reference [2].
Application
Collective
Resource
Connectivity
While standard Grid infrastructures provide distributed scientific
Fabric
communities the ability collaborate and share resources, additional
capabilities are needed to cope with the specific challenges associated with scientists accessing and manipulating very large distributed Figure 2: Diagram showing how
data collections. These collections, ranging in size from Terabytes Grid services in different levels
to Petabytes, comprise raw (measured) and many levels of processed provide applications access to reor refined data as well as comprehensive metadata describing, for sources (Fabric).
example, under what conditions the data was generated or collected,
how large it is, etc. New protocols and services must facilitate access to significant tertiary (e.g., tape) and
secondary (disk) storage repositories to allow efficient and rapid access to primary data stores, while taking
advantage of disk caches that buffer very large data flows between sites. They also must make efficient use
of high performance networks that are critically important for the timely completion of these transfers.
Thus to transport 10 Terabytes of data to a computational resource in a single day requires a 1 Gigabit per
second network operated at 100% utilization. Efficient use of these extremely high network bandwidths
also requires special software interfaces and programs that in most cases have yet to be developed.
The computational and data management problems encountered in these experiments include the following
challenging aspects:
3

Computation-intensive as well as data-intensive: Analysis tasks are compute-intensive and dataintensive and can involve thousands of computer, data handling, and network resources. The central problem is coordinated management of computation and data, not just data curation and
movement.

Need for large-scale coordination without centralized control: Rigorous performance goals require
coordinated management of numerous resources, yet these resources are, for both technical and
strategic reasons, highly distributed and not always amenable to tight centralized control.

Large dynamic range in user demands and resource capabilities: It must be possible to support and
arbitrate among a complex task mix of experiment-wide, group-oriented, and (perhaps thousands
of) individual activities—using I/O channels, local area networks, and wide area networks that span
several distance scales.

Data and resource sharing: Large dynamic communities would like to benefit from the advantages
of intra and inter community sharing of data products and the resources needed to produce and
store them.
The “Data Grid” has been introduced as a unifying concept to describe the new technologies required to
support such next-generation data-intensive applications—technologies that will be critical to future dataintensive computing in the many areas of science and commerce in which sophisticated software must harness large amounts of computing, communication and storage resources to extract information from data.
Data Grids are typically characterized by the following elements: (1) they have large extent (national and
even global) and scale (many resources and users); (2) they layer sophisticated new services on top of existing local mechanisms and interfaces, facilitating coordinated sharing of remote resources; and (3) they
provide a new dimension of transparency in how computational and data processing are integrated to provide data products to user applications. This transparency is vitally important for sharing heterogeneous
distributed resources in a manageable way, a point to which I will return in the next section.
4
Major Data Grid Efforts Today
I describe in this section the major projects that have been undertaken to develop Data Grids. Without exception, these efforts have been driven by the extreme needs of current and planned scientific experiments,
and have led to fruitful collaborations between application scientists and computer scientists. High energy
physics has been at the forefront of this activity, a fact that can be attributed to its historical standing as
both a highly data intensive and highly collaborative discipline. It is widely recognized, however, that existing high energy physics computing infrastructures are not scalable to upgraded experiments at SLAC,
Fermilab and Brookhaven, and new experiments at the LHC, that will generate Petabytes of data per year
and be analyzed by global collaborations. Participants in other planned experiments in nuclear physics,
gravitational research, large digital sky surveys, and virtual astronomical observatories, fields which also
face challenges associated with massive distributed data collections and a dispersed user community, have
also decided to adopt Data Grid computing infrastructures. Scientists from these and other disciplines are
exploiting new initiatives and partnering with computer scientists and each other to develop productionscale Data Grids.
A tightly coordinated set of projects has been established that together are developing and applying Data
Grid concepts to problems of tremendous scientific importance, in such areas as high energy physics, nuclear physics, astronomy, bioinformatics and climate science. These projects include (1) the Particle Physics Data Grid (PPDG [3]), which is focused on the application of Data Grid concepts to the needs of a
number of U.S.-based high energy and nuclear physics experiments; (2) the Earth System Grid project [4],
which is exploring applications in climate and specific technical problems relating to request management;
(3) the GriPhyN [5] project, which plans to conduct extensive computer science research on Data Grid
problems and develop general tools that will support the automatic generation and management of derived,
or “virtual”, data for four leading experiments in high energy physics, gravitational wave searches and astronomy; (4) the European Data Grid (EDG) project, which aims to develop an operational Data Grid infrastructure supporting high energy physics, bioinformatics, and satellite sensing; (5) the TeraGrid [6], which
will provide a massive distributed computing and data resource connected by ultra-high speed optical networks; and (6) the International Virtual Data Grid Laboratory (iVDGL [7]), which will provide a world-
4
wide set of resources for Data Grid tests by a variety of disciplines. These projects have adopted the Globus [8] toolkit for their basic Grid infrastructure to speed the development of Data Grids. The Globus directors have played a leadership role in establishing a broad national—and indeed international—consensus
on the importance of Data Grid concepts and on specifics of a Data Grid architecture. I will discuss these
projects in varying levels of detail in the rest of this section while leaving coordination and common architecture issues for the following section.
4.1
Globus-Based Infrastructure
The Globus Project [8], a joint research and development effort of Argonne National Laboratory, the Information Sciences Institute of the University of Southern California, and the University of Chicago, has
over the past several years developed the most comprehensive and widely used Grid framework available
today. The widespread adoption of Globus technologies is due in large measure to the fact that its members
work closely with a variety of scientific and engineering projects while maintaining compatibility with other computer science toolkits used by these projects. Globus tools are being studied, developed, and enhanced at institutions worldwide to create new Grids and services, and to conduct computing research.
Globus components provide the capabilities to create “Grids” of computing resources and users; track the
capabilities of resources within a grid; specify the resource needs of user’s computing tasks; mutually authenticate both users and resources; and deliver data to and from remotely executed computing tasks. Globus is distributed in a modular and open “toolkit” form, facilitating the incorporation of additional services
into scientific environments and applications. For example, Globus has been integrated with technologies
such as the Condor high-throughput computing environment [9,10] and the Portable Batch System (PBS)
job scheduler [11]. Both of these integrations demonstrate the power and value of the open protocol toolkit
approach. Experience with projects such as the NSF’s National Technology Grid [12], NASA’s Information Power Grid [13] and DOE ASCI’s Distributed Resource Management Project [14] have provided
considerable experience with the creation of production infrastructures.
4.2
Earth System Grid
4.3
The TeraGrid
The TeraGrid Project [6] was recently funded by the National Science Foundation for $53M over 3 years to
construct a distributed supercomputing facility at four sites: the National Center for Supercomputing Applications [15] in Illinois, the San Diego Supercomputing Center, Caltech’s Center for Advanced Computational Research and Argonne National Laboratory. The project aims to build and deploy the world's largest, fastest, most comprehensive, distributed infrastructure for open scientific research. When completed,
the TeraGrid will include 13.6 teraflops of Linux cluster computing power distributed at the four TeraGrid
sites, facilities capable of managing and storing more than 450 terabytes of data, high-resolution visualization environments, and toolkits for Grid computing. These components will be tightly integrated and connected through an optical network that will initially operate at 40 gigabits per second and later be upgraded
to 50-80 gigabits/second—an order of magnitude beyond today's fastest research network. TeraGrid aims
to partner with other Grid projects and has active plans to help several discipline sciences deploy their applications, including dark matter calculations, weather forecasting, biomolecular electrostatics, and quantum molecular calculations [16].
4.4
Particle Physics Data Grid
The Particle Physics Data Grid (PPDG) is a collaboration of computer scientists and physicists from six
experiments who plan to develop, evaluate and deliver Grid-enabled tools for data-intensive collaboration
in particle and nuclear physics. The project has been funded by the U.S. Department of Energy since 1999
and was recently funded for over US$3.1M in 2001 (funding is expected to continue at a similar level for
2002-2003) to establish a “collaboratory pilot” to pursue these goals. The new three-year program will
exploit the strong driving force provided by currently running high energy and nuclear physics experiments
at SLAC, Fermilab and Brookhaven together with recent advances in Grid middleware.
Novel mechanisms and policies will be vertically integrated with Grid middleware and experiment specific
applications and computing resources to form effective end-to-end capabilities. Our goals and plans are
guided by the immediate, medium-term and longer-term needs and perspectives of the LHC experiments
5
ATLAS [17] and CMS [18] that will run for at least a decade from late 2005 and by the research and development agenda of other Grid-oriented efforts. We exploit the immediate needs of running experiments –
BaBar [19], D0 [20], STAR [21] and Jlab [22] experiments – to stress-test both concepts and software in
return for significant medium-term benefits. PPDG is actively involved in establishing the necessary coordination between potentially complementary data-grid initiatives in the US, Europe and beyond.
The BaBar experiment faces the challenge of data volumes and analysis needs planned to grow by more
than a factor 20 by 2005. During 2001, the CNRS-funded computer center at CCIN2P3 Lyon, France will
join SLAC in contributing data analysis facilities to the fabric of the collaboration. The STAR experiment
at RHIC has already acquired its first data and has identified Grid services as the most effective way to
couple the facilities at Brookhaven with its second major center for data analysis at LBNL. An important
component of the D0 fabric is the SAM [23] distributed data management system at Fermilab, to be linked
to applications at major US and international sites. The LHC collaborations have identified data-intensive
collaboratories as a vital component of their plan to analyze tens of petabytes of data in the second half of
this decade. US CMS is developing a prototype worldwide distributed data production system for detector
and physics studies.
4.5
GriPhyN
GriPhyN is a large collaboration of computer scientists and experimental physicists and astronomers who
aim to provide the information technology (IT) advances required for Petabyte-scale data intensive sciences. Funded by the National Science Foundation for US$11.9M for 2000-2005, the project requirements are
driven by four forefront experiments: the ATLAS [17] and CMS [18] experiments at the LHC, the Laser
Interferometer Gravitational Wave Observatory (LIGO [24]) and the Sloan Digital Sky Survey (SDSS
[25]). These requirements, however, easily generalize to other sciences as well as 21 st century commerce,
so the GriPhyN team is pursuing IT advances centered on the creation of Petascale Virtual Data Grids
(PVDG) that meet the data-intensive computational needs of a diverse community of thousands of scientists spread across the globe.
GriPhyN has adopted the concept of virtual
Major Archive
data as a unifying theme for its investigations
Facilities
of Data Grid concepts and technologies. This
term is used to refer to two related concepts:
transparency with respect to location as a
means of improving access performance, with
respect to speed and/or reliability, and transparency with respect to materialization, as a
means of facilitating the definition, sharing,
and use of data derivation mechanisms. These
characteristics combine to enable the definiNetwork caches &
tion and delivery of a potentially unlimited
regional centers
virtual space of data products derived from
other data. In this virtual space, requests can
be satisfied via direct retrieval of materialized
products and/or computation, with local and
global resource management, policy, and security constraints determining the strategy used.
Local
The concept of virtual data recognizes that all
sites
except irreproducible raw experimental data
?
need ‘exist’ physically only as the specification for how they may be derived. The grid
Figure 3: Virtual data in action. In this example, a
may instantiate zero, one, or many copies of
data request is satisfied by data from a major archive
derivable data depending on probable demand
facility, data from one regional center and computaand the relative costs of computation, storage,
tion on data at a second regional center, plus both data
and transport. On a much smaller scale, this
and computation from the local site and neighbor.
dynamic processing, construction, and delivery
of data is precisely the strategy used to generate much, if not most, of the web content delivered in response to queries today.
6
Figure 3 illustrates what the virtual data grid concept means in practice. Consider an astronomer using
SDSS to investigate correlations in galaxy orientation due to lensing effects by intergalactic dark matter
[26,27,28]. A large number of galaxies—some 107— must be analyzed to get good statistics, with careful
filtering to avoid bias. For each galaxy, the astronomer must first obtain an image, a few pixels on a side;
process it in a computationally intensive analysis; and store the results. Execution of this request involves
virtual data catalog accesses to determine whether the required analyses have been previously constructed.
If they have not, the catalog must be accessed again to locate the applications needed to perform the transformation and to determine whether the required raw data is located in network cache, remote disk systems,
or deep archive. Appropriate computer, network, and storage resources must be located and applied to access and transfer raw data and images, produce the missing images, and construct the desired result. The
execution of this single request may involve thousands of processors and the movement of terabytes of data
among archives, disk caches, and computer systems nationwide.
Virtual data grid technologies will be of immediate benefit to numerous other scientific and engineering
application areas. For example, NSF and NIH fund scores of X-ray crystallography labs that together are
generating Petabytes of molecular structure data each year. Only a small fraction of this data is being
shared via existing publication mechanisms. Similar observations can be made concerning long-term seismic data generated by geologists, data synthesized from studies of the human genome database, brain imaging data, output from long-duration, high-resolution climate model simulations, and data produced by
NASA’s Earth Observing System. In order to realize these concepts, GriPhyN is conducting research into
virtual data cataloging, execution planning, execution management, and performance analysis issues (see
Figure 4). The results of this research, and other relevant technologies, are developed and integrated to
form a Virtual Data Toolkit (VDT). Successive VDT releases will be applied and evaluated in the context
of the four partner experiments. VDT 1.0 was released in October 2001 and the next release is expected
early in 2002.
Production Team
Individual Investigator
Other Users
Interactive User Tools
Request Planning and
Scheduling Tools
Virtual Data Tools
Resource
Resource
Management
Management
Services
Services
Request Execution
Management Tools
Security and
Security and
Policy
Policy
Services
Services
Other Grid
Other Grid
Services
Services
Transforms
Raw data
source
Distributed resources
(code, storage,
computers, and network)
Figure 4: A Petascale Virtual Data Grid, showing different
kinds of users accessing Grid resources and services
4.6
European Data Grid
4.7
International Virtual Data Grid Laboratory
7
5
Common Infrastructure
Given the international nature of the experiments participating in these projects (some of them, like the
LHC experiments, are participating in several projects), There is widespread recognition by scientists in
these projects of the importance of developing common protocols and tools to enable inter-Grid operation
avoid costly duplication. Scientists from several of these projects are
6
Summary
Data Grid technologies embody entirely new approaches to the analysis of large data collections, in which
the resources of an entire scientific community are brought to bear on the analysis and discovery process,
and data products are made available to all community members, regardless of location. Large interdisciplinary efforts recently funded in the U.S. and EU have begun research and development of the basic technologies required to create working Data Grids. Over the coming years, they will deploy, evaluate, and
optimize Data Grid technologies on a production scale, and integrate them into production applications.
The experience gained with these new information infrastructures, providing transparent managed access to
massive distributed data collections, will be applicable to large-scale data-intensive problems in a wide
spectrum of scientific and engineering disciplines, and eventually in industry and commerce. Such systems
will be needed in the coming decades as a central element of our information-based society.
References
1 Formerly known by the name “Computational Grid”, the term “Grid” reflects the fact that the resources
to be shared may be quite heterogeneous and have little to do with computing per se.
2 I. Foster, C. Kesselman, S. Tuecke, “The Anatomy of the Grid: Enabling Virtual Scalable Organizations”, International Journal of High Performance Computing Applications, 15(3), 200-222, 2001,
http://www.globus.org/anatomy.pdf.
3 PPDG home page, http://www.ppdg.net/.
4 Earth System Grid homepage, http://www.earthsystemgrid.org/.
5 GriphyN Project home page, http://www.griphyn.org/.
6 TeraGrid home page, http://www.teragrid.org/.
7 International Virtual Data Grid Laboratory home page, http://www.ivdgl.org/.
8 Globus home page, http://www.globus.org/.
9 Livny, M. High-Throughput Resource Management. In Foster, I. and Kesselman, C. eds. The Grid:
Blueprint for a New Computing Infrastructure, Morgan Kaufmann, 1999, 311-337.
10 Moore, R., Baru, C., Marciano, R., Rajasekar, A. and Wan, M. Data-Intensive Computing, in Foster, I.
and Kesselman, C. eds. The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann,
1999, 105-129.
11 Johnston, W.E., Gannon, D. and Nitzberg, B., Grids as Production Computing Environments: The Engineering Aspects of NASA's Information Power Grid. In Proc. 8th IEEE Symposium on High Performance Distributed Computing, 1999, IEEE Press.
12 Stevens, R., Woodward, P., DeFanti, T. and Catlett, C. From the I-WAY to the National Technology
Grid. Communications of the ACM, 40(11):50-61. 1997.
13 Information Power Grid home page, http://www.ipg.nasa.gov/.
14 Beiriger, J., Johnson, W., Bivens, H., Humphreys, S. and Rhea, R., Constructing the ASCI Grid. In
Proc. 9th IEEE Symposium on High Performance Distributed Computing, 2000, IEEE Press.
15 NCSA home page at http://www.ncsa.edu/.
16 These applications are described more fully at http://www.teragrid.org/about_faq.html.
8
17 The ATLAS Experiment, A Toroidal LHC ApparatuS, http://atlasinfo.cern.ch/Atlas/Welcome.html
18 The CMS Experiment, A Compact Muon Solenoid, http://cmsinfo.cern.ch/Welcome.html
19 BaBar, http://www.slac.stanford.edu/BFROOT
20 D0, http://www-d0.fnal.gov/
21 STAR, http://www.star.bnl.gov/
22 Jlab experiments, http://www.jlab.org/
23 SAM, http://d0db.fnal.gov/sam/
24 LIGO home page, http://www.ligo.caltech.edu/.
25 SDSS home page, http://www.sdss.org/.
26 Fischer, P., McKay, T.A., Sheldon, E., Connolly, A., Stebbins, A. and the SDSS collaboration: Weak
Lensing with SDSS Commissioning Data: The Galaxy-Mass Correlation Function To 1/h Mpc, Astron.J. in press, 2000.
27 Luppino, G.A. and Kaiser, N., Ap.J. 475, 20, 1997.
28 Tyson, J.A., Kochanek, C. and Dell’Antonio, I.P., Ap.J., 498, L107, 1998.
9
Download