The DOSAR VO Statement of Work

advertisement
The DOSAR VO Statement of Work
Introduction
DOSAR (Distributed Organization for Scientific and Academic Research) is a 'grassroots' grid organization which evolved from the Southern Analysis Region consortium of
D0 institutions as that group expanded beyond the realms of both D0 and the southern
region of the US.
DOSAR is involved in many aspects of 'community' grid activities, such as deploying
desktop computer clusters and generic computing resources for grid computing. Ongoing
grid computing activities and development at DOSAR institutions are listed below by
institution. This work has been supported by left over funds and in the spare research
time of the researchers involved. Following each description of institutional OSG-related
activities is a section describing needs for additional funding.
Institution: Louisiana Tech University
Activities: Integration Testbed and Production
At Louisiana Tech University, both Integration Testbed and Production work are
conducted. The Integration work is spearheaded by computer scientist Box Leangsuksun,
a well-know expert in the area of High Availability grid computing. Leangsuksun,
assisted by Physicist Dick Greenwood, and their students, is applying techniques
successfully developed for the HA-OSCAR software to the OSG software stack. In order
to eliminate single-point-failures, the following features will be incorporated in the OSG
software: self-healing; failure detection; recovery; automatic fail-over; and automatic
fail-back. The production work, directed by Dick Greenwood, presently includes
SAMGrid-OSG MC Production at LTU and at the Center for Computation and
Technology (CCT) at Louisiana State University (LSU) via the Louisiana Optical
Network Initiative (LONI). Starting in January, 2007, D0 data reprocessing has begun on
250 new Dell processors on LONI grid using SAMGrid-OSG, and later in the Spring,
2007, ATLAS production will begin on the LONI grid using the OSG stack.
Barriers to Success:
For the production work, the institution has relied on a large fraction of the time of the
departmental system administrator. The institution requires additional technical and
manpower support for these activities, which could be fulfilled with support for two
graduate students at the level of $15 k/year each, or by a dedicated grid system manager
at the level of $45k/year. In addition, participation in OSG related activities has resulted
in additional travel demands beyond that covered by the institution PI’s base grant. We
estimate these additional travel expenses to be around $5k/year.
Institution: Sao Paulo
Activities: Tier 2 Production and D0 Reprocessing
Another one of the D0 Reprocessing sites employing SAMGrid-OSG is SPRACE, the
grid computing center at Sao Paulo, Brazil. SPRACE is an active OSG CMS Tier2
associated with FNAL Tier1.
Barriers to Success:
Institution: Iowa State University
Activities: Production and Cluster Development
At Iowa State University (Jim Cochran and Charles Zaruba), a cluster is being developed
to do ATLAS production. Also, there is an ongoing effort to install MS Virtual Server
(similar to VMware) on about 50 departmental Windows machines to allow Linux batch
operation under Condor when these units are otherwise idle.
Barriers to Success:
Institution: UT-Arlington
Activities: Tier 2 Production and Monitoring Software Development
At UT Arlington, Kaushik De is co-leading the production software development for
ATLAS and Jae Yu is directing an effort to produce MonALISA based ATLAS
distributed analysis monitoring and a monitoring for data handling, in addition to the
local replica service in data handling for distributed analyses.UT Arlington and the
University of Oklahoma (OU) are co-hosts of the ATLAS Southwest Tier 2 computing
center. Jae Yu is also the associate director of the HiPCAT organization in the state of
Texas in which he is involved in a broader, interdisciplinary grid computing activities as
a part of DOSAR missions.
Barriers to Success:
A $50k/year level of support for a highly qualified CSE Ph.D. student is needed to
investigate and specify a longer term development project for a non-invasive system for
Linux application running on Windows.
Institution: The University of Oklahoma
Activities: OSG Integration and Production, ATLAS Tier 2 Production,
D0 Production, Campus/Community Grid, and Cyber Infrastructure
(CI)
At OU, a Condor pool has been created that currently consists of about 200 student lab
PCs, but is scheduled to increase to 750 by the spring. It has the OSG middleware stack
installed, and is already being used by OSG VOs (including DOSAR and D0) and local
OU users. Non High Energy Physics science being conducted on that Condor pool
includes materials science (nano technology) and data storage research (data encoding for
storage media). The project recently got additional funding by a CI-TEAM grant, which
will enable us to more efficiently incorporate the already existing campus Condor pool
into both research and education. The OUHEP group has also created a Condor pool
using desktop computers dedicated to High Energy Physics, which is being used for OSG
integration work, as well as both ATLAS and D0 production, and is currently also
serving as a data storage cache for the SAMGrid-OSG interface for the entire D0
collaboration.
Barriers to Success:
In order to be able to successfully continue all grid computing related
work at OU, we need at least one more full time person to work on this,
at a level of $45k/year (?), plus additional travel funds to attend
various grid computing and software meetings at $5k/year.
Download