Cloud computing services for science e

advertisement
Description of Work
1/8
Cloud services for science e-Infrastructure in 2011
OBJECTIVE
The main objectives are to implement and deploy cloud computing and cloud storage
services for Norwegian and Swedish research groups and to disseminate knowledge that
enables the participating centers and end-users to build competence on supporting and
maintaining cloud-based services.
1. Background
In 2010, UNINETT Sigma and SNIC participated in the NDGF project “NEON” – a fully
Nordic collaboration, coordinated by NDGF through SNIC/KTH - to assess the feasibility,
ease-of-use and cost efficiency of cloud computing technology in the Nordic realm. The
findings of this project are:
• Private cloud technology is not mature enough yet to provide a transparent user
experience. It is expected that this will not be the case until earliest mid 2012
• Public cloud technology is mature enough but lacks certain features that will be
necessary in order to include cloud resources in a transparent manner in a national
infrastructure (e.g. quota management)
• Public clouds are competitive in the low end for non-HPC jobs (low memory, low
number of cores) on price.
• A significant fraction (ca. 20%) of the jobs running on the current supercomputer
infrastructure is potentially suitable for cloud environments. This holds in particular
for single-threaded or single-node jobs with small/medium memory requirements.
• There is sometimes a backlog of “real” supercomputer jobs that suffers from the
non-HPC jobs on the supercomputer infrastructure.
• Available storage capacity is not accessible in a user friendly way; most storage
clouds are only accessible via programmable interfaces.
A pilot “cloud-backed storage” service was developed as part of Notur/NorStore’s
contribution to the NEON project. This service gives users access to cloud storage via
either a web interface or network drive (using the WebDAV protocol). The storage can
have a quota per user, and can use multiple storage back-ends. In 2010, an Amazon S3
and a local disk back-end were developed. The aim is to provide end-users a lowthreshold interface to cloud storage resources to manage and share scientific data and
eventually build community-specific services with the assistance of selected user groups.
SNIC established a multi-site private cloud (Eucalyptus) that is API compatible with the
Amazon public cloud. The objective was to monitor the maturity of private cloud solutions
to determine their suitability as a low-cost alternative once they reach the same maturity
level as public clouds like Amazon.
2. Strategy
In the proposed activity, the national computing infrastructures in Norway and Sweden will
build on the results of the NEON activity in 2010. The infrastructures will investigate the
migration of scientific tasks from Notur/SNIC resources to cloud environments in 2011. As
the public cloud has a low threshold for entry-level user groups and communities, another
Description of Work
2/8
aim is to reach user groups and communities for which the compute and storage
resources in the present national e-Infrastructures are difficult to use.
To achieve these objectives, the national infrastructures will be offering cloud computing
services and storage services on a small to medium size scale in a public cloud. Both
computing and storage services will be gradually made available to a selected group of
end users, following the recommendations of the NEON project. Notur and SNIC will work
closely with a limited number of projects to get the services deployed on top of the public
cloud. This will lead to organizational and operational experience. It should also bootstrap
and organize user communities that are able to share computational and data cloud
environments.
The focus in the proposed activity lies strongly on providing end users a low-threshold
interface to both cloud computing and cloud storage.
This is achieved by:
1. Funding: UNINETT Sigma and SNIC cover the cost of using the (public) cloud
resources
2. User support: partial funding is provided for advanced user (application) support
for selected user communities
3.
4. Dissemination and training: workshops are held to train the participating centers
and help end-user projects get started (e.g., hands-on boot camps)
5. Repositories: build a catalogue of default configured cloud resources (e.g.
standard virtual machines) for the national infrastructure
By the end of the project, research projects are able to request access to cloud resources
and application support by applying through the regular Notur and SNIC channels.
Partners that are interested in participating in deploying the cloud services should have the
necessary skills to support end users on the provided cloud services.
3. Activities and milestones
Common to the work packages mentioned in this section are user and project
management, billing and accounting – investigating and implementing transparent
methods to monitor and handle cost for research groups. This includes the national and
cross-national view, i.e. how to secure the project allocations from the national
infrastructure are mapped onto contracts (cf current consolidated billing in Amazon Web
Services).
3.1 Cloud computing
To start offering a cloud computing service, the following activities are defined:
Work package 1: Cloud computing service setup
Start date: April 1, 2011
Delivery date, July 1, 2011.
The objective is to set up a ready-to-use public cloud computing service that can be used
by research projects and communities. This needs to be prepared thoroughly to provide a
smooth user experience from the start. The allocation of cloud services should be offered
to users of the national infrastructure in a transparent manner (and at the same level as
Description of Work
3/8
existing services in the national infrastructure). Any services rendered in the "cloud" should
thus be subject to the same accounting, user and project management policies. The user
management and accounting regime from the national infrastructure must therefore be
interfaced with the cloud service provider, thereby using the existing administration
systems as much as possible.
Activities include:
·
·
·
·
Set up access management, i.e. cloud service provider set up with the current
administrative management tools being used by the national infrastructures.
Identify and engage communities that are candidates to run on a public computing
cloud. Select users/communities with cloud-friendly applications (e.g., life sciences).
Establish a support process that guides users to the most effective support
resource, ranging from public support forums (e.g. Amazon) to national
infrastructure support services.
Transfer support knowledge to partners (e.g. site operators) via documentation and
workshops; Organize a workshop on how to use the available cloud technologies.
Participation is mandatory for partners that wish to deploy the cloud support
services locally.
Result: After this work package, all preparations have been done for a selected group of
end users to start applying for (and use) a public cloud service through the national
infrastructure.
Work package 2: Deploy cloud computing service
Start date: July 1, 2011
Delivery date November 1, 2011.
The objective is to deploy the computing service from Work package 1 in the national
infrastructure. Users can apply for cloud resources via the normal national allocation
procedures. The aim is to train at least one person from each participating centre on
providing support for the cloud computing service. These people should be able to support
(local) users on the cloud towards the end of the project.
Activities include:
·
·
·
·
·
·
·
End user documentation for using the cloud computing service
Raise awareness for the available cloud computing services (and its advantages,
e.g., immediate availability, no scheduling, low-threshold interfaces)
Implement and fine-tune the support and management process for users and
communities of the cloud computing service
Build and maintain a catalogue of standard virtual machine images for use in
various “types” of research projects. These images are selected based on user
community input.
Evaluate and forecast actual usage and cost of the cloud computing service
Maintain a private cloud solution that is compatible with the public cloud service (for
cost comparisons).
Description of Work
4/8
Result: The result of this work package is a public cloud computing service with an
implemented user support structure. The computing service is integrated in the national
infrastructure and includes a repository of pre-configured resources available. The
participating partners are capable of supporting local users.
3.2 Cloud backed storage
Work package 3: Personal scientific storage
Start date: Feb 15, 2011
Delivery date: July 1, 2011.
The objective of this work package is to provide researchers with a personal cloud storage
service for managing scientific data. The work builds on the results of the NEON activity in
2010 and consists of deploying and polishing the end-user (web) interface and integrating
the service with a national identity provider (IdP). The metadata “tagging” system will be
enhanced.
The service will initially be deployed in the Amazon cloud, though the design must ensure
that other cloud service providers can be added with reasonable effort. The service will
have a WebDAV interface.
Activities include:
·
·
·
·
·
·
·
·
·
Design and integrate the web interface with a national IdP.
Design and implement support for the WebDAV interface, i.e., “WebDAV disk
access”.
Design and implement support for a metadata system
Improve user-friendliness of the web interface
Test and document use cases for the cloud-backed storage
Document and publish software as open source; Bootstrap an open source
community around the software
End user documentation
Deploy an initial version on the public cloud
Operate the initial version
Result: Users will be able to use this cloud service as their personal scientific storage area
via a drive interface and web interface, with the option of tagging data and making data
publicly available. The provided interface provides essential functionalities for users to
manage data. The design and use of personal spaces are properly documented both for
developers and users.
Work package 4: Shared spaces for community building
Start date: July 1, 2011
Delivery date: November 1, 2011
Users that have access to the personal cloud storage service must be able to create
shared storage spaces (much like project areas) with other users that have access to the
Description of Work
5/8
service. These shared spaces are set up and configured via the available web interface.
After initial set up, they are also available via the disk interface (WebDAV).
Activities:
·
·
·
·
·
·
Design and implement support for users to create shared spaces
Design and implement support for sharing of existing files and folders with full
access control
Test and document use cases for shared spaces
Deploy the version developed in this work package on the (public) cloud
Deploy the version developed in this work package in the national infrastructure for
scientific data.
Operate the deployed versions
Result: Users will be able to use the cloud-backed storage to create shared spaces. The
design and use of shared spaces are properly documented both for developers and users.
Work package 5: Deploying shared spaces for key user communities
Start date: May 1, 2011
Delivery date: December 15, 2011.
Work packages 3 and 4 provide general tools for storing and sharing data via a cloud
backed storage service. This work package focuses on adapting these general services to
specific user communities. In this work package, we define the (minimum) metadata that is
considered mandatory for specific user communities to document their (shared) data
collections, as well as any additional metadata that is required to integrate the data in the
national infrastructures for scientific data and computation. This activity is done in
collaboration with key user communities that help in defining the metadata requirements
(e.g., life sciences, climate, linguistics).
Activities:
 Engage selected user communities; Define relevant test cases in collaboration with
these communities
 Define the minimum required metadata per community
 Adapt the cloud backed storage service to provide and require the specified
metadata per community; A user-friendly interface is built for managing and
querying the metadata.
 Transfer support knowledge to partners (e.g. site operators) via documentation and
workshops. The aim is to train at least one person from each centre on the cloud
backed storage service. These people should be able to support (local) users on the
cloud towards the end of the project
Result: A number of shared spaces are made available in the national infrastructure,
tailored to specific user communities. The whole service is properly documented, both for
developers and users.
Description of Work
6/8
4. Resources and timeline
The project starts in January 2011, and runs for one year until December 31st, 2011. The
project has 3 FTEs (1.5 from each SE, NO). In addition to these 2, the project has 0,2 FTE
for coordination – 0.1 per country. Operational resources are taken into account. Costs of
using public cloud resources are specified in euros.
4.1 Resources for the project
Manpower:
Country
FTE
Name
NO
1.0
Maarten Koopmans
(coordination), Andreas Bach,
NN
SE
1.0
To be determined
Public cloud costs:
The budget for using the public cloud for the described services.
Country
Service
Cost
NO
Cloud computing service
NO
Cloud backed storage
SE
Cloud computing service
SE
Cloud backed storage
4.2 Work groups
Group 1
Work package 1 and 2
Maarten Koopmans, Andreas
Bach, NN, To be determined
Group 2
Work package 3 , 4 and 5
Maarten Koopmans, Andreas
Bach, NN, To be determined
4.3 Time plan
Deliverable
Description
Due date
Group
1
Support procedures
cloud computing,
provisioning
May 1
1
2
Initial launch
computing service
July1
1
Description of Work
7/8
3
Transfer knowledge
computing service
September 1
1
4
Catalog of shared cloud November 1
images
1
5
Deploy test version
April 15
cloud backed storage as
personal scientific
drive
2
6
Deploy cloud backed
storage as personal
scientific drive
July 1st
2
7
Deploy test version
cloud backed storage
with shared spaces
September 15
2
8
Deploy cloud backed
storage with shared
spaces
November 1
2
9
Identify and engage
user communities for
cloud backed storage
July 1
2
10
Deploy community
specific metadata on
cloud-backed storage
Dec 15
2
5. Risks and measures
Risk
Impact
Measure
Cost and quota for services
Medium
active monitoring; start with
a small set of projects.
Cloud backed storage for end
users is new technology
Medium
Phased roll out to gain
operational experience
Little enthusiasm end users
Low
Advertise and promote services
Too much enthusiasm end
users
High
Use as advertising;
Conservative sites/operators
Medium
Most universities already work
on clouds in their curriculum in
their comp. science dept. (and
some other places as well).
Activate those users directly
Demand for windows support
Low
accept at most one pilot to
bootstrap this for roadmap 2012
and beyond
Upon success tender required
Medium
Evaluate in aug/sept timeframe.
Description of Work
for continuation
8/8
If needed, start tender process,
maybe collaborative.
Download