What is Grid Computing System - Department of Computer Science

advertisement
Grid Computing Systems
UMBC CMSC 621 Fall 07 Report
Fuesane Cheng
What is Grid Computing System
• Coordinated resource sharing and problem solving in dynamic,
multi-institutional virtual organizations (VOs).
• Virtualization of distributed computing and data resources such as
processing, network bandwidth and storage capacity to create a
single system image
• individual users can access computers and data transparently,
without having to consider location, operating system, account
administration, and other details. Users essentially sees a single,
large virtual computer
• Based on an open set of standards and protocols, e.g., Open Grid
Services Architecture (OGSA) that enable communication across
heterogeneous, geographically dispersed environments.
• With grid computing, organizations can optimize computing and data
resources, pool them for large capacity workloads, share them
across networks and enable collaboration
• "virtual supercomputer" by using
– spare computing resources within an organization.
– a network of geographically dispersed computers
Grid Computing System
Grid Computing Layer
INTERNET
Grid Computing Layer
Types of Grid Computing Systems
• The Heavy-weight, feature-rich systems that tend to concern
themselves primarily with providing access to large-scale, intra- and
inter-institutional resources such as clusters or multiprocessors. Grid
systems developed using the Globus Toolkit are examples of this
class.
• The Desktop Grid, in which cycles are scavenged from idle desktop
computers. The Berkeley Open Infrastructure for Network
Computing (BOINC), a descendant of the SETI@home project, is an
example of middleware for public Desktop Grid computing, as it
harnesses resources that exist outside of institutional control.
• The hybrid BOINC- and Globus-based Grid systems to inter-operate
and thus provides a means for Globus-based computational Grids to
incorporate a much greater range of resources.
– Decreasing the startup cost for new Desktop Grid computing
projects, it makes Desktop Grids a viable option for a broader
range of projects, and provides to Desktop Grids features
inherent in Globus (e.g., authentication, authorization, file
transfer).
The Anatomy of the Grid
Layered Grid Architecture
Fabric: Interfaces to Local Control
•
•
•
Provide shared access to resources (e.g. computational, storage, catalogs,
network, Repository)
Implement the local, resource–specific operations on specific resources
(physical or logical) to allow sharing operation at higher levels.
There is a trade off between requesting richer fabric functionality and
simplifying Gird infrastructure deployment. For Example:
– Advance reservations makes it possible for higher-level services to aggregate
(coschedule) resources in interesting ways that would otherwise be impossible to
achieve
– However, as in practice few resources support advance reservation “out of the
box,” a requirement for advance reservation increases the cost of incorporating
new resources into a Grid
•
•
At minimum, resources should implement enquiry and resource
management mechanisms
Sample Resource Capabilities:
– Computational: starting, monitoring, controlling the execution of processes, and
management, advance reservation, and enquiry functions
– Storage: Putting and getting files, remote data selection and reduction, and
management, advance reservation, and enquiry functions
– Network: Control over the resources for network transfers (e.g., prioritization,
reservation), Enquiry functions to determine network characteristics and load.
– Code Repository: Managing versioned source and object code
– Catalogs: catalog query and update operations on databases
Connectivity: Communicating Easily
and Securely
• Defines core communication and authentication protocols
• Communication Protocols enable the exchange of data between
Fabric layer resources.
– Internet (IP and ICMP), transport (TCP, UDP), and application (DNS,
OSPF,RSVP, etc.) layers of the Internet layered protocol architecture
– New Protocols
• Authentication protocols build on communication services to provide
cryptographically secure mechanisms for verifying the identity of
users and resources.
– Single sign on: logon once and have access to multiple Grid resources
defined in the Fabric layer
– Delegation: a program endowed by a user to run on his behalf is able to
access the resources on which the user is authorized. The program may
also be able to delegate a subset of its rights to another program.
– Integration with various local security solutions: Be able to interoperate
with each site or resource’s security solutions such as Kerberos and
Unix security
– User-based trust relationship: different sites or resources are not
required to cooperate or interact with each other in order to let an
authorized user to use them at the same time
Resource: Sharing Single Resources
•
•
•
Builds on Connectivity layer communication and authentication protocols to
define protocols (and APIs and SDKs) for the secure negotiation, initiation,
monitoring, control, accounting, and payment of sharing operations on
individual resources
Concerned entirely with individual resources and hence ignore issues of
global state and atomic actions across distributed collections; such issues
are the concern of the Collective layer
Two primary classes of protocols
– Information protocols are used to obtain information about the structure and state
of a resource, for example, its configuration, current load, and usage policy (e.g.,
cost).
– Management protocols are used to negotiate access to a shared resource,
• Specifying
– Resource requirements (including advanced reservation and quality of service)
– Operation(s) to be performed, such as process creation, or data access
• Enforcing resources sharing policy
• Monitoring the status of an operation and controlling (for example, terminating) the
operation.
•
The Resource and Connectivity protocol layers form the neck of the
hourglass model, and as such should be limited to a small and focused set
– capture the fundamental mechanisms of sharing across many different resource
types (for example, different local resource management systems); but
– not overly constraining the types or performance of higher-level protocols that
may be developed.
Collective: Coordinating Multiple
Resources
• Contains protocols and services (and APIs and SDKs) that are not
associated with any one specific resource but rather are global in
nature and capture interactions across collections of resources.
• Builds on the narrow Resource and Connectivity layer “neck” in the
protocol hourglass, they can implement a wide variety of sharing
behaviors without placing new requirements on the resources being
shared
• Service Examples:
–
–
–
–
–
–
–
–
–
–
Directory services
Co-allocation, scheduling, and brokering services
Monitoring and diagnostics services
Data replication services
Grid-enabled programming systems
Workload management systems and collaboration frameworks
Software discovery services
Community authorization servers
Community accounting and payment services
Collaboratory services
Collective Layer Example
Collective and Resource layer protocols, services, APIs, and SDKS can be combined
in a variety of ways to deliver functionality to applications
Applications
• Applications are constructed in terms of, and by calling
upon, services defined at any layer
• At each layer, well-defined protocols that provide access
to some useful service such as resource management,
data access, resource discovery, and so forth
• At each layer, APIs may also be defined whose
implementation (ideally provided by third-party SDKs)
exchange protocol messages with the appropriate
service(s) to perform desired actions
• APIs are implemented by software development kits
(SDKs), which in turn use Grid protocols to interact with
network services that provide capabilities to the end user
• Higher level SDKs can provide functionality that is not
directly mapped to a specific protocol, but may combine
protocol operations with calls to additional APIs as well
as implement local functionality
Application Programmer’s view of Grid
Architecture
Solid lines represent a direct call; dash lines protocol interactions
“On the Grid”: The Need for Intergrid
Protocols
• Currently, it is quite feasible to define multiple
instantiations of key Grid architecture elements
• Grids constructed with these different protocols are not
interoperable and cannot share essential services
• Long-term success of Grid computing requires selection
and achieving widespread deployment of one set of
protocols at the Connectivity and Resource layers—and,
to a lesser extent, at the Collective layer
• These Intergrid protocols enable different organizations
to interoperate and exchange or share resources.
• Resources that speak these protocols can be said to be
“on the Grid.” Standard APIs are also highly useful if Grid
code is to be shared.
Relationships with Other Technologies
• Web Service and SOA
– The ubiquity of Web technologies (i.e., IETF and W3C standard
protocols—TCP/IP, HTTP, SOAP, etc.—and languages, such as HTML
and XML) makes them attractive as a platform for constructing VO Grid
systems and applications
– Emergence of SOA standards for Web Services and Grids are just
another but important service capability being provided
– They do an excellent job of supporting the browser-client-to-web-server
interactions, but lack features required for the richer interaction models
that occur in VOs. E.g. TLS vs. Single sign-on
– Grid Security Infrastructure (GSI) extensions to TLS with delegation
capabilities would permit a browser client to delegate capabilities to a
Web server so that so that the server could act on the client’s behalf
• Application and Storage Service Providers
– Application service providers (ASPs), storage service providers (SSPs),
and hosting companies offer outsourcing services for specific business
and engineering applications and storage capabilities by service level
agreement that defines access to a specific combination of hardware
and software
– Security, dynamic reconfiguration of resources, and load sharing across
providers are challenging rarely attempted currently
Relationships with Other Technologies (2)
• Enterprise Computing Systems
– Technologies such as CORBA, EJB, J2EE, and DCOM are all systems
designed to enable the construction of distributed applications
– Standard resource interfaces, remote invocation mechanisms, and
services discovery make it easy to share resources within a single
organization
– However, sharing arrangements are relatively static and restricted to
occur within a single organization primarily in client-server form rather
than the coordinated use of multiple resources
– Integrate with Grid protocols provides enhanced capability and enables
interoperability such as:
• CORBA ORB uses GSI mechanisms to address cross-organizational
security issues
• Portable Object Adaptor that speaks the Grid resource management
protocol to access resources spread across a VO
• Grid-enabled Naming and Trading services that use Grid information service
protocols to query information sources distributed across large VOs
• Internet and Peer-to-Peer Computing
– Peer-to-peer computing and Internet computing is an example of the
more general (“beyond client-server”) sharing modalities and
computational structures are in much common with Grid technologies
– But need to shift focus from vertical to shared infrastructure and
interoperability
Protocols and Standards for Web Services
Copy from “Service Oriented Computing” Munindar Singh, Michael Huhns
Globus Tookit
• Grid Computing Layer (Middleware) development toolkit which has
been developed since the late 1990 to support the development of
service-oriented distributed computing applications and
infrastructures
• An open source software toolkit used for building grids. It is being
developed by the Globus Alliance and many others all over the world.
• Includes software for security, information infrastructure, resource
management, data management, communication, fault detection, and
portability
• Packaged as a set of components that can be used either
independently or together to develop applications.
• Grid Resource Allocation and Management (GRAM) protocol and its
gatekeeper (factory) service; these provide for the secure and reliable
creation and management of arbitrary computations, termed transient
service instances
• Grid Security Infrastructure (GSI), which supports single sign on,
delegation, and credential mapping. A two-phase commit protocol is
used for reliable invocation
• Meta Directory Service (MDS-2), which provides for information
discovery through soft-state registration, data modeling, and a local
registry
Globus Toolkit Components
Selected GT4 Components and
Interactions
Shaded boxes are GT4 code and white boxes are user code
Globus Architecture
• Shown in previous figure, the GT4 architecture depicts three sets of
components
– A set of Service Implementations
•
•
•
•
•
•
•
Execution management (GRAM)
Data access and movement (GridFTP, RFT, OGSA-DAI)
Replicata Management (RLS, DRS)
Monitoring and discovery (Index Trigger, WebMDS)
Credential management (MyProxy, delegation, SimpleCA)
Instrument management (GTCP)
Most are Java Web Services but some are in other languages and/or use
other protocols
– Three Containers
• Used to host user-developed services written in Java, Python, and C
respectively
• Provide implementations of security, management, discovery, state
management, and other mechanisms frequently required when building
services
• Extend open source service hosting environments with support for useful
Web Service specifiaciton
– A set of Client Libraries
• Allow client programs to invoke operations on both GT4 and user-developed
services with multiple interfaces providing different levels of control: WS-I
SOAP, common security and messaging infrastructure, a powerful and
extensible authorization framework, common WS interfaces and behaviors,
life time management of stateful components
Examples of Grid Services
http://lattice.umiacs.umd.edu/gridservices.php
http://www.gridforum.org/documents/GFD.29.pdf
UMBC Planned Grid Connectivity
1150 CPUs,
including 80 x86
node cluster
College Park
12 Institutions
Sura Grid
224 XServe blades
Bowie
National Lambda Rail(NLR)
Globus Toolkit/Condor
Websphere App. Server
NL
R
Lattice Grid
Bluegrit
Rationale S/W
Lambda Ram
+900 cpu’s
Fiber
Matisse
UMBC HyperWall
6CPUs/12screens
SURAgrid Participants
Bowie State
GMU
(As of April 2006)
UMD
UMich
UKY
UVA
UArk
GPN
Vanderbilt
ODU
UAH
USC
NCState
OleMiss
TTU
SC
TACC
LSU
= SURA Member
= Resources on-grid
ULL
UFL
UAB
TAMU
GSU
Tulane
UNCC
Lattice Grid
• What is:
– The Lattice Project is an attempt to effectively share
computational resources among departments and institutions,
starting with those in the University System of Maryland.
– The Grid is focused on computation, and we have not yet made
efforts to enable large-scale data access, storage, or replication.
• Grid Software
– make heavy use of the Globus Toolkit, which forms the backbone
of our Grid system. It provides mechanisms for job submission,
file transfer, and authentication and authorization of Grid entities,
to name a few things.
– have also done extensive work with BOINC, which enables
public participation in the Grid and represents a potentially huge
resource. We have developed software that allows Globus, (and
hence our Grid system), to submit jobs to a BOINC pool.
– work with scheduling software, such as Condor and PBS, that
controls local resources. Such software is being deployed where
it is most appropriate.
UMBC Near Term Bluegrit Design
Hardware:
1 Intel based head node
1 Intel based storage server
College Park
33 2-Proc. JS20 blades(2.2GHz +.5GB)
14 4-Proc. JS21 blades(2.5GHz +2GB)
5.4 TB of shared storage
1.3 TB of node storage
UMBC Network
10 Gb
Head node
Operating System:
Red Hat Enterprise 4 Linux
JS21 Blades
Network:
10 Gb external connection to College Park
1 Gb Ethernet interconnect
100 Mb external connection
JS20 Blades
Storage
UMBC Future Bluegrit Potentials
 5 Available Chassis with 70 blade slots
 Add Cell blade architecture for future computing
 Upgrade interconnects between chassis/blades
Increase RAM availability
Build Out Campus Grid
References
1. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. I.
Foster, C. Kesselman, S. Tuecke. International J. Supercomputer
Applications, 15(3), 2001.
2. Service-Oriented Science. I. Foster. Science, vol. 308, May 6, 2005.
3. Globus Toolkit Version 4: Software for Service-Oriented Systems. I.
Foster. IFIP International Conference on Network and Parallel Computing, SpringerVerlag LNCS 3779, pp 2-13, 2006.
4. Lattice Project http://lattice.umiacs.umd.edu/gridservices.php
5. SURA Grid http://www1.sura.org/3000/3200_ITGridPlan.htm
6. UMBC Multicore Computing Center (MC2)
http://www.umbc.edu/research/blog/2007/08/ibm_gift_to_bring_orchestra_of_1.html
Download