Toward Internet Distributed Computing

advertisement

Toward Internet Distributed Computing

By

Milan Milenkovic, Scott H. Robinson, Rob C. Knauerhase, David Barkai, Sharad Garg, Vijay

Tewari, Todd A. Anderson and Mic Bowman

The internet we have today was designed to fetch ‘Static’ information form the network and display it for humans to read. The data formats and protocols in place today were not designed for machine-to-machine communication, but instead required human intervention. However, promising technologies like peer-to-peer (P2P) and grid computing could help the Internet to evolve into a distributed computing platform. The internet could also be transformed into an application-hosting platform, and provide a similar environment as provided by operating systems today. The proposed solution suggests to ‘disaggregate’ and ‘virtualize’ individual system resources as services that can be published, discovered and configured at runtime to execute an application. Such a system can be built by combining and extending web-services, P2P and grid computing technologies.

Distributed computing offers many advantages like resource sharing, load balancing, flexibility, reliability, availability and fault tolerance. The main force pushing for a distributed internet is the increasing reliance of e-business on automation. To create a distribute internet the idea is to virtualize resources such as computational cycles, network connections and storage and maintain a network pool. Resource can be aggregated at runtime to accomplish a task i.e. execute an application. The aggregation of resources is temporary in nature and the resources return to the pool upon the completion of the allocated task. The underlying assumption is availability of the mechanism that resources use to publish their willingness to collaborate.

The requirement of internet distributed environment can be met by creating intelligent and self configuring networks. The distributed intelligence in a network increase scalability of the network, where as the self organizing feature provides the flexibility of creating a network of resource available on an ad hoc basis. The performance improvement can be attributed to the preference to use local resources. Use of local resources reduce latency by reducing the distance between the data source, the processor and information consumer. This design scheme is evident in wireless networks e.g. bluetooth, that tend to consume fringe bandwidth as opposed to backbone bandwidth. Adding intelligence in networks also helps in load balancing and reduces resource contention. Now let’s consider the network services and abstraction layers necessary to host applications on the internet.

Resource Virtualizations: A network service could be considered as a software functional component which abstracts its underlying definition. A service publishes itself and an interface using which applications could utilize its offered functionality. The idea of virtualizations can also be extended to hardware resources i.e. computational cycles and storage etc.

Resource Discovery: The ability of an application to find a service based on functionality, characteristics, location and cost.

Dynamic configuration and run time binding: Defined as the capability to bind to a resource at runtime as opposed to design and link time. Run time binding also helps improve load balancing and reliability.

Resource aggregation and orchestration: Refers to the management of resources. The system provides the synchronization of resources in a similar fashion to operating systems but at the network level. The system releases the resources back into shared pool upon successful completion of tasks/sub-tasks.

To Implementation IDC, the designers have identified certain technologies that would be used as is or extended to achieve their goal. The implementation plan is base on utilizing P2P, grid computing and web-services as building blocks of IDC. Let’s take a look at what these technologies will contribute.

Web-services: “Web services are loosely coupled self contained components that define and use internet protocols to describe, publish, discover and invoke other dynamically located web services”. The services can advertise and discover services using Universal

Description, Discovery and Integration (UDDI) specifications or use web-services

Inspection language. For a service to be used by a client, the service should be described using a well structured language like Web Services Description Language (WSDL). The service also needs to specify the interface i.e. SOAP and XML that a client uses to communicate with the service.

Peer-to-Peer (P2P): This concept is based on utilizing the aggregated computing power of millions of computers together. Typical applications include file transfer, information sharing applications e.g. Kazaa. P2P provides a distributed environment in which all participating machines contribute to and draw from the resource pool. A P2P solution must overcome all challenges associated with heterogeneous collection of computers, issues related to dynamic assigned IP and translation etc.

Grid Computing: The term is derived from the analogy of a power grid, which aggregates resources in a pool to deal with various load conditions without user awareness. Open

Grid Services Architecture (OGSA) has incorporated web services to offer a new distributed computing environment which is typically used in high-performance computing. Grid computing provides the architecture and middleware for resource virtualization, aggregation and abstraction.

Mobile computing is inherently nomadic in nature i.e. network participants move in and out of networks and keep establishing and breaking sessions. This behavior makes the dynamic discovery an issue. Mobile devices tend to participate in two types of networks i.e. ad hoc networks (e.g. market) and networks where a wireless infrastructure is present (e.g. Office). This give rise to security issues such as willingness to share information, but either way service discovery phase is mandatory before collaboration. A dynamic discovery protocol needs to be searched which meets the following criterions i.e. should support directory free and oriented modes, should be scalable, platform-independent and provide information about services and how to access them. Keeping the above mentioned in mind a prototype was developed which used

WSDL and UDDI formats. For large ad hoc networks a UDDI like directory was developed which had entries for all locally available services and participants. Small devices could maintain a streamlined version of the directory and inspect each others directory when forming small ad hoc networks. However for large ad hoc networks, all the power constrained devices elect a Local

Master Directory (LMD) i.e. the one having the most power; that performs the job of resource aggregation and query resolution and takes care of all service advertisements.

The proposed design is neither final nor complete, but surely is a promising candidate which could change the internet from its present static state to a distributed computing environment.

Questions.

1.

What do you think is the motivation for a device to give up all its local resource into a shared pool when the possibility exits that the resource is not available to it when it actually needs it.

2.

The LMD is chosen so that majority of the devices do not waste energy/power by participating in service advertisements in large ad hoc networks. But there is no compulsion for the duration the LMD participates in the network. Is not choosing a LMD equally resource intensive as reelecting a LMD for a short period?

3.

The IDC concept is based on existing technologies like web-services, P2P and grids computing which all assume low-latency. The lowest possible latency could be offered by local resources. Is it or is it not a valid assumption? If you say valid, what is the application performance degradation with the IDC approach?

Download