Autonomic System Design Visa Holopainen,

advertisement
Autonomic System
Design
Visa Holopainen, visa@netlab.hut.fi
Enabling autonomic behavior in
systems software with hot swapping,
J. Appavoo et al. 2003



Focus on object-oriented systems software
By hot swapping, new algorithms and monitoring code can be added to a
running system without disruption
Hot swapping is accomplished either by interpositioning of code, or by
replacement of code



Interpositioning involves inserting a new component between two existing ones.
This enables more detailed monitoring when problems occur, while minimizing
run-time costs when the system is performing acceptably
Replacement allows an active component to be switched with a different
implementation of that component while the system is running
Triggering hot swapping



In many cases an object is expected to trigger a replacement itself
(autonomously).
For example, if an object is designed to support small files and it registers an
increase in file size, then the object can trigger a hot swap with an object that
supports large files
In other cases, the system infrastructure is expected to determine the need for an
object replacement through a hot swap. Monitoring is required for this purpose.
Adaptive code vs. hot swapping




Among other features, hot swapping allows systems software to
react to changes in environment
More traditional approach towards handling varying environments is
to use adaptive code
In a system using adaptive code, all possible configurations must be
built to the system beforehand
Adaptive code has many problematic features (presented below)
Illustration of adaptive code vs. hot swapping



An adaptive code implementation (A) vs a hot-swapping
implementation (B) of the same function
The adaptive code approach is monolithic and includes monitoring
code that collects the data needed by the adaptive algorithm to
choose a particular code path
With hot swapping, each algorithm is implemented independently
(resulting in reduced complexity per component), and is hot
swapped in when needed
Benefits of hot swapping


Hot swapping can be beneficial at least in the following respects:
Optimizing for the (non) common case


Optimizing for a wide range of file attribute values



Researchers have shown up to 30 percent fewer cache misses by using
the appropriate cache management policy
Multiprocessor optimizations


For example, although the vast majority of files accessed are small (< 4
KB), OSs must also support large files
Access patterns


Dynamic replacement allows efficient implementations of common paths to
be used when suitable, and less-efficient, less-common implementations to
be switched in when necessary
Some applications perform better when distributed to many processors
while others perform better when run on a single processor
Enabling client-specific customization
Exporting system structure information

Always gathering the necessary profiling information increases overhead
Testing system

A research operating system (K42) has been developed to test the hot
swapping approach




Runs on PowerPC and MIPS architectures (soon available for x86 also)
K42 scales well to multiprocessor systems
Performance advantages of hot swapping have been demonstrated in K42
K42 is available at http://www.research.ibm.com/K42
Adding Autonomic Functionality to
object-oriented applications,
M. Schanne, W. Tichy, T. Gelhausen, 2003




The goal is to separate autonomic functionality from
applications (similar to hot swapping)
This is accomplished by creating a system based on
class renaming and proxy/wrapper generation
A list of the proxy objects is kept in registry
Proxy objects has always a pointer to the latest version
of the actual object and access to its member functions




This is accomplished by ByteCode Engineering Library
(BCEL)
Wrapper functions ensure synchronization of variables
The design ensures that there is no need for the user
to adapt his source code in any way or even to restart
the program
The supported environment: the likes of Java 2
platform
Usable Autonomic Computing
Systems: the Administrator’s Perspective, R. Barrett, P. Maglio, E. Kandogan, J. Bailey, 2004



Autonomic computing seeks to solve the problem of increasingly complex
configurations through increased automation
However, the AC strategy of managing complexity through automation runs
the risk of making management harder (more powerful commands)
This is why autonomic systems should:






Provide facilities that make rehearsing and planning easy
Be designed to allow administrators to quickly undo changes, making operations
(whether on production systems or test systems) less risky and therefore easier
Inform the administrator if undo:ing a command will not be possible (easily)
Have enhanced capabilities for testing complex end-to-end systems so that
administrators will be confident that their changes are not having unintended
consequences
Provide access to arbitrary levels of configuration detail if need be
Autonomic system should also

Contain a command line interface (in addition to GUI)
An Architectural Approach to
Autonomic Computing, S. White, J. Hanson, I.
Whalley, D. Chess, J. Kephart, 2004



An autonomic system can be decomposed to 1) interfaces, 2) interactions
and 3) design patterns
A bit RFC-style paper with MUST and SHOULD statements about
Autonomic Elements (AE)
MUST Examples:




An AE MUST be self-managing
An AE MUST handle problems locally whenever possible
An AE MUST be capable of establishing and maintaining relationships with other
autonomic elements
SHOULD Examples:



An AE SHOULD ask for a realistic set of requirements when requesting a service
from another element
An AE SHOULD offer a range of performace, reliability, availability and security
associated with its service
An AE SHOULD protect itself against inappropriate service requests and
responses
Use of policies
The use of policies is essential for autonomic systems
Three (3) policy levels presented


Action policies (IF condition THEN action)
1)
•
An AE employing action policies MUST measure and/or synthesize the
quantities stated in the condition
Goal policies (”Response time must not exceed 2 sec.”)
2)
•
AEs employing goal policies MUST possess sufficient modeling or planning
capabilities to translate goals into actions
Utility function policies (automatically determine the most valuable goal
in any situation)
3)
•
AEs employing utility funtion policies MUST have sophisticated modeling and
optimization capabilities to translate utility functions into actions
Interfaces

Making a system autonomic requires additional interfaces to be
added to the system

Monitoring and test interfaces


Lifecycle interfaces


Enable administrative elements to determine the lifecycle state of an element
(e.g. starting, paused), to cause a state change, and to determine the lifecycle
model that applies to the element, and to determine the lifecycle model that
applies to the element
Policy interfaces


Enable an element to be monitored by any other element that has established
the appropriate administrative relationships with it
Enable administrative elements to send new policies to an element, and to
determine the policies currently in use by the element
Negotiation and binding interfaces

Permit an element to request a service from other elements, or to request to
provide a service
Relationships




When an AE has agreed to provide service to another AE, then those two
elements have a relationship
Relationships are typically formed at run-time
Autonomic systems are built by relationships
Request-response paradigm used to form relationships
From autonomic elements to autonomic systems
Assembling an autonomic system requires:

1)
2)
3)
A collection of AEs that implement the desired function
Additional autonomic elements to implement system functions that enable the
needed system-level behaviors (=infrastructure elements)
Design patterns for system self-management
Infrastructure element can be






Registry (provides mechanisms for elements to find one another)
Sentinel (provides monitoring services to other elements)
Aggregator (combines two or more existing elements and uses them to provide
improved service)
Broker (facilitates interaction)
Negotiator (assists elements with complex negotiations)
Towards Requirements-Driven
Autonomic Systems Design, A.
Lapouchnian, S. Liaskos, J. Mylopoulos, Y. Yu, 2005
There are three basic ways to make a system autonomic

1)
2)
3)
Design the system to support a space of possible behaviors
Equip system with planning and social capabilities so that it can delegate tasks
to external software components (agents)
Build the system so that it has evolutionary capabilities (like biological systems)
The first approach was studied in the paper
Requirements engineering



Development of a framework for capturing and analyzing stakeholder intentions
to generate functional and non-functional requirements
Illustration of requirements engineering: goal
model

Top-level ”hard” goal:



4 top-level ”softgoals”




Schedule meeting
AND-composed of lower level
hard goals
Good quality schedule, Minimal
effort, Minimal disturbances,
Accurate constraints
Lower level softgoals can be
related to higher levels by help
(+), hurt (-), make (++) or break (-) relationships
6 alternative ways to fulfill the
goal “Schedule Meeting”
An autonomic system should
address all different ways of
fulfilling the top-level goals
Goal model -> Feature model ->Component
Connector model
Goal model is integrated into the knowledge of an
autonomic element
Architectural Design of a Distributed
Application with Autonomic Quality
Requirements, D. Weyns, K. Schelfthout and T. Holvoet,
2005



A reference architecture for situated multi-agent systems (situated MAS)
was developed
This reference architecture was applied to a real-world software system
The architecture:



A situated MAS consists of an environment populated with agents (autonomous
entities)
Intelligence in a situated MAS originates from the interaction between agents,
rather than from their individual capabilities
The architecture holds three abstractions: agents, ongoing activities and the
environment
High-level model view of the architecture



The Perception module maps the
local state of the environment onto
a percept for the agent
The Consuption module handles
the effects of encironment
changes that affect the agent
The Decision module is
responsible for action selection
The application


A system in which robots transport loads from one place to another within a
warehouse and recharge themselves whenever needed
Old system: centralized server controlled robots


Main problem: inflexibility; robots can’t adapt to changing situations
Improvement: Robots are agents acting in a MAS

Drawback: more complicated system
Module view of the application

Two kinds of agents: trasport
agents and AGV agents


Transport agents are ”managers”;
they determine the priority of the
transport, assign transports to
AGVs and ensure that the
transport succeeds
AGV agents are responsible for
executing the assigned transport
Architecture of the environment




To cope with the complexity of the
environment, it is presented
through a layered architecture
Virtual environment uses a
middleware layer that enbles
agents to communicate with each
other
Virtual environment enbles agent
routing and prevents collisions
The agent observer a 3-5 meter
circle from the virtual environment
at a time



In this circle the agent marks the
path it is going to use and removes
this path when leaving the circle
This way collisions can be avoided
Transport agents use the virtual
environment to locate AGV agents
A Control Theory Foundation for SelfManaging Computing Systems, Y. Diao, J.
Hellerstein, S. Parekh, R. Griffith, G. Kaiser, D. Phung, 2005


Control theory used as a way to identify a
number of requirements for and challenges in
building self-managing systems
What does control theory bring to table in terms
of self-management?


Autonomic computing and control theory have
slightly different points of focus: autonomic
computing focuses on the specification and
construction of management components that
interoperate well, while the focus of control
theory is on analyzing and/or developing
components and algorithms so that the resulting
system achieves the control objectives
For example, control theory provides design
techniques for determining the values of
parameters in commonly used control algorithms
so that the resulting control system is stable and
settles quickly in response to disturbances
Feedback Control Theory








Reference Input (I/P) : Desired Output (O/P) (as specified by the
human)
Control Error : (Reference I/P – Measured O/P)
Control Input : Parameters which affect behavior of the system
Disturbance I/P : affects Control I/P
Controller : Change Control I/P to achieve Reference I/P
Measured O/P : Measurable feature of the system
Noise I/P : affects Measured O/P
Transducer : Transforms measured O/P to compare with Reference
I/P
Properties of Control Systems

SASO

Stable



Accurate


Measure Output converges to Reference (Desired) Input
Short Settling Times


Bounded Input produces bounded output
Unstable systems not usable in mission critical work
Converges to the Stable Value quickly
No Overshoot
 Achieves objectives in a steady manner
Control Analysis and Design

Transfer function and Ztransformation used to control
and model response times and
settling times
MaxUsers
u (k )
Notes Server
Actual RIS
y (k )
Model of System Dynamics
y ( k  1)  a 1 y ( k )  b1 u ( k )
Transfer Function
N ( z) 
b1
z  a1
a1  0.43
b1  0.47
Example: control theory approach to web server
management




Objective : CPU Utilization < 50%
Measured Output : CPU utilization
Control Input : “MaxClients”
During the first 300 s, the system
operates without feedback control.
When the controller is turned on, a
reference input of 0.5 is used. At this
point, the system begins to oscillate
and the amplitude of the oscillations
increases. This is a result of a
controller design that overreacts to
the stochastics in the CPU utilization
measurement.
<username>, I Need You!
Initiative and Interaction in Autonomic
Systems, P. Kaminski, P. Agrawal, H. Kienle, H. Müller, 2005

Autonomic job requirements

If I hired a person instead, what qualities would I look for?



attention to detail, strong communication skills, initiative, tempered by job
boundaries, self-knowledge and willingness to seek help
Treat users as partners, not masters
Basic idea:

The system has an optimization engine that decides if the preferred
mode of action in some situation is to 1) connect a human or 2) try to
repair the system





Decision based on 1) explicit instructions and 2) learning
Balance match, bother, rush, risk
The system learns from human actions and becomes more competent in
solving problems on its own
Balance initiative and interaction
Send messages via e-mail, instant messenger, etc.
Human (operator) is added to the traditional
autonomic computing cycle
Autonomic interaction manager
Analyze
Monitor
receive
advice
Plan
Knowledge
Execute
ask for
help
Download