Autonomous Agent Execution Environment (AXE) A secure, distributed agent execution system

advertisement
Ambrosia 15-612
Autonomous Agent
Execution Environment (AXE)
A secure, distributed agent execution system
for the Java Virtual Machine.
Ambrosia
Joshua Baer, Bob Dean, John Huebner, Jed Pickel
15-612 Distributed Systems, Dr. Raj Rajkumar, TA Andrew Berry
Carnegie Mellon University, Spring 1998
Chapter
1
Introduction
The Agent eXecution Environment (AXE)
Purpose
Ambrosia has developed an agent execution environment and proven it has practical use
by implementing several agents. An Agent eXecution Environment (AXE) is more flexible
than a client/server design model because it allows arbitrary code to be executed on a
remote machine. An agent system can also localize computations near data, reducing
network transmissions and increasing perceived and actual performance.
For example, a traditional client/server implementation of a mechanism to find and retrieve
files on the network (“find file”) would require a specific server on every machine from
which files were to be found. This “find file” server would actively wait for find file requests,
wasting local resources. In a more flexible design, the server might provide remote
directory operations as a service. This would allow the client to download the available
directory structures from the server, search the directories locally, and then send a request
to the server for a file transfer. The client would be more efficient if it made a set of
concurrent requests for directories, but this would also make the client more complex.
In contrast, an agent system would allow the search routines to be sent to multiple servers
for execution. The file search would be able to occur local to the server, and also
concurrently on multiple machines. This would improve latency, and reduce network
bandwidth. Furthermore, each instance of the algorithm used by the client could be
tailored to the search being performed.
Overview
An agent system offers high availability and fault tolerance using a fail-stop model.
Availability is increased because a user can obtain an agent from multiple sources and
execute it on multiple hosts. If an agent is replicated to a node, and that node fails before
or during the execution of the agent, the parent agent can create and replicate a new
agent to another node in the system. If the system semantics are designed to allow
multiple copies of an agent, and/or agent cooperation, then a shared state method of
active replication is easily supported. If an agent sends a copy of itself to another node
and does not hear a timely reply, it can re-direct that task to another node.
Performance can be increased by using long-term caching. An execution node saves the
code and static data of an agent so that they need not be sent to the node the next time
the agent is invoked.
Terms
This section defines a number of terms used to describe the Agent eXecution
Environment. Some of these terms have been changed or updated since the submission
of the original design proposal.
Term
Agent
AXEKey
Agent eXecution
Environment (AXE)
Execution Node
Data Object
Code Object
Node
Security Manager
(sandbox, SecMan)
Resources
Sync
Short Term Caching
Long Term Caching
Definition
A self contained execution including code, data, security info
An agent object which stores the public and private key,
and also provides methods for manipulating and
authenticating them.
The distributed collection of applications that accept,
executes, and transmits agents. It mediates between the
agent and the operating system to acquire resources for the
agent.
A single application in the execution environment. It will only
send agents that it will execute itself.
The portion of an agent containing the state of an agent. The
variables needed by an agent to run. Note: Constants are
stored in the code object at compilation time.
The portion of an agent that contains the execution
instructions. It also includes static data since it shares its
properties of read only and static after compile.
A single instance of an Agent eXecution Environment on a
single host.
The internal AXE system which controls an agent’s access to
resources.
CPU cycles, file system, memory, network, display
An agent entry point called by the system before transporting
the agent to a different node. Allows the agent to save its
state so that it is not lost during replication.
The execution node storing the entire agent to disk,
temporarily, for use in transmitting or re-transmitting to
another node.
The execution node storing the code, and static data of an
agent, between invocations, to increase performance the next
time that agent is invoked.
2
Chapter
2
Architecture
The Agent eXecution Environment (AXE)
Overall Design
An Agent eXecution Environment (AXE) is best defined as a distributed collection of
synchronized execution nodes. An execution node is a machine that provides ability to
accept and execute objects (agents) from the network as a service.
An agent is a combination of code, data, and log information, that has the ability to travel
through the Agent eXecution Environment, under the constraints imposed by individual
execution nodes. The details of agent transport will be explained later in this paper
Agent
Object w/
methods
Log
Data
Data
Structure
Privilege
List
Exec Agent
GUI
Calls
Creates
Reads
/Write
s
AXE Node
Sec Man
JVM
Figure 2.1: The Architecture of a Single AXE Node
An agent can only be introduced into the system from a node which itself is participating in
the AXE. In order to provide an incentive to share local resources, a future restriction
would only allow a node to introduce agents which would have permission to execute on
the local node (the Golden Rule).
To join the system, a newly launched node contacts an existing member and
authenticates, after which the member contacts all existing nodes to add the new member
to their list of nodes. This AXE is locked down during this information update using a twophase commit to maintain integrity.
During the initialization of an AXE, the AXE registers its own security manager with the
JVM. This gives control of the Java security checks to the AXE Agent Security Manager.
Thus, the AXE has the ability to accept or deny every possible security request made
within the JVM. See figure 2.1.
Agent
Master
Agent
Data
Code
Data
Code
Agent
Data
Legend:
Agent
eXecution
Environment
Code
Machine
Receives
Process
Creates
Agent
eXecution
Environment
Data Structure
Sends Agent to
Figure 2.2: Agent Transport
An agent can be introduced into the system by sending it to any node. Usually a user
launches an agent by executing a master program that initializes the agent’s data and
sends it to a known node. One future addition would be to allow a node to launch agents
without the aid of a master. Then, we could add the requirement that an agent to be
introduced into the system from a node which itself is participating in the AXE, as an
additional security measure. In order to provide an incentive to share local resources,
another future restriction would only allow a node to introduce agents which would have
permission to execute on the local node (the Golden Rule).
Each node keeps a list of all functioning nodes. In the future a node would be able to store
public keys for agents and other nodes. The node would have the abilities to function as a
trusted source for other nodes for obtaining agents, and to authenticate checksums or
hashes for known agents.
The administrator at each individual node has the ability to configure the environment
according to their local policies and procedures, such as a default security policy for
unknown agents. At the present time, all agents have the same privileges, but the system
has been designed so that it could easily be enhanced to support per-agent security
restrictions.
The distributed nature of this project takes place on two levels. The AXE itself is a
distributed system that must maintain state, availability, and security. On top of that, the
AXE provides a framework for individual agents to build their own distributed systems.
2
Each node is multi-threaded, has the ability to process multiple agents simultaneously,
implements a GUI for configuration, and maintains its own log file. Any change of state is
recorded by the log file.
Logging
Whenever a problem occurs within a complex computer system, administrators need the
ability to immediately and effectively isolate the problem and fix it. The System Log is
invaluable in this task. For this reason, every aspect of an AXE node is logged. The log
allows the administrator to trace and plug holes in their security model. The AXE system
design tries to ease the administrators work by incorporating log viewing into the graphical
user interface, and by having each system module perform logging of its events.
The GUI aids with log analysis by the inclusion of two elements: the Quick Log Window
and the Log Viewer. The Quick Log Window resides in the lower right side of the GUI’s
main window. It tracks the last one hundred messages logged by the system, allowing a
quick reference for when the administrator becomes aware of a problem. If the Quick Log
is not enough, the GUI also includes a Log Viewer. The Log Viewer is a separate window
from the main GUI window. It also includes basic text manipulation features such as
searching.
The AXE System also aids with logging by allowing the administrator to control the amount
of logging that takes place. At the lowest logging level, basic system events having to do
with the life cycle of the agent are logged as well as events having to do with the
addition/removal of nodes from the Environment. The second logging level is a Trace. At
this level the Agent Interface logs all API calls, and the Security Manager begins to log
every security check, including the agent which necessitated the security check, and
whether or not the check succeeded. The third and final logging level is Verbose. At this
level every aspect the of system is logged. When an agent makes a system access the
state of every agent is written to the log for comparison. When the Security Manager is
called, a failed Authentication results in a dump of the full execution stack. Such
measures aid the administrator by giving a base line to compare an event with, separating
an error caused by a rogue agent from possible system glitches.
Agent Transportation
Agent transport is one of the major elements of the Agent eXecution Environment. The
transportation system is what allows an agent to be sent around the network from node to
node. There are three main parts to the transport system design: the agent object, the
agent replication process, and agent caching.
The Agent as an Object
The agent object has three main objects within it. These objects are the code object, data
object, and the security/authentication object. The agent has been designed in this way to
limit the executable segment’s access to corruptible data. This design allows the agent to
3
be easily transported across the network as a single object, and also allows the execution
environment access to vital security information about the agent before the agent is
executed.

Code Object: contains the actual Java byte code for the agent. The code object also
contains constants required for execution, but not large static data structures. This
includes such items as final variables and predefined strings. As far as the agent is
concerned, this object is execute only.

Data Object: contains the current state of the agent. If an agent is to be transferred
and restarted on another execution environment at the current point of execution, all
necessary data for this restart is saved by calling for a Sync. If no state is needed at
the new location, this object will be "empty." This object is only available to the agent
by Execution Environment system calls such as GetData and SetData. No other
access is allowed to the agent. Between these calls, a local copy of the object can be
manipulated and saved. This object also includes separate static data such as
graphics files, agent specific help and other static data which the agent may wish to
bring with it.

Security/Authentication Object: contains all security data needed by the AXE for
authentication and tracking of the agent. For example, this object contains a public
key to allow for agent authentication and methods for manipulating and authenticating
keys and data. The agent, through the use of AXE system calls, can read this object
without restriction, but has no write access. The authentication portion of this object
contains a checksum to ensure that the object is intact, version information, and the
author’s name.
The Agent Transport and Replication Process
The Agent eXecution Environment supports two methods of agent transport and
replication. The first of these methods is manual control by the user. Each node has the
ability to send an agent directly to another execution node of the AXE from the command
line or (not fully implemented) within the GUI.
The second method of agent transport is Agent Replication. This is the process whereby
and agent will send a copy of itself to one or more execution nodes. The agent achieves
this through the use of execution node system calls. Currently, there are two options for
agent replication. The first is a straight transfer of the agent currently residing in the
client’s cache. This means that when the agent is executed on the new node, the
execution will be independent of the parent agent’s current state at time of transfer. The
second transfer method is where the agent requests that its current state be sent along
with the cached agent to the new node. In this method, the node calls a Sync in the agent
before transport, to allow it to update the data object with any relevant state information.
The first method could be used, for example, to upgrade a common utility agent such as a
global find file. Such an agent does not require knowledge of any execution node for it’s
own operation and therefore can be transferred without updating its cached Execution
State. The second method is used by the Agent Interface (the AXE agent API) to send
running agents from one node to another. The replicated agents will resume execution at
the start of their code block, but their data will be preserved in the state saved by the
sync() method. It is possible for a restarted agent to return to its pre-transport execution
state by efficient use of its Data Object.
4
Under the current system design, it is the responsibility of the agent to update and store
it’s own state by calling its setData() method before it is replicated. Each node of the
execution environment contains a setData() method for the agent to complete this task if it
so desires.
Caching
The level and complexity of the caching system used by each node is completely at the
control of the node administrator. At the minimum level, all agents that are executed on a
node should be placed into the node’s agent cache. This allows the node to start and stop
an agent as needed without having to download the agent from the network each time. At
this level, the administrator can decide to only allow hand-picked agents to run on the
node, and for the node to refuse replication requests from other nodes. At the most
complex level, the node accepts all replication requests from other nodes, and caches any
agent that is sent to it independent of the agents executing on the node. Agents are
cached regardless of whether or not they are ever executed.
Security Manager
The first security issue to address is which agents will be permitted by a node.
Administrators should be able to decide whether to accept anonymous agents and choose
in particular which agents to accept, while rejecting others. The use of the Agent’s
Security/Authentication eases this task greatly and also allows it to be automated.
Sandbox
This environment is designed such that administrators at individual nodes have the ability
to configure a default security policy for access to selected resources by the anonymous
agents. Anonymous agents are agents that are not known by the local server. Known
agents eventually will have a custom security policy based on the administrator’s level of
trust for that agent. In the current implementation, all agents are treated as Anonymous.
The security policy provides access control to:


Network Resources

Accept socket connections

Open socket connections

Listen for a network connection

Use IP multicast

Set the socket factory
Local File System

Delete files
5




Read files

Write files
Process Control

Modify thread arguments

Modify thread group arguments

Know about the thread group for new threads

Create subprocesses
System Resources

Use the printer

Set system properties

Access the clipboard

Bring windows to the foreground
Java Specific

Dynamically load and link code libraries

Access the AWT event queue

Manipulate class loaders

Halt Java VM

Access members

Access Java packages

Define classes in packages

Use the Security API

Examine the stack depth of a class
Authentication
Upon receipt of an Agent, the Agent eXecution Environment must perform a number of
functions to authenticate that agent. Fundamentally, the two primary authentication
6
requirements are: knowledge of where the agent came from, and assurance that the
agent code is not modified from the known version.
The AXE includes a public key infrastructure such that each node has a unique
public/private key pair and each instance of an agent has the option of having a
public/private key pair. Ideally, outgoing agents would signed, and incoming agents will be
verified by checking the signature. This functionality would be implemented at the node
and could not be altered by an agent. This form of authentication proves the true source
of an agent, and that the agent was not modified in transit.
In order to assure that agent code is not modified from a known version by a malicious
node, a built-in Java one-way hash function is used. Hashes of running agents are stored
on every node. Upon receipt of an agent, a hash of the agent code is computed and
compared with the hash stored on the node. To reduce the chance of man in the middle
attacks, this comparison could be encrypted using the public key infrastructure in place.
Encryption
With a public key infrastructure already implemented, we could implement the option of
encrypting all data transmitted between servers. This would add significant processing and
network overhead, and might be best implemented as an optional AXE service which
agents could take advantage of if necessary.
Shared Distributed State
Each node stores a list of names, IP addresses, and public keys for the other primary
nodes of the AXE.
Each node is able to:

List all machine names in AXE

Return most idle node in AXE (least number of agents, but ideally the largest number
of free cycles)

Return public key for given machine name

Return public key or hash for given agent

Verify a signed piece of data
Removing Nodes from the AXE
Nodes can be removed from the AXE on demand on in response to an error. Choosing
“Exit” allows a node to gracefully remove itself from the AXE. Additionally, if any errors are
encountered communicating with an existing member of the AXE, the member who
7
discovers the error notifies all other members and each node removes the offending
member from the AXE.
8
Chapter
3
Detailed Design and Implementation
Problem areas, tradeoffs, and design decisions
RMI vs. Object Serialization vs. Applets
Java has a well-developed mechanism for running untrusted code, called the Applet class.
Existing Java Virtual Machines (JVMs) already implement a sandbox for this class. The
advantage of using applets for our agents is that we could exploit the existing sandbox.
The disadvantage of using applets for agents is that we have limited control over the
existing sandbox.
For the development of the AXE, Ambrosia chose to use Object Serialization.
"java.io.ObjectOutputStream" marshalls objects for sending over a socket.
"java.io.ObjectInputStream" unmarshalls the stream into an object again. Javaís Remote
Method Invocation also has facilities to load a class locally. These classes provide the
foundation for building a very rich execution environment, although they are at a lower
level than applets. With the combination of the Agent Security Manager, the advantages
of the Applet approach are achieved without the disadvantages.
Additionally, RMI is used for maintaining state across all nodes. After initially implementing
agent transport with standard object serialization, it was realized that RMI could have been
used for this process. This would have had the uniformity of RMI for all network
communication. Strict object serialization has a small performance advantage over RMI. If
the AXE were to be re-engineered, RMI would be used for agent transport as well as
maintaining shared state.
Primary Backup vs. Shared State
There were two design choices for implementing the shared state of the AXE. One
approach was to use a central server with one or more primary backup servers. This is
efficient for updates, but limited in backup capabilities to the number of backup machines.
Another approach was a shared state system where each node maintains an identical
data structure with all pertinent AXE information. This is more processor and network
intensive, but maintains higher availability and better matches the peer-focused agent
environment.
The shared state system was decided to be the best implementation for the AXE. This
turned out to be simpler to implement since there is no primary server or list of backup
servers to keep track of or allocate. It also simplifies adding new nodes to the system and
controlling updates.
.
.
.
.
Scalability
.
.
The downside. of choosing the shared state approach is that it does not scale well. Each
node locks down
. the entire AXE every time it joins the AXE, exits the AXE, launches an
agent, or terminates and agent. With hundreds of nodes, this would result in excessive
.
network traffic and reduced processing time due to the locks. Ideally, in future
implementations a method of linking multiple AXEs together without sharing state between
different AXEs would be developed. This would allow agents to be authenticated and
transported between AXEs based on geography, network topology, or human resources.
Agent Security
The AXE overloads the Java Security Manager in order to implement the above
mentioned security model.
The Security Policy for a node is defined by a Privilege Setting List. This list defines the
agent security context for the node by containing information on each aspect of the Agent
Security Manager. When the Agent Security Manager receives a specific security check
request, the appropriate context within the Privilege List is checked. If the Privilege Setting
allows the action that prompted the security check, then the check is a success and the
calling thread is allowed to continue executing.
There is one major difficulty when dealing with the Java Security Manager: there can only
be one Security Manager per Java Virtual Machine. To explain this, take into account the
above mentioned security implementation and the following scenario. An administrator
identifies an agent that is abusing the node’s networking capabilities. Instead of killing the
agent, the administrator changes the default security policy so agents can no longer
accept socket connections. The difficulty is that with the general JVM, this now means
that the AXE can no longer accept socket connections either.
This problem is solved by authentication. If a security check fails the initial check by the
Privilege Setting List, then the Agent Security Manager checks the authentication level of
the calling thread. Where an agent may have restricted system access, the threads
involved in running the AXE do not.
This authentication is accomplished in two ways:

Thread Groups – Within the node each agent is executed as a thread. When the
thread is created it is assigned to the Agent Thread Group. Part of the Security
Authentication Process is to check the Thread Group of the calling thread. If the
Thread Group is the Agent Thread Group, access is denied

Execution Stack – The Security Manager has the ability to access the Class
Execution Stack of the Java Virtual Machine. Authentication is also accomplished by
searching the classnames on the stack for a name containing the “agent” keyword. If
such a match occurs, then an agent was the source of the security check and access
is denied. This necessitates the requirement for each agent to have the keyword
within their classname. Since all agents are verified by the Node before they are
executed, it is possible to deny access to agents that do not comply with the naming
requirement.
2
.
.
.
.
Both authentication processes are used in the current implementation of the Agent
.
Security Manager, although both are reasonably safe, the double level of security is more
.
safe than either alone.
.
. advantage to overloading the Java Security Manager. The system can
There is another
. information about the execution of an agent external to the log. This
maintain statistical
information can then be displayed to the User/Administrator in a simple quick reference
format.
Since each agent is giving an agent control block to maintain information on the agent as it
executes, it is possible for resource tracking to be accomplished. This is done by
maintaining flags within the Agent Control Block for each type of system resource. Thus
when the security manager determines that an agent is making a File System call, the file
System flag is set to be true. This is advantageous to the Administrator since it allows
Administrator to determine which agents are accessing the various aspects of the system
without having to resort to the master log.
Tracking statistical data for each agent also allows the AXE to identify the Stability of the
agent. Stability is classified as being one of the following three settings: Stable, Unstable,
and Hostile. A Stable agent is one that is executing without error. An Unstable agent is
one that has made at least one violation of the nodes Security Policy. Such an agent is
ore likely to make further violations as well. The Unstable setting flags this to the
Administrator. The final setting is the Hostile setting. An agent is deemed Hostile if it has
made multiple general security violations, and/or has tried to access parts of the node
system that the Administrator has decide to “watch”. For example, an Administrator
decides that he does not want agents to access the Java System for the Node. A Hostile
agent would be one that tries to access the Java System, even though it may not make
multiple security violations required to be generally defined as a Hostile Agent.
There is only one major way to improve upon the Agent Security Manager. With the
addition of public key agent verification, it would be possible to have multiple privilege
levels as opposed to the single level currently implemented. This would also be easy to
implement. Each privilege level would require its own Privilege Setting List, and the code
for this already exists. Additional logic would need to be added to the Authentication
method. This logic would be responsible for mapping an agent to the relevant Privilege
Setting List based on its hashcode and AXEKey. A better implementation would be to add
a reference the Privilege Setting List to the Agent Control Block when the agent is initially
added to the node. When authentication is performed the Authentication function would
merely need to extract this reference from the Agent Control Block for the current agent.
The Graphical User Interface
The Graphical User Interface of the AXE has been designed to give the maximum amount
of information to the user/administrator quickly. This way the administrator can react
quickly when a problem arises. For this reason the AXE GUI contains an Agent List, Active
Node List, and the Recent Log. The purposes of these elements have been introduced
above. The Recent Log was discussed in the Logging section.
The main object in the GUI is the Agent List. This lists all agents that are currently on the
node. In addition, the statistical data collected within the security manager is also
displayed here.
3
.
.
.
.
.
.
.
.
.
The only improvement currently under consideration is to have the Active AXE Node list
control the main display. Thus the GUI could be used to administer any node of the AXE.
4
Chapter
Chapter
4
Agents
Specific Agent Implementations and Ideas
Distributed Processing
We implemented a distributed processing agent to show the tremendous increase in
performance that harnessing multiple CPU’s for a calculation delivers. In order to
distribute a computation, one must first devise a way to break the computation into
independent parts that can be processed individually and return results that can be easily
combined to yield the final result.
The calculation that chosen was numerical integration because the integral can be broken
x=0.0
x=2.0
Figure 4.1
x=0.0
x=2.0
Figure 4.2
into many sections that can later be summed. This follows the process of Reimann Sums.
Reimann Sums works on the principle that an integral can be represented as a sum of
rectangles of a height equal to the function at the rectangle’s location. There can be great
error in using this process depending on the width of the rectangles used. Rectangle
width is varied depending on the slope of the function to reduce error. The function x2 is
the function used by the agent.
This method of approximation introduces some error where the corners of the rectangle
fall bellow or above the true area as shown in figure 4.3. If we assume the function is
monotonic, then the worst case scenario can be calculated where the function has a
discontinuity immediately after the lower bound as shown in figure 4.4. This comes out to
a maximum error of (fn(max)-fn(min)) * (max-min).
Error
.
.
.
.
.
.
.
.
.
Error
Error
Calculated Area
x=0
Figure 4.3
x=0
x=2.0
Figure 4.4
x=2.0
The distributed processing agent is executed by running a master application written in
Java. It creates an agent interface, agent and a special data object used but the
distributed processing agent. It initializes the data with zero for a lower bound and its local
host as the site to send results to. The master sets the upper bound and threshold
according to the command line parameters. Once the data object is initialized, the agent
and its data are sent to the AXE on the local host.
When the distributed processing agent first starts executing, it calculates the worst case
error and checks to see if it is less than the threshold value. If the error is greater than the
threshold value, then the agent sends itself to the next two hosts and terminates. The first
copy of itself is sent with bounds equal to the parent’s lower bound and mean. The second
copy of itself is sent with bounds equal to the parent’s mean and upper bound. In this way,
it splits the area to be summed into two smaller areas.
The initial design called for an agent to tell the master that it was splitting. The master
would then reserve space for the children and wait for their results. This made resending
an agent that timed out simple, since the master only needed to check the list for children
that had not reported in. This design was flawed, however, because the children’s results
often preceded the parent’s notification that it was splitting. As a result the child’s response
would be lost and the parent’s splitting notice would create two empty entries.
This was fixed by modifying the protocol to be time independent. Instead of trying predict
what messages were expected, we sent only partial sums with boundaries. The master is
then able to compare the boundaries of the sums that it has so far to determine what
ranges need to be recalculated. To assist this process the master creates a binary tree of
results. Each result has a value for to sum, lower and upper bounds, and a flag indicating
0.0-2.0
0
.
0
2
1.0-1.5 .
0
0.0-1.0
1.0-1.25
2
.
0
2
.
5
2
.
0
3
1.25-1.5
.
0
0
.
0
41.0-2.0
.
0
2
.
5
3
.
0
2
.
0
4
.
0
1.5-1.75
2
1.5-2.0
3
.
0
3
.
5
3
.
0
4
1.75-2.0
.
0
3
.
5
4
.
0
.
.
.
.
whether it is partially or totally complete.
.
. is implemented by using Java’s ServerSocket. SetSoTimeout() to specify
Fault tolerance
the number of.milliseconds for the master to wait between agents reporting partial sums.
. the java.io.InterruptedIOException and send out a new agent to fill in the
Then we catch
empty nodes .in the tree.
FindFile
The purpose of the Find File agent is to demonstrate the ability of the Agent eXecution
Environment to share global resources. The resource in this case is long term storage
media. The agent gives the client user the ability to search for data concurrently on
multiple execution nodes, and then retrieve that data. The agent can be viewed as being
similar to the Find File utility found with Windows 95 and NT, but on a global scale as
opposed to local one.
The Agent consists of three parts: the application interface dialog, the slave agent and the
file transfer agent. The process is as follows:
1) The user fills in the interface dialog. The information entered can be an exact filename
(foobar.doc), or a substring of possible file names (foo*). This data is placed within the
Data Object of a slave agent. The slave agent is then replicated to all the nodes of
the AXE.
2) The slave agent searches the shared directory tree of each node for a match to the
criteria housed in its Data Object. The slave compiles all matches and sends them
back to the interface dialog.
3) The interface dialog collates all return data and displays it in a graphical list to the
user. The user then selects the file(s) that they wish to download. At this point the
interface dialog places the file name in the Data Object of a file transfer agent. The file
transfer agent is then sent to the appropriate node where it opens the file, and sends it
back across the network to the interface dialog.
Administrative Reporter
The administrative agent will allow one machine to monitor other machines on the
network. We will attempt to track as much information as possible, however, Java turns
out to be a major limitation in this area. In achieving its cross-platform execution, detailed
system information would be compromised. We would ideally like to track:

Idle cpu cycles

Free disk space

Free RAM
3




.
.
.
.
Network traffic (kb/s)
.
.
Currently running processes
.
Currently.running agents
.
Percentage user/agent processing time
Unfortunately, the only information which is readily available from Java is:

Java VM version

Java Machine Architecture

Resources accessed

Number of agents running on the node
To use the Admin agent, one execution node will launch the agent, which will send a copy
of itself to a single node (arbitrarily). The agent will then bounce from node to node,
sending back status reports to the master. The agent can be configured to do a single
pass through the AXE or just keep rotating through all available nodes on a continuous
basis.
The master would watch the slaves for extreme values or known patterns. Upon detecting
a possible problem, a human would be notified via email or possibly numeric pager.
Humans could check AXE status at any time by viewing a web page which summarized
the current statistics.
We decided not to implement the Administrative Reporter agent for a few reasons. First,
some of the functionality originally intended for Admin was brought into the node itself,
such as tracking the agents running on each node. Second, Java turned out to be
extremely limited in the information available about the local system.
Intrusion Response Agent
While there are any number of examples where distributed agent technology can be
applied, we have elected to include a short discussion about how this technology can be
applied to the network security field.
After discovery of a network intrusion, determining the full extent of the intrusion often
requires a coordination effort across multiple machines and multiple sites. Among other
issues, this coordination requires detailed analysis of audit information that often extends
well beyond the realm of a single network. This is a long and painful process requires a
significant amount of human effort.
Agent technology can be leveraged to solve this problem quickly, and reduce the large
overhead of human effort involved in responding to network intrusions. A single agent
could follow audit trail information across multiple machines and multiple networks to the
extent permitted by individual nodes.
4
.
.
.
.
Of course, there are a whole set of other security issues introduced by applying agent
.
technology across untrusted networks. Agent technology, a piece of mobile code
.
performing some function in an automated fashion, could easily be applied for insidious
. are also a number of issues about access to system resources as well
applications. There
. of service attacks. Although we have though through these problems to
as potential denial
some extent,.we will avoid getting into great detail in order to preserve the brevity of this
report.
5
Chapter
5
Discussion
The process, the results, the experience
Features Not Implemented
There were a number of features we were not able to implement due to time constraints.
They are listed in order of rough priority.

Persistent agent trail

Sign all nodes, agents, and data (code for signing developed but not deployed)

Distinction between “Allowed Agents” and “Anonymous Agents”

Individual security policies for known agents

Caching (developed but not deployed)

Encourage sharing of resources with “The Golden Rule” policy

Link together more than one AXE for scalability
Coordinating a Team of Programmers
Many of the beneficial lessons learned from this project revolved around issues of
coordinating a team of four programmers. One of the significant technical issues learned
by all of us was using CVS in order to maintain a shared state for our code development;
however, most of our lessons learned resulted from human interaction issues. These
issues included agreeing on meeting times, differences in schedules, differences in coding
style, and differences in work habits. In retrospect, we could have improved our process
by maintaining regular meeting times, providing regular status updates, and establishing
milestones, goals and deadlines for integration. Overall, we all learned some very valuable
lessons from this project that we can apply next time we work on a group project.
One tool which proved invaluable for the development of this project was a group email
discussion list. Besides providing basic group communication facilities, it also archived all
messages on a secure web site. This allowed us to hash out many of the details of our
project
Chapter
6
Summary
A few concluding comments
What was accomplished
This turned out to be a very interesting project for the four of us. We were a small group
relative to the others, and composed of people with very diverse backgrounds. Unlike
many other groups, none of us knew each other prior to taking this class and forming our
group. Needless to say, this created some obstacles to teamwork and communication
which took us a while to overcome. The first few weeks were spent getting a feel for each
team member’s capabilities, work style, and level of commitment.
Once we agreed to the autonomous agents theme, we gained excitement for the project
and its possibilities. We were intrigued by the idea of a useful, cross platform FindFile
solution, as well as a cross platform server monitoring system. Two of us had never
programmed in Java before, and were excited to gain experience with it.
We were able to construct the basic Agent eXecution Environment and get it to run on
three platforms; Solaris, Linux, and WindowsNT. We could not get a Macintosh node up
and running due to lack of support for rmiregistry. Agents can be introduced into the
environment and replicate themselves to multiple nodes. All nodes are aware of all other
nodes and agents in the system at all times. All agents are monitored by a node-specific
security policy with fine control over system resources. Detailed logging is present at
multiple levels with a variety of ways to access and view the log.
Of three initially planned agents, we were able to get two functional prototypes operational.
DistProc is an interesting example of an agent replicating itself many times for distributed
processing. FindFile is a usable implementation for network searching. The administrative
reporting agent turned out to be quite limited since most of the information we would be
interested in still platform specific at this time (idle cpu cycles, process list, system
resources).
Appendix
A
Code Summary
The nitty-gritty
Overview
All parts of the AXE lie within the ambrosia package. The package is divided into two
sections, ambrosia.axe for code relating to the environment and nodes, and
ambrosia.agent for specific agent code and generic agent support code.
ambrosia.axe
ambrosia.axe.core
This contains fundamental code for the AXE.
ambrosia.axe.gui
This contains all of the code for the AXE graphical user interface.
ambrosia.axe.server
This contains the RMI stubs and other communication code.
ambrosia.axe.util
This contains utilities and tools used by the server, including authentication tools.
ambrosia.agent
ambrosia.agent.(agent)
This is where specific agents must be located. For security reasons, only agents with
“ambrosia.agent” in their package name will be allowed to execute on the AXE.
.
.
.
.
ambrosia.agent.core.
.
This contains.fundamental code required by all agents.
.
ambrosia.agent.sample
.
This is a sample agent useful for testing purposes.
ambrosia.agent.util
This includes utilities and tools which would be useful to agent developers.
2
.
.
.
.
.
.
.
.
.
3
.
.
.
.
.
.
.
.
.
4
Download