Designing Autonomous Agents for Process Control. Gensym

Designing AutonomousAgents for Process Control.
From: AAAI Technical Report WS-98-10. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved.
AnshuMehra
am@gensym.com
Virginio Chiodini
vlc@gensym.com
GensymCorporation
125 CambridgePark
Drive
CambridgeMA,02140.
example,guaranteeddelivery and subject-based addressing.
ADEuses a delegation based event mechanism
similar to
JDK1.1 model.Agentsuse messagesto generate and listen
for events. EachAgent has a network-wideunique Nameand
can be endowedwith specific Properties. Usingan SQL
language, Agentscan query the Multi-Agentenvironmentin
search of Agentswith specific properties. ADE
also enables
Agentsto send messagesto other Agentsqualified by their
properties. AnAgentcan concurrently performmultiple
Activities whilemultiple Agentscan performa specific
Activity. AgentActivities can be Transient or Permanent.
TransientActivities are created and started by messagessent
by Agentsand are deleted whenthe tasks assigned to the
activity havebeencompleted.Permanentactivities are created
and started at Agentinitialization and are deleted at Agent
termination. Permanentactivities concurrentlyperformtasks
requested by multiple messages.
Duringthe past two years we have developedthe Agent
DevelopmentEnvironment (ADE)and have used ADE
multipleplanningand control applications. In this paperwe
provide a brief overviewof ADEand present two Adaptive
Control applications developedwith ADE.
The Agent DevelopmentEnvironmentis an integrated
platformfor buildingdistributed multi-agentapplications.
ADE
enables a graphical definition of Agentsand Agents’
Activities. ADEprovides a simulation environmentin which
Agentscan be tested before being deployed.ADEis built on
G2,an object-oriented graphical environment
that offers a
robust platform for the development
of real-time systems.
Distinctive componentsof ADEare:
¯
A predefmedclass hierarchy of agents and agent
components.
¯
Anagent communications"middleware".
AgentActivities are defined as Grafcet, using the ADE
Grafcet DevelopmentEnvironment. The Grafcet
Development
Environment
facilitates the design and
development
of single- and multi-thread AgentActivities as
standard IEC-848FunctionCharts with a sophisticated
graphical environment.Agentactivities definedas Grafcet
Charts can be executedin interpreted or compiledmode.In
Interpreted Mode,Agentsactivities can be monitoredand
suspendedwith a variety of graphical tools and can be
modifiedin real time by modifyingthe correspondingGrafcet
Chart.
A graphical programming
language to design and
develop Agents’behavior based on the Grafcet
standard.
¯
A complete debuggingenvironment.
¯
Adistributed simulationenvironment
to test MultiAgentapplications built with ADE.
¯
A deploymentcenter to deploy Agentsas G2objects or
JavaBeansin a Java virtual machine.
¯
Applications.
ADEis currently being used in two prototype
control applications:
A LeamingContext to enable Agentsto develop their
behaviorthrougha learning process.
Multi-Agentapplications built with ADEmayrun on a single
machineor on a distributed network. Agents communicate
through Messages.ADEprovides a basic direct addressing
messageservice, with someoptional functionality, for
113
¯
A ModelPredictive Control application of an
Induction HeatingProcess.
¯
A ModelFree Control application of an
Electrolytic MetalPlating Process.
Closed-LoopControl of an Induction Heating Process.
Agent-BasedDesign.
InductionHeatinginvolvesplacing an electrical conducting
workpiece in a varying magneticfield. Themagneticfield
induceseddycurrents at a powerlevel and frequencyin the
workpiece. The workpiece heats up becauseit has resistivity
to the eddycurrents inducedby the magneticfield. Thegoal
of inductionheating for forgingis to uniformlyheat a work
piece to the properforging temperature.Workpieces are
movedthroughseveral induction coils whosevoltage can be
controlled. Multiple workpieces maybe concurrently
processedwithin one inductioncoil. The goal of the closedloop control systemis to adjust the voltage of the induction
coils to stabilize the temperatureof the outgoingworkpieces,
managingvariations in temperatureof incomingpieces and
transitions from"run" to "hold" and backto "run" of the
workpieces alongthe heating line.
In both applicationsprocessentities are representedby
Agents.Wedistinguish two types of Agents:
¯
StaticAgents.
The behaviorof Static Agentsis defined whenAgentsare
created throughGrafcets or standard methods.Workpieces
heated by an Induction HeatingProcess are examplesof
Static Agents.The maintask of a "WorkPiece Agent"is to
evaluateits final temperatureprofile, giventhe valuesof the
control variables (for example,voltage and frequency)
utilizing the mathematicalmodelof inductionheating.
¯
Anadequatemathematicalmodelis available for the
induction heating process. Usingthe mathematicalmodel, we
developedand validated an AgentBasedprocess simulator.
Thesimulator is then used to enable Agentsto leaman
appropriatecontrol policy.
Closed-LoopControl of a Plating Process.
In a Plating Process work pieces are movedthrough a set
of baths that deposit various metals via electrolysis. At the
end of the process a control device checks the thickness of
each metallic layer and evaluates discrepancies between
set points and actual values. The process has two control
variables: the rectifier current of each electrolytic bath
and the line speed. Increasing the speed of the line
increases the production throughput, but demandshigher
current in the electrolytic baths. Highcurrent levels may
cause instabilities in the plating process. Whenthe
thickness of a specific metal is insufficient, plates mustbe
discarded, causing a loss in productivity. Onthe other
hand, raising thickness set points wouldcause a waste of
precious metals, increasing production costs. Otherstate
variables like metal concentration, temperature, and PH
affect the plating process. The goal of the closed-loop
control systemis to maintain the best possible equilibrium
amongthese competing requirements and to manage
unpredictable changesin the conditions of the
environmentthat affect the plating process. The
functional relationship betweencurrent, process time and
metal thickness can be determined only via complex
mathematical processes. Lengthy and costly system
identification procedures must be performedfor each
different type and shapeof plate to derive transfer
functions linking each control variable to the thickness of
metallic layers. Noaccurate modelis available to evaluate
the interaction amongthe different control variables.
114
DynamicAgents.
The behavior (policy) of DynamicAgentsis developedand
refined duringthe learning phase. All the Agentsthat operate
control variables are modeledas DynamicAgents. Dynamic
Agentsadapt their behaviorby interacting amongthemand
with the environmentthrough the LearningContext. The
LearningContextis a special activity of Dynamic
Agentsthat
continuouslymodifiestheir control by optimizing
environmentalfeedback, through a mappingbetween
perceptionsand actions. LearningContextsoperate until
stable and consistent sequencesof goodcontrol decisionsare
made.The LearningContext is activated wheneverthe
current control policies fail to producegoodresults in
unexploredregions of the state space. Severallearning
contextswith different learningstrategies havebeen
developed. All the DynamicAgentsof an application must
use the sameLeamingContext.
Whena modelof the process is not available (for examplein
the Plating Process)twooptions are available:
Learn an approximatemodelusing Supervised Learning
techniqueslike Feed-forwardNeuralNetworks.In this
case the learning processcantake place in a simulated
environment.Work-piecescan be represented as Static
Agentswhosebehavior is defined by a Neural Network.
.
Leamcontrol policies without learning a modelof the
environment.In this case a simulationenvironment
is not
available. TheLearningphasemustbe performedin realtime concurrentlywith the executionof the actual
production process. Sometechniques developedin
ReinforcementLearningenable Agentsto learn directly
state-action value. A model-freeapproachis feasible
only if the speedof convergenceof the LearningProcess
is fast.
Bothapproachesare currently being evaluated.