Designing AutonomousAgents for Process Control. From: AAAI Technical Report WS-98-10. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved. AnshuMehra am@gensym.com Virginio Chiodini vlc@gensym.com GensymCorporation 125 CambridgePark Drive CambridgeMA,02140. example,guaranteeddelivery and subject-based addressing. ADEuses a delegation based event mechanism similar to JDK1.1 model.Agentsuse messagesto generate and listen for events. EachAgent has a network-wideunique Nameand can be endowedwith specific Properties. Usingan SQL language, Agentscan query the Multi-Agentenvironmentin search of Agentswith specific properties. ADE also enables Agentsto send messagesto other Agentsqualified by their properties. AnAgentcan concurrently performmultiple Activities whilemultiple Agentscan performa specific Activity. AgentActivities can be Transient or Permanent. TransientActivities are created and started by messagessent by Agentsand are deleted whenthe tasks assigned to the activity havebeencompleted.Permanentactivities are created and started at Agentinitialization and are deleted at Agent termination. Permanentactivities concurrentlyperformtasks requested by multiple messages. Duringthe past two years we have developedthe Agent DevelopmentEnvironment (ADE)and have used ADE multipleplanningand control applications. In this paperwe provide a brief overviewof ADEand present two Adaptive Control applications developedwith ADE. The Agent DevelopmentEnvironmentis an integrated platformfor buildingdistributed multi-agentapplications. ADE enables a graphical definition of Agentsand Agents’ Activities. ADEprovides a simulation environmentin which Agentscan be tested before being deployed.ADEis built on G2,an object-oriented graphical environment that offers a robust platform for the development of real-time systems. Distinctive componentsof ADEare: ¯ A predefmedclass hierarchy of agents and agent components. ¯ Anagent communications"middleware". AgentActivities are defined as Grafcet, using the ADE Grafcet DevelopmentEnvironment. The Grafcet Development Environment facilitates the design and development of single- and multi-thread AgentActivities as standard IEC-848FunctionCharts with a sophisticated graphical environment.Agentactivities definedas Grafcet Charts can be executedin interpreted or compiledmode.In Interpreted Mode,Agentsactivities can be monitoredand suspendedwith a variety of graphical tools and can be modifiedin real time by modifyingthe correspondingGrafcet Chart. A graphical programming language to design and develop Agents’behavior based on the Grafcet standard. ¯ A complete debuggingenvironment. ¯ Adistributed simulationenvironment to test MultiAgentapplications built with ADE. ¯ A deploymentcenter to deploy Agentsas G2objects or JavaBeansin a Java virtual machine. ¯ Applications. ADEis currently being used in two prototype control applications: A LeamingContext to enable Agentsto develop their behaviorthrougha learning process. Multi-Agentapplications built with ADEmayrun on a single machineor on a distributed network. Agents communicate through Messages.ADEprovides a basic direct addressing messageservice, with someoptional functionality, for 113 ¯ A ModelPredictive Control application of an Induction HeatingProcess. ¯ A ModelFree Control application of an Electrolytic MetalPlating Process. Closed-LoopControl of an Induction Heating Process. Agent-BasedDesign. InductionHeatinginvolvesplacing an electrical conducting workpiece in a varying magneticfield. Themagneticfield induceseddycurrents at a powerlevel and frequencyin the workpiece. The workpiece heats up becauseit has resistivity to the eddycurrents inducedby the magneticfield. Thegoal of inductionheating for forgingis to uniformlyheat a work piece to the properforging temperature.Workpieces are movedthroughseveral induction coils whosevoltage can be controlled. Multiple workpieces maybe concurrently processedwithin one inductioncoil. The goal of the closedloop control systemis to adjust the voltage of the induction coils to stabilize the temperatureof the outgoingworkpieces, managingvariations in temperatureof incomingpieces and transitions from"run" to "hold" and backto "run" of the workpieces alongthe heating line. In both applicationsprocessentities are representedby Agents.Wedistinguish two types of Agents: ¯ StaticAgents. The behaviorof Static Agentsis defined whenAgentsare created throughGrafcets or standard methods.Workpieces heated by an Induction HeatingProcess are examplesof Static Agents.The maintask of a "WorkPiece Agent"is to evaluateits final temperatureprofile, giventhe valuesof the control variables (for example,voltage and frequency) utilizing the mathematicalmodelof inductionheating. ¯ Anadequatemathematicalmodelis available for the induction heating process. Usingthe mathematicalmodel, we developedand validated an AgentBasedprocess simulator. Thesimulator is then used to enable Agentsto leaman appropriatecontrol policy. Closed-LoopControl of a Plating Process. In a Plating Process work pieces are movedthrough a set of baths that deposit various metals via electrolysis. At the end of the process a control device checks the thickness of each metallic layer and evaluates discrepancies between set points and actual values. The process has two control variables: the rectifier current of each electrolytic bath and the line speed. Increasing the speed of the line increases the production throughput, but demandshigher current in the electrolytic baths. Highcurrent levels may cause instabilities in the plating process. Whenthe thickness of a specific metal is insufficient, plates mustbe discarded, causing a loss in productivity. Onthe other hand, raising thickness set points wouldcause a waste of precious metals, increasing production costs. Otherstate variables like metal concentration, temperature, and PH affect the plating process. The goal of the closed-loop control systemis to maintain the best possible equilibrium amongthese competing requirements and to manage unpredictable changesin the conditions of the environmentthat affect the plating process. The functional relationship betweencurrent, process time and metal thickness can be determined only via complex mathematical processes. Lengthy and costly system identification procedures must be performedfor each different type and shapeof plate to derive transfer functions linking each control variable to the thickness of metallic layers. Noaccurate modelis available to evaluate the interaction amongthe different control variables. 114 DynamicAgents. The behavior (policy) of DynamicAgentsis developedand refined duringthe learning phase. All the Agentsthat operate control variables are modeledas DynamicAgents. Dynamic Agentsadapt their behaviorby interacting amongthemand with the environmentthrough the LearningContext. The LearningContextis a special activity of Dynamic Agentsthat continuouslymodifiestheir control by optimizing environmentalfeedback, through a mappingbetween perceptionsand actions. LearningContextsoperate until stable and consistent sequencesof goodcontrol decisionsare made.The LearningContext is activated wheneverthe current control policies fail to producegoodresults in unexploredregions of the state space. Severallearning contextswith different learningstrategies havebeen developed. All the DynamicAgentsof an application must use the sameLeamingContext. Whena modelof the process is not available (for examplein the Plating Process)twooptions are available: Learn an approximatemodelusing Supervised Learning techniqueslike Feed-forwardNeuralNetworks.In this case the learning processcantake place in a simulated environment.Work-piecescan be represented as Static Agentswhosebehavior is defined by a Neural Network. . Leamcontrol policies without learning a modelof the environment.In this case a simulationenvironment is not available. TheLearningphasemustbe performedin realtime concurrentlywith the executionof the actual production process. Sometechniques developedin ReinforcementLearningenable Agentsto learn directly state-action value. A model-freeapproachis feasible only if the speedof convergenceof the LearningProcess is fast. Bothapproachesare currently being evaluated.