23AIE231M Introduction to AI & Data Science Minor: Artificial Intelligence & Data Science November 2023 Intelligent Agents Unit 02: L04-L06 Puja Dutta, PhD Assistant Professor, Civil Engineering Amit Agarwal, PhD Professor, Electrical & Electronics Engineering | Cybersecurity +91 97432 94057 +91 98679 10690 https://www.amrita.edu/faculty/dr-puja-dutta https://www.amrita.edu/faculty/amit-agarwal https://www.linkedin.com/in/amit-agarwal-635a548 I. Definitions II. Vacuum Cleaning Agent III. Concept of Rationality IV. PEAS – performance, environment, actuators & Sensors V. Environment Types VI. Agent Structures VII. Agent Types VIII. Problem Set 1. Definitions [1/1] An agent is anything that perceives its environment through sensors and acts upon that environment through actuators. A percept is the set of all sensory inputs available to an agent at any given time. An agent’s percept sequence is the time-ordered set of all percepts the agent has perceived. Evakuate qbjective function If we have a mapping from the agent’s percept sequence to the agent’s choice sequences, then we can say the agent knows what it wants. A necessary condition for the mapping to exist is that payoffs are known choices. An agent function maps percept sequence to an action sequence. An agent program implements the agent function. 3 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security I. Definitions II. Vacuum Cleaning Agent III. Concept of Rationality IV. PEAS – performance, environment, actuators & Sensors V. Environment Types VI. Agent Structures VII. Agent Types VIII. Problem Set 2. Vacuum Cleaning Agent [1/1] Environment, Percept Sequence Actions of a vacuum cleaner & Problem: Should amount of dirt collected in 1 hour be an objective? Under which situation will this objective be irrational? Problem: What objective function would you write for the vacuum cleaner? Problem: Before writing the objective function, you will need to define sensing and the environment. So, do that first. 5 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security I. Definitions II. Vacuum Cleaning Agent III. Concept of Rationality IV. PEAS – performance, environment, actuators & Sensors V. Environment Types VI. Agent Structures VII. Agent Types VIII. Problem Set 3. Concept of Rationality [1/2] What is rational at any given time depends on four things: ➢ The performance measure that defines the criterion of success. ➢ The agent’s prior knowledge of the environment. ➢ The actions that the agent can perform. ➢ The agent’s percept sequence to date. Rational Agent: For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has. Problem: If clean then move left, or move right. If not clean then suck. Is this rational for the vacuum cleaner if it has memory? What if it does not have memory? Does rationality require omniscience? No! But a rational agent is expected to engage in information gathering prior to acting. Problem: What sort of information can a vacuum cleaner gather? about its own actions, about its environment 7 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 3. Concept of Rationality [2/2] Information gathering is necessary, but not sufficient. The agent must also learn from the information it has gathered. That learning does not happen on sensed information alone. That sensed information must be used in addition to a priori knowledge. Without learning, the rationality of the agent is fragile. Example: A female sphex wasp will dig a burrow (1) Go out (2) Sting a caterpillar (3) Drag it to the burrow (4) Enter the burrow (5) Check if the burrow is still in the desired state (6) Drag the caterpillar inside (7) Lay its eggs (8) The caterpillar serves as a food source when the eggs hatch. If, however, the caterpillar is moved a few inches away while the wasp does (6) then, it will restart at (4). Even after dozens of caterpillar-moving interventions, the wasp fails to learn that its plan is failing. 8 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security I. Definitions II. Vacuum Cleaning Agent III. Concept of Rationality IV. PEAS – performance, environment, actuators & sensors V. Environment Types VI. Agent Structures VII. Agent Types VIII. Problem Set 4. PEAS [1/2] An agent that relies only on prior knowledge lacks autonomy. A rational agent should be able to correct for partial or incorrect knowledge or, add to that knowledge base in order to maximize its utility function. Problem: Can an agent with no knowledge base be autonomous? Yes. After sufficient experience of its environment, the behavior of a rational agent can become effectively independent of its prior knowledge. Hence, the incorporation of learning allows one to design a single rational agent that will succeed in a vast variety of environments. To analyze an agent, we need to specify utility function, environment, actuators and sensors. This specification is called PEAS or Task Environment. The table shows a sample PEAS for an driverless taxi. 10 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 4. PEAS [2/2] PEAS – more examples Problem: Study these and suggest an addition of at least one more item to each of the PEAS. Problem: Are the environment states fully or partially observable? Problem: Is the impact of actuator action fully observable or partially observable? Problem: Each system, including driverless taxi system, has single agent or multiagent? Justify. Note: an object is an agent only if its actions depends on that of another in the environment. 11 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security I. Definitions II. Vacuum Cleaning Agent III. Concept of Rationality IV. PEAS – performance, environment, actuators & Sensors V. Environment Types VI. Agent Structures VII. Agent Types VIII. Problem Set 5. Environment Types [1/2] If the next state of the environment is completely determined by the current state and the action executed by the agent, then we say the environment is deterministic; else, stochastic. E.g. a driverless taxi is stochastic because one can never predict the behavior of traffic, tires blow out or engine seize up exactly. In an episodic task environment, the agent receives a percept and then performs a single action. The action is independent of actions in previous episodes. E.g. a defective part on an assembly line once detected, is removed, regardless of previous actions of the checking robot. In a sequential task environment, current decisions affect all future decisions. E.g. driverless car or chess. Such task environments are much more difficult than episodic task environments. If the environment can change while an agent is deliberating, then we say the environment is dynamic for that agent; otherwise, it is static. to do next. Chess is static. The sensing or action space can be continuous or discrete. The environment could be known or unknown. 13 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 5. Environment Types [2/2] The hardest case is partially observable, multiagent, stochastic, sequential, dynamic, continuous, and unknown. Examples of task environment and their characteristics: 14 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security I. Definitions II. Vacuum Cleaning Agent III. Concept of Rationality IV. PEAS – performance, environment, actuators & Sensors V. Environment Types VI. Agent Structures VII. Agent Types VIII. Problem Set 6. Agent Structures [1/3] Earlier we described how agents interact with their i) environment and ii) environment types they experience. We now switch attention to how agents map from percepts to actions. This covers agent sensors, end effectors and agent program. An agent that takes current percept as input needs far less memory than the one that works with the entire percept history. Pseudocode of an agent that i) maintains the entire percept history & ii) maintains mapping from percept to an action. Does it work with a percept history or percepts alone? 16 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Structures [2/3] If at each stage ∃ ℘𝑡 percepts and, there are a total of 𝑇 observations then, the lookup table will #℘ # ℘2 have 11𝐶 × 1𝐶 × ⋯ 𝑇 𝑡𝑖𝑚𝑒𝑠 = # ℘1 × # ℘2 × ⋯ 𝑇 𝑡𝑖𝑚𝑒𝑠. Problem: Can a (solely) table-driven approach be practical? Justify. A camera of a driverless taxi captures 3-channel images with an 8-bit quantization and a 10 𝐻𝑧 sampling rate. Each image is 640 × 480 pixel. a) Over an hour of driving, estimate table size of instantaneous precepts, b) Estimate for the entire history of precepts. Solution: 3-channel at 8-bit quantization means 24-bit, i.e., 3 byte representation per pixel. # of pixels in one frame: 640 × 480 = 3.072 × 105 . a) Thus, one percept needs 3 × 3.072 × 105 Τ1024 = 900 𝑘𝐵. b) 1-hour of driving needs 900 × 10 × 3600 Τ 1024 × 1024 = 30.9 𝐺𝐵 . This is NOT the memory needed to store the 1-hr precept history but the memory needed to store 1-hr video. The number of precepts generated by a 1-hr video capture is much larger; 9.216 × 105 × 9.216 × 105 × ⋯ × 10 × 3600 times. This is 9.216 × 105 36000 … this is ≫ the number of atoms in the universe (c 1080 ). 17 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Structures [3/3] The example illustrates that there will not be any: • physical space available to store all the precepts • computer will be fast enough to create the precept • agent that could learn all the right table entries from its experience • agent that will know enough how to fill the table of precept histories and actions This key challenge for AI is to find out how to write programs that, to the extent possible, produce rational behavior from a smallish program rather than from a vast table. 18 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security I. Definitions II. Vacuum Cleaning Agent III. Concept of Rationality IV. PEAS – performance, environment, actuators & Sensors V. Environment Types VI. Agent Structures VII. Agent Types VIII. Problem Set 6. Agent Types [1/8] Four types of agent programs: • Simple reflex agents; • Model-based reflex agents; • Goal-based agents; • Utility-based agents No psudeocodes 1. SIMPLE REFLEX AGENTS: They choose actions on the basis of the current percept. The table shows pseudocode for agent function tabulated in slide 5. The program is quite small as compared to the agent function. This reduction is a natural outcome of ignoring percept history which reduces possibility space from 3𝑇 to just 3. 20 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Types [2/8] A further small reduction comes from the fact that actions for whether to suck or not do nothing depend on which cell the agent is in. Thus, the instantaneous percept reduces from 𝐴𝑐𝑙𝑒𝑎𝑛 , 𝐴𝑑𝑖𝑟𝑡𝑦 , 𝐵𝑐𝑙𝑒𝑎𝑛 , 𝐵𝑑𝑖𝑟𝑡𝑦 to 𝐴, 𝐵, 𝐷𝑖𝑟𝑡𝑦? . The INTERPRET-INPUT function generates an abstracted description of the current state from the percept, and the RULE-MATCH function returns the first rule in the set of rules that matches the given state description. 21 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Types [3/8] In the figure: Rectangles denote the current internal state of the agent’s decision process. Ovals denote the background information used in the process. 22 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Types [4/8] Challenges with Simple Reflex Agents: ➢ One may need > 1 frame to decide as 1 frame may could present misleading information. Paris Riots 2018 ➢ Partial observation may lead to infinite looping (vacuum example). Randomization can help escape infinite looping. Thus, can we say that some degree of irrationality, in some contexts, is more intelligent? 23 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Types [5/8] 2. MODEL-BASED REFLEX AGENTS: One way to deal with partial observability is to keep track of the part of the world that one cannot perceive now, i.e., maintain that as an internal state. In the case of the video on Paris 2018 Riots, one could a) keep some frames in the memory. However, this is necessary but not sufficient. One needs to how some knowledge of the world, i.e., b) a model of the world, that will lead to questions, such as for example: ➢ if the smoke or fire is so massive then what is it that is burning? ➢ given that Paris is densely populated, do I see reports of people dead or injured given the rather strong impression of unrest the image conveys? ➢ do I see any reports of loss of something treasured, important or expensive? ➢… Finally, we need b) a model of how the world changes due to the action of the agent. 24 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Types [6/8] Note, the models are approximations of reality. not reality but A Model-based Reflex Agent 3. GOAL-BASED AGENTS: A model-based reflex agent may not be adequate for deciding what to do. A goal is needed. Actions under the same environment estimate and possibility set can be dramatically different, for different goals. E.g., if a car catches the reflex from another right ahead that stopped due to a jam, it could continue to wait. Or, if it were carrying an emergency medical patient, it can back-up and try to get away from the jam, perhaps, by using a nonstandard route to a hospital, or, even take the patient to another hospital. 25 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Types [7/8] Goal-based agents can lead to simple actions such as in the ambulance example or, a complex set of actions should they be required to meet the goal, e.g., actions following the loss of a queen in a game of chess. While a goal-based agent is less efficient than a model-reflex based agent, it is also more flexible. 4. UTILITY-BASED AGENTS: Even goals are inadequate. Some paths to the goal are cheaper or more efficient or more reliable than others. A utility function captures costs and payoffs of reaching goals. It also helps choose an action when goals are contradictory, such as speed & safety. In real-life, the world is only partially observable, the effect of an action, i.e., both the cost and the payoff is often stochastic as opposed to being deterministic. The goal itself may be unreachable or several goals may be incompatible. Each of these limitations mean that goal focus may not be even possible. Under such, real-life situations, an agent’s rational option is to focus on maximizing expected utility. 5. LEARNING AGENTS: An agent that maximizes expected utility will have a problem doing just that when the stored mappings between observations and actions lead to sub-optimal choices. Learning enables the agent to operate in initially unknown environments and to become more competent than its initial knowledge alone might allow. 26 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security 6. Agent Types [8/8] A Learning Agent comprises 4 components: i) performance element: all agents discussed so far that do not learn; ii) critic: tells the agent how well it is doing. iii) learning element: updates percept history, how it uses sensors, updating the environment to action space map. iv) problem generator: suggests actions that will lead to new and informative experiences. Without a learning element, a performance element will keep doing the actions that are best, given what it knows. But if the agent is willing to explore a little, i.e., If it may accept suboptimal actions in the short-run on account of exploration, it may discover better actions for the long run. 27 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security A Learning Agent I. Definitions II. Vacuum Cleaning Agent III. Concept of Rationality IV. PEAS – performance, environment, actuators & Sensors V. Environment Types VI. Agent Structures VII. Agent Types VIII. Problem Set 8. Problem Set [1/1] Q1: For each of the following intelligent systems, fill the PEAS table. Health Monitoring for Artillery Barrels Practicing tennis against a wall Apple Disease Early Warning System Knitting a sweater Real-time Counting for Passengers in a Bus Equity bidding bot Environmental Comfort Monitoring in a Computer Lab Course recommendation system Q2: Define in 3 lines each term: agent, agent function, agent program, rationality, autonomy, reflex agent, model-based agent, goal-based agent, utility-based agent, learning agent. Q3: Write the objective function you seek to maximize over the next 5 years after graduation with respect to your personal life (only). 29 Puja Dutta, CIE, Amit Agarwal, EEE | TIFAC-CORE in Cyber security