Decision Making AI
John See
20 Dec 2010
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making AI
• Ability of a game character to decide what to do
• Decision Making in Millington’s Model
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making AI
• Decision Trees
• Finite State Machines (FSM)
• Rule-based Systems
• Fuzzy Logic & Neural Networks
• Blackboard Architecture
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees
• Fast, easy to understand
• Simplest technique to implement, but extensions to the basic algorithm can be sophisticated
• Typically used to control characters, animation or other ingame decision making
• Can be learned, and learning is relatively fast (compared to fuzzy logic/NN)
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees – Problem Statement
• Given a set of knowledge, we need to generate a corresponding action from a set of actions
• Map input and output – typically, a same action is used for many different sets of input
• Need a method to easily group lots of inputs together under one particular action , allowing the input values that are significant to control the output
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees – Problem Statement
• Example: Grouping a set of inputs under an action
Enemy is visible
Enemy is now < 10m away
Enemy is visible
Enemy is still far (> 10m), but not at flank
Enemy is visible
Enemy is still far (> 10m), at flank
Enemy is not visible, but audible
Attack
Attack
Move
Creep
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees – Algorithm Overview
• Made up of connected decision points
• Tree has starting decision, its root
• For each decision, starting from the root, one of a set of ongoing options is chosen.
• Choice is made based on character’s knowledge
(internal/external) Fast! No prior representation!
• Continues along the tree, making choices at each decision node until no more decisions to consider
• At each leaf of the tree, an action is attached
• Action is carried out immediately
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees
• Check a single value and don’t contain any Boolean logic
(AND, OR)
• Representative set
• Boolean – Value is true
• Enumeration – Matches one of the given set of values
• Numeric value – Value is within given range
• 3D Vector – Vector has a length within given range
• Examples?
Games Programming III (TGP2281) – T1, 2010/2011
Decision Trees – Combining Decisions
• AND two decisions – place in series in the tree
• OR two decisions – also use decisions in series, but the two actions are swapped over from the AND
Games Programming III (TGP2281) – T1, 2010/2011
Decision Complexity
• Number of decisions that need to be considered is usually much smaller than number of decisions in the tree.
• Imagine using IF-ELSE statements to test each decision?
• Method of building DTs : Start with simple tree, as AI is tested in game, additional decisions can be added to trap special cases or add new behaviors
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making - Branching
• So far, we have considered only binary trees – decisions choose between 2 options.
• It is possible to build DT with any number of options, or different decisions with different number of branches
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making - Branching
• Deep DTs may result in a same alert being checked numerous times before a decision is found
• Flat DTs are more efficient, requires less decision checking
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making - Branching
• Still common to find DTs using only binary decisions
• Why?
• Underlying code for multiple branches simplifies down to a series of binary tests (IF-ELSE statements)
• Binary DTs are easier to optimize. Some learning algorithms that work with DTs require them to be binary
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Performance
• Takes no memory, performance is linear with number of nodes visited
• Assume each decision takes constant amount of time, and tree is balanced, performance: O(log
2 n), where n is number of decision nodes in tree
• This DOES NOT consider the execution time of the different checks required in the DT, which can vary a lot!
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Balancing the Tree
• DTs can run the fastest when a tree is balanced
• A balanced tree keeps the height of its branches approximately equal (within 1 to be considered balanced).
In our context, it will have about the same number of decision making levels on each branch
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Balancing the Tree
• 8 behaviors, 7 decisions
• 1 st tree – extremely unbalanced, 2 nd tree – balanced
• To get to H, 1 st tree needs 8 decisions, 2 nd tree needs 3 only
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Balancing the Tree
• If all behaviors are equally likely , what is the average number of decisions needed for both trees?
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Balancing the Tree
• If we were likely to end up at decision A majority of time, which is more efficient?
• How do we treat decisions that are time-consuming to run?
Let’s say A is the most time-consuming decision…
Games Programming III (TGP2281) – T1, 2010/2011
Decision Making – Merging Patterns
• DTs can be extended to allow multiple branches to merge into a new decision – efficient, but some care is needed!
• A decision/action can be reached in more than one way
• Avoid causing loops, can never find an action leaf
Games Programming III (TGP2281) – T1, 2010/2011
Random Decision Trees
• To provide some unpredictability and variation to making decisions in DTs
• Simplest way: Generate a random number and choose a branch based on its value
• DTs are normally intended to run frequently, reacting to the game state, random decisions can be a problem
• What is a potential problem with the following DT?
Games Programming III (TGP2281) – T1, 2010/2011
Random Decision Trees
Possible considerations…
1.
Allow random decision to keep track of what it did last time.
• When a decision is considered, a choice is made at random, and that choice is stored. Next time the decision is considered, no more randomness, previous choice is maintained.
• If something in the world changes, a different decision was arrived, the stored choice should be removed.
2.
Timing Out: Allow the AI to “time out” after a set time, and a random choice is to be made again. Gives variety and realism.
Games Programming III (TGP2281) – T1, 2010/2011
State Machines
• Often, characters in a game act in one of a limited set of ways
• Carry on doing the same thing until some event or influence makes them change
• Can use decision trees, but it is easier to model this behavior using state machines (or finite state machines, FSM)
• State machines take into consideration
• the world around them
• their internal state
Games Programming III (TGP2281) – T1, 2010/2011
State Machines – Basics
• Each AI character occupies one state at each instance
• Actions or behaviors are associated with each state
• So long as the character remains in that state, it will continue carrying out the same actions/behavior
• States are connected by transitions
• Each transitions leads from one state to another, the target state, and each has a set of associated conditions
• Changing states: when the game determines that conditions of a transition are met, the conditions trigger and a new state is fired
Games Programming III (TGP2281) – T1, 2010/2011
State Machines – Simple Example
• State machine to model a soldier – 3 states
• Each state has its own transitions
• The solid circle (with a transition w/o trigger condition) points to the initial state that will be entered when the state machine is first run
Games Programming III (TGP2281) – T1, 2010/2011
State Machines vs. Decision Trees
• Now, name some obvious differences in making decisions using decision trees and state machines ?
Games Programming III (TGP2281) – T1, 2010/2011
Finite State Machines (FSM)
• In game AI, a state machine with this kind of structure
(as seen earlier) is usually called a finite state machine
(FSM)
• An FSM has a finite number of states and transitions
• It has finite internal memory to store its states and transitions
Games Programming III (TGP2281) – T1, 2010/2011
FSM – Generic Implementation
• Use a generic state interface that keeps track of a set of possible states and records the current state it is in
• With each state, a series of transitions are maintained.
Each transition is also a generic interface with conditions
• At each iteration (game loop), an update function is called to check if any of the transitions from the current state is triggered.
• If a transition is triggered, then the transition will be fired
• The separation of triggering and firing of transitions allows the transitions to have their own actions
Games Programming III (TGP2281) – T1, 2010/2011
FSM – Generic Implementation
• Refer to textbook or other references for a more indepth code-level implementation of FSMs
Games Programming III (TGP2281) – T1, 2010/2011
FSM – Complexity
• State machines only requires memory to hold a triggered transition and the current state
• O(1) in memory and O(m) in time, where m is the
(average) number of transitions per state
• The algorithm calls other supporting functions to perform action and etc. These probably account for most of the time spent in the algorithm.
• Hard-coded FSM – inflexible, does not allow level designers the control over building the FSM logic
Games Programming III (TGP2281) – T1, 2010/2011
Hard-coded FSM
• Hard-coded FSM –
• Consists of an enumerated value, indicating which state is currently occupied, and a function that checks if a transition is followed
• States are HARD-CODED, and limited to what was HARD-
CODED
• Pros – Easy and quick implementation, useful for small
FSMs
• Cons:
• Inflexible, does not allow level designers the control over building the FSM logic
• Difficult to maintain (alter) – Large FSMs, messy code
• Every character needs to be coded its own AI behaviors…
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• One state machine is a powerful tool, but will still face difficulty expressing some behaviors
• Also if you wish to model somewhat different behaviors from more than one state machines for a single AI character
• Example: Modeling alarm behaviors with hierarchical s/m
(using a basic cleaning robot state machine)
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• If the robot needs to get power if it runs out of power & resume its original duties after recharging, these transition behaviors must be added to ALL existing states to ensure robot acts correctly
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• This is not exactly very efficient. Imagine if you had to add many more concurrent behaviors into your primary state machine?
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• A hierarchical state machine for the cleaning robot
• Nested states – could be in more than one state at a time
• States are arranged in a hierarchy next state machine down is only considered when the higher level state machine is not responding to its alarm
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• H* “history state” node
• When the composite state (lower hierarchy) is first entered, the H* node indicates which sub-state should be entered
• If composite state already entered, then previous sub-state is restored using the H* node
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• Hierarchical state machine with cross hierarchy transition
• Most hierarchical s/m support transitions between levels of the hierarchy
• Let’s say we want the robot to go back to refuel when it does not find any more trash to collect…
Games Programming III (TGP2281) – T1, 2010/2011
Hierarchical State Machines
• Refer to textbook for more details on its implementation
• Performance:
• O(n) in memory (n is number of layers in hierarchy)
• O(nt) in time, where t is number is number of transitions per state
Games Programming III (TGP2281) – T1, 2010/2011
DT + SM
• Combining decision trees and state machines
• One approach: Replace transitions from a state with a decision tree
• Leaves of DT (rather than straightforward conditions/actions) are now transitions to other states
Games Programming III (TGP2281) – T1, 2010/2011
DT + SM
• To implement state machine without decision tree transitions…
• We may need to model complex conditions that require more checking per transition
• May be time-consuming as need to check all the time
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy Logic
• Founded by Lotfi Zadeh (1965)
• “the essence of fuzzy logic is that everything is a matter of degree”
• Imprecision in data…
• and uncertainty in solving problems
• Fuzzy logic vs. Boolean logic
• 50%-80% less rules than traditional rule-based systems, to accomplish identical tasks
• Examples: Air-conditioner thermostat or washing machine
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy Logic in Games
• Example of uses:
• To control movement of bots/NPCs (to smooth out movements based on imprecise target areas)
• To assess threats posed by players (to make further strategic decisions)
• To classify player and NPCs in terms of some useful game information (such as combat or defensive prowess)
Games Programming III (TGP2281) – T1, 2010/2011
Crisp data & Fuzzy data
• Crisp data (real numbers, value)
• Fuzzy data (a predicate or description, with degree value)
• Fuzzy logic gives a predicate a degree value. Instead of belonging to a set of being excluded (1 or 0, Boolean logic), everything can partially belong to a set , and some things more belong than others
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy sets
• Fuzzy sets – the numeric value is called the degree of membership (these values are NOT probability values!)
• For each set, a degree of membership of 1 given to something completely in the set. Degree membership of
0 given to something completely outside the fuzzy set
• Typical to use integers in implementation instead of floating-point values (between 0 and 1), for fast computation in game
• Note : Anything can be a member of multiple sets at the same time
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy Control / Inference Process
• 3 basic steps in a fuzzy control or fuzzy inference process
Games Programming III (TGP2281) – T1, 2010/2011
Step 1 - Fuzzification
• Mapping process – converts crisp data (real numbers) to fuzzy data (degree of membership)
• E.g.: Given a person’s weight, find the degree to which a person is underweight, overweight or at ideal weight
Games Programming III (TGP2281) – T1, 2010/2011
Membership Functions
• Membership functions map input variables to a degree of membership, in a fuzzy set between 0 and 1
• Any function can be used, and the shape usually is governed by desired accuracy, the nature of problem, or ease of implementation.
• Boolean logic m/f
Games Programming III (TGP2281) – T1, 2010/2011
Membership Functions
• Grade m/f
• Reverse grade m/f
• Triangular m/f
• Trapezoid m/f
Games Programming III (TGP2281) – T1, 2010/2011
Membership Functions
• Earlier example of using a set of membership functions to represent a person’s weight
Games Programming III (TGP2281) – T1, 2010/2011
Step 2 – Fuzzy rule base
• Once all inputs are expressed in fuzzy set membership, combine them using logical fuzzy rules to determine degree to which each rule is true
• E.g.
• Given a person’s weight and activity level as input, define rules to make a health decision
• If overweight AND NOT active then frequent exercise
• If overweight AND active then moderate diet
• But having a fuzzy output such as “frequency exercise” is not enough – need to quantify the amount of exercise
(e.g. 3 hours per week)
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy rules
• Usually uses IF-THEN style rules
• If A then B
• A
antecedent / premise
• B
consequent / conclusion
• To apply usual logical operators to fuzzy input, we need the following fuzzy axioms :
• A OR B = MAX(A, B)
• A AND B = MIN(A, B)
• NOT A = 1 – A
Games Programming III (TGP2281) – T1, 2010/2011
Fuzzy rules
• Earlier example on weight (and now, including height) overweight AND tall = MIN(0.7, 0.3) = 0.3
overweight OR tall = MAX(0.7, 0.3) = 0.7
NOT overweight = 1 – 0.7 = 0.3
NOT tall = 1 – 0.3 = 0.7
NOT (overweight AND tall) = 1 – MIN(0.7, 0.3) = 0.7
• Note that these fuzzy axioms (AND, OR, NOT) are not the only definition of the logical operators. There are other definitions that can be used…
Games Programming III (TGP2281) – T1, 2010/2011
Complete Rule Base
• With the above m/f for each input variable, common requirement is to construct a complete set of all possible combination of inputs. In this case, we need 18 rules
(2x3x3)
Games Programming III (TGP2281) – T1, 2010/2011
Rule evaluation (Creature example)
• We have an AI fuzzy decision making system, which needs to evaluate whether a creature should attack the player. Input variables: range, health, opponent ranking
Games Programming III (TGP2281) – T1, 2010/2011
Rule evaluation (Creature example)
• Rule base:
• If (in melee range AND uninjured) AND NOT hard then attack
• If (NOT in melee range) AND uninjured then do nothing
• If (NOT out of range AND NOT uninjured) AND
(NOT wimp) then flee
• Given specific degrees for the input variables, we might get outputs that are like:
• Attack degree: 0.2
• Do nothing degree: 0.4
• Flee degree: 0.7
Games Programming III (TGP2281) – T1, 2010/2011
Rule evaluation (Creature example)
• So what do we do with those fuzzy membership output values??
• Missing link: We also need to represent the output variable as a fuzzy membership set!
Games Programming III (TGP2281) – T1, 2010/2011
Step 3 – Defuzzification
• Defuzzification process: Fuzzy output Crisp output
• From previous step, each rule in rule base results in a degree of membership in some output fuzzy set
• With the numerical output we got earlier (0.2 for attack,
0.4 for do nothing, 0.7 for flee), we shall construct a composite output membership function
Games Programming III (TGP2281) – T1, 2010/2011
Step 3 – Defuzzification
• Not possible to be exact/accurate, but there are methods that solve the problem as near as possible
• Using Highest Membership Function
• Choose fuzzy set which has highest degree of membership and choose the output value that represents each set.
• 4 common points: min, max, average of min/max, bisector
• Very simple to implement but coarse defuzzification
Games Programming III (TGP2281) – T1, 2010/2011
Step 3 – Defuzzification
• Blending Based on Membership
• Blend each characteristic point based on its corresponding degree of membership
• E.g. Character with 0.2 attack, 0.4 do nothing, 0.7 flee will produce crisp output given by (0.2 * attack direction) + (0.4 * do nothing direction) + (0.7 flee direction)
• Make sure that the eventual result is normalized (otherwise result may be over-the-bounds or unrealistic)
• Common normalization technique: Divide total blended sum by the sum of fuzzy output values
• Minimum values blended (Smallest of Maximum, SoM)
• Maximum values blended (Largest of Maximum, LoM)
• Average values blended (Mean of Maximum, MoM)
Games Programming III (TGP2281) – T1, 2010/2011
Step 3 – Defuzzification
• Center of Gravity
• Also known as Centroid of Area method Takes into account all membership values, rather than specific ones
(largest, smallest, average, etc.)
• First, each m/f is cropped at the membership value of its set
• Center of mass is found by integrating each in turn. This point is chosen as output crisp value
• Unlike bisector of area method, we can’t compute this offline since we do not know in advance the fuzzy membership values, and how the m/f will be cropped
Games Programming III (TGP2281) – T1, 2010/2011
Misc: Dealing with Complex Rule Base
• We may have multiple rules in our rule base that will results in the same output membership fuzzy set.
• E.g.
• Corner-entry AND going-slow THEN accelerate
• Corner-exit AND going-fast THEN accelerate
• Corner-exit AND going-slow THEN accelerate
• How do we deal with such situations? Which output membership value for accelerate to choose?
Games Programming III (TGP2281) – T1, 2010/2011
Some Good Examples
• 2 Examples: Control Example & Threat Assessment
Example from “AI for Game Developers” ref book
Games Programming III (TGP2281) – T1, 2010/2011