SCM Sweb

advertisement
KNOWLEDGE
REPRESENTATION
• Classical cognitive science and Artificial
Intelligence relied on the idea of
“knowledge representation”
The RepresentationalComputational theory of mind
1. Knowledge consists of mental
representations (mental symbols).
2. Thinking consists of the manipulation of
these symbols.
3. These computations have effects on
behavior or on other representations.
THE COMPUTER ANALOGY
• The mind is like a computer.
• A computer consists of:
1) Symbols or data structures:
–
–
–
–
–
A string of letters, like “abc”.
Numbers like 3.
Lists
Trees
Etc.
2) Algorithms (step-by-step procedures to operate on those
structures).
For instance, a procedure may “reverse” the order of
elements in a list.
Computers and mind
COMPUTERS
MIND
Data structures
Algorithms
Running programs
Mental representations
Computations
Thinking
“The fundamental working hypothesis of AI
is that intelligent behavior can be precisely
described as symbol manipulation and can
be modeled with the symbol processing
capabilities of the computer.”
Robert S. Engelmore and Edward Feigenbaum
• A cognitive theory describes the mental
representations (symbols) and the
procedures or computations on these
representations.
• What kinds of representations are there?
Different theories have different views about how
the mind represents knowledge.
The “symbols” could include:
IMAGES
LOGICAL SYMBOLS
RULES
CONCEPTUAL SYMBOLS (frames, scripts).
IMAGES
• Empiricists and other philosophers
believed that mental representations are
mainly visual images.
• It is clear that we sometimes think in terms
of images.
• In the exercise below, for ex., we must
manipulate (rotate) images.
• Most philosophers, however, believe that
images cannot express many important
features of abstract human thinking.
• For instance, logical relations like “if…
then”.
LOGIC
• Another option involves using formal logic to
model human thinking.
• There are several systems of logic.
• One system is the propositional calculus, also
known as sentential logic.
• Formulas like “P” or “Q” represent propositions
like “Peter is in school” and “Mary is in school”.
• A proposition is a statement that refers to a fact.
• An expression is any sequence of
sentence letters, connectives or
parentheses.
• For example:
P —> Q
A)
PQE —> (QQF <—> P)))) ((
• 22 —> 5 is not an expression
•
Not all expressions are well-formed.
– Many expressions in the previous page were
not well-formed.
•
A well-formed formula (wff) is an
expression that follows certain rules:
1. A sentence letter by itself is a wff
Example: P
2. If we add the negation symbol ~ to any wellformed expression, the result is a also wellformed.
Example: ~P
Note: Since ~P is well-formed, it follows from rule 2 that
~~P is also well-formed.
Note: ~ goes together with only one expression.
~PQ is not well-formed.
3. Given two well-formed expressions, the result of
connecting them by means of &, —>, <—>, or V is also a
well-formed formula.
Example: P, Q, and ~Q are all well formed, so the following
are also well-formed:
P & ~Q
P <—> Q
P —> Q
• Note: &, —>, <—>, and V must always go together with
two wwfs. Otherwise, the expression is not well-formed.
• Sample expressions that are not well-formed:
P <—>
&Q
• 4. No other expression is well-formed.
• Exercise: Which of these are well-formed?
Why, or why not?
A&B
~P —> Q
~ (A & B)
A V —>
~A
~
• A —> B can be translated as:
– If A, then B
– B, if A
– A is a condition of B
– A is necessary for B
– A is sufficient for B
– B, provided that A
– Whenever A, B
– B, on the condition that A
• A&B
– A and B
– Both A and B
– A, but B
– A, although B
– A, also B
• AVB
– A or B
– Either A or B
• A <—> B
– A if and only if B
– A is equivalent to B
– A is a necessary and sufficient condition for B
– A just in case B
• How would you write “neither A nor B” in
the propositional calculus?
• The sentence can be written in two ways:
~(A V B)
(~A & ~B)
Exercise--Translate the following into the propositional
calculus:
1. Maggie is smiling but Zoe is not smiling
2. If Zoe does not smile, then Janice will not be happy
3. Maggie’s smiling is necessary to make Janice happy.
4. If Maggie smiles although Janice is not happy, then
Zoe will smile.
•
•
•
•
Use the following translation scheme:
A: Maggie is smiling
B: Zoe is smiling
C: Janice is happy
• Maggie is smiling but Zoe is not smiling
A & ~B
• If Zoe does not smile, then Janice will not
be happy
~B —> ~C
• Maggie’s smiling is necessary to make
Janice happy
A —> C
If Maggie smiles although Janice is not
happy, then Zoe will smile.
(A & ~C) —> B
• The truth (T) or falsehood (F) of
propositions are called TRUTH VALUES.
• The logical systems that we are studying
today only have two possible truth values:
T or F.
• Note: there are several systems of logic
that involve three or more values.
• A Truth Table (TT) gives every possible
combination of truth values between
propositions.
• A Truth Table gives the meaning (the grammar) of logical
sentences.
• If we want to known the meaning of ~A, we just make a
Truth Table.
A
T
F
If A is true, then ~A is false.
If A is false, then ~A is true.
~A
F
T
A
B
T
T
F
F
T
F
T
F
A&B
T
F
F
F
A & B is only true if both A and B are true.
Otherwise, it is false.
A
B
T
T
F
F
T
F
T
F
AVB
T
T
T
F
A V B is false if both A and B are false.
Otherwise, it is true.
A
B
T
T
F
F
A —> B
T
T
F
F
T
T
F
T
A —> B is true, except when A is true and B
is false.
A
B
T
T
F
F
A <—> B
T
T
F
F
T
F
F
T
A <—> B is only true if A and B are both true
or both false.
• Logic is concerned with truth.
• Its concern is how the truth or falsehood of
one proposition depends on the truth or
falsehood of one or more propositions.
– For instance, if A and B are both true, then A
→ B must also be true.
• A complex proposition, such as A <—> B
is a truth function of simple propositions A
and B.
• Its truth values depend on the truth values
of its components.
– These components are simple propositions.
• We can consider a well-formed formula
like A —> B as an expression, and then
use the previous rules to construct a new
truth table.
Example:
• Construct a TT for the expression
(P —> Q) V (~Q & R)
First write down all the possible
combinations for P, Q, and R.
Construct first the table for P —> Q, then for
~Q, then for ~Q & R.
Now you can do a TT for the whole formula!
•
•
•
•
Construct a TT for the expressions
P V (~P V Q)
~ (P & Q) V P
R <—> ~P V (R & Q)
• A TAUTOLOGY is a formula that is always
true.
• For instance, P —> (~P —> Q)
To prove this, please construct a truth table,
and you will see that for every value of P
and Q the whole formula comes out true!
• Is P V ~ P a tautology?
• What about ((P—> Q) —> P) —> P)?
• An INCONSISTENT FORMULA is always
false.
Example:
P & ~P
• A wff that is neither tautologous nor
inconsistent is contingent.
• Is this formula tautologous, inconsistent, or
contingent?
(P <—> Q) —> (P V ~R)
• A proposition that is always true (a
tautology) is so general that it says nothing
in particular.
• Tautologies contain no information about
the world.
– Only contingent propositions give information
about the world.
• To repeat:
– All propositions that assert some particular
information about the world are contingent.
• The modern theory of truth tables for
propositional logic was developed by…
…the philosopher Ludwig Wittgenstein in
his book Tractatus Logico-Philosophicus.
• Logicians are mainly interested in
reasoning.
• Logical reasoning begins with some
assumptions or premises.
• The philosopher then applies certain rules
of reasoning to reach conclusions.
• We are now going to study several rules of
valid reasoning.
Two important rules
The Modus Ponens
P→Q
P
Therefore Q
The Modus Tollens
P→Q
~P
Therefore ~ Q
Other rules
• From the conjunction (&) of two sentences,
we can always conclude either sentence.
Example:
P&Q
P
Q
• From any expression, we can conclude any
sentence that has it as a disjunct.
Example:
Given the proposition P, we can conclude all of the
following:
PVQ
(R –> T) V P
R&F V P
Another example:
P –> Q
(P –> Q) V (~P & F)
(P –> Q) V T
• If there is a disjunction (V), and one of its
terms is denied, you can conclude the
other term.
Example:
PVQ
~P
Q
• Given two assumptions, an arrow can be
introduced as shown in the following
examples.
Example:
R
P
PR
Another example:
~P V Q
P
Q
PQ
• The arrows can be eliminated:
Example:
PQ
P
Q
An interpretation of this example:
If it is raining, I will bring an umbrella.
It is raining.
I will bring an umbrella.
• We can introduce double arrows in the
manner of the following example:
PQ
QP
P↔Q
Q↔P
• We can also eliminate the double arrows:
P↔Q
PQ
QP
• If we assume a sentence and its denial, we can conclude
the denial of any assumption that appears before the two
sentences. (This is called the indirect method or reductio
ad absurdum)
Example:
PQ
~Q
P
Q
~P
• Exercise--Prove the following:
P V ~R
~R  S
~P
S
• The rules we have studied are “truthpreserving”.
– If we start from true assumptions and then
apply these rules, the conclusions thus
reached will also be true.
– The rules preserve truth from the premises to
the conclusions.
– Truth is not lost if the rules are followed.
• To reason is to construct proofs.
• A proof is a sequence of lines.
– Each line contains one sentence.
– Each sentence is either an assumption or the
result of applying the rules of reasoning to
some assumptions.
– The last sentence is the conclusion or
theorem that is proved.
• There are many very simple and important
examples of logical reasoning that cannot be
expressed in the propositional calculus:
All human beings are animals;
Bryan is a human being,
therefore Bryan is an animal.
Another example:
All positive integers are divisible by themselves.
2 is an integer,
therefore 2 is divisible by itself.
What is the problem?
Every sentence in this type of argument is different
from every other sentence.
Different sentences are represented by different
sentence letters (for instance, A, B, and C).
We cannot represent the similarities between the
various sentences, because we take the whole
sentence as a unit.
Propositional logic cannot show how it is possible
to conclude the last sentence from the first.
• We need to break down each sentence
into parts, and then…
…show how different sentences share the
same parts.
We need another language
How to express the similarity between:
1. Sam is smiling
2. Janice is smiling ?
We can represent them as follows:
1. Fa
2. Fb
• Now, letters like a and b represent names,
whereas F represents the predicate “is
smiling”.
• A proposition can be understood as
relation between a subject and a predicate.
The sentence “Hector is Spanish” can be
written as
Fa
where a = Hector and F = is Spanish.
The letter “a” represents “Hector”.
The letter is a name for the subject.
F represents the predicate.
• It is clear that the following two sentences,
although different, have the same
structure:
Hector is Spanish
Picasso is Spanish
One sentence can be represented as Fa
(Hector is Spanish) and the other as Fb
(Picasso is Spanish).
• Many people are Spanish, not just Hector
and Picasso.
• We can make the expression more
general by replacing the name “a” with a
variable “x”.
Fx
• A propositional function is formed by
replacing a name with a variable.
• A function is more general than a
proposition.
• We use the word “function” to indicate that
the language of logic is (or is very close to)
the language of mathematics.
• A function always has an empty place.
For instance:
“are animals”, “is Spanish”, and “is smiling”
are incomplete expressions.
– The subject is missing.
– Only when it is complete can the sentence
say something that is either true or false.
Problem:
• if we take “x” to represent the general term
“people”, then Fx represents “people are
Spanish”.
• But not all people are Spanish!
• So we need to find a way to express that
some people are Spanish.
• In other cases, however, it is appropriate
to say “all”.
For instance:
All people are rational animals.
All people will die some day.
• xFx indicates that the predicate F
describes “all x”.
• xFx indicates that the predicate F
describes “some x”.
• To indicate “all” or “some” is to quantify an
expression.
• This new logical language is focused on
“propositional functions” like “x is Spanish”
or “x is smiling”.
• These functions are always quantified.
• The logical system we are discussing is
called the Predicate calculus.
THE PREDICATE CALCULUS
Elements
1. Names:
a, b, c, d, a1, b1 , c1,, d1 , a2, …
Names represent individuals, like “Tim” or “this
chair”
2. Variables
u, v, w, x, y, z, u1, v1 , w1 , …
3. Predicate letters:
A, B, C,… Z, A1, …. Z1, A2, …
4. The identity symbol ‘=‘
5. Quantifiers:
a) The universal quantifier x
The universal quantifier corresponds to “every”
or “all”.
b) The existential quantifier x
The existential quantifier corresponds to “some”,
which means “at least one”.
A quantifier must always be followed by a variable
(never a name).
6. All the elements of the propositional calculus:
sentence letters, connectives, and parentheses.
Note:
The Predicate Calculus is an extension of the
propositional calculus.
It includes the same elements plus several new
ones (names, variables, predicate letters, and
the identity symbol).
For convenience, we can also introduce the
symbol  as an abbreviation.
a  b really means ~a = b.
The symbol  is not really part of the basic
vocabulary of the Predicate Calculus.
Rules of well-formed formulas in
the Predicate Calculus
1. All the rules of the propositional calculus
also apply to the Predicate Calculus.
2. A predicate letter followed by one or more
names is well-formed
Examples:
• Fa
• Fab
3. Expressions of the form a = b (identity of names)
are wff.
Strictly speaking, identity is a kind of predicate.
The proper way of writing this should be
=ab
For historical reasons, however, it is written a=b.
• 4. Any name in a wff can be replaced by a
variable, the result is also well-formed if it
is quantified.
• For instance, if Fa is well-formed, then we
can:
– Replace “a” by “x” and form Fx.
– Quantify the new sentence as xFx.
• When we use variables, we must quantify:
Fx is not well-formed.
– It is an example of an open formula.
– (An open formula is made by replacing a
name in a wwf with a new variable without
quantifying the variable).
• Fx and Fx are not well-formed.
– x Fx and x Fx, however, are well-formed.
• aFa and bFb are not well-formed.
– You cannot quantify a name, only a variable.
5. Nothing else is well-formed.
Are the following wff?
1. Fz
2. Fb
3. xGab
4. xGax
5. x (Gxy <—> yHy)
(Hint: There is only one wff here!)
Translate the following sentences:
1. All Spanish men are clever
2. Some Chinese people live in Hong Kong
3. Not all Chinese people live in Hong Kong
4. Only Spanish people live in Madrid
5. No people study in City University unless
they are stupid.
1. All Spanish men are clever
x(Sx —> Cx)
2. Some Chinese people live in Hong Kong
x(Cx & Hx)
3. Not all Chinese people live in Hong Kong
~x(Cx—>Hx)
Alternative form: x(Cx & ~Hx)
4. Only Spanish people live in Madrid
(M = Live in Madrid; S = is Spanish)
x(Mx—>Sx)
The formula x(Sx—>Mx) is an incorrect
translation!
Why is it incorrect?
5. No people study in City University unless
they are stupid.
(P = are people; U =study in CityU; S = are
stupid)
x(Px —> (~Ux V Sx))
Another form: x((Px & UX) —> Sx)
• How to indicate that there are exactly n
things?
– For instance, exactly one, or exactly two, etc.
• xy x=y
Exactly one
• xy (xy & z (z=x V z=y)
Exactly two
• xyz (((xy & xz) & yz
& w ((w=x V w=y) V (w=z)) Exactly three
• Quantities are always expressed by using:
The identity symbol = ,
together with
Quantifiers.
• In addition to the rules of propositional
logic, several new rules are helpful.
• To explain them, we need to explain three
concepts:
– Universalization
– Existentialization
– Instantiation
• We have already introduced the important
procedure of universalization.
For instance:
Take the formula Fa & Ga.
– x (Fx & Gx) is a universalization of that
formula.
• Another procedure is existentialization.
For instance:
Take the formula Fa & Ga.
– x (Fx & Gx) is an existentialization of that
formula.
The reverse of existentialization and
universalization is instantiation.
The formula Fa & Ga is an instance of the
formula x (Fx & Gx) and of the formula
x (Fx & Gx).
• We can now formulate several rules of
reasoning.
• For any universally quantified sentence, we can
conclude any instance of that sentence.
Example:
x Fx
Fa
Fb
• Given a sentence with one name, we can
conclude an existentialization of that
sentence:
Example:
Fa
xFx
• The following argument is not valid:
Fa
x Fx
• You cannot conclude that, just because
something is F, therefore everything is F!
Here is a simple chain of reasoning:
x(Fx—>Gx)
Fa –> Ga
Fa
Ga
Fa &Ga
x (Fx & Gx)
Exercise:
Prove:
x (Gx & ~ Fx)
x(Gx –> Hx)
x (Hx & ~ Fx)
x (Gx & ~ Fx)
x(Gx –> Hx)
Ga –> Ha
Ga & ~Fa
Ga
Ha
~Fa
Ha & ~Fa
x (Hx & ~ Fx)
Quantifier exchange
Example:
x ~Fx can be turned into:
~x Fx
Example:
~ x Fx can be turned into:
x ~Fx
Leibniz’s Law (Substitutivity of
Identity)
• If a=b, then a and b can be interchanged in any
sentence.
Example:
Fa
a=b
Fb
Another example of Leibniz’s Law:
Fa & Ga
a=b
Fb & Ga
Fb & Gb
Fa & Gb
• That identical terms can be substituted for
one another is an extremely important law
of logic.
• Some philosophers, such as Frege,
believe that it is one of the most important.
• We can use this law to prove that if a=b
and bc, then ac.
• Please remember that we have simplified
the laws of reasoning.
Take this as an introductory overview…
• Formal logic can be used to construct
plans.
• A plan is a logical deduction from some
initial state to a goal state.
• Planning has been very important in
the theory of Artificial Intelligence.
CRITICAL QUESTIONS ABOUT
LOGIC AND COGNITION
• Formal logic was developed to provide an ideal
standard of excellent thinking.
• It was not developed to describe how human
beings actually think in their everyday lives.
• Many scholars believe that ordinary people do
not typically use formal logic in their plans.
• Logic has nothing to do with psychology or
actual cognition.
• Psychologist Peter Wason developed a
method of testing whether people do tend
to apply logical rules in their everyday
thinking.
The four cards above appear on the page.
A rule states:
If one side has a word, then the other has an even number
(In short: if word, then even)
• Your task is to decide which cards must be turned to see
if the rule is true of the four cards.
• Turn only the minimum number.
• Most people turn the two cards with words on them to
check the other side.
• This can be seen as an application of the modus
ponens rule.
• Many people fail to turn over the 7.
• They often say they have not done so because:
The rule doesn't say anything about odd numbers.
But if the 7 card has a word on the other side, then
the rule would be refuted.
So it is necessary to turn over the 7.
To understand this, people need to know the
modus tollens.
• The point of the experiment is that logical
rules do not quite describe how people
actually reason.
– People normally use representations and
procedures very different from those of formal
logic.
The problem of consistency.
Most logicians insist that systems should be
consistent.
This requirement is too strong.
Human thinking is often inconsistent.
• Classical logical laws are supposed to apply with
unrestricted generality.
• Many people, however, treat rules as rough
generalizations that admit of exceptions.
– A rule is often treated as a default.
• The rule that if x is an SCM student, then x is
stupid is roughly true, but there are exceptions.
• Traditional logical thinking is mainly sequential.
• It does not account for other forms of thinking, such as lateral
thinking.
• "Lateral Thinking aims to change concepts and perceptions“.
• Lateral thinking involves searching for different ways of looking at
things.
• It involves looking at a situation from a new angle or POV.
• “With logic you start out with certain ingredients just as in playing
chess you start out with given pieces. In most real life situations, we
assume certain perceptions and certain concepts. Lateral thinking is
concerned not with playing with the existing pieces but with
changing those very pieces”
• A lateral thinking puzzle:
A father and his son are involved in a car
accident, as a result of which the son (but
not the father) is rushed to hospital for
emergency surgery.
The surgeon looks at him and says "I can't
operate on him, he's my son".
• Computer scientist Marvin Minsky insisted
that the influence of logic (particularly the
issue of consistency) has been harmful to
AI research.
• The weaknesses of any logical approach
to AI can be considered in relation to the
Frame Problem
Suppose that we define a world.
A state of this world will include:
a. Objects (e.g., cubes A, B, and C)
b. Relations between objects (e.g., location)
c. Properties of objects (color, shape, etc.)
• An event in this world is a relation on
states (a function that maps one state to
another state).
• Examples:
– Put one object on another.
– Move one object from one location to another.
• Suppose block B stands on block A.
• One robot must move block A to a new
location.
• We have only moved block A. But of
course, this implies implies that block B
has also been moved.
• In a complex world, one single change will
produce many other changes.
• Knowledge of the world is interdependent.
• In normal cases, people do not consider
everything that does not change.
– For instance, when an object is moved, its color and
shape will normally not change. If it is blue, it will
remain blue; etc.
– These are irrelevant factors.
• It is obviously out of the question to write a long
list of logical rules specifying all that does not
change.
• In most everyday situations, there is a
potentially unlimited number of relevant
features.
• How can we avoid specifying everything
that does not change?
This is known as the frame problem.
• The frame problem arises in the context of
The frame problem arises especially when a
goal requires a sequence of events or
actions.
The following problems can occur:
• To miss one of the possible consequences
of some event.
• To waste resources examining facts that
are completely irrelevant.
• Many cognitive scientists believe that all
assumptions involved in thinking can be
represented EXPLICITLY by means of symbols
(for instance, mathematical or logical symbols)
• But the frame problem shows that many
assumptions involved in thinking cannot be
explicitly represented.
– They must remain implicit.
• This problem is not just about robots.
• The frame problem raises important
philosophical questions about the nature
of mind.
• Disappointed with the logic-based approach, AI
researchers like Marvin Minsky have looked for
alternative ways to represent knowledge.
• Some alternatives include:
– Frames
– Scripts
– Rules
FRAMES
• When we come across some situation, we select
from memory a structure, called a frame.
• A frame is "a data-structure for representing a
stereotyped situation” or category.
• A frame is a mental representations of typical
things and situations related to some concept.
• For instance, the FLYING ON A PLANE frame includes
the following categories:
• FLIGHT ATTENDANT, LIFE VEST, SAFETY BELT,
FIRST CLASS, ECONOMY CLASS, SAFETY
INSTRUCTIONS, etc.
• There are relations between these categories (X has a Y,
X is on Y, X is a part of Y, X is a kind of Y, etc.)
• Both the categories and their relations are part of the
frame.
• The frame FLYING ON A PLANE must
also include “subframes” like:
– GOING TO THE TOILET,
– WATCHING A MOVIE,
– FINDING ONE’S SEAT,
– STORING ONE’S HAND LUGGAGE
OVERHEAD, etc.
• A frame models our expectations and
assumptions.
– If we see certain kinds of doors, we expect to
find a room behind it.
– And we have some assumptions about what a
normal room must look like (it may have
windows and shelves, etc.)
• One important difference between the
frame-based approach and the logicbased approach:
– The frame approach was designed mainly to
study psychology and AI.
– Logic was not designed for psychological
research.
Example: The concept of a “Western”.
•
•
•
•
•
•
•
•
A kind of fiction
Normally takes place in the US
Set around or after the US Civil War
Characters: sheriff, cavalry officer, farmer, cattle
driver, bounty hunter.
Locations: small town, fort, saloon, stagecoach
Objects: guns, horses, roulettes, etc.
Typical situations: gunfights, burning a farm or a
fort, stampede, driving cattle, etc.
Film Examples: Unforgiven, Rio Bravo, Fort
Apache, My Darling Clementine.
• A frame is a package of information that can be
applied as a whole.
• According to this sort of theory:
Cognition does not consist of step-by-step
logical deductions.
It is the application of a whole frame to a
particular situation.
• Information is often structured into “parts” and “wholes”.
• The act of looking at a cube, for instance, involves a
structure like this:
• A frame can be represented as a graph of
nodes and relations.
• The top levels tend to be more “fixed” and
stable.
• The lower levels consist of “terminals” or
slots that can be filled by data.
• When we move around a cube, one or
more faces may go out of view, the whole
shape of the cube may change, etc.
• Thus we have a series of view-frames.
• Some relations are more stable than
others.
• For instance, “next to” is more stable than
“top” and “bottom”.
• Do we build such frames for every single
object?
– This sounds too complicated.
– People probably just build frames for
important objects and for a few simple or
basic shapes (cube, sphere, cone, etc.)
Default assignments
Minsky hypothesized that frames are stored in
long-term memory with default terminal values.
For instance, if I say “Peter is on the chair” you
probably do not think of an abstract chair. You
perhaps imagine a particular chair with a shape,
color, etc.
These characteristics are default assignments to
the terminals of frame systems.
• If needed, default assignments can be
changed to fit reality.
– A frame can adapt, if the data do not match
our expectations or assumptions.
– Surrealist artists force us to modify many of
our default expectations.
– We open a door and there is not a room but a
landscape!
• What happens when the data do not
match the frame?
1. We can replace our original frame choice
with another frame.
2. We can find an excuse or an explanation.
“It is an experimental movie”.
“It is broken or poorly designed”.
“It is not finished”.
“It is not a real door but a toy”.
3. When trying to replace a frame, we can use
advice from a similarity network.
This network represents similarities and
differences between related concepts.
A box, unlike a table, has no room for
knees; the box is similar to a chair because one
can sit on it, etc.
If something is not a chair, perhaps it is a
box!
In this way, similarity networks can help us
to replace our original frame with a more
appropriate frame.
Frames are thus open to change.
Frame modification or replacement
resembles the scientific process:
1. Producing a hypothesis (a frame)
2. Testing it (matching the frame against the
data)
3. Modifying or replacing it.
• A key idea behind the theory of frames:
The basic ingredients of intelligence are typically
structured into chunks of some sort.
These structures are open to revision.
The use of changeable structures accounts for the
power and efficiency of human thinking.
• In film, we can think of the following as examples
of frames:
– Popular genres (martial arts, action, sci-fi, ghost,
Western, detective, romance).
– Narrative (story) structures: the classical Hollywood
structure.
– Documentary or Fiction
– Individual “authors”: such as the style of a famous
director.
Films can encourage us:
• To search for the right frame, when it is not clear
which frame is relevant
• To revise the frame in the course of a film by
maintaining an inconsistent or changing frame.
• Some scholars prefer the word “schemata”.
A schema is roughly the same as a frame:
“A schema is a knowledge structure characteristic of a
concept or category. “ (David Bordwell)
• Schemata are embodied in prototypes, or "best
examples."
– Our prototype of buying and selling probably involves one
person purchasing something from another with cash, check, or
credit card.
• This prototype is a sort of default assignment.
• On the basis of the schema, people can apply the same
essential structure to a variety of differing situations.
• This approach has often been applied to the study of art.
• Viewers (readers, etc.) must actively search for
frames or schemata that match the data of the art
work.
• The art work can thus engage the active participation of
viewers.
• When an artist is aware of cognition, s/he will invite
users to participate by using ambiguous or changing
frames.
• Artists can also presuppose unusual frames.
• Our experience will become dynamic and rich.
SCRIPTS
• In addition to frames, an alternative way to
model conceptual thinking involves scripts.
• A SCRIPT is a knowledge structure
specifically designed for typical event
sequences.
• A famous example is the [RESTAURANT]
script developed by the computer scientist
Roger Schank and the social psychologist
Robert Abelson in 1977.
• THE RESTAURANT SCRIPT
1) Actor goes to a restaurant.
2) Actor is seated.
3) Actor orders a meal from waiter.
4) Waiter brings the meal to actor.
5) Actor eats the meal.
6) Actor gives money to the restaurant.
7) Actor leaves the restaurant.
• The underlined words represent variables.
• A script can be understood as a sequence
of conceptualizations, with some variables
in them (called script variables).
• The script often involves branching:
– If you arrive in a restaurant, it may be that
either there is a menu already on the table or
that the waiter will bring you one.
– Eventually, you will go on to order your meal,
so the two branches will join again.
• A script enables you to “fill in” unstated or
missing information.
– Think of a movie where you see a character
walking out of a house and (in the next scene)
into a restaurant.
– You infer that the person walked or took some
form of transport between the two scenes.
– Later, you can assume that the person paid
for the dinner, even if this is not shown in the
movie.
• A script may also include:
– “props” (table, menu, food, silverware, plates,
money, bill)
– roles (waiter, customer)
– perhaps some entry conditions (the customer
is hungry, the customer has money, the
customer likes the restaurant)
• Like frames, scripts are also open to
revision.
• Parts of a script can be adapted to
changing circumstances.
• New parts can be added to existing
scripts.
RULES
• Another model of reasoning involves the
use of “rules”.
• A rule is an IF-THEN structure.
– The “if” part is called the condition.
– The “then” part is called the action.
• A rule may be taken as a default.
• A default is a rough generalization that may be
changed when exceptions appear.
• In this sense, rules are not fixed once and for all.
– For instance: IF x is a bird, THEN x can fly.
– A penguin is an exception…
• These “rules” are different from classical logical
rules, which do not admit of exceptions.
• Rules can represent many sorts of knowledge
about (for example):
• Concepts and their relations.
– “If x has four legs, wags its tail, and barks, then x is a
dog.”
– IF x is a dog, THEN x is a mammal.
• Causes and effects in the world
– If x is kicked, then x will move.
• Goals or tasks
– IF you want to obtain a better job, THEN you should
get a degree from a good university.
• Regulations
– IF you do not attend the exam, THEN you will fail the
subject.
• Rules can be arranged hierarchically in
tree form:
– There are rules having other rules under them
– For instance, the rule for recognizing dogs
includes a rule for recognizing tails.
• The concept of “rule” was not developed
as an ideal logical tool.
– It was different from the predicate and
propositional calculus.
• The rule concept was developed to model
real human thinking.
• In particular, rule systems are used to
model problem-solving.
• Allen Newell, Cliff Shaw, and Herbert Simon
(NSS) developed the field of Artificial Intelligence
in 1955.
• They understood intelligence as the capacity
to solve problems.
• A problem consist a gap between some starting
state and some goal state.
– Examples: find a way to exit the house and start the
car, find something to say in response to a certain
statement, choosing the right school, finding the
fastest path to reach a destination, etc.
• To solve a problem means…
….to find a sequence of rules that gives a
path from the starting state to the goal
state.
These rules are a plan or strategy for action.
Work in classical AI often described
intelligence in terms of planning.
• Rules can be used to think forward or backward.
– We can either work forward from the starting point or
backward from the goal.
• Example of backward thinking:
– If I want to reach school, I might reason thus: “To get
to school, I must take the highway. To reach the
highway, I must take the main street. To reach the
main street…”
– In forward thinking, however, I would start out from
the house and consider all possible choices.
• Another strategy is bidirectional search.
– This approach combines backward and forward
thinking.
• When you have to solve a problem, you must
often SEARCH through a space of possibilities.
• For instance:
– If you are trying to find your way out of a maze, you
must search through the space of possible paths.
– If you are looking for a good design for a car, you
must search through the space of possible designs.
– You might want to search through all possible
combinations of colors in order to pick the right
combination of clothing.
• In a logical system, the main operation is
“deduction” (proving formulas from the axioms
plus the rules of proof).
• In a frame-based theory, the main operation is
the application of a frame to a given situation.
• In a rule-based system, however, the main
operation is SEARCHING THROUGH A
SPACE OF POSSIBILITIES.
• In many cases, however, the space of possible
solutions is too large.
– It is not practically possible to search through the
entire space.
• To try every possible solution would be highly
inefficient.
• Consider how actual people conduct efficient
searches.
– For instance, a doctor diagnosing an illness does not
need to go through every possible cause.
– Her knowledge leads her to narrow down the search
space considerably.
– We say that she is an expert in this area.
• Knowledge relevant to a problem can reduce
search time. This is the essence of expertise.
•
For instance, if I already know a city, I can use
my knowledge of landmarks, streets, and
neighborhoods to find a particular location.
• Knowledge-search duality:
1. “Search compensates for the lack of
knowledge”
2. “Knowledge reduces uncertainty by reducing
search.”
• Expertise combines two kinds of knowledge:
• Knowledge of facts
• Practical heuristics acquired through years of
experience and trial and error.
– These include the tricks, shortcuts, and rules-ofthumb that experts have learnt through years of
experience.
– Heuristics are rules that can help you find a good
enough solution without having to search through the
complete space of possibilities.
• An Expert System contains large
amounts of very specific, specialized
knowledge (ie., knowledge of a narrow
area) about complex problems.
• “Expert systems” (ES) and “Knowledgebased systems” (KBS) are often used
synonymously.
• Building an expert system is known as
knowledge engineering.
• Knowledge engineering begins by
interviewing experts in various fields
(science, medicine, business) to find out
how they think.
• Expertise is then reduced to a set of
interconnected general rules, the
knowledge base (normally if-then rules,
but possibly also frames).
Sample rules
•
IF engine_getting_petrol
AND engine_turns_over
THEN problem_with_spark_plugs
•
IF NOT engine_turns_over
AND NOT lights_come_on
THEN problem_with_battery
•
IF NOT engine_turns_over
AND lights_come_on
THEN problem_with_starter
•
IF petrol_in_fuel_tank
THEN engine_getting_petrol
• A reasoning or inference engine then matches
new evidence against this knowledge base.
A complete interaction with the car repair system might be:
System: Is it true that there's petrol in the fuel tank?
User: Yes.
System: Is it true that the engine turns over?
User: No.
System Is it true that the lights come on?
User: No.
System: I conclude that there is a problem with battery.
• http://www.expertise2go.com/webesie/tuto
rials/ESIntro/
• http://easydiagnosis.com/
•
•
Expert systems can be used in game design.
Baseball simulation games, for instance, were often based on expert
knowledge from baseball managers.
•
When a human played the game against the computer, the computer asked
the Expert System for a decision on what strategy to follow.
Even those choices where some randomness was part of the natural
system (such as when to throw a surprise pitch-out to try to trick a runner
trying to steal a base) were decided based on probabilities supplied by
expert sportsmen.
•
•
•
Tony La Russa Baseball
http://www.mobygames.com/game/dos/tony-la-russa-baseball-ii/coverart/gameCoverId,32721/
•
Earl Weaver Baseball
•
http://www.sportplanet.com/features/articles/ewb/
• The medical expert system Mycin represented its
knowledge as a set of IF-THEN rules with certainty
factors.
• It was coded in Lisp.
The following is an English version of one of Mycin's rules:
• IF the infection is pimary-bacteremia
AND the site of the culture is one of the sterile sites
AND the suspected portal of entry is the gastrointestinal
tract
THEN there is suggestive evidence (0.7) that infection is
bacteroid.
• The use of expert systems in some professions, such as
medicine, can give rise to legal issues.
• The system Mycin, developed for medical diagnosis in
the 1970s, was never actually used in practice.
• This wasn't because of any weakness in its performance
- in tests it outperformed members of the Stanford
medical school.
• It had to do with legal issues:
– If it gives the wrong diagnosis, who do you sue?
• The following art work can be understood
as a parody of questionnaires.
• http://www.diacenter.org/closky/
PROBLEMS WITH EXPERT
SYSTEMS
The core problem
Cognitive science and AI researchers often believe that:
a. people carry a list of background assumptions inside
their minds
(This list may involve rules, frames, and/or scripts)
b. This list of background assumptions could be
completed in theory, so that all assumptions could be
enumerated completely, and then coded in some form
in a computer.
Do these two beliefs make sense?
• Most human experts do not know explicitly all
the things that they know implicitly or nonconsciously.
– It is like tying up our shoes, something that we do
without thinking of the complicated steps involved.
– The knowledge is not the object of conscious
attention.
• People who have strong expert knowledge may
not be able to explain what that knowledge
involves.
• If we only ask them, we may never obtain
any information on “intuitive” or “nonconscious” knowledge.
• Experts often use common sense and
intuitive knowledge, which are not easily
programmable.
•
There is a difference between knowing-how
and knowing-that.
1. Knowing-how (procedural knowledge) involves taskoriented skills.
2. Knowing-that is the knowledge that a statement is
true or false.
•
•
A lot of our knowledge is “knowing-how”.
Is it possible to put all of this procedural
kowledge into words?
• The problem: how to encompass all of
intuitive knowledge in a finite system of
rules?
• In practical terms, classical AI had serious
limitations.
• The classical AI approach can give incorrect
responses for questions that lie just slightly
outside of the narrow areas for which they were
programmed.
• These practical problems suggest a strong
problem with the philosophical suggestions.
• Most of our actions are social (they involve or
presuppose more than one person).
• Social action relies on background knowledge:
• Background knowledge is taken for granted by
all participants in an interaction.
• It is shared by all participants.
• Seldom put explicitly into words.
• Any system of rules in the real world is
incomplete.
• Knowledge may not involve any data structure at
all:
– For instance, I can learn to ride a bicycle simply by
practicing certain patterns of body responses
acquired through trial and error, imitation, practice,
formal or informal training, etc.
• How do you know the difference between the
sound of a violin and the sound of a cello? Is it a
matter of learning a system of rules?
Some critics complain that classical AI
overemphasizes rules and planning.
What is planning?
• A plan controls the order in which a sequence
of actions is to be carried out.
• From this point of view, the execution or
implementation itself is unimportant: every move
is a stage in the implementation of the plan.
• Action is controlled by a plan.
• Is this an accurate model of human thinking?
• The world is too dynamic and unpredictable and for this
reason cannot be completely and reliably represented
inside the machine.
 Our plans are essentially vague.
 Plans do not represent the circumstances of actual actions in full
detail: they could not possibly do so!
“No amount of anticipation, planning, and programming can
ever enumerate, a priori, all the variants of even a
routine situation that may occur in daily life.”
George N. Reeke and Gerald Edelman
“In the real world any system of rules has to be
incomplete.”
Dreyfus
 If we take “plans” to determine every aspect of action,
then our actions are never planned in this strong
sense.
• A plan does not specify everything that a person must
do next.
• Even a detailed plan requires some improvisation on
the spot.
 Improvisation is by definition not bounded by plans.
 People respond to their situations: the material and
social circumstances of their actions.
 At most, a plan is only a rough guide.
• This does not mean that people never plan or use rules.
• People do make plans and use rules, quite often..
…but plans do not control every aspect of an action.
– Let us consider this point in detail:
• How do we use planning?
• Plans are “imagined projections” that we use to prepare
for an action before it happens.
• They are also “retrospective reconstructions” that we use
to explain an action that has happened or to review its
outcomes.
• Plans come before or after action.
• Plans are resources for our practical thinking about
action.
• Sometimes we use plans while conducting
an action…
…But only because the action has somehow
run into problems.
• This is not the “usual” or “normal”
situation.
• The system of our assumptions is
essentially vague:
• There is no finite list of rules or
assumptions that could be completely
coded into a computer.
• The problem is that classical AI and cognitive science
emphasize knowledge representation.
• They assume that knowledge of the world is
represented inside the mind, in the form of symbols
that code rules, frames, scripts, etc.
• Cognition does not only happen “in the head”.
• It is a complicated, interactive achievement that involves:
a. The body and skills of the person.
b. Interaction with the environment (including other persons).
• Classical AI studied individual minds.
• It ignored the extent to which intelligence
involves the environment, including other
people or agents.
• For instance, the meanings of words or
sentences depend on the context in which
they are used.
• Language is essentially 'indexical'.
– It can only be understood in relation to its
surrounding context.
Situated action
• Lucy Suchman used the term “situated
action” to emphasize “that every course of
action depends in essential ways upon its
material and social circumstances.”
• People’s behavior is often regular.
• It is not random.
• But this does not mean that people carry a
list of complete and precise rules in their
minds.
• Rules or frames are loose and ambiguous.
• People rely on context to guide their
actions.
• This approach was developed by
researchers on a field called
“ethnomethodology”.
• Ethnomethodologists insist that knowledge
and understanding are context-dependent
(situated).
• Ethnomethodology studies our taken-for-granted
knowledge.
• One research method is to 'breach' or 'break' the
everyday routine of interaction:
– Pretending to be a stranger in one's own home;
– Blatantly cheating at board games;
– Attempting to bargain for goods on sale in stores.
• This makes explicit the shared regularites that
sustains the normal flow of everyday life.
NOUVELLE AI
AI without knowledge
representation
Key principles of new AI
•
•
•
•
•
•
•
Cognition-in-practice (situated action)
Ecological niche
Distributed cognition
Emergence
Adaptive systems
Cheap design
Autopoiesis (self-production)
• Cognition cannot be separated from practical action in
some environment (ecological niche).
• Cognition is not done only in the head: it is distributed
over mind, body, and environment.
• “Cognition observed in everyday practice is
distributed—stretched over, not divided among—mind,
body, activity and culturally organized settings (which
include other actors).” (Jean Leve)
• “…knowledge-in-practice, constituted in the settings of
(social) practice, is the locus of the most powerful
knowledgeability of people in the lived-in world.”
• Cognition-inaction
• http://www.ace.uci.edu/penny/works/petitm
al/petitcode.html
Emergence
• Emergence occurs whenever a collection
of simple, interacting subunits give rise to
new characteristics (known as emergent
properties) that are not found in individual
subunits.
• An emergent property is normally the
product of collective activity without a
central planner or controller.
Think of an ant colony.
• Its individual members follow very simple
rules.
– “Take an object”, “Follow a trail” (etc.)
• The whole colony generates complex
patterns on the basis of these simple
rules.
• The colony only exists in virtue of the
behavior of its individual members.
• The colony as a whole, however, is more
powerful than its members:
– It can do things that individual members
cannot do (respond to food, enemies, etc.)
– Lives longer than its members
– Modifies its environment in ways that
individual members cannot.
• Suppose you want to design robots that collect wood into
a pile
• You can either programme the robot to achieve that
purpose (top-down approach).
• Or you could instead program very simple rules that will
yield the desired behavior indidrectly (bottom-up
approach).
• Michael Resnick developed robots that followed simple
rules:
– “Walk around randomly until bumping into a wood
chip.”
– “If you are not carrying anything then pick up the
chip.”
– “If you are carrying wood and bump into another, then
put it down.”
Self-organization
• This refers to the spontaneous rise and maintenance of
order or complexity out of a state that is less ordered.
• It is not imposed by an external designer or central
planner.
• Order arises out of the internal organization of the
system.
• Resnick’s robots (previous slide) perform a function
which has not been directly programmed in them.
• The flocking
behaviors of birds
and other animals
demonstrate the
emergent selforganization of a
group acting as a
single creature.
• Flocks develop
without a leader.
• Intelligence here
becomes collective
rather than individual.
(“Swarm
intelligence”).
Birds
Flocking,
Sylvia
Nickerson
Collage
2004
SUMBSUMPTION ARCHITECTURE
(developed by Rodney Brooks, MIT)
Example of: distributed cognition, cognitionin-practice, importance of the ecological
niche, emergence and self-organization.
• Brooks faced the usual problem of classical AI:
How to design a cognitive structure (“mind”) that
would control the behavior of the robot.
Brooks decided to get rid of symbolic cognition.
No symbolic representation of the world inside the
robot.
Nothing but sense and action.
“Seeing, walking, navigating, and aesthetically
judging do not usually take explicit thought, or
chains of thought… They just happen.” (Brooks)
In insects and other lower animals, sensation and
actuation are closely linked, without the
intermediary of some internal symbolic
representation of the world (by rules, etc.)
Basic skills are based mainly on the unthinking
coordination of perception and action.
• The key principle is the direct linkage of
perception and action.
– INTELLIGENCE WITHOUT
REPRESENTATION
• Close coupling between sensors and actuators
ensures a short reaction time.
• No need for some representation or plan to
control the action.
• By avoiding world representations, the
architecture saves:
– The time needed to read and write them,
– The time-cost of algorithms that might employ them
– The difficulty of keeping up their accuracy.
• A single robot (agent) is a collection of many Finite State
Machines (FSM) augmented with timers.
• The timers enable state changes after preprogrammed
periods of time.
• Each FSM performs an independent, simple task
such as controlling a particular sensor or actuator.
• These simple actions can produce more complex
behavior when they are organized into layers.
• Each layer implements a recognizable behavior such as
wander aimlessly, avoid an obstacle, or follow a moving
object.
For instance, a robot might have three layers:
1. One layer might ensure that robots avoid
colliding with objects.
2. Another layer would make the robot move
around without a fixed goal.
– Because of the first layer, the second layer has no
need to worry about collisions.
3. A third layer would make it move towards some
object sensed in the distance.
– Because of the first layer, this layer also has no need
to worry about clashing with the object.
• Some layers may inhibit or suppress the
behavior of other layers.
• This method allows different levels to have
their own hierarchical "rank of control".
• The higher layers (or levels of competence) build upon
the lower levels to create more complex behaviors.
• Each higher layer can be built independently and added
on to the system to create a higher level of competence.
• The designer begins by creating the lower layers and
then add more and more layers to create more
complicated behavior.
• The behavior of the whole system is the result of many
interacting simple behaviors.
• Simple behaviors are combined in a bottom-up way to
form more complex behavior (emergence)
Distributed intelligence
• This behavior is distributed:
– Different FSMs do different tasks that
contribute to the overall behavior.
– The FSMs operate independently of each
other without a central control.
– The timers of the FSMs do not have to
synchronize with one another.
ADAPTIVE SYSTEM
• No set of rules could possibly prepare a
robot for all the events that might happen.
• In classical AI, unexpected inputs from the
environment could trigger a highly
inappropriate response.
• A bottom-up approach, like the
subsumption architecture, results in a
more flexible and adaptable design.
• An adaptive system adapts its behavior
according to how it senses changes in its
environment.
• Adaptation mainly involves the mutual
adjustment of action and environment.
• It does not necessarily require mental
representation.
Cheap design
• Computation can be reduced to a very small fraction.
• Brooks only uses simple sensors, cheap
microprocessors, and algorithms with low memory
requirements.
• This is the principle of cheap design: less is more.
• Since robots are inexpensive, it is possible to create a
world populated by many and observe their collective
behavior.
• Each agent is self-contained
(autonomous).
– All computation is performed by the robot.
• This approach shows that it is possible to
design an autonomous agent with a
minimal representation of the world.
The behavior follows from the
interaction between organisms and
environment.
It is not controlled only by the internal
rules of the organism.
• A robot might be rather poor at individual
tasks, but survive well in a dynamic realworld environment.
• “Individual behaviors can be composed to
compensate for each others failures,
resulting in an emergently coherent
behavior despite the limitations of the
component behaviors.”
• No need to store all the necessary knowledge, because
a lot of the information is simply there in the
environment.
• No need to program every feature of the environment
into the robot.
• We use the interactions between the hardware and the
environment (the ecological niche).
• There is representation of the state of the environment.
• The robot does not predict all of the effects that its
actions will have on the world.
– The frame problem does not really arise.
• No longer necessary to assume any complicated
internal processing of symbols (drawing
inferences, matching representations, retrieving
precedents from memory, etc.)
• Intelligence is no longer “stored” inside the head
of the robot.
• Intelligence is distributed between the robot and
its environment (niche).
• Artist Ken Rinaldo used a version of
Brooks’ subsumption architecture.
• http://www.ylem.org/artists/krinaldo/emerg
ent1.html
AUTOPOIESIS
• Autopoiesis literally means "selfproduction” in Greek.
• The term was originally introduced by
Chilean biologists Francisco Varela and
Humberto Maturana in the early 1970s:
Example of autopoiesis
The eukaryotic cell is made of various biochemical
components (nucleic acids, proteins, etc.), and is
organized into bounded structures (the cell nucleus, the
organelles, the cell membrane, etc.)
These structures, thanks to the external flow of molecules
and energy, produce the components which continue to
maintain the organized structure of the cell.
It is the structure of the cell that gives rise to these
components.
It is these components that reproduce the cell.
The biological cell therefore produces itself.
An autopoietic system is to be contrasted with an
allopoietic system.
– An allopoietic system is “other-producing” rather than
“self-producing”.
A car factory uses raw materials (components) to
produce a car (an organized structure) which is
something other than itself (a factory).
The car does not reproduce itself.
"An autopoietic machine is a machine organized
(defined as a unity) as a network of processes of
production (transformation and destruction) of
components which:
(i) through their interactions and transformations
continuously regenerate and realize the network of
processes (relations) that produced them; and
(ii) constitute it (the machine) as a concrete unity in
space in which they (the components) exist by
specifying the topological domain of its realization
as such a network."
Maturana and Varela
• Classical AI begins with a task-neutral machine
and then devises instructions (programmes) that
make the machine carry out a task.
– Top-down processing.
– Central planning
• New AI and A-Life defines very simple rules from
which most complex behavior will eventually
emerge.
– Bottom-up processing.
– Local interactions by simple units.
– No unit has an overall plan of action.
CONNECTIONISM
• An important development in the late 1980s was the use
of neural networks.
• Neural networks are often used to implement the ideas
of emergence and self-organization mentioned before.
Historical note
The concept of neural networks was already anticipated in
the work of the cybernetics movement in the late 40s
and 50s.
In the 60s, however, classical AI became the dominant
force. Neural networks were pushed aside…
• A network is an organization of interconnected processing units.
• A neural network is a very large collection
of very simple processing units.
• The term “neural” shows that the original
inspiration was the way neurons are
connected in the brain.
• The biological appropriateness of the
model, however, is debatable.
• It is best to understand the model without
thinking too much of the biological brain.
• Key properties of a neural network:
1. A set of processing units
2. A pattern of connectivity among these
units
– Network connections are channels through
which information flows between members of
a network. In the absence of such
connections, no group of objects is a network.
3. An input connection is a conduit through which
a member of a network receives information
(INPUT).
4. An output connection is a conduit through
which a member of a network sends information
(OUTPUT).
No computer belongs to a network unless it can
receive information (INPUT) from other
computers or send information (OUTPUT) to
other computers.
• A processing unit is roughly analogous to
a neuron in the brain.
• In the brain, information flows from neuron
to neuron through the synapses.
• In a neural network, the flow of information
is often organized into layers.
• In this example,
every unit is
connected to all the
units above it.
• This is called a
three-layered
feedforward
network.
• The input (signals received by) a unit are
called activation values (or simply
activations).
• An activation value is a number.
• The activation value is normally between 0
and 1.
• This number indicates how active the unit
is.
• A unit typically receives activation values
from several other input units.
• The unit computes its own activation value
depending on the activation values it
receives from the input units.
• The unit then sends its activation value to
other units, thus helping to transmit
information through the net.
• The output of a unit depends on its inputs.
• It also depends on something called its
connection weight (or just “weight”).
• Weights may be positive or negative.
– They usually range from -1 to 1.
• Weights represent the strength of a
connection.
– A positive weight is a stronger connection
than a negative weight.
– A negative weight is a weaker connection.
• A unit receives many incoming signals.
• It must compute its combined input before
it can process its output.
• The COMBINED INPUT to a unit is the sum
of each INPUT activation multiplied by its
connection weight.
• The activation function of a unit ensures
that its output will never exceed an
acceptable range (maximum or minimum).
• For example, if the acceptable range is 0
to 1, then the value of the combined input
must be translated into a number between
0 and 1.
• This value is the output activation of the
unit.
• Neural networks tend to be massively
parallel.
•
Neural nets resemble the brain in two ways:
1. Distribution
The execution of particular tasks is often distributed
over several brain regions. Functions are not always
localized in a specific physical area of the brain.
2. Parallelism.
Brain activity is not serial but vastly parallel.
•
Connectionist models of computation combine
these two ideas.
– The model is often also called parallel distributed
processing (PDP).
• The simple net we have discussed is
called a feed forward net.
• Activation flows directly from inputs to
hidden units and then on to the output
units.
• More complex models include many layers
of hidden units, and recurrent connections
sending signals backwards.
• Connectionism is another name for the
use of neural nets in AI.
• Connectionism is also a cognitive theory.
– “Thinking” can be explained by collections of
units that operate in this way in the brain.
• All the units calculate roughly the same
(very simple) activation function.
• The operation of the network to a large
extent depends on the weights between
the units.
• The goal of connectionist research
involves finding the right weights between
units.
• Here is one method:
1. Suppose we want a net to carry out some
task (such as recognizing male and female
faces in a picture).
2. The net might have two output units
(indicating the “male” and “female”) and
many input units, one devoted to the
brightness of each pixel in the picture.
3. The weights of the net to be trained are
initially set to random values.
4. The net is then “shown” some picture(s).
5. The actual output of the net is compared
with the desired output.
6. Every weight in the net is modified slightly
to bring the net's actual output values
closer to the desired output values.
7. The process is repeated until the desired output
values are produced at the appropriate times.
8. The ideal objective is to let the net “generalize”
its behavior, so as to “recognize” even male and
female faces it has never “seen” before.
To “recognize” something is to send an
appropriate output when confronted with that
something.
• Thus a neural network can be said to “learn”.
Its acquired ability is an emergent property or
characteristic of the network.
• An emergent characteristic is the characteristic
of a whole (such as a network) which cannot be
predicted from any knowledge of its parts (the
processing units).
• Connectionists no longer need to program
all the knowledge into the computer by
using explicit symbols.
• The computer evolves the knowledge.
• There is no need to program a particular
plan, script, or frame into the computer.
Some tasks that neural networks can do:
• Pronounce some English text.
• Recognize the tenses of verbs.
• Recognize other grammatical structures
and construct correct sentences.
• Recognize shapes
• Recognize speech.
• Many definitions have exceptions.
• Philosophers and psychologists believe
that concepts have flexible boundaries.
– Concepts do not always have clear-cut
membership conditions.
• Connectionist models seem especially
suitable to taken into account the flexibility
and ambiguity of concepts.
Distributed representation
• Connectionist models have been used to
tackle the question:
How does the brain represent information?
• It is not true that an individual neuron or a small
chunk of neurons (a node in the net) represents
one thought (for instance, our memory of a
particular place or the concept “dog”).
• Instead, every thought is represented by a
complex pattern of activity across various parts
of the brain.
• Representation is distributed rather than
local.
• Each representation involves many units
• Each unit participates in many
representations.
Connectionism vs. classical AI
• Classical AI involved systems which use
explicit logical principles, rules, scripts,
frames, or similar symbolic structures.
• It often used rules with conditions and
actions.
• Classical AI relied on symbolic
representation.
• Instead of symbols stored in the brain (the
classical idea of knowledge representation in
AI), cognitive representation involves activation
patterns distributed throughout the brain.
• "Information is not stored anywhere in particular.
Rather it is stored everywhere. Information is
better thought of as 'evoked' than 'found'"
(Rumelhart & Norman 1981).
• Whereas classical AI emphasized
symbolic representation, connectionist
representation is sub-symbolic.
– Connectionism works on cognitive
microstructures.
• Connectionist systems involve the
massively parallel processing of subsymbols
• Neural networks can be used to represent
traditional logical relations.
• Suppose that node y will only be active when its
input value is at least 1.
• But neural nets are often used in a manner
very different from classical AI. This is the
emphasis of today’s lecture.
• Connectionism has influenced the
development of a philosophical viewpoint
known as ELIMINATIVE MATERIALISM.
Note: A psychologist or computer science
who works with neural networks does not
necessarily have to support Eliminative
Materialism! Many of them do not!
Eliminative materialism (E. M.)
• This strong version of materialism was defended
by Paul and Patricia Churchland.
• They do not say that mental processes are the
same as brain processes.
– They say that mental processes do not exist.
– Descriptions of desires, emotions, sensations, etc.
are empty.
– They do not refer to anything that exists.
• All that really exists are patterns of neural
activation in the brain.
Comparison of E. M. and
Functionalism
• Functionalism attempts to explain mental
states.
• E. M. claims that mental states are an
illusion.
• They should not be explained, but
“eliminated” from the theory.
• Most of us explain one another’s actions
by speaking of “beliefs”, “emotions” and
“desires”.
• This commonsense theory of the mind is
called FOLK PSYCHOLOGY.
• According to eliminative materialism, folk
psychology is all wrong.
• According to E. M.,
…folk psychology will be completely displaced
by a true theory of the brain.
• In any case, we need not accept EM to
see that connectionism is a powerful
model of AI.
Download