Lecture 25 The Cognitive Inversion Artificial Intelligence: Recognition Tasks

advertisement

Lecture 24 4/15/04 15:20

CS 100

(S&G, § 13.4)

CS 100 - Lecture 24 4/15/04

The Cognitive Inversion

• Computers can do some things very well that are difficult for people

– e.g., arithmetic calculations

– playing chess & other board games

– doing proofs in formal logic & mathematics

– handling large amounts of data precisely

• But computers are very bad at some things that are easy for people (and even some animals)

– e.g., face recognition & general object recognition

– autonomous locomotion

– sensory-motor coordination

• Conclusion: brains work very differently from Von

Neumann computers

4/15/04 CS 100 - Lecture 24 2

1

1

Lecture 24

4/15/04

The 100-Step Rule

• Typical recognition tasks take less than one second

• Neurons take several milliseconds to fire

• Therefore then can be at most about 100 sequential processing steps

CS 100 - Lecture 24 3

Two Different Approaches to

Computing

Von Neumann: Narrow but Deep

4/15/04

Neural Computation: Shallow but Wide

CS 100 - Lecture 24 4

4/15/04 15:20

CS 100 2

Lecture 24

Retina:

100 million receptors

How Wide?

4/15/04

Optic nerve: one million nerve fibers

CS 100 - Lecture 24 5

A Very Brief Tour of

Real Neurons

4/15/04

(and Real Brains)

CS 100 - Lecture 24 6

4/15/04 15:20

CS 100 3

Lecture 24 4/15/04 15:20

4/15/04 CS 100 - Lecture 24

(fig. from internet)

Left Hemisphere

7

CS 100

4/15/04 CS 100 - Lecture 24 8

4

Lecture 24

Typical

Neuron

4/15/04 CS 100 - Lecture 24

Grey Matter vs. White Matter

9

4/15/04 15:20

CS 100

4/15/04 CS 100 - Lecture 24

(fig. from Carter 1998)

10

5

Lecture 24

Neural Density in Cortex

4/15/04

• 148 000 neurons / sq. mm

• Hence, about 15 million / sq. cm

CS 100 - Lecture 24 11

Relative Cortical Areas

4/15/04 15:20

CS 100

4/15/04 CS 100 - Lecture 24 12

6

Lecture 24

Information Representation in the Brain

4/15/04 CS 100 - Lecture 24

Macaque Visual System

13

4/15/04 15:20

CS 100

4/15/04 CS 100 - Lecture 24

(fig. from Van Essen & al. 1992)

14

7

Lecture 24

Hierarchy of

Macaque

Visual

Areas

4/15/04 CS 100 - Lecture 24

(fig. from Van Essen & al. 1992)

15

Bat Auditory

Cortex

4/15/04 15:20

CS 100

4/15/04 CS 100 - Lecture 24

(figs. from Suga, 1985)

16

8

Lecture 24 4/15/04 15:20

Real Neurons

4/15/04 CS 100 - Lecture 24

Typical Neuron

17

CS 100

4/15/04 CS 100 - Lecture 24 18

9

Lecture 24

Dendritic Trees of Some Neurons

A.

inferior olivary nucleus

B.

granule cell of cerebellar cortex

C.

small cell of reticular formation

D.

small gelatinosa cell of spinal trigeminal nucleus

E.

ovoid cell, nucleus of tractus solitarius

F.

large cell of reticular formation

G.

spindle-shaped cell, substantia gelatinosa of spinal chord

H.

large cell of spinal trigeminal nucleus

I.

putamen of lenticular nucleus

J.

double pyramidal cell, Ammon’s horn of hippocampal cortex

K.

thalamic nucleus

L.

globus pallidus of lenticular nucleus

4/15/04 CS 100 - Lecture 24

(fig. from Trues & Carpenter, 1964)

19

Layers and Minicolumns

4/15/04 15:20

CS 100

4/15/04 CS 100 - Lecture 24

(fig. from Arbib 1995, p. 270)

20

10

Lecture 24

CS 100

Neural

Networks in

Visual

System of

Frog

4/15/04 CS 100 - Lecture 24

(fig. from Arbib 1995, p. 1039)

21

Frequency

Coding

4/15/04 CS 100 - Lecture 24

(fig. from Anderson, Intr. Neur. Nets )

22

4/15/04 15:20

11

Lecture 24

Variations in Spiking Behavior

4/15/04 15:20

CS 100

4/15/04 CS 100 - Lecture 24

4/15/04

Chemical Synapse

1.

Action potential arrives at synapse

2.

Ca ions enter cell

3.

Vesicles move to membrane, release neurotransmitter

4.

Transmitter crosses cleft, causes postsynaptic voltage change

CS 100 - Lecture 24

(fig. from Anderson, Intr. Neur. Nets )

24

23

12

Lecture 24

Neurons are Not Logic Gates

• Speed

– electronic logic gates are very fast (nanoseconds)

– neurons are comparatively slow (milliseconds)

• Precision

– logic gates are highly reliable digital devices

– neurons are imprecise analog devices

• Connections

– logic gates have few inputs (usually 1 to 3)

– many neurons have >100   000 inputs

4/15/04 CS 100 - Lecture 24 25

4/15/04 15:20

CS 100

Artificial Neural Networks

4/15/04 CS 100 - Lecture 24 26

13

Lecture 24 inputs

Typical Artificial Neuron connection weights output threshold

CS 100 - Lecture 24 4/15/04 27

4/15/04

Typical Artificial Neuron summation activation function

CS 100 - Lecture 24 net input

28

4/15/04 15:20

CS 100 14

Lecture 24

4/15/04

Operation of Artificial Neuron

0

1

–1

2

0 × 2 = 0

1 × 1 = –1

1 × 4 = 4

3

Σ

3 > 2

1

4

1 2

CS 100 - Lecture 24 29

4/15/04

Operation of Artificial Neuron

1

1 –1

2

1 × 2 = 2

1 × 1 = –1

0 × 4 = 0

1

Σ

1 < 2

0

4

0 2

CS 100 - Lecture 24 30

4/15/04 15:20

CS 100 15

Lecture 24 4/15/04 15:20

Feedforward Network

CS 100 input input layer

4/15/04 hidden layers

CS 100 - Lecture 24 output layer

31

Feedforward Network In

Operation (1)

4/15/04 CS 100 - Lecture 24 32

16

Lecture 24

Feedforward Network In

Operation (2)

4/15/04 15:20

4/15/04 CS 100 - Lecture 24

Feedforward Network In

Operation (3)

33

CS 100

4/15/04 CS 100 - Lecture 24 34

17

Lecture 24

Feedforward Network In

Operation (4)

4/15/04 15:20

4/15/04 CS 100 - Lecture 24

Feedforward Network In

Operation (5)

35

CS 100

4/15/04 CS 100 - Lecture 24 36

18

Lecture 24

Feedforward Network In

Operation (6)

4/15/04 15:20

4/15/04 CS 100 - Lecture 24

Feedforward Network In

Operation (7)

37

CS 100

4/15/04 CS 100 - Lecture 24 38

19

Lecture 24

Feedforward Network In

Operation (8) output

4/15/04 CS 100 - Lecture 24

Comparison with Non-Neural

Net Approaches

• Non-NN approaches typically decide output from a small number of dominant factors

• NNs typically look at a large number of factors, each of which weakly influences output

• NNs permit:

– subtle discriminations

– holistic judgments

– context sensitivity

4/15/04 CS 100 - Lecture 24 40

39

4/15/04 15:20

CS 100 20

Lecture 24

Connectionist Architectures

• The knowledge is implicit in the connection weights between the neurons

• Items of knowledge are not stored in dedicated memory locations, as in a Von Neumann computer

• “Holographic” knowledge representation:

– each knowledge item is distributed over many connections

– each connection encodes many knowledge items

• Memory & processing is robust in face of damage, errors, inaccuracy, noise, …

4/15/04 CS 100 - Lecture 24 41

Differences from

Digital Calculation

• Information represented in continuous images (rather than language-like structures)

• Information processing by continuous image processing (rather than explicit rules applied in individual steps)

• Indefiniteness is inevitable (rather than definiteness assumed)

4/15/04 CS 100 - Lecture 24 42

4/15/04 15:20

CS 100 21

Lecture 24

Supervised Learning

• Produce desired outputs for training inputs

• Generalize reasonably & appropriately to other inputs

• Good example: pattern recognition

• Neural nets are trained rather than programmed

– another difference from Von Neumann computation

4/15/04 CS 100 - Lecture 24 43

Learning for Output Neuron (1)

0

1

1

2

–1

Σ

4

3

2

3 > 2

1

Desired output:

1

No change!

4/15/04 CS 100 - Lecture 24 44

4/15/04 15:20

CS 100 22

Lecture 24

Learning for Output Neuron (2)

0

1

2

–1.5 –1

Σ

1.75

4 3.25

1 2

1.75

< 2

Desired output:

1 0 0

Output is too large

4/15/04 CS 100 - Lecture 24 45

Learning for Output Neuron (3)

1

1

0

2

–1

Σ

4

1

2

1 < 2

0

Desired output:

0

No change!

4/15/04 CS 100 - Lecture 24 46

4/15/04 15:20

CS 100 23

Lecture 24

Learning for Output Neuron (4)

1

1

2 2.75

–0.5 –1

Σ

2.25

4

0 2

2.25

> 2

Desired output:

0 1 1

Output is too small

4/15/04 CS 100 - Lecture 24 47

Credit Assignment Problem

How do we adjust the weights of the hidden layers?

Desired output input layer

4/15/04 hidden layers

CS 100 - Lecture 24 output layer

48

4/15/04 15:20

CS 100 24

Lecture 24

Input

4/15/04

Back-Propagation:

Forward Pass

Input

4/15/04 CS 100 - Lecture 24

Desired output

49

Back-Propagation:

Actual vs. Desired Output

CS 100 - Lecture 24

Desired output

50

4/15/04 15:20

CS 100 25

Lecture 24

Input

4/15/04

Back-Propagation:

Correct Output Neuron Weights

Input

4/15/04 CS 100 - Lecture 24

Desired output

51

Back-Propagation:

Correct Last Hidden Layer

CS 100 - Lecture 24

Desired output

52

4/15/04 15:20

CS 100 26

Lecture 24

Input

4/15/04

Back-Propagation:

Correct All Hidden Layers

Input

4/15/04 CS 100 - Lecture 24

Desired output

53

Back-Propagation:

Correct All Hidden Layers

CS 100 - Lecture 24

Desired output

54

4/15/04 15:20

CS 100 27

Lecture 24

Input

4/15/04

Back-Propagation:

Correct All Hidden Layers

Input

4/15/04 CS 100 - Lecture 24

Desired output

55

Back-Propagation:

Correct First Hidden Layer

CS 100 - Lecture 24

Desired output

56

4/15/04 15:20

CS 100 28

Lecture 24

Use of Back-Propagation (BP)

• Typically the weights are changed slowly

• Typically net will not give correct outputs for all training inputs after one adjustment

• Each input/output pair is used repeatedly for training

• BP may be slow

• But there are many better ANN learning algorithms

4/15/04 CS 100 - Lecture 24 57

ANN Training Procedures

• Supervised training : we show the net the output it should produce for each training input (e.g., BP)

• Reinforcement training : we tell the net if its output is right or wrong, but not what the correct output is

• Unsupervised training : the net attempts to find patterns in its environment without external guidance

4/15/04 CS 100 - Lecture 24 58

4/15/04 15:20

CS 100 29

Lecture 24

Applications of ANNs

• “Neural nets are the second-best way of doing everything”

• If you really understand a problem, you can design a special purpose algorithm for it, which will beat a NN

• However, if you don’t understand your problem very well, you can generally train a

NN to do it well enough

4/15/04 CS 100 - Lecture 24 59

4/15/04 15:20

CS 100 30

Download