Action-Perception Learning Cycles

advertisement
Action-Perception-Learning Cycles
2012 Fall Graduate Course
Byoung-Tak Zhang
Department of Computer Science and Engineering &
Cognitive Science and Brain Science Programs
Seoul National University
http://bi.snu.ac.kr/
What is a Learning System?
• Learning is the improvement of performance in some
environment through the acquisition of knowledge
resulting from experience in that environment.
the improvement
of behavior
on some
performance task
through acquisition
of knowledge
based on partial
task experience
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
2
Machine Learning: An Example
Error Backpropagation
wi  wi  wi ,
wi  
Information Propagation
Weights
Input x1
x
Input x2
E
wi
Output Comparison
 1
2
Ed ( w) 
(
t

o
)
 k k
2 koutputs
Output
o  f (x)
Input x3
Input Layer
Scaling Function
Hidden Layer
Output Layer
Activation Function
Activation Function
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
3
Application Example:
Autonomous Land Vehicle (ALV)
• NN learns to steer an autonomous vehicle.
• 960 input units, 4 hidden units, 30 output units
• Driving at speeds up to 70 miles per hour
ALVINN System
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
4
Google “Self-Driving Car”
• DARPA Grand Challenge (2005)
• DARPA Urban Challenge (2007)
• Google Self-Driving Car (2009)
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
5
Machine Learning (ML): Three Tasks
• Supervised Learning
– Estimate an unknown mapping from known input and target output
pairs
– Learn fw from training set D = {(x,y)} s.t. f w (x)  y  f (x)
– Classification: y is discrete
– Regression: y is continuous
• Unsupervised Learning
–
–
–
–
Only input values are provided
Learn fw from D = {(x)} s.t. f w (x)  x
Compression
Clustering
–
–
–
–
Not target, but rewards (critiques) are provided “sequentially”
Learn a heuristic function fw from Dt = {(st,at,r t) | t = 1, 2, …} s.t. f w (st , at , rt )
Action selection
Policy learning
• Reinforcement Learning
2012 (c) SNU
Zhang,
B.-T., Next-Generation
Biointelligence
Lab, Machine Learning Technologies, Communications of KIISE, 25(3), 2007
http://bi.snu.ac.kr/
6
Machine Learning Models
• Symbolic Learning







– Version Space Learning
– Case-Based Learning
• Neural Learning
–
–
–
–
Multilayer Perceptrons
Self-Organizing Maps
Support Vector Machines
Kernel Machines
• Evolutionary Learning
–
–
–
–
–
Evolution Strategies
Evolutionary Programming
Genetic Algorithms
Genetic Programming
Molecular Programming
Probabilistic Learning

Bayesian Networks
Helmholtz Machines
Markov Random Fields
Hypernetworks
Latent Variable Models
Generative Topographic
Mapping
Other Methods





Decision Trees
Reinforcement Learning
Boosting Algorithms
Mixture of Experts
Independent Component
Analysis
2012 (c) SNU
Biointelligence
Lab, Machine Learning Technologies, Communications of KIISE, 25(3), 2007
Zhang,
B.-T., Next-Generation
http://bi.snu.ac.kr/
7
From Machine Learning to
Brain-Like Cognitive Learning
Machine Learning vs. Human Learning
Machine Learning
• Clear separation of learning
and inference
• Examples are assumed to be
statistically independent
• Mainly numerical, quantitative
change
• One-shot learning is difficult
• Requires uniquely labeled
examples (supervised
classification)
• Good at discrimination and
classification (discriminative)
Human Learning
• Learning and inference
interleaved
• Previous learning affects the
next learning (dynamic)
• Relational, qualitative change
possible
• One-shot learning is frequent
• Learns from unlabeled or selflabeled examples (selfsupervised)
• Can generate prototypes and
instances (generative)
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
9
Human Learning: Properties
•
•
•
•
•
•
•
•
•
•
•
Sensorimotor
Real-time
Predictive
Incremental
Dynamic
Structural
One-shot
Self-supervised
Prototypical
Generative
Recall
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
10
Humans and Computers
The Entire Problem Space
Human Computers
What Kind of
Computers?
Current Computers
11
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
Cognitive Systems
Cognitive Systems Require Cognitive Computing or
Cognitive Information Processing
Cognitive Computing
Cognitive System
Real-Time Dynamics
Openness
Multisensory Integration
Perception
Sequential Generation
Action
2012 30(1):75-111,
(c) SNU Biointelligence
Zhang, B.-T., Communications of KIISE,
2012 Lab,
http://bi.snu.ac.kr/
12
TU Munich “Rosie” the Cognitive Robot
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
13
Apple “Siri” Personal Assistant
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
14
Toward Human-Level Computational Intelligence:
A Perspective of the SNU Biointelligence Lab
• Q1: What capability is fundamentally missing for achieving
human-level computational intelligence?
– A1: Human-level machine learning that enables rapid, flexible, and robust
decisions and actions in dynamic and uncertain environments.
• Q2: What aspect is the most essential to study human-level
machine learning?
– A2: Lifelong learning with perception-action cycles, i.e. the circular flow of
information that takes place between the organism and its environment in the
course of a sensory-guided sequence of behavior towards a goal (Fuster, 2004).
• Q3: What capabilities are required for lifelong learning in
perception-action cycle systems?
– A3: Dynamic, incremental, online, and predictive learning. Flexible
representation and fast reorganization. Multisensory integration, sensorimotor
imagery, and sequential decision making. Active, selective attention. Balancing
exploration and exploitation. Self-awareness, motivation, self-sustainability….
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
15
Course Introduction
• From machine learning to brain-like
cognitive learning
• Brain as a physical, thermodynamic computer
• Perception-action cycles and Carnot cycles
• Models of action-perception-learning cycles
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
16
Brain as a Physical, Thermodynamic
Computer
Brain as a Physical, Thermodynamic Computer
• Brain is an open, dissipative system, operating far
from thermodynamic equilibrium.
• Brain requires energy and matter to exchange with
its environment to maintain stability.
• Brain can be excited internally by chemical
(enzymes) and electrical means (action potentials)
as well as externally.
• Continuous sensing of external world and internal
world.
• Continuous action on external world and internal
world.
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
18
Mapping the World
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
19
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
20
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
21
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
22
23
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
Carnot Cycle for a Pyramidal Neuron
[Fry, 2005; Fry, 2008]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
24
Carnot Cycle for the Brain
[Freeman et al., 2012]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
25
Information Physics of Biological Systems
[Bialek et al., 2007]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
26
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
[Slide by Robert Fry]
27
[Slide by Robert Fry]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
28
Perception-Action Cycles
(참고: Andrew Ng, Stanford Univ.)
Perception-Action Cycle in Autonomous
Helicopter Control
2012 Helicopter
(c) SNU
Stanford Autonomous
- Airshow #2:
Biointelligence Lab,
http://www.youtube.com/watch?v=VCdxqn0fcnE
http://bi.snu.ac.kr/
30
Perception-Action Cycle in Humans
[Trommershaeuser et al., Sensory Cue Integration, 2011]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
31
Perception-Action Cycle in Communication
between A and B
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
32
Perception-Action Cycle in Language
Comprehension
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
33
Perception-Action Cycle in Robots
[Zahedi et al., Adaptive Behavior,
2012 (c) SNU Biointelligence Lab,
2009] http://bi.snu.ac.kr/
34
Perception-Action Cycle
[Zahedi et al., Adaptive Behavior,
2012 (c) SNU Biointelligence Lab,
2009] http://bi.snu.ac.kr/
35
Predictive Information
[Zahedi et al., Adaptive Behavior,
2012 (c) SNU Biointelligence Lab,
2009] http://bi.snu.ac.kr/
36
Sensory Prediction
[Zahedi et al., Adaptive Behavior,
2012 (c) SNU Biointelligence Lab,
2009] http://bi.snu.ac.kr/
37
Free Energy and the Perception-Action Cycle
2012 (c) SNU Biointelligence Lab,
[Friston, Trends in Cognitive Sciences,
2009]
http://bi.snu.ac.kr/
38
Reinforcement Learning and the PerceptionAction Cycles
= (information-to-go) – (value-to-go)
[Tishby & Polani, 2010]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
39
Brain Mechanisms for the PerceptionAction-Learning Cycle
Brain Computation: Speed, Flexibility, Robustness
 How can brain computation be so fast, flexible, and
robust in a changing environment?
– Fast
• Object recognition: within 100 ms
• Anomaly detection: N400, P600
• Instant decision-making
– Flexible
• Invariant to shift, scale, and rotation
• Various utterances for the same meaning
• Art, music, literature, and dancing
– Robust
• Cluttered image
• Noisy speech
• Intention reading under complex situations
 What brain mechanisms for information processing and
organization allow this?
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
41
Language Processing in the Brain
• N400: a brain wave related to linguistic processes.
• Increased when semantically mismatched
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
Fig. 9.30: ERP
waveforms
differentiate
between congruent
words at the end of
sentences (work)
and anomalous last
words that do not fit
the semantic
specifications of the
preceding context
(socks).
42
Syntactic Processing in the Brain
•
•
LAN (left anterior negativity): negative wave over the left
frontal areas when words violate the required word category in
a sentence (syntactic violation)
e.g. “the red eats”, “he mow”
Semantic
Syntactic
ERPs related to semantic and syntactic processing.
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
43
Brain as a Widely Distributed, Parallel, Interactive,
Overlapping, Dynamic Relational Memory Network
[Fuster, 2004]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
[Fuster, 2004]
44
Neural Representations and Processing
•
•
•
•
•
•
•
•
•
•
•
•
“Chemical” and “molecular” basis of synapses
Distributed representation
Multiple overlapping representations
Hierarchical representation
Associative recall
Population coding
Assembly coding
Sparse coding
Temporal coding
Synfire chain
Dynamic coordination
Correlation coding
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
45
Bayesian Brain: Multisensory Integration
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
[Knill & Pouget, 46
2004]
Population Coding (Representation)
Rate Coding
1 number of spikes in population of size N
T 0 T
N
1 t  T / 2 1 N
f
 lim

(
t
'

t
)dt '

i
T 0 T t  T / 2 N
i 1
A(t )  lim
Gain Coding
2012 (c) SNU Biointelligence Lab, http://bi.snu.ac.kr/
47
Probabilistic Inference with Population Codes
[Knill and Pouget, Trends in Neurosciences, 2004]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
[Knill and Pouget, Trends in Neurosciences, 2004]
48
Dynamics in Sensory Cue Integration
2012 (c) SNU Biointelligence Lab,
49
[Deneve et al.,http://bi.snu.ac.kr/
Nature Neuroscience, 2001, from Knill and Pouget, Trends in Neurosciences, 2004]
Models of Perception-ActionLearning Cycles
Markov Models
(Markov Chains)
First-order Markov Model
(Markov Chain)
Second-order Markov Model
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
51
Latent Markov Models
(Hidden Markov Models)
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
52
Filtering / Tracking
• We want to track the unknown state x of a
system as it evolves over time based on the
(noisy) observations y that arrive sequentially.
yt-1
yt
yt+1
Observation
p(yt|xt)
state
xt-1
p(xt|xt-1)
xt
Transition
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
xt+1
53
Linear Dynamical Systems
(Kalman Filters)
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
54
Kalman Filter
• Process to be estimated:
yk = Ayk-1 + Buk + wk-1
Process Noise (w) with covariance Q
zk = Hyk + vk
Measurement Noise (v) with covariance R
• Kalman Filter
Predicted: ŷ-k is estimate based on measurements at previous time-steps
ŷ-k = Ayk-1 + Buk
P-k = APk-1AT + Q
Corrected: ŷk has additional information – the measurement at time k
ŷk = ŷ-k + K(zk - H ŷ-k )
K = P-kHT(HP-kHT + R)-1
Pk = (I - KH)P-k
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
55
Filtering
Discrete x
Continuous x
[Barber et al., 2011]
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
56
Smoothing
Parallel Smoothing
Sequential Smoothing
Discrete x
Continuous x
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
57
[Barber et al., 2011]
Prediction
Interpolation
Most-likely latent trajectory
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
58
[Barber et al., 2011]
Sequential Importance Sampling
Choosing the proposal distribution:
Optimal choice (minimum variance)
Boostrap filter
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
59
[Barber et al., 2011]
Sequential Importance Resampling
or Particle Filter
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
[Barber et al., 60
2011]
Example: PF with N=4
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
61
[Barber et al., 2011]
Course Overview
Action-Perception-Learning
Cycles
Course Description
How can the brain learn so fast, flexibly, and robustly? What representational
mechanisms and organizational principles does the brain use? How can we
apply these principles to constructing intelligent cognitive machines that learn
like humans? To address these questions, it is important to observe that the
brain is embodied with sensors and actuators, and interacts with its
environment in a continuous perception-action cycle. Living in a dynamic
environment under uncertainty requires the brain to learn moment by moment
in real time and incrementally in this continuous, rapid perception-action cycle.
In this course we review recent experimental and theoretical work on
perception-action cycles and neural coding principles in the brain. We also
study mathematical tools developed in information theory, control theory, and
Bayesian statistics that may be useful to model the biological information
processing in the brain. The goal is to develop computational models of
sequential learning processes, i.e. action-perception-learning cycle machines,
that enable rapid, continuous, and reliable action and decision-making in a
changing environment over an extended period of time or lifelong.
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
63
Plan
• Part I: Neurocognitive Models
–
–
–
–
–
–
–
Cortical Models
Language Models
Thermodynamic Models
Free Energy Models
Decision-Theoretic Models
Information-Theoretic Models
Exam 1: Thursday, Oct. 18, 2012
• Part II: Computational Models
–
–
–
–
–
–
Markov Models
Dynamical Systems
Kalman Filters
Probabilistic Population Codes
Particle Filters
Exam 2: Thursday, Nov. 29, 2012
2012 (c) SNU Biointelligence Lab,
http://bi.snu.ac.kr/
64
Download