Ali Ghadirzadeh, Atsuto Maki, Mårten Björkman Presented by Jen-Fang Chang

advertisement
Ali Ghadirzadeh, Atsuto Maki, Mårten Björkman
Sept 28- Oct 2 2015. Hamburg Germany
Presented by Jen-Fang Chang
1
Outline
 Introduction
 Proposed method
 Experiment results
 Conclusion and Future work
2
Introduction
 Sensorimotor contingencies(SMC) based method
 Sensory awareness are the result of integrating sensorimotor
coupling into the planning system.
 Grounded directly to the environment, give the flexibility to
designs systems with self-learning.
3
Visual servoing
 Visual servoing is an approach to control a robot’s motion
using feedback.
 Unlike tradition methods to design system, there is no
need to calibrate the robot by SMC based approach.
 The task is grounded to the environment, enabling self-
learning without intervention.
4
Main contribution
 Eliminate the need for prior knowledge of kinematic or
dynamic models of the robot
 Use forward model to search for proper actions to solve
tasks by minimizing a cost function, instead of training a
separate inverse model, to speed up training
 Encode 3D spatial positions of a target project to avoid
calibration with external coordinate system
5
Related Works
 Forward model with Distal Supervised learning (DSL)
 Combined forward-inverse model learning
 Affordance theory learning
 Reinforcement learning
 PILCO
6
Proposed method
 Training a forward model
 Finding inverse output
 Gaussian Process regression
 Kinematic system
 Visuomotor tasks
7
Training a forward model
General form forward model:
START
Predict new
state 𝑆𝑡 + 1
Target S*
cost function
initial train forward model
with randomly generated
action-observation pairs
Threshold Ϛ , the
distance between 𝑆𝑡
and S*
Gaussian Process regression
Yes
Target
state
reached?
(𝑆𝑡+1, 𝑆𝑡+1) > Ϛ
Yes
No
Get current state 𝑆𝑡
Find ∆𝐽t
Find inverse output
Get new state 𝑆𝑡+1 by
executing ∆𝐽t
8
Inverse output
 The cost function defined as
(1)
 To find the optimum motor command that minimize the cost
function, we use gradient based method:
(2)
 Equation (2) can be written as:
(3)
9
 Apply gradient descent repeatedly gets an optimum
unbounded value:
(4)
 Insert the optimum value
to
(5)
We can get the motor commands that optimize the cost
function
10
Gaussian Process regression
 A non-parametric model where the representation is
given by the training samples.
 The key factor to successfully learn the forward models
in real-time.
11
Kinematic system
 The camera joints consist Jnp (neck pan), Jnt (neck tilt)
and Jv (Vergence angle)
 The robot arms joints consist Jsp(shoulder pan),
Jsl(shoulder lift), Jar (arm roll) and Je(elbow)
 The control architect is hierarchical and can be regarded as
a look and move architecture
 A position is defined by camera joints that would be
required for the robot to fixate onto the object
12
Visuomotor tasks
 Fixation task
 Fixate onto an object which can be observed with the
maximum overlap between the two camera view
 The process serves as a mean to probe the 3D position
 Reaching task
 Use the information of 3D position and try to reduce the
distance between the end-effector and target
13
𝑦 = (𝑦𝑙 + 𝑦𝑟)/2
Jv missing
Fixation forward model structure
𝐷𝑖𝑠𝑝𝑎𝑟𝑖𝑡𝑦 𝑑 = 𝑥𝑟 − 𝑥𝑙
Reaching forward model structure
14
Experiment results
 Model training
 Learning performance for the
training and test phases of the
fixation task, Implemented on the
PR2 robot.
15
Learning performance for the reaching task, (a) with the same target used
for training and test, and (b) for two new target.
16
Cost functions evaluation
17
3D position encoding
 The Euclidean distance between
two points in the joint space
against the 3D distance in Metric
space
18
Tolerance against image distortion
The cost functions are given as the
average of 10 different trials, performed
in the simulation environment.
Image with no distortion, moderate(λ= 0.9)
and considerable(λ= 0.7)
19
Conclusion
 The most important feature of the proposed method
are the real-time learning and the fact that it requires
the robot to only have a few interaction with the
environment.
 One key factor to speed up the training is that we
applied the forward model to search for the motor
commands by minimizing a given cost, instead of
training a separate inverse model.
20
Future work
 Control a robot arm while avoiding obstacles
 Utilize other regression model for different event
21
 https://www.youtube.com/watch?v=qxDPU-nS0xo
 https://www.youtube.com/watch?v=9JQOcg3Jcqs
22
Download