Ali Ghadirzadeh, Atsuto Maki, Mårten Björkman Sept 28- Oct 2 2015. Hamburg Germany Presented by Jen-Fang Chang 1 Outline Introduction Proposed method Experiment results Conclusion and Future work 2 Introduction Sensorimotor contingencies(SMC) based method Sensory awareness are the result of integrating sensorimotor coupling into the planning system. Grounded directly to the environment, give the flexibility to designs systems with self-learning. 3 Visual servoing Visual servoing is an approach to control a robot’s motion using feedback. Unlike tradition methods to design system, there is no need to calibrate the robot by SMC based approach. The task is grounded to the environment, enabling self- learning without intervention. 4 Main contribution Eliminate the need for prior knowledge of kinematic or dynamic models of the robot Use forward model to search for proper actions to solve tasks by minimizing a cost function, instead of training a separate inverse model, to speed up training Encode 3D spatial positions of a target project to avoid calibration with external coordinate system 5 Related Works Forward model with Distal Supervised learning (DSL) Combined forward-inverse model learning Affordance theory learning Reinforcement learning PILCO 6 Proposed method Training a forward model Finding inverse output Gaussian Process regression Kinematic system Visuomotor tasks 7 Training a forward model General form forward model: START Predict new state 𝑆𝑡 + 1 Target S* cost function initial train forward model with randomly generated action-observation pairs Threshold Ϛ , the distance between 𝑆𝑡 and S* Gaussian Process regression Yes Target state reached? (𝑆𝑡+1, 𝑆𝑡+1) > Ϛ Yes No Get current state 𝑆𝑡 Find ∆𝐽t Find inverse output Get new state 𝑆𝑡+1 by executing ∆𝐽t 8 Inverse output The cost function defined as (1) To find the optimum motor command that minimize the cost function, we use gradient based method: (2) Equation (2) can be written as: (3) 9 Apply gradient descent repeatedly gets an optimum unbounded value: (4) Insert the optimum value to (5) We can get the motor commands that optimize the cost function 10 Gaussian Process regression A non-parametric model where the representation is given by the training samples. The key factor to successfully learn the forward models in real-time. 11 Kinematic system The camera joints consist Jnp (neck pan), Jnt (neck tilt) and Jv (Vergence angle) The robot arms joints consist Jsp(shoulder pan), Jsl(shoulder lift), Jar (arm roll) and Je(elbow) The control architect is hierarchical and can be regarded as a look and move architecture A position is defined by camera joints that would be required for the robot to fixate onto the object 12 Visuomotor tasks Fixation task Fixate onto an object which can be observed with the maximum overlap between the two camera view The process serves as a mean to probe the 3D position Reaching task Use the information of 3D position and try to reduce the distance between the end-effector and target 13 𝑦 = (𝑦𝑙 + 𝑦𝑟)/2 Jv missing Fixation forward model structure 𝐷𝑖𝑠𝑝𝑎𝑟𝑖𝑡𝑦 𝑑 = 𝑥𝑟 − 𝑥𝑙 Reaching forward model structure 14 Experiment results Model training Learning performance for the training and test phases of the fixation task, Implemented on the PR2 robot. 15 Learning performance for the reaching task, (a) with the same target used for training and test, and (b) for two new target. 16 Cost functions evaluation 17 3D position encoding The Euclidean distance between two points in the joint space against the 3D distance in Metric space 18 Tolerance against image distortion The cost functions are given as the average of 10 different trials, performed in the simulation environment. Image with no distortion, moderate(λ= 0.9) and considerable(λ= 0.7) 19 Conclusion The most important feature of the proposed method are the real-time learning and the fact that it requires the robot to only have a few interaction with the environment. One key factor to speed up the training is that we applied the forward model to search for the motor commands by minimizing a given cost, instead of training a separate inverse model. 20 Future work Control a robot arm while avoiding obstacles Utilize other regression model for different event 21 https://www.youtube.com/watch?v=qxDPU-nS0xo https://www.youtube.com/watch?v=9JQOcg3Jcqs 22