Uploaded by mvgxgwalfevqfsvvyi

ilovepdf merged (8)

advertisement
Development of intelligent systems
(RInS)
Introductory information
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Academic year: 2021/22
Development of intelligent systems
About the course
 Development of Intelligent Systems
 University study program Computer and Information Science, 3rd year
 3 hours of lectures and 2 hours of tutorials (practice classes) weekly
 6 ECTS credits
 Lectures on Fridays 8:15 – 11:00 (in P22)
 (Online using MS Teams)
 Tutorials, 4 groups:
 Mondays 8:15 - 10:00 and 11:15-13:00 (in R2.38)
 Tuesdays 10:15 – 12:00 and 12:15 – 14:00 in (R2.38)
 Course home page:
https://ucilnica.fri.uni-lj.si/course/view.php?id=69
Development of intelligent systems, Introductory information
2
About the course
 Module Artificial intelligence
Development of intelligent systems, Introductory information
3
Lecturer
 Prof. dr. Danijel Skočaj






Visual Cognitive Systems Laboratory
e-mail: danijel.skocaj@fri.uni-lj.si
url: http://www.vicos.si/danijels
tel: 01 479 8225
room: second floor, room R2.57
office hours: Thursday, 13:00-14:00
 Teaching assistants:
Matej Dobrevski
 e-mail: matej.dobrevski@fri.uni-lj.si
 tel: 01 479 8245
 room: second floor, room R2.37
Assist. prof. Luka Čehovin Zajc
 e-mail: luka.cehovin@fri.uni-lj.si
 tel: 01 479 8252
 room: second floor, room R2.35
Development of intelligent systems, Introductory information
4
Goal of the course
The course aims at teaching the students to
develop an intelligent system by integrating
techniques from artificial intelligence and
machine perception. Students will learn how
to design an intelligent system, how to select
which tools and methods to use, and how to
implement new components and integrate
them into a functional system.
Development of intelligent systems, Introductory information
5
Course goals
 To learn about intelligent robot systems




requirements
methodologies
applications
middleware
 Development of an intelligent robot system







design
architecture
use of appropriate components
development of new components
integration
robot programming
testing, debugging
 Extremely practically oriented course!
Development of intelligent systems, Introductory information
6
Development platform
 Robot platform: iRobot Roomba 531 + TurtleBot + Kinect
 Software platform: ROS, Robot Operating System
Development of intelligent systems, Introductory information
7
TurtleBot++
Development of intelligent systems, Introductory information
8
Herd
Development of intelligent systems, Introductory information
9
Robot platform
Development of intelligent systems, Introductory information
10
Diploma theses
G. Pušnik
D. Tratnik
A. Rezelj
Development of intelligent systems, Introductory information
J. Bizjak
11
RInS 2012
 Slalom
Development of intelligent systems, Introductory information
12
RInS 2013
 Object search
Development of intelligent systems, Introductory information
13
RInS 2014
 Mini Cluedo
Development of intelligent systems, Introductory information
14
RInS 2015
 DeliveryBot
Development of intelligent systems, Introductory information
15
RInS 2016
 TaxiBot
Development of intelligent systems, Introductory information
16
RInS 2017
 Robot of the rings
Development of intelligent systems, Introductory information
17
RInS 2018
 CryptoBot
Development of intelligent systems, Introductory information
18
RInS 2019
 TreasureHunt
Development of intelligent systems, Introductory information
19
RInS 2020
Development of intelligent systems, Introductory information
20
2021
StopCorona
Development of intelligent systems, Introductory information
21
2022 Final task
FooDeRo
Food
Delivery
Robot
Development of intelligent systems, Introductory information
22
2021 Final task
 Setup:




„Small city“ scene (fenced area).
Several persons (faces) in the scene.
Four „restaurants“ (cylinders) of different colours.
Four parking slots marked with rings of different sizes and colours.
 Goal:
 Deliver the (virtual) food to the persons.
 Task:








Find all persons in the city.
Find all the restaurants and look at them which food do they serve.
Park in the starting parking slot.
Accept orders from the web application (who orders what).
Collect the food and (virtually) bring it to the corresponding persons.
Talk to them if they need anything else.
Fulfil their requests.
Park in the end parking slot.
Development of intelligent systems, Introductory information
23
2022 Final task
Development of intelligent systems, Introductory information
24
2022 Final task
Development of intelligent systems, Introductory information
25
Intermediate tasks
 Task 1: Autonomous navigation and human search
 The robot should autonomously navigate around the
competition area
 It should search and detect faces
 It should approach every detected face
 It should be completed in simulator
AND with a real robot in real world
 Task 2: Parking





Detect rings
Recognize and say the colour of the rings
Approach the green ring
Detect the marked parking place below the ring
Park in the marked parking place
Development of intelligent systems, Introductory information
26
Competencies to be developed
 System setup
 Running ROS
 Tele-operating TurtleBot
 Autonomous navigation







Autonomous control of the mobile platform
Acquiring images and 3D information
Simultaneous mapping and localization (SLAM)
Path planning, obstacle avoidance
Advanced fine manoeuvring
Basic mobile manipulation
Intelligent navigation and exploration of space
 Advanced perception and cognitive capabilities




Detection of faces, circles, 3D rings, 3D cylinders
Recognition of faces, food, digits, colour
Speech synthesis, speech recognition, dialogue processing (reading QR codes)
Belief maintenance, reasoning, planning
Development of intelligent systems, Introductory information
27
Autonomous navigation
 Autonomous control of the mobile platform
 components for controlling the robot
 Acquiring images and 3D information
 using Kinect
 OpenCV for processing images
 Point Cloud Library for processing 3D information
 Simultaneous mapping and localization (SLAM)
 building the map of the environment, navigation using the map
 transformations between coordinate frames
 Path planning, obstacle avoidance
 setting the goals, approaching to the specific local goals
 detect and avoid the obstacles
 Advanced maneuvering
 precise maneuvering
 Intelligent navigation and exploration of space

autonomus exploration, setting the goals
Development of intelligent systems, Introductory information
28
Perception and recognition
 Face detection and recognition
 Circle detection
 Detection of restaurants
 detection of 3D cylinders
 Detection of rings
 localization of rings in 3D space
 Food recognition
 Colour learning and recognition
 Circles, cylinders, rings
 (QR code reading)
 Dialogue processing
 Speech synthesis
 Speech recognition
 Speech understanding
Development of intelligent systems, Introductory information
29
Advanced perception and cognition
 Belief maintenance, reasoning, planning





anchoring the detected objects to the map
creating and updating the beliefs
reasoning using the beliefs
planning for information gathering
What to do next?
 Intelligent navigation




considering the map
optimize the exploration of space
optimize the distance travelled needed
Where to go next?
 Visual servoing
 move the mobile camera to optimise perception
 visual servoing while parking
Development of intelligent systems, Introductory information
30
Challenges
 Robot control (ROS)
 „engineering“ issues
 robot system (actuators, sensors,…), real world
 Selection of appropriate components for solving subproblems
 many of them are given, many of them are available in ROS
 Development of new components
 implementing algorithms for solving new problems
 Integration
 integrating very different components
 „Estimate the time needed for integration, multiply it by 3, but
you have still probably underestimated the time actually needed.“
 Difficult debbuging; visualize, log!
 Very heterogeneous distributed system
 mobile robotics, navigation, manipulation
 computer vision, machine learning
 reasoning, planning
Development of intelligent systems, Introductory information
31
Competencies to be developed
 System setup
 Running ROS
 Tele-operating TurtleBot
 Autonomous navigation






Autonomous control of the mobile platform
Acquiring images and 3D information
Simultaneous mapping and localization (SLAM)
Path planning, obstacle avoidance, approaching
Advanced fine manoeuvring and parking
Intelligent navigation and exploration of space
 Advanced perception and cognitive capabilities





Detection of faces, circles, 3D rings, 3D cylinders
Recognition of faces, food, digits, colour
Basic manipuation and visual servoing
Speech synthesis, speech recognition, dialogue processing (reading QR codes)
Belief maintenance, reasoning, planning
Development of intelligent systems, Introductory information
32
Types of challenges
 System setup
 Running ROS
 Tele-operating TurtleBot
 Autonomous navigation






Autonomous control of the mobile platform
Acquiring images and 3D information
Simultaneous mapping and localization (SLAM)
Path planning, obstacle avoidance, approaching
Advanced fine manoeuvring and parking
Intelligent navigation and exploration of space
engineering
issues
integration of
components
development
of components
 Advanced perception and cognitive capabilities





Detection of faces, circles, 3D rings, 3D cylinders
Recognition of faces, food, digits, colour
Basic manipuation and visual servoing
Speech synthesis, speech recognition, dialogue processing (reading QR codes)
Belief maintenance, reasoning, planning
Development of intelligent systems, Introductory information
33
Research areas
 System setup
 Running ROS
 Tele-operating TurtleBot
 Autonomous navigation






Mobile robotics
Computer vision, ML
Dialogue processing, AI
Autonomous control of the mobile platform
Acquiring images and 3D information
Simultaneous mapping and localization (SLAM)
Path planning, obstacle avoidance, approaching
Advanced fine manoeuvring and parking
Intelligent navigation and exploration of space
 Advanced perception and cognitive capabilities





Detection of faces, circles, 3D rings, 3D cylinders
Recognition of faces, food, digits, colour
Basic manipuation and visual servoing
Speech synthesis, speech recognition, dialogue processing (reading QR codes)
Belief maintenance, reasoning, planning
Development of intelligent systems, Introductory information
34
Requirements
 System setup
 Running ROS
 Tele-operating TurtleBot
 Autonomous navigation






For 6
For + max. 2
For + max. 2
Autonomous control of the mobile platform
Acquiring images and 3D information
Simultaneous mapping and localization (SLAM)
Path planning, obstacle avoidance, approaching
Advanced fine manoeuvring and parking
Intelligent navigation and exploration of space
 Advanced perception and cognitive capabilities





Detection of faces, circles, 3D rings, 3D cylinders
Recognition of faces, food, digits, colour
Basic manipuation and visual servoing
Speech synthesis, speech recognition, dialogue processing (reading QR codes)
Belief maintenance, reasoning, planning
Development of intelligent systems, Introductory information
35
Tasks
 System setup
 Running ROS
 Tele-operating TurtleBot
 Autonomous navigation






Task 1
Task 2
Task 3
Autonomous control of the mobile platform
Acquiring images and 3D information
Simultaneous mapping and localization (SLAM)
Path planning, obstacle avoidance, approaching
Advanced fine manoeuvring and parking
Intelligent navigation and exploration of space
 Advanced perception and cognitive capabilities





Detection of faces, circles, 3D rings, 3D cylinders
Recognition of faces, food, digits, colour
Basic manipuation and visual servoing
Speech synthesis, speech recognition, dialogue processing (reading QR codes)
Belief maintenance, reasoning, planning
Development of intelligent systems, Introductory information
36
Simulation vs. real-world robot
Majority of the tasks to be implemented in simulator
At least part to be implemented on a real robot
Also depending on the pandemic situation
Development of intelligent systems, Introductory information
37
Simulation vs. real-world robot
 System setup
 Running ROS
 Tele-operating TurtleBot
 Autonomous navigation






Autonomous control of the mobile platform
Acquiring images and 3D information
Simultaneous mapping and localization (SLAM)
Path planning, obstacle avoidance, approaching
Advanced fine manoeuvring and parking
Intelligent navigation and exploration of space
Sim.+real
Sim. only
Task 1
Task 2
Task 3
Optional: also parts
of Task 2 and Task 3
 Advanced perception and cognitive capabilities





Detection of faces, circles, 3D rings, 3D cylinders
Recognition of faces, food, digits, colour
Basic manipuation and visual servoing
Speech synthesis, speech recognition, dialogue processing (reading QR codes)
Belief maintenance, reasoning, planning
Development of intelligent systems, Introductory information
38
Homeworks
 In case of only the work in simulation is possible
 Will be decided during the semester
 A couple of assignments on computer vision tasks on real data:




Face detection observation model
Colour recognition
Digit recognition
Food recognition
 To be submitted at učilnica
 Assessed at major milestones
Development of intelligent systems, Introductory information
39
Lectures
 Additional knowledge needed for understanding and
implementation of the individual components of the system:








introduction to intelligent systems
ROS
sensors
transformations between the coordinate frames
mobile robotics
computer vision and machine learning
robot manipulation
artificial cognitive systems
Development of intelligent systems, Introductory information
40
Practical implementation
 Five robots are available
 Teams of three students
 Each team
there is at least one good C++/Python programmer
there is at least one member who can work with Linux and robots/hardware
there is at least one member good in computer vision and machine learning
all the members are equivalent – the work should be fairly distributed – no
piggybacking!
 all the members of the groups attend the same tutorial group
 preferably also have their own laptop /powerful desktop




 sufficiently powerful
 native Linux (Ubuntu 20.04 Focal Fossa,…; ROS Noetic Ninjemys)
 USB port
 Mobile platforms are available
 during the practice classes (tutorials)
 at other times in RoboRoom R2.38 (booking required)
Development of intelligent systems, Introductory information
41
Continuous integration
 It is essential to work during the entire semester
 Time during the official classes does not suffice
 Book the robot and work at other times in RoboRoom R2.38
Development of intelligent systems, Introductory information
42
Milestones
 Milestone 1: Autonomous navigation
and human search
 Autonomous navigation around the competition area
 Find and approach the faces
25.3.
Milestone
1
 Milestone 2: Parking
 Detect the 3D rings
 Basic visual servoing
 Fine manoeuvring and parking
 Milestone 3: FooDeRo




Deliver the food
Computer vision, machine learning
Dialogue, mobile manipulation
Belief maintenance, reasoning, planning
29.4.
Milestone
2
27.5.
Milestone
3
+ Autonomous navigation and human search on a real robot
Development of intelligent systems, Introductory information
43
Evaluation





Huge emphasis on practical work
Continuing work and assessment during the semester
Different levels of requirements available
There is no written exam!
Oral exam
 Grading:
10 points: M1 in simulator (system operation)
10 points: M1 on real robot (system operation)
15 points: M2 in simulator (system operation)
25 points: M3 in simulator (concepts used, system operation, system performance)
10 points: Final report (description of the methods used, experimental results,
implementation, innovation)
 10 points: Active participation (participation at the practice classes, active work in the
lab, active problem solving, homeworks)
 20 points: Oral exam (concepts presented during the lectures, discussion about
theoretical and implementation details of the developed system, experimental results)





Development of intelligent systems, Introductory information
44
Requirements
 Requirements:
 at least 30 points (50%) earned at milestones
 at least 5 points (50%) for the final report
 at least 50 points (50%) altogether
 If the student fails to carry out the work in time (fails to successfully
demonstrate the robot at the milestones), he/she can not pass the course in the
current academic year.
 If the student does not pass the oral exam, he/she is allowed to take it again
(this year).
 If it is determined that the student has not contributed enough to the
development of the system in his/her team, he can not pass the course in the
current academic year.
 The students have to participate in the continuous assessment of their work (at
milestones M1, M2, M3).
 Attendance at the practice classes is compulsory.
 Completed requirements are only valid in the current academic year.
Development of intelligent systems, Introductory information
45
Conclusion








Very „hands-on“ course
Gaining practical experience on a real robot
Real-world problems
Collaboration
Creative thinking
Problem solving
Innovativeness
Practical skills
Development of intelligent systems, Introductory information
46
Development of intelligent systems
(RInS)
Introduction
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Academic year: 2021/22
Development of intelligent systems
Intelligent systems
 Software intelligent systems
 Passive situated robot systems
 Active embodied robot systems
Development of intelligent systems, Introduction
2
Robotics
ro·bot noun \ˈrō-ˌbät, -bət\: a real or imaginary
machine that is controlled by a computer and is often
made to look like a human or animal
: a machine that can do the work of a person and that
works automatically or is controlled by a computer
Merriam – Webster dictionary
 Robot
 Karel Čapek: R.U.R. (Rossum's Universal Robots),
1921
 „robota“ – work; forced, hard labour
Development of intelligent systems, Introduction
3
Intelligent autonomous robot systems
Drive
Development of intelligent systems, Introduction
Walk
4
Intelligent autonomous robot systems
Float
Development of intelligent systems, Introduction
Dive
5
Intelligent autonomous robot systems
Fly
Development of intelligent systems, Introduction
Surround us
6
Types of robots






Industrial robots
Robot manipulators
Mobile robots
Humanoid robots
Cognitive systems
Unmanned aerial vehicles, ...
Development of intelligent systems, Introduction
7
Industrial robots
Development of intelligent systems, Introduction
8
Domestic robots
Development of intelligent systems, Introduction
9
Autonomous car navigation
 Autonomous navigation
 Self-driving cars
 Navigation assistants


Pedestrian detection
Several cameras + other sensors
http://www.mobileye.com
Development of intelligent systems, Introduction
Bloomberg, Uber self-driving car
10
Autonomous boat navigation (USV)
UNI-LJ, FE, LSI
FRI, LUVSS
Harhpa Sea
Development of intelligent systems, Introduction
11
Autonomous drones (UAV)
UNI-LJ, FRI, LUVSS
Development of intelligent systems, Introduction
12
Cognitive robotics

Wikipedia:
Cognitive robotics is concerned with endowing robots with mammalian and
human-like cognitive capabilities to enable the achievement of complex goals
in complex environments. Robotic cognitive capabilities include perception
processing, attention allocation, anticipation, planning, reasoning about
other agents, and perhaps reasoning about their own mental states. Robotic
cognition embodies the behaviour of intelligent agents in the physical
world.
 A cognitive robot should exhibit:






knowledge
beliefs
preferences
goals
informational attitudes
motivational attitudes (observing, communicating, revising beliefs, planning)
Development of intelligent systems, Introduction
13
Cognitive systems
 Cognitive assistant
 Explores the environment
and builds a map of it
 Learns to recognize and
identify objects
 Understand object
affordances
 Can verbally and non-verbally
communicate with people in
its vicinity
 Detects new situations and
reacts accordingly
• Built-in basic functionalities,
which are then further
developed, adapted and
extended by learning
Development of intelligent systems, Introduction
Morpha
Univ. Karlsruhe
14
Cognitive systems
EURON video
Development of intelligent systems, Introduction
15
Intelligent robot systems
UL FRI
Development of intelligent systems, Introduction
16
Mobile robots
EURON
video
Development of intelligent systems, Introduction
17
Mobile robots
IRobot Roomba TurtleBot
Development of intelligent systems, Introduction
Ubiquity robotics Magni
18
Mobile robots
UL FRI LUVSS
Development of intelligent systems, Introduction
19
Robotics
 Routine industrial robotic sensor system
EURON video
EURON video
 Intelligent artificial visual cognitive systems
Development of intelligent systems, Introduction
20
Sensor-robot system
 Perception – action cycle
Development of intelligent systems, Introduction
21
Simulation of robot perception and control
Development of intelligent systems, Introduction
22
Sensors
 Range sensors
 Object recognition
 Bumper –
collision detector
 Odometer
Development of intelligent systems, Introduction
23
Planning and control
 Planning
 Control
Development of intelligent systems, Introduction
24
CS 545 Robotics
Introduction to
Slides adapted from Sachin Chitta and Radu Rusu (Willow Garage)
Overview
?
my new application
web browser
email client
window manager
memory management
scheduler
process management
device drivers
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
file system
OS
Overview
Standards
Hardware:
PCI bus, USB port, FireWire, ...
Software:
HTML, JPG, TCP/IP, POSIX, ...
my new application
web browser
email client
window manager
memory management
scheduler
process management
device drivers
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
file system
OS
Overview
...but what about robots
?
my new application
web browser
email client
window manager
memory management
scheduler
process management
device drivers
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
file system
OS
Lack of standards for robotics
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
Typical scenario
robot
world
1 perceiving
2 processing
3 manipulating
1
Many sensors require device drivers and calibration procedures
For example cameras: stereo processing, point cloud generation...
Common to many sensors: filtering, estimation, coordinate transformation,
representations, voxel grid/point cloud processing, sensor fusion,...
2 Algorithms for object detection/recognition, localization, navigation, path/motion
planning, decision making, ...
3 Motor control: inverse kinematics/dynamics, PID control, force control, ...
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
Control loops
robot
world
1 perceiving
2 processing
3 manipulating
Many control loop on different time scales
Outer most control loop may run once every second (1Hz) or slower
Inner most may run at 1000Hz or even higher rates
Software requirements:
Distributed processing with loose coupling. Sensor data comes in
at various time scales.
Real time capabilities for tight motor control loops.
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
Debugging tools
robot
world
1 perceiving
2 processing
3 manipulating
visualization
simulation
Simulation: No risk of breaking real robots, reduce debugging cycles, test in super realtime, controlled physics, perfect model is available...
Visualization: Facilitates debugging, ...looking at the world from the robot’s perspective.
Data trace inspections allow debugging on small time scales.
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
Overview
ROS
task executive
navigation
simulation
planning
visualization
perception
data logging
control
message passing
real-time capabilities
device drivers
web browser
email client
window manager
memory management
scheduler
process management
device drivers
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
file system
OS
Overview
1
2
3
4
5
6
7
8
Orocos: <http://www.orocos.org>
OpenRTM: <http://www.is.aist.go.jp>
ROS: <http://www.ros.org>
OPRoS: <http://opros.or.kr>
JOSER: <http://www.joser.org>
InterModalics: <http://intermodalics.eu>
Denx: <http://denx.de>
GearBox: <http://gearbox.sourceforge.net/gbx_doc_overview.html>
Why should we agree on one standard ?
Code reuse, code sharing:
stop inventing the wheel again and again... instead build on top of each other’s code.
Ability to run the same code across multiple robots:
portability facilitates collaborations and allows for comparison of similar
approaches which is very important especially in science.
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
What is ROS ?
ROS is an open-source, meta-operating system
and stands for Robot Operating System.
It provides the services you would expect from an operating
system, including hardware abstraction, low-level device
control, implementation of commonly-used functionality,
message-passing between processes, and package
management.
http://www.ros.org (documentation)
https://lists.sourceforge.net/lists/listinfo/ros-users (mailing list)
http://www.ros.org/wiki/ROS/Installation (it’s open, it’s free !!)
Mainly supported for Ubuntu linux, experimental for Mac
OS X and other unix systems.
http://www.ros.org/wiki/ROS/StartGuide (tutorials)
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
Robots using ROS
http://www.ros.org/wiki/Robots
many more....
...and many more to come...
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS package system
How to facilitate code sharing and code reuse ?
A package is a building block and implements a reusable capability
Complex enough to be useful
Simple enough to be reused by other packages
A package contains one or more executable
processes (nodes) and provides a ROS interface:
Messages describe the data format of the in/output
of the nodes. For example, a door handle detection
node gets camera images as input and spits out
coordinates of detected door handles.
Service and topics provide the standardized ROS
interface to the rest of the system.
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS package system
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS packa ge system
Collection of packages and stacks, hosted on line
Many repositories (>50): Stanford, CMU, TUM , Leuven, USC, Bosch, ...
http://www.ros.org/wiki/Repositories (check it out...)
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS package system
ROS packages tend to follow a common structure. Here are
some of the directories and files you may notice.
• bin/: compiled binaries (C++ nodes)
• include/package_name: C++ include headers
• msg/: Message (msg) types
• src/package_name/: Source files
• srv/: Service (srv) types
• scripts/: executable scripts (Python nodes)
• launch/: launch files
• CMakeLists.txt: CMake build file (see CMakeLists)
• manifest.xml: Package Manifest
• mainpage.dox: Doxygen mainpage documentation
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS package system
manifest.xml
The manifest is a
minimal specification
about a package and
supports a wide variety
of ROS tools.
<package>
<description brief="one line of text">
long description goes here,
<em>XHTML is allowed</em>
</description>
<author>Alice/alice@somewhere.bar</author>
<license>BSD</license>
<depend package="roscpp"/>
<depend package="my_package"/>
<rosdep name="libreadline5-dev"/>
<export>
<cpp cflags="-I${prefix}/include"
lflags="-L${prefix}/lib -lmy_lib"/>
</export>
</package>
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS core
The roscore is a collection of nodes and programs that
are pre-requisites for a ROS-based system.
master
roscore
It provides naming and registration services to the rest of the
nodes in the ROS system. It tracks publishers and subscribers to
topics as well as services.
The role of the master is to enable individual ROS nodes to locate
one another. Once these nodes have located each other they
communicate with each other peer-to-peer.
ROS uses socket communication to facilitate networking. The
roscore starts on http://my_computer:11311
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS package system
node1.launch
<launch>
param server
my_param=42
master
set
<node pkg="my_package"
type="node1"
name="node1" args="--test">
<param name="my_param"
value="42" />
roscore
</node>
get
</launch>
node 1
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: message passing
Problem:
Synchronization and message passing across multiple processes, maybe even
across multiple computer and/or robots.
master
roscore
node 3
?
node 1
?
?
node 2
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: message passing
#include "ros/ros.h"
#include "std_msgs/String.h"
#include <sstream>
Problem:
int main(int argc, char **argv)
across multiple processes, maybe even
{
Synchronization and message p assing
ros::init(argc, argv, "node1");
across multiple computer and/or robots.
ros::NodeHandle n;
ros::Publisher chatter_pub =
n.advertise<std_msgs::String>("info", 1000);
master
ros::Rate loop_rate(10);
int count = 0;
roscore
while (ros::ok())
{
std_msgs::String msg;
std::stringstream ss;
node 3
ss << "hello world " << count;
msg.data = ss.str();
init
ROS_INFO("%s", msg.data.c_str());
chatter_pub.publish(msg);
ros::spinOnce();
node 1
loop_rate.sleep();
++count;
node 2
}
return 0;
}
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: message passing
#include "ros/ros.h"
#include "std_msgs/String.h"
#include <sstream>
Problem:
int main(int argc, char **argv)
across multiple processes, maybe even
{
Synchronization and message p assing
ros::init(argc, argv, "node1");
across multiple computer and/or robots.
ros::NodeHandle n;
ros::Publisher chatter_pub =
n.advertise<std_msgs::String>("info", 1000);
master
ros::Rate loop_rate(10);
int count = 0;
roscore
while (ros::ok())
{
std_msgs::String msg;
std::stringstream ss;
node 3
ss << "hello world " << count;
msg.data = ss.str();
advertise
ROS_INFO("%s", msg.data.c_str());
chatter_pub.publish(msg);
ros::spinOnce();
node 1
loop_rate.sleep();
++count;
node 2
}
return 0;
}
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: message passing
#include "ros/ros.h"
#include "std_msgs/String.h"
#include <sstream>
Problem:
int main(int argc, char **argv)
across multiple processes, maybe even
{
Synchronization and message p assing
ros::init(argc, argv, "node1");
across multiple computer and/or robots.
ros::NodeHandle n;
ros::Publisher chatter_pub =
n.advertise<std_msgs::String>("info", 1000);
master
ros::Rate loop_rate(10);
topic
int count = 0;
roscore
while (ros::ok())
/node1/info
{
std_msgs::String msg;
std::stringstream ss;
node 3
ss << "hello world " << count;
msg.data = ss.str();
advertise
ROS_INFO("%s", msg.data.c_str());
chatter_pub.publish(msg);
ros::spinOnce();
node 1
loop_rate.sleep();
++count;
node 2
}
return 0;
}
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: message passing
#include "ros/ros.h"
#include "std_msgs/String.h"
#include <sstream>
Problem:
int main(int argc, char **argv)
across multiple processes, maybe even
{
Synchronization and message p assing
ros::init(argc, argv, "node1");
across multiple computer and/or robots.
ros::NodeHandle n;
ros::Publisher chatter_pub =
n.advertise<std_msgs::String>("info", 1000);
master
ros::Rate loop_rate(10);
topic
int count = 0;
roscore
while (ros::ok())
/node1/info
{
std_msgs::String msg;
std::stringstream ss;
node 3
ss << "hello world " << count;
msg.data = ss.str();
ROS_INFO("%s", msg.data.c_str());
publish
chatter_pub.publish(msg);
ros::spinOnce();
node 1
loop_rate.sleep();
++count;
node 2
}
return 0;
}
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: message passing
Problem:
Synchronization and message passing across multiple processes, maybe even
across multiple computer and/or robots.
master
topic
roscore
/node1/info
subscribe
node 3
publish
node 1
subscribe
node 2
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: message passing
Problem:
Synchronization and message passing across multiple processes, maybe even
across multiple computer and/or robots.
master
topic
roscore
/node1/info
subscribe
advertise
publish
node 1
subscribe
node 2
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
node 3
ROS: message passing
Problem:
Synchronization and message passing across multiple processes, maybe even
across multiple computer and/or robots.
master
topic
roscore
/node1/info
/node3/info
subscribe
publish
subscribe
publish
node 1
subscribe
node 2
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
node 3
ROS: logging
Problem:
Synchronization and message passing across multiple processes, maybe even
across multiple computer and/or robots.
logging
subscribe
master
topic
roscore
/node1/info
/node3/info
subscribe
publish
subscribe
publish
node 1
subscribe
node 2
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
node 3
ROS: logging
rosbag: This is a set of tools for recording from and playing
back to ROS topics. It can be used to mimic real sensor
streams for offline debugging.
http://www.ros.org/wiki/rosbag
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: device drivers
Problem:
Many sensors do not come with standardized interfaces. Often the manufacturer
only provides support for a single operating system (e.g. Microsoft Windows).
Thus, everybody that wants to use a particular sensor is required to write their own
device driver, which is time consuming and tedious.
Instead, a few people did the
work and the rest of the
world (re-)uses their code
and builds on top of it.
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: robot descriptions
urdf: This package contains a C++ parser for the Unified Robot
Description Format (URDF), which is an XML format for representing a
robot model.
<robot name="test_robot">
<link name="link1"
name="link1" />
/>
<link name="link2"
name="link2" />
/>
<link name="link3"
name="link3" />
/>
<link name="link4"
name="link4" />
/>
calibration
required !!
<joint name="joint1" type="continuous">
<parent link="link1"/>
<child link="link2"/>
</joint>
<joint name="joint2" type="continuous">
<parent link="link1"/>
<child link="link3"/>
</joint>
<joint name="joint3" type="continuous">
<parent link="link3"/>
<child link="kinect_link"/>
link="link4"/>
</joint>
</robot>
http://www.ros.org/wiki/urdf
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: calibration
Provides a toolchain running through the robot calibration
process. This involves capturing pr2 calibration data, estimating
pr2 parameters, and then updating the PR2 URDF.
http://www.ros.org/wiki/pr2_calibration
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: visualization
rviz: This is a 3D visualization environment for robots. It allows
you to see the world through the eyes of the robot.
http://www.ros.org/wiki/rviz
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: 2D/3D perception
OpenCV: (Open Source Computer Vision) is a library of programming functions for
real time computer vision. http://opencv.willowgarage.com/wiki/
Check out CS 574 (Prof. Ram Nevatia) !!
PCL - Point Cloud Library: a comprehensive open source library for n-D Point
Clouds and 3D geometry processing. The library contains numerous state-of-the
art algorithms for: filtering, feature estimation, surface reconstruction, registration,
model fitting and segmentation, etc.
http://www.ros.org/wiki/pcl
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: planning
The motion_planners stack contains different motion
planners including probabilistic motion planners, search-based
planners, and motion planner based on trajectory optimization.
http://www.ros.org/wiki/motion_planners
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: navigation
navigation: A 2D navigation stack that takes in information
from odometry, sensor streams, and a goal pose and outputs
safe velocity commands that are sent to a mobile base.
http://www.ros.org/wiki/navigation
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
ROS: task executive
SMACH, which stands for 'state machine', is a task-level
architecture for rapidly creating complex robot behavior.
http://www.ros.org/wiki/smach
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
Example application
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
Overview
ROS
task executive
navigation
simulation
planning
visualization
perception
data logging
control
message passing
real-time capabilities
device drivers
web browser
email client
window manager
memory management
scheduler
process management
device drivers
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
file system
OS
Why should one use ROS ?
Build on top of existing software, make use of existing tools, and focus on
your own research.
Provide the community your own work such that people can reproduce
your experiments and build on top of it.
More information about ROS
Stanford Course: Robot Perception
http://pr.willowgarage.com/wiki/Stanford_CS324_PerceptionForManipulation
PR2 workshop (Good tutorial videos)
http://www.ros.org/wiki/Events/PR2BetaTraining/Videos
CS 545 Robotics
Introduction to ROS
Peter Pastor
Computational Learning and Motor Control Lab
Development of intelligent systems
(RInS)
Robot sensors and TurtleBot
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Academic year: 2021/22
Development of intelligent systems
Robotic sensors
Sensors
Robot platforms
http://ias.cs.tum.edu
Development of intelligent systems, Robot sensors
2
Sensors
 Equivalent to human senses
 Acquire information from the environment
 Electronic/mechanic/chemical device that maps the attributes of the environment
into a quantitative measurement
 Robot can differentiate only between the states in the environment, which can be
sensed differently
see
action
AGENT
ENVIRONMENT
Development of intelligent systems, Robot sensors
3
Perception action cycle
 Significant
abstraction
of the real
world
sense
perception
modelling
planning
task execution
motor control
Development of intelligent systems, Robot sensors
act
4
Senses
 Human senses:
 The list of robot senses is much longer!






Beyond human capabilities
Vision beyond visual spectrum (IR cameras, etc.)
Active vision (radar, LIDAR)
Hearing beyond the range 20 Hz-20 kHz (ultrasound)
Chemical analysis for better taste and smell
Measurement of temperature, humidity, illumination, radiation,
pressure, volume, position, direction, acceleration, velocity, etc.
Development of intelligent systems, Robot sensors
5
Classification of sensors
 Proprioceptive and exteroceptive sensors
 Proprioceptive: measure internal states of the robot (batter status, position of wheels,
angle between the segments in the robot arm)
 Exteroceptive: measure the state of the environment (majority of the sensors)
 Passive and active sensors
 Passive: only receive the energy from the environment (e.g., camera)
 Active: also emit the energy in the environment (e.g., radar)
 Noninvasive and invasive sensors
 Noninvasive (contactless): no contact with the object
 Invasive: measurement with contact
 Visual, non-visual
Development of intelligent systems, Robot sensors
6
Classification of sensors
Development of intelligent systems, Robot sensors
7
Classification of sensors
Development of intelligent systems, Robot sensors
8
Sensors in robots
Gas Sensor
Piezo Bend Sensor
Metal Detector
Pendulum Resistive
Tilt Sensors
Gieger-Muller
Radiation Sensor
UV Detector
Pyroelectric Detector
Resistive Bend Sensors
Digital Infrared Ranging
CDS Cell
Resistive Light Sensor
Pressure Switch
Miniature Polaroid Sensor
Limit Switch
Touch Switch
Mechanical Tilt Sensors
Gyro
IR Pin
Diode
IR Sensor w/lens
Thyristor
Magnetic Sensor
Polaroid Sensor Board
IR Reflection
Sensor
Magnetic Reed Switch
IR Amplifier Sensor
IRDA Transceiver
Hall Effect
Magnetic Field
Sensors
Accelerometer
Lite-On IR
Remote Receiver
Radio Shack
Remote Receiver
IR Modulator
Receiver
Development of intelligent systems, Robot sensors
Solar Cell
Compass
Compass
Piezo Ultrasonic Transducers
9
Cameras
Electromagnetic spectrum
Visual
“light”
Near infrared
“light”
(NIR)
Long-wavelength
infrared “light”
(FLIR)
Terahertz
“light”
(T-ray)
Development of intelligent systems, Robot sensors
10
Sensing EM radiation
Development of intelligent systems, Robot sensors
11
Resistive sensors
 Band sensor
 The resistance changes
by bending the sensor
 Potentiometer
 Position sensor in sliding or
rotating mechanisms
 Photoresistor
 Small resistance at
high illumination
 Light detection
Sensor
Development of intelligent systems, Robot sensors
Sensor
Sensors
12
Infrared sensors
 Intensity IR sensors
 Emit an receive IR light
 Photo-transistor
 Sensitive on daylight, reflections, distance
 Robust, cheap
 Application: object detection, optical encoder
 Modulated IR sensors




Modulation in demodulation
Pulse detection
More robust
IR remotes, itn.
Development of intelligent systems, Robot sensors
13
Infrared sensors
 Range sensors
 Measuring angle between the emitted and received light
-> triangulation
 Non-sensitive on ambient light
Development of intelligent systems, Robot sensors
14
Measuring rotation
 Incremental Optical Encoders
 Relative rotation
light sensor
light
emitter
decode
circuitry
 Incremental Optical Encoders
 Absolute position
 Gray code
Development of intelligent systems, Robot sensors
15
Inertial sensors
 Gyroscope
 Measuring change of orientation
 based on the principles of
angular momentum
 Accelerometer
 Measures acceleration,
also orientation
 Uniaxial, triaxial
 Vibration sensor, vibration analysis, detection of orientation
 Nintendo Wii, smart phones
Development of intelligent systems, Robot sensors
16
Compass
 Electronic compass
 Absolute orientation of the robot
 N, S, E, W
Development of intelligent systems, Robot sensors
17
GPS
 Global Positioning System
 24 satellites at the height of
20200 km
 Atomic clock
 Satellite emit the time and
position data
 At least 4 satellites should
be visible
 Differential GPS – additional
(terrestrial) signals are
considered
Development of intelligent systems, Robot sensors
18
Tactile sensors





Haptic technology
Buttons, switches
Bumpers (collision sensors)
Touch sensors on the robot arm
Different types:




Piezoresistive
Piezoelectric
Capacitive
Elastoresistive
 Artificial skin
Development of intelligent systems, Robot sensors
19
Acustic sensors
 Perception of sound
 Sonar
 Microphone
 Array of microphones
 Detection the sound
direction
Development of intelligent systems, Robot sensors
20
Range sensors
 Stereo vision
 Shape from X
 Coded light range sensor
 IR range sensor
 Time Of Flight sensors





Emit the signal, wait until it is back, measure the time
RADAR
SONAR
LIDAR
ToF cameras
Development of intelligent systems, Robot sensors
21
Sonar
Emits ultrasound
Measure the time
Bat, dolphin
From a couple of cm
to 30 m
 30 degrees angular accuracy
 Quite slow:
200ms for 30m




Development of intelligent systems, Robot sensors
22
Sonar
 Usage:
Mapping of space
Robot
chair
Length of Echo
Doorway
chair
Scan moving from left to right
 Problem:
noise,
interference
Development of intelligent systems, Robot sensors
23
Laser range sensors








LIDAR (Light Detection And Ranging)
Emits laser pulses
Rotating mirror – different angles (up to 180 degrees)
Vertical movement – the entire hemisphere
Better angular accuracy (0.25 degrees)
Faster
Different ranges, indoor, outdoor
Robust
Development of intelligent systems, Robot sensors
24
TOF cameras
 Time-of-flight cameras
 Time of pulse travel
Development of intelligent systems, Robot sensors
25
Coded light range sensor
 Camera and stripe projector
v
s
u
Color coding:
Development of intelligent systems, Robot sensors
26
Stereo cameras
Development of intelligent systems, Robot sensors
27
Structure from motion
Development of intelligent systems, Robot sensors
28
Other sensors
 Exteroceptive sensors
 Wind speed
 Temperature
 Humidity
 Proprioceptive sensors
 Baterry level
 Temperature of CPU, motors, sensors, etc.
Development of intelligent systems, Robot sensors
29
Multimodal perception
Setup
Thermal
Development of intelligent systems, Robot sensors
Polarized cam
Zed – stereo
LIDAR
Radar
Wide-baseline stereo
UL FE, FRI, Janez Perš, Matej Kristan
30
Sensor fusion
 One sensor often does not suffice




Noise
Limited accuracy
Non-reliabilty
Limited sensing range
=>Fuse the results of several sensors
 Sensor fusion: fusion on the level of sensors
 Combine signals in one data structure on a low level
 Sensor integration: Fusion on the level of representations
 Process data from every sensor independently and merge the obtained information on
a higher level
 Fusion of data from multiple sources:
 Measurement from different sensors
 Measurement from different times
 Measurement from different locations
Development of intelligent systems, Robot sensors
31
TurtleBot++
Development of intelligent systems, Robot sensors
32
iRobot Roomba
 Actuators and sensors
Development of intelligent systems, Robot sensors
33
Motors
 Changeable speed of the wheels
 pulse-width modulation (PWM)
wheels
brushes
vacuum
cleaner
 On/off motors for brushes and vacuum cleaner
Development of intelligent systems, Robot sensors
34
Wheels
 Differential control system
 Two independently controlled
wheels
 Electric motor
 high speed
 25:1 reduction
 large torque
Development of intelligent systems, Robot sensors
35
Sensors
bumper
base
buttons
Development of intelligent systems, Robot sensors
wall
cliff
dirt
odometry
wheels
36
IR sensors
 IR sensors
base
bumper
wall
 Micro switches:
 Capacitive sensor:
Development of intelligent systems, Robot sensors
buttons
odometry
cliff
wheels
dirt
37
Power supply
 Measuring power supply




capacitance of the accumulator [mAh]
voltage [V]
current [A]
connectors
temperature
Development of intelligent systems, Robot sensors
38
Indicators
 Led lights
 Status (green, red)
 Dirt detection (blue)
status
dirt
 Speaker
 piezoelectric beeper
speaker
Development of intelligent systems, Robot sensors
39
RGBD sensor Kinect
 PrimeSense sensor
Development of intelligent systems, Robot sensors
40
Components
IR
projector
IR
camera
RGB
camera
Development of intelligent systems, Robot sensors
41
Scheme
Development of intelligent systems, Robot sensors
42
Projected pattern
Development of intelligent systems, Robot sensors
43
Projected pattern
Development of intelligent systems, Robot sensors
44
Patent
Development of intelligent systems, Robot sensors
45
Patent
Development of intelligent systems, Robot sensors
46
Kinect performance
 Specifications:







Horizontal field of view: 57 degrees
Vertical field of view: 43 degrees
Physical tilt range: ± 27 degrees
Depth sensor range: 1.2m - 3.5m
320x240 16-bit depth @ 30 frames/sec
640x480 32-bit colour@ 30 frames/sec
16-bit audio @ 16 kHz
Kotnik, 2018
Khoshelham, 2011
Development of intelligent systems, Robot sensors
47
RGBD information
Development of intelligent systems, Robot sensors
48
TurtleBot in simulation
Development of intelligent systems, Robot sensors
49
Gazebo and RViz
Development of intelligent systems, Robot sensors
50
Literature
 Dr. John (Jizhong) Xiao, City College of New York,
Robot Sensing and Sensors
 Tod E. Kurt, Hacking Roomba: ExtremeTech, Wiley, 2006
 http://www.ifixit.com/Teardown/Microsoft-Kinect-Teardown/4066/3
 Futurepicture, http://www.futurepicture.org/?p=116
 United States Patent, Garcia et. al, Patent No. 7,433,024 B2
 Peter Corke, Robotics,Vision and Control, 2017
 other
Development of intelligent systems, Robot sensors
51
Development of intelligent systems
(RInS)
Object detection
Matej Kristan, Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Academic year: 2021/22
Development of intelligent systems
Computer vision
Visual information
Computer vision tasks
Face detection!
Development of intelligent systems, Object detection
2
Visual information
Images
Video
3D
Development of intelligent systems, Object detection
3
Classification
What is depicted in the image?
Categorisation
Localisation
Recognition/identification of instances
Development of intelligent systems, Object detection
4
Detection
Where in the image?
Detection
Development of intelligent systems, Object detection
Instance segmentation
5
Segmentation
What does every pixel represent?
Semantic segmentation
Development of intelligent systems, Object detection
Panoptic segmentation
6
Two stage object detection and recognition
very fast
efficient
Face
detection
HOG+SVM
AdaBoost
SSD
„Scarlet“
Face
recognition
CNN
PCA/LDA
could be slower
computationally more complex
Development of intelligent systems, Object detection
7
Observation model
 Three face detectors given




HOG+SVM
AdaBoost
SSD
Any other?
 Not perfect
 Which one is better?
 More true positives
 Less false positives
 Test set
 Images, videos
 Different angles, illumination
 Motion blur, etc.
 Observation model
 Performance
 at different distances and angles
 at different illuminations
Development of intelligent systems, Object detection
8
Robustification of detection
 Use and robustify the better detector
 Take into account temporal dimension
 Repetitive detections more robust
 Filter out false positives
 Take into account spatial dimension
 Non-maximum suppression
 Observation model




Map the image from 2D image to 3D world
Anchor the image into the map
Non-maximum suppression in the map
Redetection of faces from different directions
Development of intelligent systems, Object detection
9
Two stage object detection and recognition
very fast
efficient
Face
detection
HOG+SVM
AdaBoost
SSD
„Scarlet“
Face
recognition
CNN
PCA/LDA
could be slower
computationally more complex
Development of intelligent systems, Object detection
10
Machine Perception
Category detection
Matej Kristan
Laboratorij za Umetne Vizualne Spoznavne Sisteme,
Fakulteta za računalništvo in informatiko,
Univerza v Ljubljani
Machine Perception
Category detection
Matej Kristan
Laboratorij za Umetne Vizualne Spoznavne Sisteme,
Fakulteta za računalništvo in informatiko,
Univerza v Ljubljani
Object categorization
• How to detect/recognize any car?
• How to detect/recognize any cow?
slide credit: B. Leibe
Challenge: Robustness
Illumination
Occlusion
Object pose
Within-class
variability
Clutter
Aspect
Slide credit: Kristen Grauman
Detection by classification: Standard approach
• Apply machine learning to learn “features” that
differ between object in interest and background.
• Basic component: a binary classifier
Classifier:
car / non-car
No,
nota acar.
car.
Yes,
Slide credit: Kristen Grauman
Detection by classification: Standard approach
• Apply a sliding window to cope with clutter and localize the
object.
Classifier
Car / non-Car
• This is essentially a greedy search using a (very) large number
of local decisions.
Slide credit: Kristen Grauman
Detection by classification: Standard approach
A bit more detailed
description:
1. Get training data
2. Determine features
(semi or fully automatic)
3. Train a classifier
Examples of cars
Examples of non-cars
Training examples
Classifier
Car / non-car
Feature
space
Not a car
Images taken from: Kristen Grauman
Is recognition really that difficult?
Find this chair in the image
Output of a normalized cross correlation.
A chair
Not really?!
Slide credit: A. Torralba
Is recognition really that difficult?
Analyze this!
Completely useless – does not work at all.
Main issue: Feature space!
Images from: A. Torralba
Straight-forward global features
Feature
space
A simple holistic description of image content, e.g.,
• Intensity/color histograms
• Intensity vector
• …
Slide credit: Kristen Grauman
Learned global features (e.g., PCA)
The PCA learns features that maximize dataset reconstruction
Calculate a lowdimensional
representation by
a linear subspace.
Average
image
Eigenimages calculated
from the covariance
matrix
Train images
...

+
average
+
++
Project new image
into the subspace.
Recognize by using a
nearest-neighbor
search in the
subspace.
[Turk & Pentland, 1991]
Slide credit: Kristen Grauman
Hand-crafting global features
• Problem 1: Pixel-based representations, are sensitive
to small shifts:
• Problem2: Color or gray-level representation is
sensitive to illumination changes or within-class
color variations.
Slide credit: Kristen Grauman
Hand-crafting global features
• Solution: Edges, contours and oriented intensity
gradients
Change intensity features into
gradient-based features…
Slide credit: Kristen Grauman
Gradient-based representation
• Edges, contours and oriented intensity gradients
• Encode local gradient distributions using histograms
• Locally unordered: invariant to small shifts and rotations
• Contrast normalization:
addresses non-uniform illumination and varying intensity.
Slide credit: Kristen Grauman
Gradient-based representation: HOG
Histogram of Oriented Gradients: HOG
For each cell visualize the strongest
gradient orientations.
Code available at:
http://pascal.inrialpes.fr/soft/olt/
[Dalal & Triggs, CVPR 2005]
Slide credit: Kristen Grauman
Let’s build a classifier
• We hand-crafted a feature descriptor that is invariant
to illumination changes and small deformation.
• How do we calculate a decision in each sub-window?
Classifier
Car / non-car
Feature
space
Feature calculated from the image
Lots of choices for a classifier
Neural networks
Nearest neighbor
106 examples
Shakhnarovich, Viola, Darrell 2003
Berg, Berg, Malik 2005...
Support Vector Machines
Guyon, Vapnik
Heisele, Serre, Poggio,
2001,…
LeCun, Bottou, Bengio, Haffner 1998
Rowley, Baluja, Kanade 1998
…
Boosting
Conditional Random Fields
Viola, Jones 2001,
Torralba et al. 2004,
Opelt et al. 2006,…
McCallum, Freitag, Pereira
2000; Kumar, Hebert 2003, …
Adapted from Antonio Torralba
Consider a linear classifier
A decision boundary, in general,
a hyper-plane:
𝑥2
ax1  cx2  b  0
?
𝒙𝑖
w
Define:
Class id: y = 1
a 
w 
c 
𝑥1
 x1 
x 
 x2 
A general hyper-plane eq:
Class id: y = −1
w xb  0
T
Classification of x = sign checking:
Learning = Choosing 𝒘 and 𝑏!
f (x)  sign (w T xi  b)
Best separation hyper-plane?
A general hyper-plane eq:
𝑥2
w xb  0
T
Classification of x = sign checking:
f (x)  sign (w T x  b)
𝑥1
Choosing 𝒘 and 𝑏?
Best separation hyper-plane?
A general hyper-plane eq:
𝑥2
w xb  0
T
Classification of x = sign checking:
f (x)  sign (w T x  b)
margin
𝑥1
Support vectors
𝒙𝑖 , 𝑦𝑖 𝑖=1:𝑁
• The hyper-plane that
maximizes the margin
between positive
(𝑦𝑖 = 1) and negative
(𝑦𝑖 = −1) training
examples.
w   i 1 i yi xi
N
Have to select SVs and learn 𝛼𝑖 s!
Classification == similarity to SVs
A general hyper-plane eq:
𝑥2
w xb  0
T
Classification of x = sign checking:
f (x)  sign (w T x  b)
w   i 1 i yi xi
N
𝑥1
f ( x)  sign (w T x  b)
 sign
T

y
x
i i i i x  b
 sign
   y  ( x , x)  b 
i
i
i
i
A “kernel”
Non-linear kernels
A non-linear kernel, e.g.,:
𝑥2
 ( x i , x)  e
1
(  (x i  x)2 / 2 )
2
Classification funct. unchanged:
𝑥1
f (x)  sign
   y  ( x , x)  b 
i
i
i
i
Non-linear kernels lift the dimensionality of the data such and apply a
linear hyper-plane in this high-dimensional space.
The hyper-plane becomes nonlinear in the original space.
Application: Pedestrian detection
• Sliding window:
1. extract HOG at each displacement
2. classify by a linear SVM
HOG
𝒙
f ( x)  sign (w T x  b)
HOG cells
weighted by the
positive support
vectors
input
HOG cells
weighted by the
negative
support vectors
HOG
Dalal and Triggs, Histograms of oriented gradients for human detection, CVPR2005
Pedestrian detection HoG+SVM
Navneet Dalal, Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005
Slide credit: Kristen Grauman
Objects (non)rigidly deform
• Nonrigid/deformable objects poorly detected using a
fixed structure.
Slide credit: Kristen Grauman
Deformable parts models (DPM)
• Each part is a HOG descriptor.
• Learn a classifier for HOGs and
geometric constraints
simultaneously by structured SVM.
Root (global) part
Overlaid parts
Geometric constraints
P. Felzenszwalb, R. Girshick, D. McAllester, D.
Ramanan, Object Detection with
Discriminatively Trained Part Based Models
IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 32, No. 9, Sep. 2010
A Great tutorial on DPMs at ICCV2013:
http://www.cs.berkeley.edu/~rbg/ICCV2013/
Time/computation criticality
• A lot of applications are time- and resources-critical
• Require efficient feature construction
• Require efficient classification
• A case study:
• Face detection
Face detection
Application specifics:
• Frontal faces are a good example, where the global
appearance model + sliding window works well:
• Regular 2D structure
• Central part of the face is well approximated by rectangle.
Images from: Kristen Grauman
Faces: Terminology
• Detection:
Given an image, where are
the faces?
Anna
Image credit: H. Rowley
• Recognition:
Whose face is it?
• Classification:
Gender, Age?
Fast face detection
• To apply in real-time applications
1. Feature extraction should be fast
2. Classifier application should be fast
• These points addressed next
Slide credit: Kristen Grauman
Choice of classifiers
Neural networks
Nearest neighbor
106 examples
Shakhnarovich, Viola, Darrell 2003
Berg, Berg, Malik 2005...
Support Vector Machines
Guyon, Vapnik
Heisele, Serre, Poggio,
2001,…
LeCun, Bottou, Bengio, Haffner 1998
Rowley, Baluja, Kanade 1998
…
Boosting
Conditional Random Fields
Viola, Jones 2001,
Torralba et al. 2004,
Opelt et al. 2006,…
McCallum, Freitag, Pereira
2000; Kumar, Hebert 2003, …
Adapted from Antonio Torralba
Boosting
• Build a strong classifier from a combination of many “weak
classifiers” – weak learners (each at least better than random)
• Flexible choice of weak learners
• This includes fast but inaccurate classifiers!
• We’ll have a look at the AdaBoost (Freund & Schapire)
• Simple to implement.
• Basis for the popular Viola-Jones face detector.
Y. Freund and R. Schapire, A short introduction to boosting, Journal of Japanese Society for
Artificial Intelligence, 14(5):771-780, 1999.
Adaboost: Intuition
weight weak classifier
𝑓2 (𝑥)
• Task: Build a classifier which is a weighted sum of
many classifiers
ℎ(𝑥)
𝑓1 (𝑥)
𝑓2 (𝑥)
Example of a weak classifier:
𝜃𝑡 ℎ (𝑥)
𝑡
𝑓1 (𝑥)
ℎ2 (𝑥)
𝑓2 (𝑥)
𝑓2 (𝑥)
𝑓2 (𝑥)
AdaBoost: Intuition
ℎ3 (𝑥)
ℎ1 (𝑥)
𝑓1 (𝑥)
𝑓1 (𝑥)
𝑓1 (𝑥)
• Train a sequence of weak classifiers.
• Each weak classifier splits train
examples with at least 50% accuracy.
• Those examples that are incorrectly
classified by the weak classifier, get
more weight in training the next
weak classifier.
Final classifier is a
combination of many
weak classifiers!
AdaBoost algorithm
Start with uniform
weights of training
samples.
{x1,…xn}
Repeat T-times:
(add a weak classifier)
Select a feature that minimizes
weighted classification error
and build a weak classifier with
that feature.
Reweight the examples:
Incorrectly classified  higher weights
Correctly classified  lower weights
Final classifier is a combination of weak
classifiers, which are weighted according
to their error.
[Freund & Schapire 1995]
Face detection
• To apply in real-time applications
1. Feature extraction should be fast
(? How to calculate fast/strong features ?)
2. Classifier application should be fast
(weak classifiers = fast evaluation)
Computing features
Simple rectangular filters as feature extractors
+1 -1
𝑓1 𝑥
-1
+1
-1 +1 -1
𝑓2 𝑥
𝑓3 𝑥
…
𝑓4 𝑥
The output of each feature is the difference between
the intensity in „black“ and „white“ regions.
Black is weighted as -1, white as +1.
Sum=1500
+
Sum=2000
-
𝑓1 𝑥 = 500
Computing features
Simple rectangular filters as feature extractors
+1 -1
𝑓1 𝑥
+1 -1
Require evaluation at many displacements and multiple scales!
Possible to evaluate such a simple filter efficiently!
Efficient computation – Integral images
• Our filters are based on sums of intensities
within rectangular regions.
• This can be done in constant time for
arbitrary large region!
• Require precomputing integral image.
Integral image
Value at (x,y) is the sum
of pixel intensities above
and left from (x,y)
+1 -1
Efficient computation – Integral images
• Consider a more complex filter
*
=?
Integral image
-1
+2
(x,y)
-1
+1
-2
+1
Large collection of filters
Account for all possible
parameters:
position, scale, type
More than 180,000
different features in a
24x24 window.
etc...
Apply Adaboost for
(i) selecting most informative features and
(ii) composing a classifier.
[Viola & Jones, CVPR 2001]
Slide credit: Kristen Grauman
Training Adaboost for face detection
Train a cascade of
classifiers using
the Adaboost
faces
non-faces
Selected features,
thresholds and weights.
AdaBoost algorithm
Start with uniform
weights of training
samples.
Repeat T-times
{x1,…xn}
Build a test classifier along each
feature. Calculate weighted
classification error for each
feature and choose a feature
with smallest error (and its
classifier).
…
[Freund & Schapire 1995]
Feature selection and classification: Adaboost
• In each round select a single feature and threshold that best
separate positive (faces) and negative (non-faces) examples given
the weighted error.
Obtained weak classifier:
…
Outputs of
rectangular filters
(features) for faces
and non-faces.
For next round of training reweight
the training examples using the
errors and select the next
feature-threshold pair.
[Viola & Jones, CVPR 2001]
Slide credit: Kristen Grauman
AdaBoost algorithm
Start with uniform
weights of training
samples.
Repeat T-times
{x1,…xn}
Build a test classifier along each
feature. Calculate weighted
classification error for each
feature and choose a feature
with smallest error (and its
classifier).
Reweight the examples:
Incorrectly classified  higher weights
Correctly classified  lower weights
Final classifier is a combination of weak
classifiers, which are weighted according
to their error.
[Freund & Schapire 1995]
Adaboost and feature selection (summary)
• Image features = weak classifiers
• In each round of the Adaboost:
1. Evaluate each rectangular filter on each training example
2. Sort examples w.r.t. filter responses
3. Select the threshold for each filter (with minimum error)
• Determine the optimal threshold in sorted list
4. Select the best combination of filter and threshold
5. The weight of the features is the classification error
6. Reweight examples
P. Viola, M. Jones, Robust Real-Time Face Detection, IJCV, Vol. 57(2), 2004.
(first version appeared at CVPR 2001)
Slide credit: Kristen Grauman
Efficiency issues
Extract features at
each bounding box
and apply Adaboost
classifier.
• Filter responses can be evaluated fast.
• But each image contains a lot of windows, that we need
to classify – potentially great amount of computation!
• How to make detection efficient?
Cascade of classifiers
• Efficient: Apply less accurate but fast classifiers first ,
to reject the windows that obviously do not contain
the particular category!
…
Cascade of classifiers
• Chain classifiers from least complex
with low true-positive rejection rate to
most complex ones:
The ROC curve
% False Pos
0
50
50
% Detection
100
vs false neg determined by
Window cropped from
the image
T
Classifier 1
F
non-face
T
Classifier 2
F
non-face
T
Classifier 3
Face
F
non-face
Slide credit: Svetlana Lazebnik
Viola-Jones face detector
Train a cascade of
classifiers using
the Adaboost
New image
faces
non-faces
Selected features,
thresholds and weights.
Postprocess detections by
non-maxima suppression.
• Train using 5k positives and 350M negatives
• Real-time detector using 38 layers in cascade
• 6061 features in the final layer (classifier)
• [OpenCV implementation: http://sourceforge.net/projects/opencvlibrary/]
Slide credit: Kristen Grauman
Viola-Jones: results
Guess what these correspond to!
Interesting:
First two selected features.
• Performance


384x288 images, detection 15 fps on 700 MHz Intel Pentium III desktop (2001).
Training time = weeks!
Slide adapted from Kristen Grauman
Detection in progress
• The video visualizes all the “features”, i.e., filter
responses checked in a cascade.
• Observe the increase
of cascade once close
to face.
http://cvdazzle.com/
Make your face invisible
• Know how it works?
Brake it!
http://cvdazzle.com/
Viola-Jones: results
Slide credit: Kristen Grauman
Viola-Jones: results
Slide credit: Kristen Grauman
Viola-Jones: results
Note the missing profiles!
Detector trained only on
frontal faces.
Slide credit: Kristen Grauman
Profile detection
Profile detection requires learning a separate detector using profile
faces.
Slide credit: Kristen Grauman
Profile detection
Paul Viola, ICCV tutorial
Try it at home!
• Viola & Jones detector was a great success
• First (!) real-time face detector
• Lots of improvements since initial publication
• C++ implementation OpenCV [Lienhart, 2002]
• http://sourceforge.net/projects/opencvlibrary/
• Matlab wrappers for C++:
• OpenCV version: Mex OpenCV
• Without OpenCV:
http://www.mathworks.com/matlabcentral/fileexchange/20976-fdlibmexfast-and-simple-face-detection
Slide credit: Kristen Grauman
Application example
Frontal faces
detected and
tracked. Names
inferred from
subtitles and
scripts.
Everingham, M., Sivic, J. and Zisserman, A.
"Hello! My name is... Buffy" - Automatic naming of characters in TV video,
BMVC 2006.
http://www.robots.ox.ac.uk/~vgg/research/nface/index.html
Slide credit: Kristen Grauman
Boosting by context
D. Hoiem, A. Efros, and M. Herbert. Putting Objects in Perspective. CVPR 2006.
slide credit: Rob Fergus
Drawbacks remain…
• Some objects poorly described by a single box
• Occlusion not accounted
for at all
?
Choice of classifiers
Neural networks
Nearest neighbor
%
106 examples
Shakhnarovich, Viola, Darrell 2003
Berg, Berg, Malik 2005...
Support Vector Machines
Guyon, Vapnik
Heisele, Serre, Poggio,
2001,…
LeCun, Bottou, Bengio, Haffner 1998
Rowley, Baluja, Kanade 1998
…
Boosting
Conditional Random Fields
Viola, Jones 2001,
Torralba et al. 2004,
Opelt et al. 2006,…
McCallum, Freitag, Pereira
2000; Kumar, Hebert 2003, …
Adapted from Antonio Torralba
Recent advances in feature learning
• Learning features by convolutional neural networks
Yan LeCun, http://yann.lecun.com/
• First successful application to large-scale object
detection by Krizhevsky et al. 1
• Huge number of parameters to learn (millions).
• Impressive performance.
• Currently a hot research topic
(Microsoft, Facebook, Google, etc.)
1Krizhevsky
NIPS2012
et al., ImageNet Classification with Deep Convolutional Neural Networks,
Recent advances in feature learning
• Some examples on COCO challenge
• CNN trained for region proposals + CNN trained for
classification.
Zagoruyko et al., “FAIR” team at MS COCO & ILSVRC Object Detection and Segmentation Challenge, ICCV2015
Recent advances in feature learning
• Some examples on COCO challenge
• CNN trained for region proposals + CNN trained for
classification.
Zagoruyko et al., “FAIR” team at MS COCO & ILSVRC Object Detection and Segmentation Challenge, ICCV2015
References
• David A. Forsyth, Jean Ponce, Computer Vision: A Modern Approach (2nd
Edition), (prva izdaja dostopna na spletu)
• R. Szeliski,Computer Vision: Algorithms and Applications, Springer, 2011
• Viola, M. Jones, Robust Real-Time Face Detection, IJCV, Vol. 57(2), 2004.
•
Viola-Jones Face Detector
•
C++ implementation in OpenCV [Lienhart, 2002]
• http://sourceforge.net/projects/opencvlibrary/
•
Matlab wrappers:
• http://www.mathworks.com/matlabcentral/fileexchange/19912
• Convolutional neural networks
•
Yan LeCun, http://yann.lecun.com/
•
Caffe, Torch, Tensor flow, etc.
Development of intelligent systems
(RInS)
Transformations between
coordinate frames
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Literature: Tadej Bajd (2006).
Osnove robotike, chapter 2
Academic year: 2021/22
Development of intelligent systems
Coordinate frames
Development of intelligent systems, Transformations between coordinate frames
2
3D environment
Development of intelligent systems, Transformations between coordinate frames
3
2D navigation
Development of intelligent systems, Transformations between coordinate frames
4
Degrees of freedom
 DOF
 6 DOF for full description of the pose of an object in space
 3 translations (position)
 3 rotations (orientation)
Development of intelligent systems, Transformations between coordinate frames
5
Degrees of freedom
Development of intelligent systems, Transformations between coordinate frames
6
Degrees of freedom
3 translations
2 rotations
1 rotation
ORIENTATION
POSITION
POSE
Development of intelligent systems, Transformations between coordinate frames
7
Position and orientation of the robot
Development of intelligent systems, Transformations between coordinate frames
8
Pose of the object in 3D space
Development of intelligent systems, Transformations between coordinate frames
9
Robot manipulator
 ViCoS LCLWOS robot manipulator
 5DOF
 6DOF needed for general grasping
Development of intelligent systems, Transformations between coordinate frames
10
Chains of coordinate frames
 Transformations between coordinate frames
Development of intelligent systems, Transformations between coordinate frames
11
Position and orientation
 Pose=Position+Orientation





Position(P2)=Position (P3)
Position(P1)~=Position (P2)
Orientation(P1)=Orientation (P3)
Orientation(P2)~=Orientation (P3)
Pose(P1)~=Pose(P2)~=Pose(P3)
Development of intelligent systems, Transformations between coordinate frames
12
Translation and rotation
 Moving objects:
 P1 v P3: Translation (T)
 P2 v P3: Rotation (R)
 P1 v P2: Translation in rotation
Development of intelligent systems, Transformations between coordinate frames
13
Position
 Position: vector from the origin of the coordinate frame to the point
 Position of the object P1:
Development of intelligent systems, Transformations between coordinate frames
14
Orientation
 Right-handed coordinate frame
 Rotation around x0 axis:
 Rotation matrix:
 Orientation of c.f.
with respect to c.f.
 Transformation of the vector
expressed in the c.f.
in the c.f.
:
Development of intelligent systems, Transformations between coordinate frames
into the coordinates expressed
15
Rotation matrices
 Rotation around x axis:
 Rotation around y axis :
 Rotation around z axis :
Development of intelligent systems, Transformations between coordinate frames
16
Properties of rotation matrix
 Rotation is an orthogonal transformation matrix
 Inverse transformation:
 In the right-handed coordinate frame the determinant equals to 1
 Addition of angles:
 Backward rotation:
Development of intelligent systems, Transformations between coordinate frames
17
Consecutive rotations
 Postmultiplicate the vector with the rotation matrix
 Consecutive rotations:
 Rotation matrices are postmultiplicated:
 In general:
 Postmultiplicate matrices for all rotations
 Rotations always refer to the respective relative current coordinate frame
Development of intelligent systems, Transformations between coordinate frames
18
Transformations
 Transformation from one c.f. to another:
 If c.f. are parallel:
 Only translation
 If c.f. are not parallel:
 Rotation and translation
 General pose description
Development of intelligent systems, Transformations between coordinate frames
19
Matrix notation
 Three coordinate frames:
 Combine the transformations:
 We can add the translation vectors if they are expressed in the same coordinate frame
 The two equations in the matrix form:
Development of intelligent systems, Transformations between coordinate frames
20
Homogeneous transformations
 General pose
can be expressed in the matrix form:
 Homogeneous transformation - homogenises (combines) rotation and translation
in one matrix
 Very concise and convenient format
 Homogeneous matrix of size 4x4 (for 3D space)
 One row is added, also 1 in the position vector
Development of intelligent systems, Transformations between coordinate frames
21
Homogenous matrix
 Rotation R and translation d:
 Only rotation:
Only translation:
Development of intelligent systems, Transformations between coordinate frames
22
Properties of homogeneous transformation
 Inverse of homogeneous transformation:
 Consecutive poses:
 Postmultiplication of homogeneous transformations:
 An element can be transformed arbitrary number of times –
by multiplying homogeneous matrices
Development of intelligent systems, Transformations between coordinate frames
23
Example
 Two rotations
 Vector
first rotate for 90o around z axis
and then for 90o around y axis
Development of intelligent systems, Transformations between coordinate frames
24
Example– two rotations
Development of intelligent systems, Transformations between coordinate frames
25
Example - translation
 After two rotations also translate the vector for (4,-3,7)
 Merge
 Translation
with rotations
 Transformation of the point (7,3,2):
Development of intelligent systems, Transformations between coordinate frames
26
Transformation of the coordinate frame
 Homogeneous transformation matrix transforms the base coordinate frame
 Vector of origin of c.f.:
 Unit vectors:
Development of intelligent systems, Transformations between coordinate frames
27
Pose of the coordinate frame
 Unit vectors of the new coordinate frame:
 Transformaction matrix
descibes the coordinate frame!
Development of intelligent systems, Transformations between coordinate frames
28
Movement of the coordinate frame
 Premultiplication or postmultiplication (of an object or c.f.) with transformation
 Example:
 Coordinate frame:
 Transformation:
Development of intelligent systems, Transformations between coordinate frames
29
Premultiplication
 The pose of the object
is transformed with
respect to the fixed
reference coordinate
frame in which the
object coordinates
were given.
 Order of
transformations:
2.
1.
Development of intelligent systems, Transformations between coordinate frames
30
Postmultiplication
 The pose of the
object is
transformed with
respect to its own
relative current
coordinate frame
 Order of
transformations:
2.
1.
Development of intelligent systems, Transformations between coordinate frames
31
Movement of the reference c.f.
 Example: Trans(2,1,0)Rot(z,90)
Development of intelligent systems, Transformations between coordinate frames
32
Movement of the reference c.f.
 Example: Trans(2,1,0)Rot(z,90)
With respect to the reference coordinate frame:
Rot(z,90)
Trans(2,1,0)
With respect to the relative coordinate frame:
Trans(2,1,0)
Development of intelligent systems, Transformations between coordinate frames
Rot(z,90)
33
Package TF in ROS
 Maintenance of the coordinate frames through time
Development of intelligent systems, Transformations between coordinate frames
34
Conventions
 Right-handed coordinate frame
 Orientation of the robot or object axes
 x: forward
 y: left
 z: up
x
 Orientation of the camera axes
 z: forward
 x: right
 y: down
 Rotation representations




z
z
y
y
x
quaternions
rotation matrix
rotations around X, Y and Z axes
Euler angles
Development of intelligent systems, Transformations between coordinate frames
35
Coordinate frames on mobile plaforms
 map (global map)




world coordinate frame
does not change (or very rarely)
long-term reference
useless in short-term
 odom (odometry)




world coordinate frame
changes with respect to odometry
useless in long-term
uselful in short-term
 base_link (robot)
 attached to the robot
 robot coordinate frame
Development of intelligent systems, Transformations between coordinate frames
36
Tree of coordinate frames
 ROS TF2
 tree of coordinate frames and
their relative poses
 distributed representation
 dynamic representation
y
 changes through time
 accessible representation
 querying relations between
arbitrary coordinate frames
x
W
W
A
W
A
B
B
Development of intelligent systems, Transformations between coordinate frames
37
Development of intelligent systems
(RInS)
Mobile robotics
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Slides: Wolfram Burgard, Cyrill Stachniss, Maren Bennewitz, Kai
Arras, UNI Freiburg, Introduction to Mobile Robotics
Academic year: 2021/22
Razvoj inteligentnih sistemov
Introduction
 Slides credit to:
Autonomous Intelligent Systems
Wolfram Burgard
Autonomous Intelligent Systems
Cyrill Stachniss
Humanoid Robots
Maren Bennewitz
Social Robotics
Kai Arras
Albert-Ludwigs-Universität, Freiburg, Germany
Razvoj inteligentnih sistemov, Mobilna robotika
2
Introduction
 Wolfram Burgard,
Albert-Ludwigs-Universität Freiburg
 Sebastian Thrun, Wolfram Burgard
and Dieter Fox,Probabilistic
Robotics, The MIT Press, 2005
Razvoj inteligentnih sistemov, Mobilna robotika
3
Introduction
 Course Introduction to Mobile Robotics – Autonomous
Mobile Systems at the Albert-Ludwigs-Universität Freiburg
Razvoj inteligentnih sistemov, Mobilna robotika
4
Introduction
 Course Introduction to Mobile Robotics – Autonomous
Mobile Systems at the Albert-Ludwigs-Universität Freiburg
 This course:
Razvoj inteligentnih sistemov, Mobilna robotika
5
Goal of this course

Provide an overview of problems /
approaches in mobile robotics

Probabilistic reasoning: Dealing with
noisy data

Hands-on experience
AI View on Mobile Robotics
Sensor data
Control system
World model
Actions
Components of Typical Robots
cameras
sensors
sonars
laser
base
actuators
8
Architecture of a Typical Control
System
Introduction to
Mobile Robotics
Robot Control Paradigms
Wolfram Burgard, Cyrill Stachniss,
Maren Bennewitz, Kai Arras
10
Classical / Hierarchical Paradigm
Sense
Plan
Act
 70’s
 Focus on automated reasoning and knowledge representation
 STRIPS (Stanford Research Institute Problem Solver): Perfect
world model, closed world assumption
 Find boxes and move them to designated position
11
Classical Paradigm
Stanford Cart
1. Take nine images of the environment, identify
interesting points in one image, and use other
images to obtain depth estimates.
2. Integrate information into global world model.
3. Correlate images with previous image set to
estimate robot motion.
4. On basis of desired motion, estimated motion,
and current estimate of environment, determine
direction in which to move.
5. Execute the motion.
12
Stanford Cart
13
Sensing
Motor Control
Plan
Execute
Plan
Sense
Model
Perception
Classical Paradigm as
Horizontal/Functional Decomposition
Act
Action
Environment
14
Reactive / Behavior-based Paradigm
Sense
Act
 No models: The world is its own, best
model
 Easy successes, but also limitations
 Investigate biological systems
15
Reactive Paradigm as
Vertical Decomposition
…
Explore
Wander
Avoid obstacles
Sensing
Action
Environment
16
Characteristics of Reactive
Paradigm
 Situated agent, robot is integral part of the
world.
 No memory, controlled by what is
happening in the world.
 Tight coupling between perception and
action via behaviors.
 Only local, behavior-specific sensing is
permitted (ego-centric representation).
17
Behaviors
 … are a direct mapping of sensory
inputs to a pattern of motor actions
that are then used to achieve a task.
 … serve as the basic building block for
robotics actions, and the overall
behavior of the robot is emergent.
 … support good software design
principles due to modularity.
18
Subsumption Architecture
 Introduced by Rodney Brooks ’86.
 Behaviors are networks of sensing and
acting modules (augmented finite
state machines AFSM).
 Modules are grouped into layers of
competence.
 Layers can subsume lower layers.
 No internal state!
19
Level 0: Avoid
Polar plot of sonars
Feel force
Sonar
force
Run away
heading
encoders
polar
plot
Collide
heading
Turn
halt
Forward
20
Level 1: Wander
heading
Wander
Feel force
Sonar
force
force
Avoid
Run away
s
heading
polar
plot
Collide
modified
heading
halt
Turn
heading
encoders
Forward
21
Level 2: Follow Corridor
Stay in
middle
Look
corridor
Wander
Feel force
Sonar
distance, direction traveled
heading
to middle
s
force
force
Avoid
Run away
Collide
modified
heading
s
heading
polar
plot
halt
Integrate
Turn
heading
encoders
Forward
22
Reactive Paradigm
 Representations?
 Good software engineering principles?
 Easy to program?
 Robustness?
 Scalability?
23
Hybrid Deliberative/reactive
Paradigm
Plan
Sense
Act
 Combines advantages of previous paradigms


World model used for planning
Closed loop, reactive control
24
Introduction to
Mobile Robotics
Probabilistic Motion Models
Wolfram Burgard, Cyrill Stachniss,
Maren Bennewitz, Kai Arras
25
Robot Motion
 Robot motion is inherently uncertain.
 How can we model this uncertainty?
26
Dynamic Bayesian Network for
Controls, States, and Sensations
27
Probabilistic Motion Models
 To implement the Bayes Filter, we need the
transition model p(x | x’, u).
 The term p(x | x’, u) specifies a posterior
probability, that action u carries the robot
from x’ to x.
 In this section we will specify, how
p(x | x’, u) can be modeled based on the
motion equations.
28
Coordinate Systems
 In general the configuration of a robot can be
described by six parameters.
 Three-dimensional Cartesian coordinates plus
three Euler angles pitch, roll, and tilt.
 Throughout this section, we consider robots
operating on a planar surface.
 The state space of such
systems is threedimensional (x,y,).
29
Typical Motion Models
 In practice, one often finds two types of
motion models:
 Odometry-based
 Velocity-based (dead reckoning)
 Odometry-based models are used when
systems are equipped with wheel encoders.
 Velocity-based models have to be applied
when no wheel encoders are given.
 They calculate the new pose based on the
velocities and the time elapsed.
30
Example Wheel Encoders
These modules require
+5V and GND to power
them, and provide a 0 to
5V output. They provide
+5V output when they
"see" white, and a 0V
output when they "see"
black.
These disks are
manufactured out of high
quality laminated color
plastic to offer a very crisp
black to white transition.
This enables a wheel
encoder sensor to easily
see the transitions.
Source: http://www.active-robots.com/
31
Dead Reckoning
 Derived from “deduced reckoning.”
 Mathematical procedure for determining
the present location of a vehicle.
 Achieved by calculating the current pose of
the vehicle based on its velocities and the
time elapsed.
32
Reasons for Motion Errors
ideal case
bump
different wheel
diameters
carpet
and many more …
33
Odometry Model
• Robot moves from x , y , to x ' , y ' , ' .
• Odometry information u   rot1 ,  rot 2 ,  trans .
 trans  ( x ' x )  ( y ' y )
2
2
 rot1  atan2( y ' y , x ' x )  
 rot 2   '   rot1
x , y ,
 rot1
 rot 2
 trans
x ' , y ' , '
Noise Model for Odometry
 The measured motion is given by the true
motion corrupted with noise.
ˆrot1   rot1  
1 | rot1 | 2
ˆtrans   trans  
ˆrot 2   rot 2  
3
| trans |
| trans | 4 | rot1  rot 2 |
1 | rot 2 | 2
| trans |
Typical Distributions for
Probabilistic Motion Models
Normal distribution
  ( x) 
2
1
2
2
e
1 x2
 2
2
Triangular distribution
0 if | x | 6 2

  2 ( x)   6 2  | x |

6 2

Application
 Repeated application of the sensor model for short
movements.
 Typical banana-shaped distributions obtained for
2d-projection of 3d posterior.
p(x|u,x’)
x’
x’
u
u
Sample Odometry Motion Model
1.
Algorithm sample_motion_model(u, x):
u   rot1 ,  rot 2 ,  trans , x  x, y,
1. ˆrot1   rot1  sample(1 |  rot1 |  2  trans )
2. ˆtrans   trans  sample( 3  trans   4 (|  rot1 |  |  rot 2 |))
3. ˆ  
 sample( |  |   )
rot 2
rot 2
1
6.
x'  x  ˆtrans cos(  ˆrot1 )
y '  y  ˆtrans sin(  ˆrot1 )
 '    ˆ  ˆ
7.
Return
4.
5.
rot 1
rot 2
x ' , y ' , '
rot 2
2
trans
sample_normal_distribution
Sampling from Our Motion
Model
Start
Examples (Odometry-Based)
Introduction to
Mobile Robotics
Probabilistic Sensor Models
Wolfram Burgard, Cyrill Stachniss, Maren
Bennewitz, Giorgio Grisetti, Kai Arras
41
Sensors for Mobile Robots
 Contact sensors: Bumpers
 Internal sensors
 Accelerometers (spring-mounted masses)
 Gyroscopes (spinning mass, laser light)
 Compasses, inclinometers (earth magnetic field, gravity)
 Proximity sensors




Sonar (time of flight)
Radar (phase and frequency)
Laser range-finders (triangulation, tof, phase)
Infrared (intensity)
 Visual sensors: Cameras
 Satellite-based sensors: GPS
42
Proximity Sensors
 The central task is to determine P(z|x), i.e., the
probability of a measurement z given that the robot
is at position x.
 Question: Where do the probabilities come from?
 Approach: Let’s try to explain a measurement.
43
Beam-based Sensor Model
 Scan z consists of K measurements.
z  {z1 , z2 ,..., z K }
 Individual measurements are independent
given the robot position.
K
P ( z | x , m )   P ( z k | x, m )
k 1
44
Beam-based Sensor Model
K
P ( z | x , m )   P ( z k | x, m )
k 1
45
Typical Measurement Errors of
an Range Measurements
1. Beams reflected by
obstacles
2. Beams reflected by
persons / caused
by crosstalk
3. Random
measurements
4. Maximum range
measurements
46
Proximity Measurement
 Measurement can be caused by …




a known obstacle.
cross-talk.
an unexpected obstacle (people, furniture, …).
missing all obstacles (total reflection, glass, …).
 Noise is due to uncertainty …




in measuring distance to known obstacle.
in position of known obstacles.
in position of additional obstacles.
whether obstacle is missed.
47
Beam-based Proximity Model
Measurement noise
0
zexp
Phit ( z | x, m)  
Unexpected obstacles
zmax
1
e
2b
1 ( z  zexp )

2
b
2
0
zexp
  e  z
Punexp ( z | x, m)  
 0
zmax
z  zexp 

otherwise
48
Beam-based Proximity Model
Random measurement
0
zexp
Prand ( z | x, m)  
zmax
1
z max
Max range
0
zexp
Pmax ( z | x, m)  
zmax
1
z small
49
Resulting Mixture Density
  hit 


 unexp 
P ( z | x, m)  
 max 


 
 rand 
T
 Phit ( z | x, m) 


 Punexp ( z | x, m) 

Pmax ( z | x, m) 


 P ( z | x, m) 
 rand

How can we determine the model parameters?
50
Raw Sensor Data
Measured distances for expected distance of 300 cm.
Sonar
Laser
51
Approximation
 Maximize log likelihood of the data
P ( z | zexp )
 Search space of n-1 parameters.




Hill climbing
Gradient descent
Genetic algorithms
…
 Deterministically compute the n-th
parameter to satisfy normalization
constraint.
52
Approximation Results
Laser
Sonar
300cm
400cm
53
Example
z
P(z|x,m)
54
Scan-based Model
 Probability is a mixture of …
 a Gaussian distribution with mean at distance to
closest obstacle,
 a uniform distribution for random
measurements, and
 a small uniform distribution for max range
measurements.
 Again, independence between different
components is assumed.
55
Example
Likelihood field
Map m
P(z|x,m)
56
San Jose Tech Museum
Occupancy grid map
Likelihood field
57
Scan Matching
 Extract likelihood field from scan and use it
to match different scan.
58
Scan Matching
 Extract likelihood field from first scan and
use it to match second scan.
~0.01 sec
59
Properties of Scan-based Model
 Highly efficient, uses 2D tables only.
 Smooth w.r.t. to small changes in robot
position.
 Allows gradient descent, scan matching.
 Ignores physical properties of beams.
 Will it work for ultrasound sensors?
60
Additional Models of Proximity
Sensors
 Map matching (sonar, laser): generate
small, local maps from sensor data and
match local maps against global model.
 Scan matching (laser): map is represented
by scan endpoints, match scan into this
map.
 Features (sonar, laser, vision): Extract
features such as doors, hallways from
sensor data.
61
Landmarks
 Active beacons (e.g., radio, GPS)
 Passive (e.g., visual, retro-reflective)
 Standard approach is triangulation
 Sensor provides
 distance, or
 bearing, or
 distance and bearing.
62
Distance and Bearing
63
Summary of Sensor Models
 Explicitly modeling uncertainty in sensing is key to
robustness.
 In many cases, good models can be found by the
following approach:
1. Determine parametric model of noise free measurement.
2. Analyze sources of noise.
3. Add adequate noise to parameters (eventually mix in
densities for noise).
4. Learn (and verify) parameters by fitting model to data.
5. Likelihood of measurement is given by “probabilistically
comparing” the actual with the expected measurement.
 This holds for motion models as well.
 It is extremely important to be aware of the
underlying assumptions!
64
Introduction to
Mobile Robotics
Mapping with Known Poses
Wolfram Burgard, Cyrill Stachniss,
Maren Bennewitz, Kai Arras
65
Why Mapping?
 Learning maps is one of the fundamental
problems in mobile robotics
 Maps allow robots to efficiently carry out
their tasks, allow localization …
 Successful robot systems rely on maps for
localization, path planning, activity
planning etc.
66
The General Problem of
Mapping
What does the
environment look like?
67
The General Problem of
Mapping
 Formally, mapping involves, given the
sensor data,
d  {u1 , z1 , u2 , z2 ,, un , zn }
to calculate the most likely map
m  arg max P(m | d )
*
m
68
Mapping as a Chicken and Egg
Problem
 So far we learned how to estimate the pose
of the vehicle given the data and the map.
 Mapping, however, involves to
simultaneously estimate the pose of the
vehicle and the map.
 The general problem is therefore denoted
as the simultaneous localization and
mapping problem (SLAM).
 Throughout this section we will describe
how to calculate a map given we know the
pose of the vehicle.
69
Types of SLAM-Problems
 Grid maps or scans
[Lu & Milios, 97; Gutmann, 98: Thrun 98; Burgard, 99; Konolige & Gutmann, 00; Thrun, 00; Arras, 99; Haehnel, 01;…]
 Landmark-based
[Leonard et al., 98; Castelanos et al., 99: Dissanayake et al., 2001; Montemerlo et al., 2002;…
70
Problems in Mapping
 Sensor interpretation
 How do we extract relevant information
from raw sensor data?
 How do we represent and integrate this
information over time?
 Robot locations have to be estimated
 How can we identify that we are at a
previously visited place?
 This problem is the so-called data
association problem.
71
Occupancy Grid Maps
 Introduced by Moravec and Elfes in 1985
 Represent environment by a grid.
 Estimate the probability that a location is
occupied by an obstacle.
 Key assumptions
 Occupancy of individual cells (m[xy]) is
independent
Bel (mt )  P(mt | u1 , z 2 , ut 1 , zt )
  Bel (mt[ xy ] )
x, y
 Robot positions are known!
72
Incremental Updating
of Occupancy Grids (Example)
73
Resulting Map Obtained with
Ultrasound Sensors
74
75
Occupancy Grids: From scans to maps
76
Tech Museum, San Jose
CAD map
occupancy grid map
77
Introduction to
Mobile Robotics
Bayes Filter – Discrete Filters
Wolfram Burgard, Cyrill Stachniss,
Maren Bennewitz, Kai Arras
78
Probabilistic Localization
79
Grid-based Localization
80
Sonars and
Occupancy Grid Map
81
Introduction to
Mobile Robotics
Bayes Filter – Particle Filter
and Monte Carlo Localization
Wolfram Burgard, Cyrill Stachniss,
Maren Bennewitz, Kai Arras
82
Motivation
 Recall: Discrete filter
 Discretize the continuous state space
 High memory complexity
 Fixed resolution (does not adapt to the belief)
 Particle filters are a way to efficiently represent
non-Gaussian distribution
 Basic principle
 Set of state hypotheses (“particles”)
 Survival-of-the-fittest
83
Sample-based Localization (sonar)
Mathematical Description
 Set of weighted samples
State hypothesis
Importance weight
 The samples represent the posterior
85
Function Approximation
 Particle sets can be used to approximate functions
 The more particles fall into an interval, the higher
the probability of that interval
 How to draw samples form a function/distribution?
86
Rejection Sampling
 Let us assume that f(x)<1 for all x
 Sample x from a uniform distribution
 Sample c from [0,1]
 if f(x) > c
otherwise
keep the sample
reject the sampe
f(x’)
c’
c
OK
f(x)
x
x’
87
Particle Filters
Sensor Information: Importance Sampling
Bel ( x)   p( z | x) Bel  ( x)
 p( z | x) Bel  ( x)
w

  p ( z | x)

Bel ( x)
Robot Motion
Bel  ( x) 
 p( x | u x' ) Bel ( x' )
,
d x'
Sensor Information: Importance Sampling
Bel ( x)   p( z | x) Bel  ( x)
 p( z | x) Bel  ( x)
w

  p ( z | x)

Bel ( x)
Robot Motion
Bel  ( x) 
 p( x | u x' ) Bel ( x' )
,
d x'
Particle Filter Algorithm
 Sample the next generation for particles using the
proposal distribution
 Compute the importance weights :
weight = target distribution / proposal distribution
 Resampling: “Replace unlikely samples by more
likely ones”
93
Particle Filter Algorithm
1. Algorithm particle_filter( St-1, ut-1 zt):
2. St  ,
3.
 0
For i  1 n
Generate new samples
4.
Sample index j(i) from the discrete distribution given by wt-1
5.
Sample xti from p( xt | xt 1 , ut 1 ) using xtj(1i ) and ut 1
6.
wti  p ( zt | xti )
Compute importance weight
7.
    wti
Update normalization factor
8.
St  St  { xti , wti }
Insert
9. For i  1 n
10.
wti  wti / 
Normalize weights
94
Mobile Robot Localization
 Each particle is a potential pose of the robot
 Proposal distribution is the motion model of
the robot (prediction step)
 The observation model is used to compute
the importance weight (correction step)
[For details, see PDF file on the lecture web page]
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
Initial Distribution
114
After Incorporating Ten
Ultrasound Scans
115
After Incorporating 65 Ultrasound
Scans
116
Estimated Path
117
Summary – Particle Filters
 Particle filters are an implementation of
recursive Bayesian filtering
 They represent the posterior by a set of
weighted samples
 They can model non-Gaussian distributions
 Proposal to draw new samples
 Weight to account for the differences
between the proposal and the target
 Monte Carlo filter, Survival of the fittest,
Condensation, Bootstrap filter
118
Summary – PF Localization
 In the context of localization, the particles
are propagated according to the motion
model.
 They are then weighted according to the
likelihood of the observations.
 In a re-sampling step, new particles are
drawn with a probability proportional to the
likelihood of the observation.
119
Introduction to
Mobile Robotics
SLAM: Simultaneous Localization
and Mapping
Wolfram Burgard, Cyrill Stachniss,
Maren Bennewitz, Kai Arras
Slides by Kai Arras and Wolfram Burgard
Last update: June 2010
The SLAM Problem
SLAM is the process by which a robot builds
a map of the environment and, at the same
time, uses this map to compute its location
• Localization: inferring location given a map
• Mapping: inferring a map given a location
• SLAM: learning a map and locating the robot
simultaneously
121
The SLAM Problem
• SLAM is a chicken-or-egg problem:
→ A map is needed for localizing a robot
→ A pose estimate is needed to build a map
• Thus, SLAM is (regarded as) a hard problem in
robotics
122
The SLAM Problem
• SLAM is considered one of the most
fundamental problems for robots to become
truly autonomous
• A variety of different approaches to address the
SLAM problem have been presented
• Probabilistic methods rule
• History of SLAM dates back to the mid-eighties
(stone-age of mobile robotics)
123
The SLAM Problem
Given:
• The robot’s controls
• Relative observations
Wanted:
• Map of features
• Path of the robot
124
The SLAM Problem
• Absolute
robot pose
• Absolute
landmark positions
• But only relative
measurements of
landmarks
125
SLAM Applications
SLAM is central to a range of indoor,
outdoor, in-air and underwater applications
for both manned and autonomous vehicles.
Examples:
•At home: vacuum cleaner, lawn mower
•Air: surveillance with unmanned air vehicles
•Underwater: reef monitoring
•Underground: exploration of abandoned mines
•Space: terrain mapping for localization
126
SLAM Applications
Indoors
Undersea
Space
Underground
127
Map Representations
Examples:
Subway map, city map, landmark-based map
Maps are topological and/or metric
models of the environment
128
Map Representations
• Grid maps or scans, 2d, 3d
[Lu & Milios, 97; Gutmann, 98: Thrun 98; Burgard, 99; Konolige & Gutmann, 00; Thrun, 00; Arras,
99; Haehnel, 01;…]
• Landmark-based
[Leonard et al., 98; Castelanos et al., 99: Dissanayake et al., 2001; Montemerlo et al., 2002;…
129
Why is SLAM a hard problem?
1. Robot path and map are both unknown
2. Errors in map and pose estimates correlated
130
Why is SLAM a hard problem?
Robot pose
uncertainty
• In the real world, the mapping between
observations and landmarks is unknown
(origin uncertainty of measurements)
• Data Association: picking wrong data
associations can have catastrophic
consequences (divergence)
131
SLAM:
Simultaneous Localization And Mapping
• Full SLAM:
p(x 0:t ,m | z1:t ,u1:t )
Estimates entire path and map!
• Online SLAM:
p(x t ,m | z1:t ,u1:t ) 
  K  p(x
1:t
,m | z1:t ,u1:t ) dx1dx2 ...dxt1
Integrations (marginalization) typically done
recursively, one at a time
Estimates most recent pose and map!
132
Graphical Model of Full SLAM
p( x1:t , m | z1:t , u1:t )
133
Graphical Model of Online SLAM
p( xt , m | z1:t , u1:t )      p( x1:t , m | z1:t , u1:t ) dx1 dx2 ...dxt 1
134
Graphical Model: Models
"Motion model"
"Observation model"
135
EKF SLAM: State representation
• Localization
3x1 pose vector
3x3 cov. matrix
• SLAM
Landmarks are simply added to the state.
Growing state vector and covariance matrix!
136
EKF SLAM: Building the Map
Filter Cycle, Overview:
1. State prediction (odometry)
2. Measurement prediction
3. Observation
4. Data Association
5. Update
6. Integration of new landmarks
137
EKF SLAM: Building the Map
• State Prediction
Odometry:
Robot-landmark crosscovariance prediction:
(skipping time index k)
138
EKF SLAM: Building the Map
• Measurement Prediction
Global-to-local
frame transform h
139
EKF SLAM: Building the Map
• Observation
(x,y)-point landmarks
140
EKF SLAM: Building the Map
• Data Association
Associates predicted
measurements
with observation
?
(Gating)
141
EKF SLAM: Building the Map
• Filter Update
The usual Kalman
filter expressions
142
EKF SLAM: Building the Map
• Integrating New Landmarks
State augmented by
Cross-covariances:
143
EKF SLAM
Map
Correlation matrix
144
EKF SLAM
Map
Correlation matrix
145
EKF SLAM
Map
Correlation matrix
146
SLAM: Loop Closure
• Loop closure is the problem of recognizing an
already mapped area, typically after a long
exploration path (the robot "closes a loop")
• Structually identical to data association, but
• high levels of ambiguity
• possibly useless validation gates
• environment symmetries
• Uncertainties collapse after a loop closure
(whether the closure was correct or not)
147
SLAM: Loop Closure
• Before loop closure
148
SLAM: Loop Closure
• After loop closure
149
SLAM: Loop Closure
• By revisiting already mapped areas, uncertainties in robot and landmark estimates can be
reduced
• This can be exploited to "optimally" explore
an environment for the sake of better (e.g.
more accurate) maps
• Exploration: the problem of where to acquire
new information (e.g. depth-first vs. breadth
first)
→ See separate chapter on exploration
150
KF-SLAM Properties (Linear Case)
• The determinant of any sub-matrix of the map
covariance matrix decreases monotonically as
successive observations are made
• When a new land-
mark is initialized,
its uncertainty is
maximal
• Landmark uncer-
tainty decreases
monotonically with
each new observation
[Dissanayake et al., 2001] 151
KF-SLAM Properties (Linear Case)
• In the limit, the landmark estimates become
fully correlated
[Dissanayake et al., 2001] 152
KF-SLAM Properties (Linear Case)
• In the limit, the covariance associated with any
single landmark location estimate is determined
only by the initial covariance in the vehicle
location estimate.
[Dissanayake et al., 2001] 153
EKF SLAM Example: Victoria Park
Syndey, Australia
154
Victoria Park: Landmarks
[courtesy by E. Nebot]
155
156
Victoria Park: Estimated Trajectory
[courtesy by E. Nebot]
157
Victoria Park: Landmarks
[courtesy by E. Nebot]
158
EKF SLAM Example: Tennis Court
[courtesy by J. Leonard]
159
EKF SLAM Example: Tennis Court
odometry
estimated trajectory
[courtesy by John Leonard]
160
EKF SLAM Example: Line Features
• KTH Bakery Data Set
[Wulf et al., ICRA 04]
161
EKF-SLAM: Complexity
• Cost per step: quadratic in n, the number of
•
•
landmarks: O(n2)
Total cost to build a map with n landmarks:
O(n3)
Memory: O(n2)
Problem: becomes computationally intractable
for large maps!
 Approaches exist that make EKF-SLAM
amortized O(n) / O(n2) / O(n2)
D&C SLAM [Paz et al., 2006]
162
SLAM Techniques
• EKF SLAM
• FastSLAM
• Graphical SLAM
• Topological SLAM
(mainly place recognition)
• Scan Matching / Visual Odometry
(only locally consistent maps)
• Approximations for SLAM: Local submaps,
Sparse extended information filters, Sparse
links, Thin junction tree filters, etc.
163
Introduction to
Mobile Robotics
SLAM –
Grid-based FastSLAM
Wolfram Burgard, Cyrill Stachniss,
Maren Bennewitz, Kai Arras
164
Grid-based SLAM
 Can we solve the SLAM problem if no pre-defined
landmarks are available?
 Can we use the ideas of FastSLAM to build grid
maps?
 As with landmarks, the map depends on the poses
of the robot during data acquisition
 If the poses are known, grid-based mapping is easy
(“mapping with known poses”)
165
Mapping with Known Poses
 Mapping with known poses using laser range data
166
Rao-Blackwellization
poses
map
observations & movements
Factorization first introduced by Murphy in 1999
167
Rao-Blackwellization
poses
map
observations & movements
SLAM posterior
Robot path posterior
Mapping with known poses
Factorization first introduced by Murphy in 1999
168
Rao-Blackwellization
This is localization, use MCL
Use the pose estimate
from the MCL and apply
mapping with known poses
169
Rao-Blackwellized Mapping
 Each particle represents a possible trajectory of
the robot
 Each particle
 maintains its own map and
 updates it upon “mapping with known poses”
 Each particle survives with a probability
proportional to the likelihood of the observations
relative to its own map
170
Particle Filter Example
3 particles
map of particle 1
map of particle 2
map of particle 3
171
Problem
 Each map is quite big in case of grid maps
 Since each particle maintains its own map
 Therefore, one needs to keep the number
of particles small
 Solution:
Compute better proposal distributions!
 Idea:
Improve the pose estimate before applying
the particle filter
172
Pose Correction Using Scan
Matching
Maximize the likelihood of the i-th pose and
map relative to the (i-1)-th pose and map
xˆt  arg max p( zt | xt , mˆ t 1 )  p( xt | ut 1 , xˆt 1 )
xt
current measurement
robot motion
map constructed so far
173
Motion Model for Scan Matching
Raw Odometry
Scan Matching
174
FastSLAM with Improved
Odometry
 Scan-matching provides a locally
consistent pose correction
 Pre-correct short odometry sequences
using scan-matching and use them as
input to FastSLAM
 Fewer particles are needed, since the
error in the input in smaller
[Haehnel et al., 2003]
175
176
177
FastSLAM with Scan-Matching
178
Conclusion (so far…)
 The presented approach is a highly efficient
algorithm for SLAM combining ideas of scan
matching and FastSLAM
 Scan matching is used to transform sequences of
laser measurements into odometry measurements
 This version of grid-based FastSLAM can handle
larger environments than before in “real time”
179
What’s Next?
 Further reduce the number of particles
 Improved proposals will lead to more
accurate maps
 Use the properties of our sensor when
drawing the next generation of particles
180
Intel Lab
 15 particles
 four times faster
than real-time
P4, 2.8GHz
 5cm resolution
during scan
matching
 1cm resolution in
final map
181
Outdoor Campus Map
 30 particles
 250x250m2
 1.75 km
(odometry)
 20cm resolution
during scan
matching
 30cm resolution
in final map
182
MIT Killian Court
 The “infinite-corridor-dataset” at MIT
183
MIT Killian Court
184
MIT Killian Court - Video
185
Conclusion
 The ideas of FastSLAM can also be applied in the
context of grid maps
 Utilizing accurate sensor observation leads to
good proposals and highly efficient filters
 It is similar to scan-matching on a per-particle
base
 The number of necessary particles and
re-sampling steps can seriously be reduced
 Improved versions of grid-based FastSLAM can
handle larger environments than naïve
implementations in “real time” since they need
one order of magnitude fewer samples
186
More Details on FastSLAM
 M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit. FastSLAM: A
factored solution to simultaneous localization and mapping, AAAI02
(The classic FastSLAM paper with landmarks)
 D. Haehnel, W. Burgard, D. Fox, and S. Thrun. An efcient FastSLAM
algorithm for generating maps of large-scale cyclic environments
from raw laser range measurements, IROS03
(FastSLAM on grid-maps using scan-matched input)
 G. Grisetti, C. Stachniss, and W. Burgard. Improving grid-based slam
with rao-blackwellized particle filters by adaptive proposals and
selective resampling, ICRA05
(Proposal using laser observation, adaptive resampling)
 A. Eliazar and R. Parr. DP-SLAM: Fast, robust simultainous
localization and mapping without predetermined landmarks, IJCAI03
(A representation to handle big particle sets)
187
188
189
190
191
192
Introduction to
Mobile Robotics
Robot Motion Planning
Wolfram Burgard, Cyrill Stachniss,
Maren Bennewitz, Kai Arras
Slides by Kai Arras Last update July 2011
With material from S. LaValle, JC. Latombe, H. Choset et al., W. Burgard
Robot Motion Planning
J.-C. Latombe (1991):
“…eminently necessary since, by definition,
a robot accomplishes tasks by moving in
the real world.”
Goals
 Collision-free trajectories
 Robot should reach the goal location
as fast as possible
194
Problem Formulation
 The problem of motion planning can be
stated as follows. Given:




A
A
A
A
start pose of the robot
desired goal pose
geometric description of the robot
geometric description of the world
 Find a path that moves the robot
gradually from start to goal while
never touching any obstacle
195
Problem Formulation
Motion planning is sometimes also called piano mover's problem
196
Piano mover's problem
197
Configuration Space
 Although the motion planning problem is
defined in the regular world, it lives in
another space: the configuration space
 A robot configuration q is a specification of
the positions of all robot points relative to
a fixed coordinate system
 Usually a configuration is expressed as a
vector of positions and orientations
198
Configuration Space
Rigid-body robot example
Robot
Reference direction
y

Reference point
x
 3-parameter representation: q = (x,y,)
 In 3D, q would be of the form (x,y,z,,b,g)
199
Configuration Space
 Example: circular robot
 C-space is obtained by sliding the robot
along the edge of the obstacle regions
"blowing them up" by the robot radius
200
Configuration Space
 Example: polygonal robot, translation only
 C-space is obtained by sliding the robot
along the edge of the obstacle regions
201
Configuration Space
 Example: polygonal robot, translation only
Work space
Configuration space
Reference point
 C-space is obtained by sliding the robot
along the edge of the obstacle regions
202
Configuration Space
 Example: polygonal robot, trans+rotation
 C-space is obtained by sliding the robot
along the edge of the obstacle regions
in all orientations
203
Configuration Space
Free space and obstacle region
 With
being the work space,
the set of obstacles,
the robot in
configuration
 We further define
 : start configuration
 : goal configuration
204
Configuration Space
Then, motion planning amounts to
 Finding a continuous path
with
 Given this setting,
we can do planning
with the robot being
a point in C-space!
205
C-Space Discretizations
 Continuous terrain needs to be
discretized for path planning
 There are two general approaches
to discretize C-spaces:
 Combinatorial planning
Characterizes Cfree explicitely by capturing the
connectivity of Cfree into a graph and finds
solutions using search
 Sampling-based planning
Uses collision-detection to probe and
incrementally search the C-space for solution
206
Combinatorial Planning
 We will look at four combinatorial
planning techniques




Visibility graphs
Voronoi diagrams
Exact cell decomposition
Approximate cell decomposition
 They all produce a road map
 A road map is a graph in Cfree in which each
vertex is a configuration in Cfree and each edge
is a collision-free path through Cfree
207
Combinatorial Planning
 Without loss of generality, we will consider
a problem in
with a point robot
that cannot rotate. In this case:
 We further assume a polygonal world
qG
qI
208
Visibility Graphs
 Idea: construct a path as a polygonal line
connecting qI and qG through vertices of Cobs
 Existence proof for such paths, optimality
 One of the earliest path planning methods
qG
qG
qI
qI
 Best algorithm: O(n2 log n)
209
Generalized Voronoi Diagram
 Defined to be the set of points q whose
cardinality of the set of boundary points of
Cobs with the same distance to q is greater
than 1
 Let us decipher
this definition...
 Informally:
the place with the
same maximal
clearance from
all nearest obstacles
qI'
qI
qG'
qG
210
Generalized Voronoi Diagram
 Geometrically:
two closest points
p
p
one closest point
q
q
clearance(q)
q
p
p
p
 For a polygonal Cobs, the Voronoi diagram
consists of (n) lines and parabolic segments
 Naive algorithm: O(n4), best: O(n log n)
211
Voronoi Diagram
 Voronoi diagrams have been well studied
for (reactive) mobile robot path planning
 Fast methods exist to compute and
update the diagram in real-time for lowdim. C's
 Pros: maximize clearance is a good idea for
an uncertain robot
 Cons: unnatural attraction to open space,
suboptimal paths
 Needs extensions
212
Exact Cell Decomposition
 Idea: decompose Cfree into non-overlapping
cells, construct connectivity graph to
represent adjacencies, then search
 A popular implementation of this idea:
1. Decompose Cfree into trapezoids with vertical
side segments by shooting rays upward and
downward from each polygon vertex
2. Place one vertex in the interior of every
trapezoid, pick e.g. the centroid
3. Place one vertex in every vertical segment
4. Connect the vertices
213
Exact Cell Decomposition
 Trapezoidal decomposition (
(a)
(c)
(b)
(d)
max)
 Best known algorithm: O(n log n) where n is
the number of vertices of Cobs
214
Approximate Cell Decomposition
 Exact decomposition methods can be involved and inefficient for complex problems
 Approximate decomposition uses cells with
the same simple predefined shape
qI
qG
qG
qI
Quadtree decomposition
215
Approximate Cell Decomposition
 Exact decomposition methods can be involved and inefficient for complex problems
 Approximate decomposition uses cells with
the same simple predefined shape
 Pros:




Iterating the same simple computations
Numerically more stable
Simpler to implement
Can be made complete
216
Combinatorial Planning
Wrap Up
 Combinatorial planning techniques are
elegant and complete (they find a
solution if it exists, report failure otherwise)
 But: become quickly intractable when
C-space dimensionality increases (or n resp.)
 Combinatorial explosion in terms of
facets to represent , , and
,
especially when rotations bring in nonlinearities and make C a nontrivial manifold
➡
Use sampling-based planning
Weaker guarantees but more efficient
217
Sampling-Based Planning
 Abandon the concept of explicitly
characterizing Cfree and Cobs and leave the
algorithm in the dark when exploring Cfree
 The only light is provided by a collisiondetection algorithm, that probes C to
see whether some configuration lies in Cfree
 We will have a look at
 Probabilistic road maps (PRM)
[Kavraki et al., 92]
 Rapidly exploring random trees (RRT)
[Lavalle and Kuffner, 99]
218
Probabilistic Road Maps
 Idea: Take random samples from C,
declare them as vertices if in Cfree, try to
connect nearby vertices with local planner
 The local planner checks if line-of-sight is
collision-free (powerful or simple methods)
 Options for nearby: k-nearest neighbors
or all neighbors within specified radius
 Configurations and connections are added
to graph until roadmap is dense enough
219
Probabilistic Road Maps
 Example
specified radius
Example local planner
What means "nearby" on a manifold?
Defining a good metric on C is crucial
220
Probabilistic Road Maps
Good and bad news:
 Pros:




Probabilistically complete
Do not construct C-space
Apply easily to high-dim. C's
PRMs have solved previously
unsolved problems
qG
Cobs
Cobs
Cobs
qI
qG
 Cons:
 Do not work well for some
problems, narrow passages
 Not optimal, not complete
Cobs
Cobs
Cobs
Cobs
qI
221
Rapidly Exploring Random Trees
 Idea: aggressively probe and explore the
C-space by expanding incrementally
from an initial configuration q0
 The explored territory is marked by a
tree rooted at q0
45 iterations
2345 iterations
222
From Road Maps to Paths
 All methods discussed so far construct a
road map (without considering the query
pair qI and qG)
 Once the investment is made, the same
road map can be reused for all queries
(provided world and robot do not change)
1. Find the cell/vertex that contain/is close to qI
and qG (not needed for visibility graphs)
2. Connect qI and qG to the road map
3. Search the road map for a path from qI to qG
223
Sampling-Based Planning
Wrap Up
 Sampling-based planners are more efficient in
most practical problems but offer weaker
guarantees
 They are probabilistically complete: the
probability tends to 1 that a solution is found if
one exists (otherwise it may still run forever)
 Performance degrades in problems with narrow
passages. Subject of active research
 Widely used. Problems with high-dimensional
and complex C-spaces are still computationally
hard
224
Potential Field Methods
 All techniques discussed so far aim at capturing the connectivity of Cfree into a graph
 Potential Field methods follow a
different idea:
The robot, represented as a point in C, is
modeled as a particle under the influence
of a artificial potential field U
U superimposes
 Repulsive forces from obstacles
 Attractive force from goal
225
Potential Field Methods
 Potential function
+
=
 Simply perform gradient descent
 C-pace typically discretized in a grid
226
Potential Field Methods
 Main problems: robot gets stuck in
local minima
 Way out: Construct local-minima-free
navigation function ("NF1"), then do
gradient descent (e.g. bushfire from goal)
 The gradient of the potential function
defines a vector field (similar to a policy)
that can be used as feedback control
strategy, relevant for an uncertain robot
 However, potential fields need to represent
Cfree explicitely. This can be too costly.
227
Robot Motion Planning
 Given a road map, let's do search!
228
A* Search
 A* is one of the most widely-known
informed search algorithms with many
applications in robotics
 Where are we?
A* is an instance of an informed
algorithm for the general problem of
search
 In robotics: planning on a
2D occupancy grid map is
a common approach
229
Search
The problem of search: finding a sequence
of actions (a path) that leads to desirable
states (a goal)
Uninformed search: besides the problem
definition, no further information about the
domain ("blind search")
The only thing one can do is to expand
nodes differently
Example algorithms: breadth-first, uniformcost, depth-first, bidirectional, etc.
230
Search
The problem of search: finding a sequence
of actions (a path) that leads to desirable
states (a goal)
Informed search: further information
about the domain through heuristics
Capability to say that a node is "more
promising" than another node
Example algorithms: greedy best-first
search, A*, many variants of A*, D*, etc.
231
Any-Angle A* Examples
 A* vs. Theta*
(len: path length, nhead = # heading changes)
len: 30.0
nhead: 11
len: 24.1
nhead: 9
len: 28.9
nhead: 5
len: 22.9
nhead: 2
232
D* Search
 Problem: In unknown, partially known or
dynamic environments, the planned path
may be blocked and we need to replan
 Can this be done efficiently, avoiding to
replan the entire path?
 Idea: Incrementally repair path keeping
its modifications local around robot pose
 Several approaches implement this idea:
 D* (Dynamic A*) [Stentz, ICRA'94, IJCAI'95]
 D* Lite [Koenig and Likhachev, AAAI'02]
 Field D* [Ferguson and Stentz, JFR'06]
233
D* Family
 D* Lite produces the same paths than D*
but is simpler and more efficient
 D*/D* Lite are widely used
 Field D* was running on Mars rovers
Spirit and Opportunity (retrofitted in yr 3)
Tracks left by a drive executed with Field D*
234
Still in Dynamic Environments...
 Do we really need to replan the entire path
for each obstacle on the way?
 What if the robot has to react quickly to
unforeseen, fast moving obstacles?
 Even D* Lite can be too slow in such a situation
 Accounting for the robot shape
(it's not a point)
 Accounting for kinematic and dynamic
vehicle constraints, e.g.
 Decceleration limits,
 Steering angle limits, etc.
235
Collision Avoidance
 This can be handled by techniques called
collision avoidance (obstacle avoidance)
 A well researched subject, different
approaches exist:
 Dynamic Window Approaches
[Simmons, 96], [Fox et al., 97], [Brock & Khatib, 99]
 Nearness Diagram Navigation
[Minguez et al., 2001, 2002]
 Vector-Field-Histogram+
[Ulrich & Borenstein, 98]
 Extended Potential Fields
[Khatib & Chatila, 95]
236
Collision Avoidance
 Integration into general motion planning?
 It is common to subdivide the problem into
a global and local planning task:
 An approximate global planner computes
paths ignoring the kinematic and dynamic
vehicle constraints
 An accurate local planner accounts for the
constraints and generates (sets of) feasible
local trajectories ("collision avoidance")
 What do we loose? What do we win?
237
Two-layered Architecture
low frequency
Planning
map
sub-goal
Collision Avoidance
sensor data
high frequency
motion command
robot
238
239
240
241
Development of intelligent systems
(RInS)
Object detection in 3D
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Academic year: 2021/22
Development of intelligent systems
Detection of obstacles and objects
Development of intelligent systems, Object detection in 3D
2
3D perception
Development of intelligent systems, Object detection in 3D
3
Development of intelligent systems, Object detection in 3D
Radu Bogdan Rusu, Nico Blodow, Willow Garage, PCL
4
Detection of planes
Development of intelligent systems, Object detection in 3D
5
RANSAC
 Random Sampling Consensus [Fischler, Bolles ’81]
Development of intelligent systems, Object detection in 3D
6
RANSAC
Development of intelligent systems, Object detection in 3D
7
RANSAC
Development of intelligent systems, Object detection in 3D
8
RANSAC in PCL
Development of intelligent systems, Object detection in 3D
9
RANSAC in PCL
Development of intelligent systems, Object detection in 3D
10
Detection of cylinders
http://pointclouds.org/documentation/tutorials/cylinder_segmentation.php
Development of intelligent systems, Object detection in 3D
11
Detection of cylinders
 Noisy data
Development of intelligent systems, Object detection in 3D
12
Detection of objects
Development of intelligent systems, Object detection in 3D
13
Collision map
Development of intelligent systems, Object detection in 3D
14
Cylinder detection
Development of intelligent systems, Object detection in 3D
15
Ring detection
Development of intelligent systems, Object detection in 3D
16
Development of intelligent systems
(RInS)
Colours
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Literature: W. Burger, M. J. Burge (2008).
Digital Image Processing, chapter 12
Academic year: 2021/22
Development of intelligent systems
Colour images
Development of intelligent systems, Colours
2
Colour images
 Sometimes colours include
meaningful information!
Development of intelligent systems, Colours
3
RGB colour images
 RGB colour scheme encodes colours as combinations of three basics colours:
red, green and blue
 Very frequently used
 Additive colour system
Development of intelligent systems, Colours
4
RGB colour space
 Every colour is a point in the 3D RGB space
Development of intelligent systems, Colours
5
RGB channels
Development of intelligent systems, Colours
6
Conversion to grayscale images
 Simple conversion:
 Human eye perceives red and green as brighter than blue, hence we can use the
weighted average:
 Grayscale RGB images have all three components equal:
Development of intelligent systems, Colours
7
HSV colour space
 Hue, Saturation, Value
Development of intelligent systems, Colours
8
HSV channels
Development of intelligent systems, Colours
9
Conversion from RGB to HSV
255
Development of intelligent systems, Colours
10
Algorithm
Development of intelligent systems, Colours
11
Conversion from HSV to RGB
256
Development of intelligent systems, Colours
12
Examples
Development of intelligent systems, Colours
13
Other colour spaces
 HLS
 TV colour spaces
 YUV
 YIQ
 YCbCr
 Colour spaces for print
 CMY
 CMYK
 Colorimetric colour spaces




CIE XYZ
CIE YUV, YU‘V‘, L*u*v, YCbCr
CIE L*a*b*
sRGB
Development of intelligent systems, Colours
14
3D colour histograms
 3 components -> 3D histogram
 High space complexity, „sparse“
Development of intelligent systems, Colours
15
1D colour histograms
 1 D histograms of the individual components
 Do not model correlations between individual colour components
Development of intelligent systems, Colours
16
2D colour histograms
 Calculate pairs of 2D histograms
 Encompass at least a partial correlation between the individual components
Development of intelligent systems, Colours
17
Algorithm
Development of intelligent systems, Colours
18
Examples
Development of intelligent systems, Colours
19
Object colours
 Rings of different colours
 Cylinders of different colours
Development of intelligent systems, Colours
20
Colour recognition
 Detect and segment the object
 in 2D or 3D
 Modelling colours
 Probability distribution
 Gaussian,
mixture of Gaussians
 Train a classifier
 SVM, ANN, kNN,…
 In 1D, 2D or 3D space
 RGB, HSV and other colour
spaces
 Working with the individual
pixels or histograms
 Working with images
Development of intelligent systems, Colours
21
Development of intelligent systems
(RInS)
Robot manipulation
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Literature: Tadej Bajd (2006) Osnove robotike, poglavje 4
Anže Rezelj (2017) Razvoj nizkocenovnega
lahkega robotskega manipulatorja
Academic year: 2021/22
Development of intelligent systems
Robot manipulator
 Industrial robot as defined by the standard ISO 8373:
An automatically controlled, reprogrammable,
multipurpose manipulator programmable in three or more
axes, which can be either fixed in place or mobile for use
in industrial automation applications.
Development of intelligent systems, Robot manipulation
2
Characteristics
 Closed-loop control
 Electrical or hydraulic motors
 Sensors
 Proprioceptive: rotation encoders, measurement od distance, speed
 Exteroceptive: tactile sensors, robot vision
 Reprogrammable:
 designed so that the programmed motions or auxiliary functions can be changed
without physical alteration
 Multipurpose:
 capable of being adapted to a different application with physical alteration
 Physical alteration:
 alteration of the mechanical system
 fixed or mobile robots
 Axis: direction used to specify the robot motion in a linear or rotary mode
 3 or more DOF
Development of intelligent systems, Robot manipulation
3
Robot manipulator
 Arm+wrist+end effector/gripper
 6DOF – can put an object in an arbitrary pose
 arm positions the object into the desired position
 wrist rotates it into the desired orientation
 gripper grasps the object
Development of intelligent systems, Robot manipulation
4
Robot arm
 Serial chain of three links
 connected by joints
 Revolute/rotational joint
 Prismatic/translational joint
Development of intelligent systems, Robot manipulation
5
Robot arm types
 Joints
 Revolute/rotational
 Prismatic/translational
 Axis of two neigbouring links
 Parallel
 Perpendicular
 3DOF
 In practice typically five different arms:





Anthropomorphic
Spherical
SCARA
Cylindrical
Cartesian
Development of intelligent systems, Robot manipulation
6
Anthropomorphic robot arm
 Three rotational joints (RRR)
 Workspace: sphere-like
 Resembles a human arm
Development of intelligent systems, Robot manipulation
7
Spherical robot arm
 Two rotational, one translational joint (RRT)
 Workspace: sphere-like
Development of intelligent systems, Robot manipulation
8
SCARA robot arm
 Selective Articulated Robot for Assembly
 Two rotational, one translational joint (RRT)
 Workspace: cylinder-like
Development of intelligent systems, Robot manipulation
9
Cylindrical robot arm
 One rotational, two translational joints (RTT)
 Workspace: cylinder
Development of intelligent systems, Robot manipulation
10
Cartesian robot arm
 Three translational joints (TTT)
 Workspace: cuboid
Development of intelligent systems, Robot manipulation
11
Robot wrist
 Rotates the object in an arbitrary orientation
 Three rotational joints (RRR)
 Sometimes also one or two suffice
 Links should be as short as possible
Development of intelligent systems, Robot manipulation
12
Robot end-effector
 The final link of the robot manipulator
 Grippers with fingers
 With two fingers
 With more than two fingers
 Other type of grippers
 Vacuum
 Magnetic
 Perforation
 Other tools as end-effectors
 Welding gun
 Spray painting gun
Development of intelligent systems, Robot manipulation
13
Robot workspace
 Reachable workspace
 The end-effector can reach every
point in this space
 Dexterous workspace
 The end-effector can reach every
point in tis space from the
arbitrary orientation
Development of intelligent systems, Robot manipulation
14
Kinematics
 Base coordinate frame [X1,Y1,Z1]
 Usually also world coordinate frame
 Used for defining of the robotic task
 End-effector reference frame [Xm,Ym,Zm]
 End-effector position
 Vector between the origins of the coordinate frames
 Object orientation
 Three angles
 Internal robot coordinates / joint variables
 Joint states (angles, translations)
 Uniquely describe the pose of the robot
 Direct kinematics
 Determine the external robot coordinates from the internal coordinates
 Inverse kinematics
 Determine the internal robot coordinates from the external coordinates
Development of intelligent systems, Robot manipulation
15
Geometrical robot model
 Robot manipulator = a serial chain of segments connected by joints
 Every joint can be either rotational or translational
 1DOF – 1 internal coordinate
 Geometrical robot model describes
 the pose of the last segment of the robot (end-effector) expressed in the reference
(base) frame
 depending on the current internal coordinates
Development of intelligent systems, Robot manipulation
16
Geometrical robot model
 Geometrical model can be expressed by a homogenuous transformation:
 p : position of the end effector in the reference coordinate frame
 n, s, a : unit vectors of the
end-effector coordinate frame:
 a: approach
 s: sliding
 n: normal
 q : vector of internal coordinates
Development of intelligent systems, Robot manipulation
17
Poses of segments
 Every joint connects two neighbouring segments/links
 Determine the transformation between them
 Recursively build the full model for the entire robot
 Coordinate frames can be arbitrarily attached to the individual segments
 Denavit – Hartenberg rules simplify computation of the geometrical robot model
 Determine the pose of the i-th c.f. with respect to the pose of the (i-1)-th c.f.
 Axis i connects segments (i-1) and i
Development of intelligent systems, Robot manipulation
18
Denavit – Hartenberg rules
 Describe the coordinate frame of the i-th segment (having the joint i+1):
1. Define the axis zi through the axis of the joint i+1
2. Find the common normal, perpendicular to the axes zi-1 and zi



Position the origin of Oi into the intersection of the axis zi with the common normal
Position the origin of Oi` into the intersection of the axis zi-1 with the common normal
If the axes are parallel, position the origin anywhere
3. Position the axis xi on a common normal in a way, that it is oriented from the joint i
towards the joint i+1

If the axis zi-1 and zi intersect, orient the axis xi perpendicular to the plane defined by the
axes zi-1 in zi
4. Determine the axis yi in a way that gives the rigt-handed c.f.
 Similarly we also describe (have already described) the coordinate frame of the
segment (i-1)
 The origin Oi-1 is determined by the intersection of the common normal of the axes i-1 and i
 The axis zi-1 is oriented along the i-th axis
 xi-1 is oriented along the common normal and directed from the joint i-1 towards the joint i
Development of intelligent systems, Robot manipulation
19
Graphical illustration of DH parameters
Development of intelligent systems, Robot manipulation
20
Denavit – Hartenberg parameters
 The pose of i-th c.f. with respect to (i-1)-th c.f. is determined by 4 parameters:
1.
2.
3.
4.
ai
di
αi
θi
–
–
–
–
distance between Oi and Oi` along xi
distance between Oi-1 and Oi` along zi-1
angle between zi-1 and zi around xi
angle between xi-1 and xi around zi-1
Development of intelligent systems, Robot manipulation
21
Denavit – Hartenberg parameters
Development of intelligent systems, Robot manipulation
22
Denavit – Hartenberg parameters
 ai and αi are always constant
 They depend on the geometry of the robot, the links between the joints, etc.
 They do not change during the operation of the robot
 One of the two remaining parameters is a variable
 Θi, if the i-th joint is rotational
 di, if the i-th joint is translational
Development of intelligent systems, Robot manipulation
23
Denavit – Hartenberg parameters
 Illustration
 r=a
 n=i
Wikipedia
Development of intelligent systems, Robot manipulation
24
Denavit – Hartenberg parameters
 Video:
http://en.wikipedia.org/wiki/Denavit-Hartenberg_Parameters
Development of intelligent systems, Robot manipulation
25
Exceptions
 Some exceptions in certain situations can be used to simplify
the process:
 Axis zi and zi-1 are parallel -> di=0
 Axis zi and zi-1 intersect -> Oi is in the intersection
 In case of the base (0-th) segment: only the axis z0 is defined
-> put the origin of O0 in the first joint
-> align x0 and x1
 In case of end-effector (n-th c.f.):
Only axis xn is defined; it is perpendicular to zn-1
-> zn should be parallel to zn-1
 In case of translational joint:
-> orient the axis zi-1 in the direction of translation
-> position Oi-1 at the beginning of translation
Development of intelligent systems, Robot manipulation
26
Denavit – Hartenberg transformation
 Transformation between the i-th and (i-1)-th c.f.:
1. Take the c.f. Oi-1 attached to the segment (i-1)
2. Translate it for di and rotate it for Θi
along and around zi-1,
to align it with the c.f. Oi`
3. Translate c.f. Oi‘ for ai in rotate it for αi
along and around xi`,
to align it with the c.f. Oi
4. DH transformation is obtained
by postmultiplication of both
transformation matrices

Function of a single variable:


Θi for rotational joint
di for translational joint
Development of intelligent systems, Robot manipulation
27
Calculation of the geometrical robot model
1.
2.
3.
4.
Set the coordinate frames for all segments
Define the table of DH parameters
Calculate DH transformations
for
Calculate the geometrical model:
Development of intelligent systems, Robot manipulation
for all segments
28
Using geometrical model
 Geometrical robot model gives the pose of the last segment of the robot (endeffector) expressed in the reference (base) frame
 Geometrical robot model defines the pose (position and orientation) of the endeffector depending on the current internal coordinates q
Development of intelligent systems, Robot manipulation
29
Anthropomorphic robot manipulator
Development of intelligent systems, Robot manipulation
30
Anthropomorphic robot manipulator
Development of intelligent systems, Robot manipulation
31
Stanford robot manipulator
Development of intelligent systems, Robot manipulation
32
Stanford robot manipulator
Development of intelligent systems, Robot manipulation
33
Spherical robot wrist
 Usually attached to the end of the robot arm
 All three axes of the rotational joints
intersect in the same point
Development of intelligent systems, Robot manipulation
34
Spherical robot wrist
Development of intelligent systems, Robot manipulation
35
Stanford manipulator with the wrist
Development of intelligent systems, Robot manipulation
36
Stanford manipulator with the wrist
Development of intelligent systems, Robot manipulation
37
Inverse kinematics model
 Direct kinematics defines the pose of the end-effector depending on the internal
coordinates

 Where will the end-effector move
T(q)
 The pose of the end-effector is uniquely determined
Inverse kinematics defines the internal coordinates that would bring the robot endeffector in the desired pose
 How to move the end-effector to reach the desired pose
 Challenging problem:
q(T)
 Nonlinear equations
 The solution is not uniquely defined
 Several solutions
 Sometimes even infinite number of solutions
 Sometimes the solution does not exist
 Take into account several criteria that determine which solution is optimal
 Sometimes we can get analytical solution, sometimes only numerical are possible
Development of intelligent systems, Robot manipulation
38
ViCoS LCLWOS robot manipulator
Development of intelligent systems, Robot manipulation
39
Requirements
Development of intelligent systems, Robot manipulation
40
Related work
Development of intelligent systems, Robot manipulation
41
Robot manipulator
Development of intelligent systems, Robot manipulation
42
Forward model
DH parameters:
Development of intelligent systems, Robot manipulation
43
Transformation
Development of intelligent systems, Robot manipulation
44
Frame
 3D printed
Development of intelligent systems, Robot manipulation
45
Frame parts
Development of intelligent systems, Robot manipulation
46
Motors
 Servo motors
Development of intelligent systems, Robot manipulation
47
Motors
Development of intelligent systems, Robot manipulation
48
Upgrading servomotors




New control circuit
Potentiometer
OpenServo
Current protection
Development of intelligent systems, Robot manipulation
49
PID controller
 Proportional, integral and derivative part
Development of intelligent systems, Robot manipulation
50
AD converter




Increasing resolution
Multiple sampling and decimation
Resolution increased from 10 to 12 bits
Sampling frequency 256 Hz
Development of intelligent systems, Robot manipulation
51
Communication
 I2C interface
 I2C bus
 OpenServo
 Communication with microcontroller
 OpenServoRobot
 Robot model (DH parameters)
 Communication with application
Development of intelligent systems, Robot manipulation
52
Workspace
Development of intelligent systems, Robot manipulation
53
Theoretical accuracy
 Expected deviation of the estimated position from the reference position of the
end effector
 Only considering motor errors
 Theoretical upper limit of precision
Development of intelligent systems, Robot manipulation
54
Theoretical accuracy
Development of intelligent systems, Robot manipulation
55
Theoretical accuracy
Development of intelligent systems, Robot manipulation
56
Empiciral repetabilty
Development of intelligent systems, Robot manipulation
57
Calibration
Grasping a cube (100 times):
Mean error:
Development of intelligent systems, Robot manipulation
58
Characteristics
Development of intelligent systems, Robot manipulation
59
Integration in ROS
Development of intelligent systems, Robot manipulation
60
Integration in Manus
Development of intelligent systems, Robot manipulation
61
Integration with programming languages
 Matlab
 Python
 Blockly
Development of intelligent systems, Robot manipulation
62
Communication using VGA and USB cable
 VGA cable and I2C protocol
 USB port
Development of intelligent systems, Robot manipulation
63
Registration with camera
Development of intelligent systems, Robot manipulation
64
Augmented reality
Development of intelligent systems, Robot manipulation
65
Multi-level teaching approach
Development of intelligent systems, Robot manipulation
66
Video
Development of intelligent systems, Robot manipulation
67
Mobile manipulation
Development of intelligent systems, Robot manipulation
68
Vision-based control
Moving the end effector to
the desired pose determined
by the pose of an object
 Camera on the robot
 Fixed camera
Development of intelligent systems, Robot manipulation
69
Vision-based control
 Position-based servoing
 Explicit control
 In world coordinate frame
 Image-based servoing
 Implicit control
 In image coordinate frame
Development of intelligent systems, Robot manipulation
70
Vision-based control
Development of intelligent systems, Robot manipulation
71
Vision-based control
 Mobile manipulation
 Joint control of mobile robot and robot manipulator
Development of intelligent systems, Robot manipulation
72
Development of intelligent systems
(RInS)
Object recognition with
Convolutional Neural Networks
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Academic year: 2021/22
Media hype
Development of intelligent systems, Object recognition with CNNs
2
Superior perfomance
1k categories
1,3M images
Top5 classification
ILSVRC results
30
28,2
25,8
25
20
16,4
15
Deep learning era
11,7
10
6,7
5
3,6
3,1
2015
2016
2,2
0
2010
Development of intelligent systems, Object recognition with CNNs
2011
2012
2013
2014
2017
3
New deep learning era
 More data!
 More computing power - GPU!
 Better learning algorithms!
Development of intelligent systems, Object recognition with CNNs
4
New deep learning era
ICCV 2019, Seoul, Korea, 27. 10. - 2. 11. 2019
Development of intelligent systems, Object recognition with CNNs
5
Machine learning in computer vision
 Conventional approach
feature
extraction
PCA, LDA, CCA,
HOG, SIFT,
SURF, ORB, …
Development of intelligent systems, Object recognition with CNNs
features
classification
learning
model
class
kNN, SVM, ANN,
AdaBoost, …
6
Deep learning in computer vision
 Conventional machine learning approach in computer vision
feature
extraction
features
classification
learning
model
class
 Deep learing approach
classification
deep
learning
Development of intelligent systems, Object recognition with CNNs
class
Deep
model
7
Deep learning – the main concept
Oseba
Person
Kolo
Bike
Car
Avto
Development of intelligent systems, Object recognition with CNNs
8
End to end learning
 Representations as well
as classifier are being
learned
Development of intelligent systems, Object recognition with CNNs
9
Perceptron






Rosenblatt, 1957
Binary inputs and output
Weights
Threshold
Bias
Very simple!
Development of intelligent systems, Object recognition with CNNs
10
Sigmoid neurons
 Real inputs and outputs from interval [0,1]
 Activation function: sgimoid function
 output =
Development of intelligent systems, Object recognition with CNNs
11
Sigmoid neurons
 Small changes in weights and biases causes small change in output
 Enables learning!
Development of intelligent systems, Object recognition with CNNs
12
Feedfoward neural networks
 Network architecture:
Development of intelligent systems, Object recognition with CNNs
13
Example: recognizing digits
 MNIST database of handwritten digits





28x28 pixes (=784 input neurons)
10 digits
50.000 training images
10.000 validation images
10.000 test images
Development of intelligent systems, Object recognition with CNNs
14
Example code: Feedforward
 Code from https://github.com/mnielsen/neural-networks-and-deep-learning/archive/master.zip
or https://github.com/mnielsen/neural-networks-and-deep-learning
git clone https://github.com/mnielsen/neural-networks-and-deep-learning.git
 or https://github.com/chengfx/neural-networks-and-deep-learning-for-python3 (for Python 3)
Development of intelligent systems, Object recognition with CNNs
15
Loss function
 Given:
for all training images
 Loss function:
 (mean sqare error – quadratic loss function)
 Find weigths w and biases b that for given input x
produce output a that minimizes Loss function C
Development of intelligent systems, Object recognition with CNNs
16
Gradient descend
 Find minimum of
 Change of C:
 Gradient of C:
 Change v in the opposite
direction of the gradient:
 Algorithm:
Learning rate
 Initialize v
 Until stopping criterium riched
 Apply udate rule
Development of intelligent systems, Object recognition with CNNs
17
Gradient descend in neural networks
 Loss function
 Update rules:
 Consider all training samples
 Very many parameters
=> computationaly very expensive
 Use Stochastic gradient descend instead
Development of intelligent systems, Object recognition with CNNs
18
Stochastic gradient descend
 Compute gradient only for a subset of m training samples:
 Mini-batch:
 Approximate gradient:
 Update rules:
 Training:
1. Initialize w and b
2. In one epoch of training keep randomly selecting one mini-batch of m samples at a
time (and train) until all training images are used
3. Repeat for several epochs
Development of intelligent systems, Object recognition with CNNs
19
Example code: SGD
Development of intelligent systems, Object recognition with CNNs
20
Backpropagation
 All we need is gradient of loss function
 Rate of change of C wrt. to change in any weigt
 Rate of change of C wrt. to change in any biase
 How to compute gradient?
 Numericaly
 Simple, approximate, extremely slow 
 Analyticaly for entire C
 Fast, exact, nontractable 
 Chain individual parts of netwok
 Fast, exact, doable 
Backpropagation!
Development of intelligent systems, Object recognition with CNNs
21
Main principle
 We need the gradient of the Loss function
 Two phases:
 Forward pass; propagation: the input sample is propagated through the network and
the error at the final layer is obtained
 Backward pass; weight update: the error is backpropagated to the individual levels,
the contribution of the individual neuron to the error is calculated and the weights are
updated accordingly
Development of intelligent systems, Object recognition with CNNs
22
Learning strategy
 To obtain the gradient of the Loss function
:
 For every neuron in the network calculate error of this neuron
 This error propagates through the netwok causing the final error
 Backpropagate the final error to get all
 Obtain all
and
from
Development of intelligent systems, Object recognition with CNNs
23
Equations of backpropagation
 BP1: Error in the output layer:
 BP2: Error in terms of the error in the next layer:
 BP3: Rate of change of the cost wrt. to any bias:
 BP4: Rate of change of the cost wrt. to any weight:
Development of intelligent systems, Object recognition with CNNs
24
Backpropagation algorithm
 Input x: Set the corresponding activation
for the input layer
 Feedforward: For each
compute
 Output error
: Compute the output error
 Backpropagate the error:
For each
compute
 Output the gradient:
Development of intelligent systems, Object recognition with CNNs
25
Backpropagation and SGD
For a number of epochs
Until all training images are used
Select a mini-batch of
training samples
For each training sample
in the mini-batch
Input: set the corresponding activation
Feedforward: for each
compute
and
Output error: compute
Backpropagation: for each
compute
Gradient descend: for each
Development of intelligent systems, Object recognition with CNNs
and
update:
26
Example code: Backpropagation
Development of intelligent systems, Object recognition with CNNs
27
Local computation
values
gradient
Development of intelligent systems, Object recognition with CNNs
28
Activation and loss functions
Activation function
Loss function
Linear
Quadratic
Sigmoid
Binary cross-entropy
Softmax
Categorical Cross-entropy
Other
Custom
Development of intelligent systems, Object recognition with CNNs
29
Activation functions
Sigmoid
Leaky ReLU
tanh
ELU
ReLU
tanh(x)
max(0.1x, x)
max(0,x)
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
30
Overfitting
 Huge number of parameters
-> danger of overfitting
 Use validation set to determine
overfitting and early stopping
 Hold out method
overfitting
overfitting
early
stopping
1,000 MNIST training images
Development of intelligent systems, Object recognition with CNNs
50,000 MNIST training images
31
Regularization
 How to avoid overfitting:
 Increase the number of training images 
 Decrease the number of parameters 
 Regularization 
 Regularization:




L2 regularization
L1 regularization
Dropout
Data augmentation
Development of intelligent systems, Object recognition with CNNs
32
Regularisation
 How to avoid overfitting:
 Increase the number of training images 
 Decrease the number of parameters 
 Regularization 










Data Augmentation
L1 regularisation
L2 regularisation
Dropout
Batch Normalization
DropConnect
Fractional Max Pooling
Stochastic Depth
Cutout / Random Crop
Mixup
Development of intelligent systems, Object recognition with CNNs
[Wan et al. 2013]
[Huang et al. 2016]
[Graham, 2014]
33
Data augmentation
 Use more data!
Development of intelligent systems, Object recognition with CNNs
 Synthetically generate new data
 Apply different kinds of transformations:
translations, rotations, elastic distortions,
appearance modifications (intensity, blur)
 Operations should reflect real-world
variation
34
Parameter updates
 Different schemes for updating gradient






Gradient descend
Momentum update
Nesterov momentum
AdaGrad update
RMSProp update
Adam update
 Learning rate decay
Image credits: Alec Radford
Development of intelligent systems, Object recognition with CNNs
35
Setting up the network
 Set up the network
 Coarse-fine cross-validation in stages
 Only a few epochs to get a rough idea
 Even on a smaller problem
to speed up the process
 Longer running time, finer search,…
 Cross-validation strategy
 Check various parameter settings
 Always sample parameters
 Check the results, adjust the range
 Hyperparameters to play with:
 network architecture
 learning rate, its decay schedule, update type
 regularization (L2/Dropout strength)…
 Run multiple validations simultaneously
 Actively observe the learning progress
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
36
Convolutional neural networks
 From feedforward fully-connected neural networks
 To convolutional neural networks
Development of intelligent systems, Object recognition with CNNs
37
example
Convolution
 Convolution operation:
 Discrete convolution:
 Two-dimensional convolution:
 Convolution is commutative:
 Cross-correlation:
Development of intelligent systems, Object recognition with CNNs
flipped kernel
38
Convolutional neural networks
 Data in vectors, matrices, tensors
 Neigbourhood, spatial arrangement
 2D: Images,time-fequency representations
 1D: sequential signals, text, audio, speech, time series,…
 3D: volumetric images, video, 3D grids
Development of intelligent systems, Object recognition with CNNs
39
Convolution layer
*
Development of intelligent systems, Object recognition with CNNs
40
Convolution layer
*
*
σ
σ
4
3
*
σ
8
8×
4×
3
Development of intelligent systems, Object recognition with CNNs
4
41
Sparse connectivity
 Local connectivity – neurons are only locally connected (receptive field)
 Reduces memory requirements
 Improves statistical efficiency
 Requires fewer operations
The receptive field of the
units in the deeper layers
is large
=> Indirect connections!
from below
Development of intelligent systems, Object recognition with CNNs
from above
42
Parameter sharing
 Neurons share weights!
 Tied weights
 Every element of the kernel is used
at every position of the input
 All the neurons at the same level detect
the same feature (everywhere in the input)
 Greatly reduces the number of parameters!
 Equivariance to translation
 Shift, convolution = convolution, shift
 Object moves => representation moves
 Fully connected network with an infinitively strong prior over its weights
 Tied weights
 Weights are zero outside the kernel region
=> learns only local interactions and is equivariant to translations
Development of intelligent systems, Object recognition with CNNs
43
Convolutional neural network
[From recent Yann
LeCun slides]
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
44
Convolutional neural network
one filter =>
one activation map
example 5x5 filters
(32 total)
input
image:
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
45
Pooling layer
 makes the representations smaller and more manageable
 operates over each activation map independently
 downsampling
Example: Max pooling
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
46
Pooling
 Max pooling introduces translation
invariance
Development of intelligent systems, Object recognition with CNNs
 Pooling with downsampling
 Reduces the representation size
 Reduces computational cost
 Increases statistical efficiency
47
CNN layers
 Layers used to build ConvNets:
 INPUT:
raw pixel values
 CONV:
convolutional layer
 (BN: batch nornalisation)
 (ReLU:)
introducing nonlinearity
 POOL:
downsampling
 FC:
for computing class scores
 SoftMax
Development of intelligent systems, Object recognition with CNNs
48
CNN architecture
 Stack the layers in an appropriate order
Babenko et. al.
Hu et. al.
Development of intelligent systems, Object recognition with CNNs
49
CNN architecture
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
50
Typical solution
Development of intelligent systems, Object recognition with CNNs
51
Network architecture
 Training the model
 Inference
Development of intelligent systems, Object recognition with CNNs
52
Example implementation in TensorFlow
Segmentation network
Classification network
Development of intelligent systems, Object recognition with CNNs
53
Case study – LeNet-5
[LeCun et al., 1998]
Conv filters were 5x5, applied at stride 1
Subsampling (Pooling) layers were 2x2 applied at stride 2
i.e. architecture is [CONV-POOL-CONV-POOL-CONV-FC]
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
54
Case study - AlexNet
[Krizhevsky et al. 2012]
http://fromdata.org/2015/10/01/imagenet-cnn-architecture-image/
CONV3
INPUT CONV1
CONV2
POOL1
POOL2
NORM1
NORM2
Development of intelligent systems, Object recognition with CNNs
CONV4
FC6
CONV5
POOL3
FC7 FC8
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
55
Case studay - VGGNet
[Simonyan and Zisserman, 2014]
best model
Only 3x3 CONV stride 1, pad 1
and 2x2 MAX POOL stride 2
11.2% top 5 error in ILSVRC 2013
-> 7.3% top 5 error
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
56
Case study - GoogLeNet
[Szegedy et al., 2014]
Inception module
ILSVRC 2014 winner (6.7% top 5 error)
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
57
Case study - ResNet
spatial dim. only 56x56! -
Development of intelligent systems, Object recognition with CNNs
Batch Normalization after every
CONV layer
Xavier/2 initialization from He et al.
SGD + Momentum (0.9)
Learning rate: 0.1, divided by 10
when validation error plateaus
Mini-batch size 256
Weight decay of 1e-5
ILSVRC 2015 winner
No dropout used
(3.6% top 5 error)
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
58
DenseNet
 Densely Connected
Convolutional Networks
Huang et al. 2017
 Every layer connected to
every other layer in a
feed-forward fashion
 Dense connectivity
 Model compactness
 Strong gradient flow
 Implicit deep supervision
 Feature reuse
Development of intelligent systems, Object recognition with CNNs
59
MobileNets
 Efficient Convolutional Neural Networks for Mobile Applications
Howard et al. 2017
 Efficient models for mobile and embedded vision applications
 Depthwise separable convolution:
 Depthwise convolution
 Pointwise (1x1) convolution
 MobileNetV2: Inverted Residuals and Linear Bottlenecks
Sandler et al. 2018
 MobileNetV3: NAS+ NetAdapt
Development of intelligent systems, Object recognition with CNNs
Howard et al. 2019
60
NASNet
 Neural Architecure Search
[Zoph et al. 2018]
 Search the space of architetures to find the optimal one given available resources
 500 GPUs across 4 days resulting in 2,000 GPU-hours on NVidia P100
Available operations to select from:
Best convolutional cells (NASNet-A) for CIFAR-10
 Other architecture search methods:
 AmoebaNet, Real et al., 2018
 MoreMNAS, Chu et. al, 2019, …
Development of intelligent systems, Object recognition with CNNs
61
EfficientNet
 Scaling the network in
[Tan and Le, 2019]
Development of intelligent systems, Object recognition with CNNs
62
Architectures overview
 Date of publication, main type
[Hoeser and Kuenzer, 2020]
Development of intelligent systems, Object recognition with CNNs
63
Analysis of DNN models
[Canziani et al., 2017]
Development of intelligent systems, Object recognition with CNNs
64
Pretrained models
Development of intelligent systems, Object recognition with CNNs
65
Transformers
[Vaswani et.al, NIPS 2017]
Development of intelligent systems, Object recognition with CNNs
[Khan et.al, 2021]
66
ViT - Vision Transformer
 AN IMAGE IS WORTH 16X16 WORDS:
TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE
[Dosovitskiy et.al,
Google, 2020,
ICLR 2021]
Development of intelligent systems, Object recognition with CNNs
67
Transfer learning
 If you don‘t have enough data use pretrained models!
1. Train on
Imagenet
2. Small dataset:
feature extractor
3. Medium dataset:
finetuning
more data = retrain more
of the network (or all of it)
Freeze these
Freeze these
tip: use only ~1/10th of
the original learning rate
in finetuning top layer,
and ~1/100th on
intermediate layers
Train this
Train this
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
68
Two stage object detection and recognition
very fast
efficient
Face
detection
HOG+SVM
AdaBoost
SSD
„Scarlet“
Face
recognition
CNN
PCA/LDA
could be slower
computationally more complex
Development of intelligent systems, Object recognition with CNNs
69
Object detection and recognition
 Two stage approach:
 Detection of region proposals
 Recognition of the individual region proposals
Development of intelligent systems, Object recognition with CNNs
70
Object detection in RInS
 Information in circles
 -> detecting circles as region
proposals (Region Of Interests)
 Rectify ROIs
 Recognize the content of ROIs
2 6
ROIs
 Rectification using homography
Development of intelligent systems, Object recognition with CNNs
71
Homography
 Two views on the same (planar) object:
 Homography: plane to plane mapping
Slide credit: Matej Kristan
Development of intelligent systems, Object recognition with CNNs
72
Computing homography
 Four corresponding points:
𝒙
𝒙′
𝑤𝒙′ = 𝑯𝒙
 H11
 x' 

w  y'    H 21
 H 31
 1 

H12
H 22
H 32
H13   x 

H 23   y 
H 33   1 
 The elements of the matrix 𝑯 can be computed using Direct
Linear Transform (DLT)!
Slide credit: Matej Kristan
Development of intelligent systems, Object recognition with CNNs
73
Application of homography
Homography
mapping
between part
of the image
and a sqare
Flagellation of Christ (Piero della Francesca)
Slide credit: Antonio Criminisi
Development of intelligent systems, Object recognition with CNNs
74
Two-stage detectors
First stage: Run once per image
- Backbone network
- Region proposal network
Second stage: Run once per region
-Fei-Fei
Crop
features: RoI pool / align
Li, Ranjay Krishna, Danfei Xu
- Predict object class
- Prediction bbox offset
75
Ren et al, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, NIPS 2015
Figure copyright 2015, Ross Girshick; reproduced with permission
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Ranjay Krishna, Danfei Xu
75
Single-Stage Object Detectors
Input image
3 xH xW
Redmon et al, “You Only Look Once:
Unified, Real-Time Object Detection”, CVPR 2016
Liu et al, “SSD: Single-Shot MultiBox Detector”, ECCV 2016
Lin et al, “Focal Loss for Dense Object Detection”, ICCV 2017
Divide image into grid
7x7
Image a set of base boxes
centered at each grid cell
Here B = 3
Development of intelligent systems, Object recognition with CNNs
Within each grid cell:
- Regress from each of the B
base boxes to a final box
with 5 numbers:
(dx, dy, dh, dw, confidence)
- Predict scores for each of C
classes (including
background as a class)
- Looks a lot like RPN, but
category-specific!
Output:
7 x 7 x (5 * B + C)
Slide credit: Fei-Fei Li, Ranjay Krishna, Danfei Xu
76
SSD: Single Shot MultiBox Detector
 Multi-scale feature maps for
detection
 Convolutional predictors for
detection
 Default boxes and aspect ratios
 Real time operation
[Liu et al., ECCV 2016]
Development of intelligent systems, Object recognition with CNNs
77
Wide usabilty of ConvNets
Detection
Instance segmentation
[Redmon, Yolo, 2018]
[Liu, SSD, 2015]
[He,
Mask R-CNN,
2012]
Development of intelligent systems, Object recognition with CNNs
78
Wide usabilty of ConvNets
Semantic segmentation
[Farabet et al., 2012]
[Chen, DeepLab
2017]
Development of intelligent systems, Object recognition with CNNs
79
Wide usabilty of ConvNets
 Image segmentation
[Jonson, 2016]
[Caicedo, 2018]
[Marmanis, 2016]
Development of intelligent systems, Object recognition with CNNs
80
Wide usabilty of ConvNets
 Action
recognition
[Donahue
et al.
2016]
[Luvizon et al. 2016]
[Simonyan et al. 2014]
Development of intelligent systems, Object recognition with CNNs
81
Wide usabilty of ConvNets
 Biometry
[Taigman et al. 2014]
[Najibi,
SSH,
2017]
[Emeršič,
2017]
Development of intelligent systems, Object recognition with CNNs
82
Wide usabilty of ConvNets
 Person/pose detection
[Güler, DensePose, 2018]
[Cao 2017]
Development of intelligent systems, Object recognition with CNNs
83
Wide usabilty of ConvNets
 Reinforcement learning for game playing
[Google DeepMind]
Development of intelligent systems, Object recognition with CNNs
84
Wide usabilty of ConvNets
Image
Captioning
[Vinyals et al., 2015]
Development of intelligent systems, Object recognition with CNNs
Slide credit: Fei-Fei Li, Andrej Karpathy, Justin Johnson
85
Surface-defect detection
Segmantation-based data-driven
surface-defect detection
Development of intelligent systems, Object recognition with CNNs
86
Surface-defect detection
Development of intelligent systems, Object recognition with CNNs
87
Surface-defect detection
Development of intelligent systems, Object recognition with CNNs
88
Polyp counting
Development of intelligent systems, Object recognition with CNNs
89
Polyp counting
Development of intelligent systems, Object recognition with CNNs
90
Ship detection
Development of intelligent systems, Object recognition with CNNs
91
Face detection
Development of intelligent systems, Object recognition with CNNs
92
Mask-wearing detection
Development of intelligent systems, Object recognition with CNNs
93
Obstacle detection on autonomous boat
USV equipped with
different sensors:
• stereo camera
• IMU
• GPS
• compass
Segmentation based on
RGB + IMU
Development of intelligent systems, Object recognition with CNNs
94
Semantic edge detection
Development of intelligent systems, Object recognition with CNNs
95
Object (traffic sign) detection
Development of intelligent systems, Object recognition with CNNs
96
Object (traffic sign) detection
Development of intelligent systems, Object recognition with CNNs
97
Image anonymisation
 Detection and
anonimysation of car
plates and faces
Development of intelligent systems, Object recognition with CNNs
98
Visual tracking
Development of intelligent systems, Object recognition with CNNs
99
Plank classification
Development of intelligent systems, Object recognition with CNNs
100
Place recognition
Development of intelligent systems, Object recognition with CNNs
101
Semantic segmentation
Development of intelligent systems, Object recognition with CNNs
102
Image enhancement
 Deblurring, super-resolution
Development of intelligent systems, Object recognition with CNNs
103
Development of intelligent systems, Object recognition with CNNs
104
Development of intelligent systems, Object recognition with CNNs
105
Deep reinforcement learning
 Automatic generation of
learning examples
 Goal-driven map-less
mobile robot navigation
Development of intelligent systems, Object recognition with CNNs
106
Innate and learned
 Goal-driven map-less mobile robot navigation
 Constraining the problem using a priory knowledge
Engineering
approach
Engineering approach +
deep learning
Development of intelligent systems, Object recognition with CNNs
Pure learning
107
Problem solving
Complexity
 Different problem complexities
Simple,
well defined problems
Complex, vaguely
defined problems
Rule-based
decision making
Data-driven
decision making
Programming
Machine learning
Development of intelligent systems, Object recognition with CNNs
108
Complexity
Problem solving
Routine solutions
Rule-based solutions
Development of intelligent systems, Object recognition with CNNs
Data-driven solutions
General intelligence
109
Adequate tools
Routine solutions
Rule-based solutions
Development of intelligent systems, Object recognition with CNNs
Data-driven solutions
General intelligence
110
Development, deployement and maintainance
• Data, data, data!
• Enough data, representative data
• Correctly annotated data
• Appropriate deep architecture design
• Proper backbone, architecture, loss function, …
• Learning, parameter optimisation
• Efficient implementation
• Execution speed
• Integration
• Maintenance
• Incremental improvement of the learned model
• Reflecting to changes in the environment
Development of intelligent systems, Object recognition with CNNs
111
Development of deep learning solutions
20%
% of solution
80%
80%
80:20?
90:10?
99:1?
60:40?
20%
time
Development of intelligent systems, Object recognition with CNNs
112
Knowledge and experience count
Development of intelligent systems, Object recognition with CNNs
113
Software
 Neural networks in Python
 Convolutional neural networks using PyTorch or TensorFlow
 or other deep learning frameworks
 Optionally use Google Colab
Development of intelligent systems, Object recognition with CNNs
114
Literature
 Michael A. Nielsen, Neural Networks and Deep learning,
Determination Press, 2015
http://neuralnetworksanddeeplearning.com/index.html
 Ian Goodfellow and Yoshua Bengio and Aaron Courville,
Deep Learning, MIT Press, 2016
http://www.deeplearningbook.org/
 Fei-Fei Li, Andrej Karpathy, Justin Johnson, CS231n: Convolutional Neural
Networks for Visual Recognition, Stanford University, 2016
http://cs231n.stanford.edu/
 Papers
Development of intelligent systems, Object recognition with CNNs
115
Development of intelligent systems
(RInS)
Cognitive robot systems
Danijel Skočaj
University of Ljubljana
Faculty of Computer and Information Science
Academic year: 2021/2022
Development of intelligent systems, Cognitive robot systems
Robotics
 Routine industrial robotic system
EURON video
EURON video
 Intelligent artificial visual cognitive system
Development of intelligent systems, Cognitive robot systems
2
Cognitive robot systems
cognitive robots
industrial
robots
SF
human
Development of intelligent systems, Cognitive robot systems
perception
action
attention
goals
planning
reasoning
communication
learning
3
Cognitive robotics

Wikipedia:
Cognitive robotics is concerned with endowing robots with mammalian and
human-like cognitive capabilities to enable the achievement of complex goals
in complex environments. Robotic cognitive capabilities include perception
processing, attention allocation, anticipation, planning, reasoning about
other agents, and perhaps reasoning about their own mental states. Robotic
cognition embodies the behaviour of intelligent agents in the physical
world.
 A cognitive robot should exhibit:






knowledge
beliefs
preferences
goals
informational attitudes
motivational attitudes (observing, communicating, revising beliefs, planning)
Development of intelligent systems, Cognitive robot systems
4
Researchers‘ definitions











Cognition is the ability to relate perception and action in a meaningful way
determined by experience, learning and memory. Mike Denham
A cognitive system possesses the ability of self-reflection (or at least selfawareness). Horst Bischof
Cognition is gaining knowledge through the senses. Majid Mermehdi
Cognition is the ability to ground perceptions in concepts together with the
ability to manipulate concepts in order to proceed toward goals. Christian
Bauckhage
An artificial cognitive system is a system that is able to perceive its
surrounding environment with multiple sensors, merge this information,
reason about it, learn from it and interact with the outside world. Barbara
Caputo
Cognition is self-aware processing of information. Cecilio Angulo
Cognitive Systems are ones that are able to extract and (most
importantly) represent useful aspects of largely redundant, possibly
irrelevant sensory information in a form that is most conducive to
achieving a particular high level goal. Sethu Vijayakumar
A cognitive system is a system that can change its behaviour based on
reasoning, using observed evidence and domain knowledge. Bob Fisher
Cognition is when I know what I am doing, when I can judge how good or
bad it is, and explain why I am doing it. Markus Vincze
Cognition is the ability to plan, reason, adapt and act according to high
level motivations or goals and using a range of senses, typically including
vision, and may be communicate. Patrick Courtney
A cognitive system is an autonomous anti-entropy engine. David Vernon
Development of intelligent systems, Cognitive robot systems
5
Researchers‘ definitions











Cognition is the ability to relate perception and action in a meaningful way
determined by experience, learning and memory. Mike Denham
A cognitive system possesses the ability of self-reflection (or at least selfawareness). Horst Bischof
Cognition is gaining knowledge through the senses. Majid Mermehdi
Cognition is the ability to ground perceptions in concepts together with the
ability to manipulate concepts in order to proceed toward goals. Christian
Bauckhage
An artificial cognitive system is a system that is able to perceive its
surrounding environment with multiple sensors, merge this information,
reason about it, learn from it and interact with the outside world. Barbara
Caputo
Cognition is self-aware processing of information. Cecilio Angulo
Cognitive Systems are ones that are able to extract and (most
importantly) represent useful aspects of largely redundant, possibly
irrelevant sensory information in a form that is most conducive to
achieving a particular high level goal. Sethu Vijayakumar
A cognitive system is a system that can change its behaviour based on
reasoning, using observed evidence and domain knowledge. Bob Fisher
Cognition is when I know what I am doing, when I can judge how good or
bad it is, and explain why I am doing it. Markus Vincze
Cognition is the ability to plan, reason, adapt and act according to high
level motivations or goals and using a range of senses, typically including
vision, and may be communicate. Patrick Courtney
A cognitive system is an autonomous anti-entropy engine. David Vernon
Development of intelligent systems, Cognitive robot systems
6
Main emphasis






Perception
Action
Reasoning, planning
Goals
Autonomy, self-awareness
Environment
AGENT
see
next
action
state
ENVIRONMENT
Development of intelligent systems, Cognitive robot systems
7
An example of a cognitive system
 Household robot Robi
 My command: “Fetch me a beer”.
Development of intelligent systems, Cognitive robot systems
8
Example
 Sequence of actions:
 The robot has to be attentive and has to listen for my command. [attention,
motivation]
 It has to hear me and understand my command. [perception, speech recognition,
communication]
 It has to set the corresponding goal and aiming at fulfilling it. [goal, proactive
behaviour]
 It has to know where the beer is located, it had to previously learn that. [learning]
 He has to plan how to fetch the beer. [planning]
 He has to plan the most appropriate path to the refrigerator, based on the map, which
had to be previously built. [navigation, map building]
 He has to move along the planned path. [action – moving]
 On the way, it has to continuously monitor its path. [perception, action]
 It has to avoid obstacles. [perception, replanning, reactive behaviour]
Development of intelligent systems, Cognitive robot systems
9
Example
 When arrives in front of the refrigerator, it has to position itself appropriately.
[embodiment, situatidness]
 It has to know how to open the refrigerator. [recognition of object affordances]
 It has to search for the beer in the refrigerator (it has to learn in advance the
corresponding appearance). [perception, categorisation, learning]
 It has to plan how to grasp the beer. [planning]
 It has to grasp the bottle suitably. [action, visual servoing, haptic control]
 It will take the reverse path and return to me. [planning, navigation, action,
perception, recognition]
 Robi: “Here is your beer”. [communication]
Development of intelligent systems, Cognitive robot systems
10
Cognitive systems
 Cognitive assistant
Explores the environment and builds the map
Learns to recognize objects
Understands object affordances
Knows to interpret verbal and nonverbal
communication with persons
 Detects new situations and
reacts correspondingly
 Operates robustly
in real time
in unconstrained
domestic environment




Willow Garage
• Basic functionalities
are built in;
they are further
developed and extended
by learning
Development of intelligent systems, Cognitive robot systems
11
An example of a cognitive system
 Autonomous car
 City drive
 Competencies











Perception (image, 3D, collision)
Planning
Reasoning
Learning
Navigation
Obstacle
avoidance
Action
Flexibility
Robustness
Efficiency
…
Development of intelligent systems, Cognitive robot systems
Google self-driving car
12
Requirements of cognitive systems









Perception
Representations
Recognition
Learning
Reasoning
Planning
Communication
Action
Architecture
Development of intelligent systems, Cognitive robot systems
13
Perception
 Perception





Visual information (image, video; RGB, BW, IR,…)
Sound (speech, music, noise, …)
Haptic information (haptic sensors, collision detectors, ect.)
Range/depth/space information (range images, 3D models, 3D maps, …)
Many different modalities – very multimodal system
 Attention
 Selective attention
 Handling complexity of input signals
Development of intelligent systems, Cognitive robot systems
14
Representation of visual information
=
Development of intelligent systems, Cognitive robot systems
+ a1
+ a2
+ a3
+…
15
Representation of space
 Metric information
 Topological map
 Hierarchical representation
Development of intelligent systems, Cognitive robot systems
16
Representation of audio
Development of intelligent systems, Cognitive robot systems
17
Representation of linguistic information
Development of intelligent systems, Cognitive robot systems
18
Representation of knowledge
1. Natural language


understanding the meaning of the individual words
Spot is a brown dog and, like any dog, has four legs and a tail.
2. Formal language



Formal logic
“Spot is a brown dog” : dog(Spot) AND brown(Spot)
“Every dog has four legs“: (∀x) dog(x) -> four-legged(x)
3. Graphical representation


Knowledge is represented
with nodes and edges
Semantic nets
4. Ect.

appropriateness, efficiency,
scalability, suitability
Development of intelligent systems, Cognitive robot systems
19
Recognition
 Recognition of









objects
properties
faces
rooms
affordances
actions
speech
relations
intentions,…
 Categorisation
 Multimodal
recognition
Development of intelligent systems, Cognitive robot systems
20
Learning







Buildnig representations
Continuous learning
Different learning modes
Multimodal learning
Forgetting, unlearning
Robustness
Nature:nurture
training samples
test samples
…
representation
learning
updated
representation
Development of intelligent systems, Cognitive robot systems
recognition
learning
result
new training
sample
21
Multimodal learning
Self-supervised
learning
Weakly-supervised
cross-modal learning
Cross-modal learning
Development of intelligent systems, Cognitive robot systems
22
Reasoning
 Reasoning
In unpredictable environment
With incomplete information
With robot limitations
In dynamic environment
Considering different modalities
Self-awareness, introspetion,
knowledge gap detection
and communication
 Expert systems






Development of intelligent systems, Cognitive robot systems
23
Planning
 Planning




In unpredictable environment
With incomplete information
With robot limitations
In dynamic environment
Development of intelligent systems, Cognitive robot systems
24
Communication
 Communication
 With human
 With other (different) agents
 In time and space





Transfer of knowledge
Clarification
Coordination
Taking initiative in the dialogue
Verbal and nonverbal communication
 Symbol grounding
 Semantic description
 Learning language
 syntax
 ontology building
 Learning using language
Development of intelligent systems, Cognitive robot systems
25
Action
 Object manipulation (manipulator)
 Moving around in space (mobile robot)
 Other: sound, light signals, other grippers, ect.
 Embodiment
 Situatidness
Development of intelligent systems, Cognitive robot systems
26
Perception – action cycle
 Large abstraction of the real world
Development of intelligent systems, Cognitive robot systems
27
Architecture
Autonomy,
self-awareness
Goals
Perception
Action
Proactive
Reactive
Reasoning,
planning
Environment
Development of intelligent systems, Cognitive robot systems
CogAff architecture
28
Examples – PR2
U Tokyo, TUM
Willow Garage
UC Berkley
Development of intelligent systems, Cognitive robot systems
29
Examples - iCub
IIT
Development of intelligent systems, Cognitive robot systems
30
Examples - Asimo
Development of intelligent systems, Cognitive robot systems
31
Examples – Boston Dynamics
Development of intelligent systems, Cognitive robot systems
32
Examples - Yaskawa
Development of intelligent systems, Cognitive robot systems
33
Examples - Nao
Aldebaran Robotics
Expero, FRI LUI
Development of intelligent systems, Cognitive robot systems
34
George
FRI, LUVSS
CogX, http://cogx.eu/results/george/
Development of intelligent systems, Cognitive robot systems
35
Curious robot George
 Incremental learning in a dialogue with a human
 Curiosity driven learning
 Learning categorical knowledge
Development of intelligent systems, Cognitive robot systems
36
Self-understanding for self-extension
Development of intelligent systems, Cognitive robot systems
37
Learning mechanisms
Situated tutor
driven learning
Situated tutor
assisted learning
Non-situated tutor
assisted learning
Situated autonomous learning
Development of intelligent systems, Cognitive robot systems
38
Video
http://cogx.eu/results/george
Development of intelligent systems, Cognitive robot systems
39
System
Development of intelligent systems, Cognitive robot systems
40
Conclusion
 Cognitive systems are









intelligent
very heterogeneous and asynchronous
coherent
multimodal
They continuous upgrade their knowledge by learning
They communicate with a human
They interact with the environment
They move around the environment
They are able of autonomous reasoning and decision making
 Literature: SKOČAJ, D., VREČKO, A., MAHNIČ, M., JANÍČEK, M., KRUIJFF, GJ,
HANHEIDE, M., HAWES, N., WYATT, J., KELLER, T., ZHOU, K., ZILLICH, M., KRISTAN, M.
An integrated system for interactive continuous learning of categorical knowledge. Journal
of experimental & theoretical artificial intelligence, ISSN 0952-813X. [Print ed.], 2016,
vol. , no. , str. 1-26
Development of intelligent systems, Cognitive robot systems
41
Conclusion
T-60
Development of intelligent systems, Cognitive robot systems
T-30
T
T+30
42
Download