summer-2014-intern-presentation

advertisement
Classifying Objects as New
or Learned with
Convolutional Networks and
SGD
By Kevin Xiong and Evan Phibbs
Mentored by Yufei Wang
Introduction
•
The turtlebot runs on Robotic Operating System
(ROS).
•
ROS allows us to interface with the turtlebot’s motors
and Xbox kinect’s camera and depth data.
Goal
•
Our goal was for the turtlebot to be placed anywhere
in the room, move around the room to find objects
around it, recognize each object as known or new
and learn the object if it was new, as an attempt in
open-ended learning.
Libraries
•
We wrote our program entirely in python using two
libraries:
•
Caffe: A convolutional neural network library
•
Sklearn: a general-purpose machine learning
library, which includes a linear svm
Methods
•
The robot captures depth data and rgb data from the
turtlebot’s kinect.
•
a binary mask is created from the depth data such
that each 1 corresponds to an object pixel and each
0 to a background pixel in order to reduce the effect
of the background on classification
•
rgb data is multiplied by the mask and set as input
into the CNN
Methods (continued)
•
The convolutional network’s last hidden layer
activations are used as input into an svm classifier
•
The distances to the separating hyperplanes of each
input into the svm are used as inputs into another
svm classifier to determine whether an object is new
or known.
Methods (continued)
depth
mask
rgb
Caffe convolutional network
rgb
mask
linear
svm
classifie
r
linear
svm
classifie
r
clas
s
new
or old
Hurdles
•
We encountered many limitations while using the
turtlebot
•
The turtlebot’s movements are not accurate
•
The turtlebot can only rotate, move forward, or
move backward.
•
Kinect has very noisy depth data and is not
aligned with camera rgb data
Results
•
We tested on 4 types of tea, 2 types of cubes, a
bottle, magic eight ball, and a stuffed animal cat.
Video Demonstration
Improvements
•
Our program could be improved with the following:
•
a more robust way of circling the objects
•
an algorithm for moving about a room to ensure no
objects are left undiscovered
•
Combining depth data and object partitioning
algorithms in order to create a finer, more accurate
mask
Improvements (continued)
•
relying solely on object partitioning in order to
recognize known objects and remove them from
the scene, leaving only new objects to be focused
on and trained
•
using an object mask such that background pixels
are set to random color noise
Download