DOCX - Department of Electrical Engineering and Computer Science

advertisement
FOOSE: Football Operator and
Optical Soccer Engine
Nathaniel Enos, Patrick Fenelon, Skyler
Goodell, and Nicholas Phillips
Department of Electrical Engineering and
Computer Science, University of Central
Florida, Orlando, Florida, 32816-2450
Abstract — The objective of the FOOSE project is to
automate one-half of a foosball table, in order to provide an
entertaining human vs. robot variation on the classic game.
Its major subsystems are image acquisition, table state
interpretation, artificial intelligence, and move execution,
which together form a complete foosball-playing system. This
paper will detail the methodology behind and function of
each subsystem, as well as addressing integration and testing
of the complete system.
Index Terms — Artificial intelligence, computer vision,
image processing, optical sensors, parallel processing, real
time systems.
I. INTRODUCTION
Fig. 1
Star Kick board
subsystem will be discussed briefly, and each will be
discussed in more detail in later sections.
A. Image Acquisition
To acquire an image of the table, the depth sensor of a
Microsoft Kinect for Windows takes an image of the table.
This image is passed to Table State Interpretation.
B. Table State Interpretation
Over the years, there have been several attempts to
produce a semi-automated foosball table. Of these, some
have been done in the context of senior design groups, to
varying degrees of success. On the other hand, one of the
group’s inspirations is a high-performance professionally
manufactured semi-automated foosball table, called the
Star Kick board (seen in Fig. 1), which is sold to arcades
in Europe for $27,000. The primary goal of the FOOSE
project is to maximize performance while working within
budget constraints. It should be here noted that the
Project’s budget constraints are being mitigated in part by
a very generous scholarship from Soar Technologies, Inc.
This paper will be organized according to data flow;
since the project has a linear data flow pattern, it is most
reasonable to discuss the subsystems as they appear in
order, starting with the current table state (ball position)
and ending with an executed move, as will be discussed in
Section II and ongoing.
The depth image is then processed (on an off-the-shelf
computer) to extract the candidate ball coordinates. These
coordinates are passed to the Physics Engine.
II. OVERVIEW OF MAJOR SUBSYSTEMS
The Rod Control Boards or RCBs are custom-printed
PCBs which are each responsible for one computercontrolled rod, making four in total. Each takes in an
absolute move position from the AI and powers the
stepper motors to move to that location, and kick if
necessary.
Due to its nature, the FOOSE project is easily divided
into a number of subsystems, each of which performs a
major task on the data stream. In this section, each
C. Physics Engine
The physics engine maintains a flexible internal model
of what might be happening on the table. It takes in sets of
potential coordinates and, using knowledge of past
coordinates, outputs the coordinates most likely to be
accurate to the Artificial Intelligence.
D. Artificial Intelligence
The Artificial Intelligence (AI) takes the ball’s
coordinates and uses its knowledge of previous board
states and rod locations to calculate a move for all four
computer-controlled rods. These moves are then outputted
to the four Rod Control Boards.
E. Rod Control Board
F. Mechanics
The mechanics subsystems consist of the stepper motors
which actually move the rods, and the tracks and pulleys,
cams, etc. which physically move and kick on command.
On being powered by the RCB, this subsystem physically
executes the move, the table state is changed, and the
process begins again.
III. SPECIFICATIONS
Before discussing the major components of the project
in more detail, it will be helpful to establish some figures
to work towards when designing the subsystems. In
testing, it was determined that a typical kick by a novice
player travels at approximately 1 m/s. This value means
different things for different subsystems, which can be
broken down into two major categories:
A. Processing Subsystems
For the specifications of the processing subsystems, it is
useful to discuss the combined total lag they impart to the
system. Of major subsystems, those included under this
heading are Image Acquisition, Table State Interpretation,
the Physics Engine, and the AI. Although lag in this area
is partially mitigated by the AI’s ability to predict, the
project will require that the lag be no more than 75ms.
This lag would allow a ball at 1 m/s to move 75mm, or
half the distance between any two rods of the foosball
table, before reacting. This should be sufficient to ensure a
good game.
B. Mechanical Subsystems
For the mechanical subsystems, the calculations become
more involved. It is necessary to obtain motors with
sufficient torque to move the rods linearly. For this, the
following information is taken into account: A maximum
hit travels at 1.08 m/s, the distance between rods is 0.15
m, the maximum distance any rod must move to intercept
a ball is 0.20 m, the mass of a rod is 1.049 kg, and the
radius of the pulley used in mechanical construction is
0.0129 m. From this information the following equations
can be synthesized:
𝑡=
0.15m
= 0.138s
1.08m/s
𝑎=
0.2m
= 12.21m/s 2
t2
𝐹 = (1.049) ∗ 12.21 = 12.81𝑁
𝑇 = 12.81 ∗ . 0129 = 0.165𝑁𝑚
Fig. 2.
Sample Output from Kinect
And so, in conclusion, the linear control motor must have
0.165Nm of torque at 1247rpm.
C. Defined Terms
For the purposes of the FOOSE project, the X direction is
down the table (from goal to goal) and the Y direction is
across the table; the direction along which the rods can
move.
IV. IMAGE ACQUISITION
As aforementioned, the physical component of the
image acquisition subsystem is the Kinect for Windows,
using its depth sensor. This is capable of operating at 30
frames per second. Assuming a standard shot at 1 m/s, this
means that the Kinect takes a picture of it every 33mm,
which is a sufficient time resolution to allow for further
image processing. The depth sensor operates at 640x480
pixels, which (as configured) gives it a resolution of
approximately 508x282 pixels on the table itself, which
translates to about one pixel per square millimeter. This
offers more than enough resolution to accurately track the
ball’s position. A sample output can be seen above in Fig.
2. The Kinect is physically mounted above the table using
a wooden box suspended from the ceiling by five strings.
This design makes it capable of being manipulated in
order to properly center and align it.
V. TABLE STATE INTERPRETATION
The Table State Interpretation (TSI) subsystem is tasked
with taking raw depth sensor output from the Kinect
hardware and extracting from it the coordinates most
likely to be the ball. To that end, it uses a multi-stage
approach, which processes the image in different ways
using different computer vision (CV) algorithms. Before
discussing the algorithms themselves, a word on what they
run on:
A. Development Tools
The physical component on which the TSI, Physics
Engine, and AI subsystems run is a regular computer
composed of off-the-shelf parts. It runs Windows 8, which
ensures compatibility with the Kinect SDK and the ability
to connect to the Rod Control Board. The computer is one
which was available to the team, and not custom-built for
this project. It contains two 8-core AMD processors
running at 2.3GHz, coupled with 32GB of RAM. This
provides an unrestrictive environment in which to
prototype and test code quickly. It also allows for
parallelization, which improves total system performance.
Except for the code running on the Rod Control Board, all
code for this project was written in C#.
B. Initial Phase: Preparation
The initial phase of the Table State Interpretation
subsystem is a preparation phase which is run once on
each boot, which locates the table and calibrates the TSI
system’s table depth level. It does this by first taking 300
depth images of the table using the Kinect. These are then
algorithmically processed using a Gaussian blur weighted
in the X-direction, which helps eliminate puppets (as they
are oriented that way). The depth level of these processed
images is then averaged to obtain the table’s normalized
depth level.
value to determine the location and shape of the table for
this particular frame. This is necessary as neither the
camera nor the table is rigidly fixed to any reference
location, so they are capable of moving slightly relative to
each other, necessitating table surface identification. A
picture of the board surface with identified, highlighted
board surface can be seen in Fig. 3. This algorithm also
determines the four corners of the table, which are used as
a reference for the coordinate system.
D. Phase Two: Ball Candidate Detection
Now that the table can be located in the frame, it is
necessary to determine the coordinates of ball candidates.
In order to do this, the Emgu CV implementation of the
circular Hough transform is used. (Emgu CV is a C#
wrapper for OpenCV, which allows OpenCV functions to
be called from C#). The circular Hough transform locates
all objects on the table surface that are roughly circular
(above a threshold tolerance) and these coordinates move
on to the next phase. In Fig. 4, the ball candidates can be
seen as circles drawn over the table.
C. Phase One: Table Surface Identification
This phase of the TSI subsystem begins the per-frame
processing stages, hence the label of “Phase One”. For the
incoming depth frame, the table surface is identified by
subtracting out the normalized value obtained in the initial
phase, and then searching within a hardcoded range of that
Fig. 4:
Ball Candidates
E. Phase Three: Candidate Elimination
The primary circular object on the table besides the ball
is typically the puppets, in normal operation. Since there
are so many of them, and they are always on the table,
they are a major source of noise that must be eliminated.
In this phase, a breadth-first search is instantiated from
each set of candidate ball coordinates. The actual ball will
not be connected to anything, but a puppet will be
connected to the rod, other puppets, etc. This helps reduce
false ball detections by a large factor.
Fig. 3:
Identified Table Surface
F. Phase Four: Candidate Ranking
In the final TSI phase, the remaining candidates are
investigated further and ranked according to likelihood of
being the ball. First, a Sobel gradient is determined around
the remaining candidates’ locations. Then, a Hough
accumulator is run on these gradients. This generates an
index of circularity, which can be compared among
candidates. The most circular candidate is found, and then
it and other candidates within a threshold of it are
outputted to the physics engine as probable ball
candidates.
G. Overall Algorithmic Performance
Overall, the TSI algorithms are very efficient compared
to early prototypes. Effective use of parallelization
throughout the subsystem has resulted in dramatic
decreases in latency and increases in throughput. The
system is currently capable of operating at 28.3fps, very
near the maximum of 30fps allowed by the Kinect sensor.
Additionally, the total lag is even lower than the target of
75ms, by a factor of 2/3. This can be seen in Figure 5:
Dropped frames
Effective framerate
Average lag
Fig. 5.
Actual
5%
28.3 fps
50ms
Target
0%
30fps
75ms
Table of Actual vs. Target TSI Performance
VI. PHYSICS ENGINE & BALL TRACKING
Real-time images from the Kinect sensor are subject to
noise. Even a sophisticated and well-tuned circle detection
algorithm will be susceptible to false positive detections
and missed detections of the foosball. For these reasons,
the Physics Engine and Ball Tracking subsystem is
introduced between the TSI and the AI subsystems. When
the TSI subsystem detects a circle on the field, it will pass
the point to the ball tracking system. The ball tracking
system will then integrate all of the potential ball positions
and output to the AI the most likely actual position of the
ball.
In the design phase of FOOSE, the intention was to take
the potential ball detections and create a unified physical
position through a Kalman filter. However, the Kalman
filter was decided to not be appropriate for our domain.
The problem lies in that erroneous readings from the
Kinect potentially do not correlate to the ball position.
Unlike in a domain like Satellite Global Positioning,
where you may get incorrect readings that are still
correlated to your actual position, the false positives from
our algorithm do not give us any statistical insight into our
actual position. Instead, we need a way to detect readings
that are not the ball and filter them out completely.
The need to filter out false readings lead to the unique
algorithm constructed for our ball tracking subsystem. The
algorithm is based on a list of watchers which keep track
of potential ball positions on the field. A watcher contains
a physics model that it uses to try to predict where the ball
should be in future ticks. Whenever a new circle position
is read from the camera, each watcher in turn tries to
integrate the new position into its current ball model. The
model that most accurately predicts the new sensed
position will then integrate it into its velocity and position
predictions; otherwise, if no watcher can accurately
integrate the position then a new watcher is spawned to
model a ball located at that position. Then whenever the
AI requests the position of the ball, the watcher with the
most accepted ball updates (i.e. the most accurate ball
model) will be selected to return the estimated position.
Over time, watchers that are not getting updates from the
camera are killed off.
This system allows a dynamic method of filtering out
erroneous readings. As long as the ball is the most
consistently detected circle on the field, then its watcher
will always have the highest confidence readings, while
the noisy circle detections will die off. The power of this
system comes from its ability to continue to track the ball
after a very rapid hit, when the camera will not be able to
keep up with the foosball’s movement. This sort of rapid
change in ball position may initially look like an erroneous
reading to the tracker, but as the confidence of the new
position increases it will quickly usurp the old watcher and
take its place as the new most accurate location of the ball.
In Figure 6, the output of the physics engine can be
seen. Crosses represent previous locations at which the
Fig. 6:
Physics Engine Output
ball had been found, and the circle represents the current,
predicted location of the ball.
VII. ARTIFICIAL INTELLIGENCE
The artificial intelligence (AI) subsystem is responsible
for calculating a move based on present and former table
states, and outputting those moves to the appropriate Rod
Control Boards.
A. Lateral Move Calculation
The AI subsystem begins when it is called by the
physics engine with the most recent best coordinates for
the ball. From these coordinates and the previous set of
coordinates, it projects a line and finds the intersections
with each rod, accounting for bounce. It then uses the
calculated intersection points to choose a puppet to block
at that intersection. Last, it uses a measured offset to
calculate how many stepper motor “clicks” the rod must
move in order to move the correct puppet to the correct
position. It then saves this move.
B. Kick Calculation
Once all lateral moves have been calculated, kicks are
determined. Because of the way kicking is implemented
on a mechanical level, it is necessary to prepare for a kick
before actually kicking. Fortunately, this is fairly easy to
mitigate. Using the ball’s velocity in the X-direction
(determined by the physics engine) and the ball’s Xposition relative to the rod in question, it is easy to
determine how long it will take the ball to intersect the
rod. The AI contains the tested values regarding how long
a kick takes to charge. So, the AI can determine whether
to charge, kick, or not kick depending on how long it will
take the ball to intersect the particular rod.
D. Initialization
As the AI is the intermediary between the computer and
all four Rod Control Boards (RCB), it is responsible for
certain functions outside of its normal move calculation.
The first of these is initialization. When the system is
booting up, the initialize function of the AI program is
called, which automatically determines the correct serial
port with which to address each RCB. It does this by
enumerating each serial port in the system, and
challenging each of them with a three-byte value (00 00
FF), which the code on the RCB will interpret as a request
for board identification. An RCB will respond with one
byte in the form of AX, where X is its number. So, RCB 3
will reply with 1010 0011. The AI then parses this and
sorts the RCB’s port into an array where it can be used
later.
E. Calibration
The AI is also responsible for requesting calibration
updates from the RCBs themselves. The AI keeps an
internal model of how many clicks each rod is capable of
moving. While this number would be static under ideal
conditions, due to changing conditions this number may
change by a few clicks. The RCB automatically calibrates
on power-up and stores this information. The AI requests
this information during initialization in order to maintain
an accurate internal representation of the board. Like the
ID request, the calibration request is a 3-byte value;
namely 00 00 F0. The RCB will respond with two bytes
representing its maximum range in clicks. The AI then
updates its internal information to match the measured
values.
VIII. ROD CONTROL BOARD
C. Move Output
A. Overview
After the determination of the lateral move and kick
value for each Rod Control Board, the move calculation is
complete. The moves are then output to their respective
RCBs (determined during initialization, see below section)
in the form of a three-byte value, where the first two bytes
represent the lateral move (final absolute position, in
clicks) and the final byte is either a 1 or a 0, where 0 is
block, 1 is charge kick, and going from 1 to 0 actively
kicks. The final “kick” byte is also used in initialization
and calibration, as seen below. The RCBs immediately
begin execution of the received move, and the AI
subsystem awaits further input from the physics engine.
The Rod Control Board (RCB) is a custom-printed twolayer PCB which is used to control and power the stepper
motors used in the mechanical subsystem. It connects to
the computer via a Serial over USB connection, which the
microcontroller, an Atmel ATMega32U4, is able to
support natively. The microcontroller takes in moves from
the AI via the Serial over USB line, which it uses to power
the motors and execute the move. In this project, one RCB
is used per rod controlled by the computer, for a total of
four RCBs. The schematic for the RCB can be seen in
Figure 7.
B. Microcontroller
Fig. 7:
RCB Schematic
As aforementioned, the RCB uses as its center the
Atmel ATMega32U4 microcontroller. For this project,
this model of ATMega proved to be ideal. It supports USB
without the need for additional complex circuitry and has
low cost. Also, the Arduino Leonardo uses the same
model, which allowed the use of its bootloader. Using the
Arduino bootloader greatly simplified coding for the
ATMega, as it allowed the ATMega to communicate with
the computer over USB directly using Arduino drivers,
which saved a great deal of time in development, as this
project requires two-way communication between the
computer and the RCBs. The ATMega is being run at
16MHz, which is the speed supported by the Arduino
bootloader, and a sufficiently high speed to be able to
operate more or less instantaneously in normal operation.
The ATMega’s other specifications also turned out to be
satisfactory for the FOOSE project, as seen in Figure 8. Its
other useful features include many IO pins, which are
mainly useful for controlling the H-bridges, but are also
used for setting the RCB ID (two sets of two pins are
sensed as binary bits and used for initialization, see
Section VII, part D) and to sense the buttons used in the
calibration process (section VII part E).
ATMega32U4 Specs
Fig. 8.
Flash (Kbytes):
32 Kbytes
Pin Count:
Max Frequency:
Max I/O Pins:
USB Transceiver:
USB Speed:
44
16 MHz
26
1
Full Speed
SPI:
2
ATMega32U4 Specifications [1]
C. Code
The RCB’s microcontroller is capable of running C
code. The code that runs on the RCB is responsible for all
aspects of RCB control, including the aforementioned
initialization and calibration routines (section VII parts D
and E).
Primarily, the RCB code is responsible for executing the
moves passed to the RCB by the AI subsystem. At all
times, the RCB knows the rod’s current position and
velocity. It is constantly polling for a new move. When it
receives a new move (which is formatted in terms of
absolute clicks of the stepper motors), it calculates the
direction and distance it needs to travel. Then, it either
continues to the new destination or, needing to change
directions, steps down the speed and then steps back up in
the new direction. This slowing down is too quick to really
perceive; the rod can change directions extremely quickly,
a clear advantage in a foosball game. However quick, the
slow down/speed up is still a necessary component of the
code, as the stepper motors would exhibit degraded
performance if the step down were not implemented.
The RCB code also governs kicking, of course. As
aforementioned in section VII, part B, it is necessary to
prepare for a kick before executing one. When the RCB
receives a kick byte equal to 1, it charges the kick by
rotating the kick stepper 90 degrees. When it receives a 0
after that, it rotates the rest of the way (270 degrees),
thereby executing a kick and resetting the system. For
more information about the mechanics of kicking, see
section IX.
amps per channel), and reliable. [2] Early versions of the
RCB utilized a more complex Texas Instruments chip
specifically designed to be a stepper motor driver, but this
component proved to be much less reliable than the Hbridges the group members had already been using for
testing. They would frequently overheat and stop working,
and it was discovered that the chip was not capable of
sustaining the sort of power output the stepper motors
required. The L298N has proven to be a reliable solution.
In testing with the H-bridges, it was found that, when
run for long periods of time, they would experience
thermal events much like the drivers. Unlike the drivers,
the H-bridges can be easily mounted to heatsinks, and
these can be seen as a prominent feature of the RCB’s
final design. The heatsinks (in combination with thermal
paste) effectively absorb and dissipate excess heat from
the H-bridges, even when run for long periods of time.
This has greatly increased the reliability of the RCB’s
during extended gameplay. In fact, since being mounted to
the heatsinks, the RCBs have not encountered any thermal
faults at all.
E. Future Work
Fig. 9:
Assembled RCB
D. H-Bridges
The H-bridges are perhaps the most important
component of the RCB. They are responsible for
controlling power output to the stepper motors.
Essentially, they act as switches. For the RCB, the L298N
full-bridge motor drivers are used. Each H-bridge is
capable of controlling two power channels. As each
stepper motor has two coils which need independent
power, and each rod has two stepper motors, each RCB
requires two H-bridges. The RCB’s microcontroller
controls the H-bridges, and the H-bridges route power to
the stepper motors.
The L298N H-bridges have proven to be ideal for the
FOOSE project. They are inexpensive, high-current (2
While the current RCB design seems solid, there are
two major areas to which further work could be directed
post-senior design.
The first of these is to increase the voltage on the
stepper motors. Currently, they operate at 12V from
standard computer power supplies. However, it may be
possible to increase the performance of the system by
doubling this to 24V, either from chained power supplies,
or from dedicated 24V power supplies. This option was
considered by the group, but ultimately not pursued due to
time constraints.
The second option concerns the reliability of the RCBs.
While currently very reliable in operation, it may be
possible to prevent issues in the future by incorporating an
array of diodes between the RCB output and the stepper
motors. This would prevent transient current spikes caused
by the inductive load of the stepper motors, and would
protect the board from any anomalies in the stepper
motors. Again, this option was favored, but ultimately not
pursued due to time constraints.
IX. MECHANICS
A benefit of the foosball domain is that the game
involves only two simple motions to play: linear motion to
move the puppets back and forth, and rotational motion to
kick the ball. This is a desirable quality, as none of the
members of the FOOSE engineering group are mechanical
engineers. However, combining the two motions into a
single elegant component is non-trivial. This section
discusses the engineering challenge of creating the final
mechanical subsystem.
The starting point for the mechanical subsystem was to
estimate the physical speed and forces required to
maintain a game of foosball that is “entertaining to a
novice user.” After watching recordings of foosball
games, it was determined that the fastest the ball speed
that a novice user should be able to block is around 1 m/s.
This is the maximum speed we expect the table to be able
to block, and the value that is used to calculate a minimum
torque requirement for the motors as shown in section III,
part B. The stepper motors that were purchased for this
task (Japan Servo KH56JM2-901) are well within the
minimum torque to create the linear speed.
However, the Japan Servo motors could not meet the
torque requirements for the rotational kick; in fact, none of
the motors on our list of potential parts had quite enough
torque to create a competitive kick. To get around this
limitation, the senior design group designed a cam kicking
mechanism that uses a spring force to load the kick, and
Fig. 10:
Kick Cam
using the stepper motor instead only as the force to wind
back the kick, as seen in Fig. 10. This solution is compact
and can fit easily on a linear slider. A picture of the cam
device can be seen in Fig. X.
Special thanks are also due to Dr. Samuel Richie, whose
guidance and advice have been essential to getting the
project to where it is today.
Skyler Goodell would like to thank Brandon Parmeter,
for his robotics expertise and advice.
BIOGRAPHIES
Nathaniel Enos is a senior at the
University of Central Florida, majoring in
Electrical Engineering with a minor in
business (2013 graduation). Nathaniel
enjoys system controls and embedded
processes. After graduation he will start a
rotational program at Texas Instruments as
a digital applications engineer.
Patrick Fenelon is a senior at the
University of Central Florida. He will be
graduating with a B.S. in Computer
Engineering in May 2013. His interests
include
programming,
math,
computational geometry, and solving
problems. After graduation, he will begin
working on the Visual Studio debugger
team at Microsoft.
Skyler Goodell is a senior at the University
of Central Florida. He will be graduating with
a B.S. in Computer Engineering in May
2013. His interests include machine learning,
serious games and robotics. He will begin
work at Microsoft on the Bing search engine
shortly after graduation.
X. CONCLUSION
In conclusion, the FOOSE project offers an innovative,
cost-effective, and competent solution to a tough
engineering problem. It successfully incorporates
subsystems from computer vision, physics modeling,
artificial intelligence, circuit design, and mechanical
engineering into an entertaining game that fits in the den.
ACKNOWLEDGEMENTS
The members of FOOSE would like to give special
thanks to Soar Technologies and its employees for their
sponsorship and invaluable advice and expertise, without
whom the FOOSE project would not have been possible.
Nicholas Phillips is a senior at the
University of Central Florida. He will be
graduating with a Bachelor’s of Computer
Engineering in May 2013. His interests
include reading, writing, and synthesizing
information. After graduation, he will be
working for Verizon Communications.
REFERENCES
[1] Atmel. (2010). ATmega32U4 Summary [Online]
Available: http://www.atmel.com/Images/7766S.pdf
[2] Sparkfun. Full-Bridge Motor Driver Dual - L298N
[Online]
Available:
http://www.sparkfun.com/products/9479
Download