Microsoft Kinect – Visualising a 3D World

advertisement
Microsoft Kinect – Visualising a 3D World
Ryan Durrant - 23945583
University of Southampton
red1g10@soton.ac.uk
Abstract
In this paper I review the technology behind the Microsoft Kinect
and look at current and possible future applications in order to
evaluate the impact of the Kinect on the world of multimedia. I
will compare the advantages and disadvantages the Kinect which
allow it to be useful in these areas. I also propose an experiment to
understand further the effectiveness of the Kinect in education, an
area I believe it could be important and effective in improving the
learning experience for pupils.
Keywords: Kinect, Education, 3D, Computer Vision, Infra Red
Camera, Microsoft, Robot Vision, Depth Imaging
1. Introduction
When first released in November 2010, the Microsoft Kinect was a
whirlwind success. Some ranked it as a technological revolution
rivalling that of the personal computer and the internet [1].
Depth imaging was initially developed as a tool for gait analysis,
facial recognition and skeletonisation in order to combat terrorism.
Depth imaging was an expensive tool, the result of billions of
pounds and many years of development [1], out of the reach of the
common person. The Kinect package itself was rumoured to take
between 5 and 10 years to develop [2]. However, when completed,
it put top of the range technology in the hands of creative and
intuitive civilians.
There are already many applications being developed, beyond the
initial release purpose of a peripheral for the XBOX 360. For
instance, it has been researched in medicine for many possible
uses. Doctors could, for example, control a computer without
having to come into contact with un-sanitary keyboards [3].
In this paper, I will look at how the Kinect operates, the hardware
required and briefly at the software. I will then provide a broad
view of current applications and look at possible future
applications, especially in education, in order to analyse how the
Kinect is successfully affecting multimedia applications.
2. Related Work
Zhang, Z. (2012) wrote about the impact of the Kinect in
multimedia, focussing mainly on the different software attributes,
such as skeletonisation and facial tracking, and suggests that the
Kinect offers a huge range of applications throughout multimedia,
in particular for more immersive video conferencing [30].
good indication of what is in front of the camera. This works as we
know the speed of light to be, roughly, 300,000,000 metres per
second, so we can calculate the distance by multiplying through by
a half the time delay in seconds to get the distance in metres. The
Kinect, however, uses a dense field of infra red (IR) points, which
create a 3D map. The structured light technique can then turn this
into depth data as it is calibrated to know where each point is on a
flat surface a known distance away, it then calculates the
displacement of each point from this and thus its distance. This is
then stored in a 2D greyscale image with the colour value for each
pixel relating to its 3rd dimensional value, the distance, as seen in
figure 1. A pixel’s x and y co-ordinates within the image denote its
position on the xy-plane.
3.2 Why use a depth camera?
Depth cameras in general are very useful. While computers have
been able to analyse still images and even video from standard
cameras before, it is very difficult to get useful, real time data. This
is because, with a picture as an array of coloured pixels, colour
images do not give a great indication of what is going on in a 3D
space as it is very difficult for a computer to establish where one
item ends and another begins. Other limitations arise from the
nature of colour (RBG) images, the camera records an array of
pixels based on the exposure, the amount of light which passes
through the lens. This can be drastically altered by the light level
and source [1]. A picture of a human under sunlight, for example,
would look very different to an identical pose under fluorescent
tubing. Other factors, such as the exposure rate of the camera, can
also result in completely different images. This makes it very
difficult to generalise any software to get consistent results in
different environments and situations.
The depth camera, however, creates a depth image as an array of
pixels relating exactly to the distance from the camera. This allows
a computer to emulate one of the most important aspects of human
sight: the ability to recognise individual objects in 3D space. This
is because it is clear that if the distance on a pixel is suddenly much
further away to those before it, we know we have reached the end
of an object. As can be seen in figure 1, the RGB image on the
right is difficult for a computer to understand and includes
reflections and shadows from a light source which would make the
image very hard to understand in its current form. This is ignored
in the depth image, centre, though this shows up a shadow where
the IR field, left, returns no data.
3. History
Initially, this technology was produced by research established in
particular to observe crowds with more success than normal Closed
Circuit Television. CCTV relies heavily on humans to spot
something really small and easily missed, making it unreliable.
Depth cameras were developed by many pioneers in the fields of
Computer Science and Mathematics before Microsoft did a lot of
research, along with PrimeSense, to produce a civilian version for
the XBOX console. It was hugely successful and became the
fastest selling peripheral in history [1].
3.1 Developing a new sensor
Although the idea of a depth camera is nothing new in itself, the
Kinect marks the first time this technology has been within the
reach of civilians. The system used to power the Kinect is very
cheap, but it is not by any means poor.
Many of the previous depth cameras used Time-of-Flight (ToF)
methods, which use short bursts of light and read the time it takes
for the light to bounce back as well as the wavelength, giving a
Figure 1 – IR image, Depth image and RGB image captured by Kinect
There are two principle methods for creating a depth image:
Passive depth imaging takes multiple 2D images and tries to
compute the distance of pixels and objects, often through the use of
computationally exhaustive correspondence algorithms or optical
flow. Active depth imaging employs additional physical sensors,
such as an IR field as seen in figure 1. The problems with
correspondence can be avoided through active methods, such as
structured-light and ToF, which do not require these complex
algorithms to triangulate the depth of each part of an image. [13]
3.3 Design Motivation
Initially developed to combat terrorism, how did this technology
end up a peripheral in living rooms across the world?
Hardly a new route, technology has evolved from military and
government use to personal and business use for years. The
internet was developed for the military as a way of communicating
in the event of war. This path is also true for depth imaging and the
Kinect.
Microsoft initially had the vision of a controller-less gaming
console, which allowed people to use natural gestures. They took
this brief to PrimeSense, and the Kinect was developed. It initially
allowed Microsoft to combat systems developed by Nintendo and
Sony, which used different methods to capture motion but both still
required controllers. The aim was to enable people who were
perhaps daunted by learning to use the controller in order to play
complex computer games, expanding their target market. [4]
More recently, development has continued to utilise the Kinect in
other fields. Medicine and robotics are two areas being researched
currently [3,7,14,15]. Microsoft released an SDK, and OpenNI
released a framework, meaning that developers could not only use
the Kinect, but actually improve it for their needs. This third party
development is pushing advancements which could apply the
Kinect to aid many aspects of our lives.
4. Hardware
The hardware offers a step forward from previous depth cameras
which have been expensive and highly specialised. The Kinect
costs very little to produce and can provide better results than
much more expensive systems, and it does all of this in the living
room of civilians. [3]
An infrared projector projects the field of IR points as in figure 3.
These are read by an IR camera and fed to the processor by a
Complementary metal oxide semiconductor (CMOS) Active Pixel
Sensor (APS). This type of APS contains an integrated circuit with
an array of photo-detectors and is often used in camera phones and
even some digital SLR cameras. It has the main advantage of being
comparatively cheap and not requiring any special manufacturing
techniques. These sensors are also immune to blooming, in which a
strong light source results in nearby pixels having extra light. One
disadvantage is the rolling shutter effect as it records one row at a
time. These traits are perfect for the Kinect, as it allows the sensor
to be robust in most lighting conditions as well as remaining cheap
and efficient. The RGB camera is also controlled by this sensor
which translates the light into electrical signals which are
understood by the PS1080 chip. The cameras and the projector can
be seen in figure 2 with the RGB camera and IR sensor close
together in the middle, allowing the two images to be overlaid, and
the projector off to the side. There are also 4 microphones in the
body, which allow the Kinect to recognise different sound inputs
and ignore background noise such as the television when
connected to an Xbox.
The upgrade from ToF to structured-light 3D mapping requires a
boost in processing power, made possible by the PS1080 chip
which converts the raw data from the IR camera, the RGB camera
and the microphones into usable data. The Kinect can also rotate
through 27˚ up or down thanks to a motor attached to the base.
5. Software
Shane Kim, an executive at Microsoft, was quoted in 2009, after
the initial announcement of the Kinect, as saying “To me, the
magic is more software.” [4] With each pixel in the image received
from the Kinect’s sensors possibly encoded with RBG, Depth and
even sound information, the software has to manipulate a lot of
information; 9.2 million 3D points per second compared to 6800
2D pts/sec for a laser range finder [14]. PrimeSense developed the
PS1080 chip specifically to process this influx of data [3].
The information from the IR sensor provides red points from the
field which will have slightly different positions and sizes
depending on the distance of the point, as can been seen in figure
3. It then triangulates this with a virtual image, hardwired into the
chip [5] which contains the expected positions at a known distance
to create the depth image. It is worth noting that the shadow caused
by each item, visible in figure 3, does not return any data to the
Kinect, so the chip assigns the maximum depth value, of 2047, to
each point within this space. This allows them to be ignored as
insignificant background space. It then analyses the curves in real
space to produce useable information and a 3D model. It can also
use this depth image to create “visible audio” from the microphone
array, enabling the computer (or Xbox) to understand where in
space each sound is coming from. It can overlay the colour image
to the depth image, as the two cameras are a known distance apart,
to allow the user to see their actions affecting the outcome. This is
desirable as it enables the Kinect to distinguish between players
and sounds, enabling a more engaging interface for multiple users.
However, it results in the vast amount of data mentioned above.
In order to deal with this, the PS1080 chip in the Kinect analyses
depth values from points, multiple points are often found within a
single pixel [10]. Attempts are then made to interpret the data
before sending it on. It can do things such as locate a human body,
map its joints and body parts and estimate where, in all probability,
your body parts will be in a few nano-seconds time [6]. This is
done based on criteria for the human body, for instance it is
unlikely your elbow will bend backward. It then gives already
interpreted data to the connected device which enables it to use the
data without having to process all of the raw data. One of the more
impressive figures is the frame rate of 30Fps, allowing a real-time
experience.
Figure 3 –Magnified IR image captured with Kinect showing field of IR dots as seen by Kinect. The
dots in the foreground are much bigger and closer together than those in the background. This allows
the Kinect to calculate the distance at each point. The shadow caused by the object is also visible.
Other methods exist to speed up this data analysis. A fast sampling
plane filtering algorithm is proposed by Biswas and Veloso [14] to
reduce the volume of the 3D pixel cloud by sampling points to
produce an overview of the 3D map. This can be vital in systems
with less computational power, such as small robotics.
6. Advantages and Limitations
In this section I will analyse the advantages of the Kinect over
other similar depth imaging technologies, as well as looking at its
limitations.
Figure 2 - Shows the RGB camera in the centre, the IR projector on the right and the IR sensor on the
left in the Kinect [25]
However, despite it being hailed as a revolution, the hardware is
nothing massively new. It is, in fact, the software which is most
innovative.
6.1 limitations
There are a few limitations of the Kinect. It lacks the accuracy of
some of its more expensive peers, though it is still robust and
highly accurate, and even considered more accurate than many
other 3D sensors by Forsythe and Green [3]. It also has a limited
resolution, 640 x 480 pixels, though the RGB camera can run at the
higher resolution of 1280 x 1024 pixels but this will result in a
lower frame-rate. It is also not hugely important for a large
resolution depth camera as it will normally be used to follow large
objects which can be defined at lower resolution with enough
clarity to be understood. In this instance a higher resolution could
be a step backward as it would slow down processing times. One
of the main limitations is due to the structured light approach; it is
difficult, though not impossible, to include more than one Kinect in
a system as the array of IR dots will become entwined and will
confuse the results. The use of IR may also be a problem in direct
sunlight, which could limit its applications.
6.2 Comparison to other cameras
There are other cameras available, and other methods for them to
retrieve depth data. One such camera, the Point Grey Bumblebee
XB3, uses passive methods through stereo imaging and
triangulation. It has three cameras which each capture an image.
These are plugged into triangulation software to provide a 3D
image. It has a high resolution, 1280 x 960, and good depth range,
0.2 to 4.5 meters. However, the software is entirely hosted on the
connected computer, which requires a powerful processor, and the
increased frame size, along with computationally complex
algorithms, reduces the frame rate to a maximum of 15 frames per
second. It also requires parameter configuration in each
environment, making it limited in its applications.
Another camera, the Camcube, utilises an active time of flight
(ToF) method. It is very expensive, a common trait with the
Kinect’s competitors, lacks colour, and has a very limited
resolution, just 200 x 200 pixels. However, perhaps the main
limitation could be that depth data is noisy, and affected by lighting
conditions to the degree it is only useable indoors under controlled
lighting conditions.
Comparatively, the Kinect has a reasonable frame size, 640 x 480
pixels. It also maintains a good depth range, 1.2 to 3.5 meters, as
well as an incredible frame-rate at 30fps thanks to the complex
software on the chip. It has a low impact on the computer, as it
conveys depth data directly from the hardware. Also, vitally, it can
be successful in a wide range of situations, increasing its usability.
Pece et al. compared these three cameras [27] and concluded that
the Camcube provided the most accurate depth data. They also
claimed that ToF cameras will soon improve to include colour as
well as quickly come down in price. However, due to the
inflexibility of this camera, as well as the price differential, I would
argue that the Kinect is still a better and more useable option.
Figure 5 – The Kinect attached to wheelchair allowing the instructor to control the chair without
confusing the student [17]
7. Applications
Many applications of the Kinect have been researched and
developed across many disciplines since its release as a gaming
console peripheral.
7.1 Medicine
Medicine is one area in which there are many possible applications
of the Kinect. For example, the ability to operate a computer
without coming in contact with any un-sterile interfaces, such as a
mouse or keyboard, has obvious benefits to doctors who always
need to reduce risk of infection. [3]
The ability to recognise natural gestures allows for the use of a
Kinect to monitor vulnerable patients, ensuring their comfort and
safety in their own home as well as reducing the strain on health
care professionals required in hospitals or old people’s homes.
Current systems require a carer, or the use of a lifeline panic button
worn around the neck. However, the second of these is not robust,
as if the patient finds themselves in trouble; they may not be able
to activate an alarm. The first is also not robust as contact is only
temporary, it is also expensive and time consuming. One method
was proposed, using the natural gestures understood by the Kinect,
in collaboration with a data set containing gestures to watch out
for. The Kinect could then notice if the patient is showing signs of
distress, and communicate information on what the problem could
be and how severe it is with a medical professional. They could
then use the built in microphones to talk to the patient, and be able
to monitor them. This method is as yet only concept, though its
implications are vast and could be hugely beneficial. [3, 16]
Figure 4 – The results collected for one subject by Chang et al. [17]
The Kinect has also been explored as a tool to aid the rehabilitation
of people with motor disabilities requiring regular exercise in order
to improve their mobility. This was explored by Chang et al. [16]
where two patients were asked to do their exercises with and
without the aid of the Kinect. Through a simple animated game,
which saw a whale reacting directly to the motions the patients
made, their motivation and results were massively improved, as
seen in figure 4.
A similar application, though different approach, has been
developed in which a Kinect is used to speed up teaching children
to used motorised wheelchairs, as in figure 5, by minimising the
distractions posed by the carer. If child makes a mistake they risk
hurting themselves or others, which requires the carer to move his
hand into the chair to take control, which can distract and confuse
the child, this can be avoided using a Kinect to override the
wheelchair based on the instructors commands. [17]
7.1.1 Advantages and Disadvantages of the Kinect
The Kinect could be particularly useful in the medicinal world
primarily because of what it enables people to. The price is
particularly important when trying to rehabilitate children or young
adults with motor difficulties as it could be purchased for the home
to encourage continued exercise and increase motivation as well as
keep medical costs for insurers, health services and hospitals down.
The IR camera enables the Kinect to monitor patients in low light,
or even no light, which could be hugely beneficial as night time is
often when patients are most vulnerable. Other types of camera
which rely on visible light such as CCTV and passive depth
cameras could not do this. The use of microphones also enables
another sense to be included which can provide vital information as
well as a more personal and beneficial communicative connection
between patients and medical professionals.
One disadvantage of the Kinect is its static nature, which would
make it difficult to monitor a patient across more than one room.
This could be overcome with the use of personal robotics, as we
will discuss in the next section, or multiple sensors. The Kinect
also raises ethical issues as it captures personal information.
7.2 Robotics
Another area that the Kinect is currently being developed is as a
navigational aid for robotics. Nick Hawes (et al.) have developed a
robot, named Dora the Explorer, which aims to improve its own
knowledge about its surroundings [11,12]. This could lead to
robots helping around the house, and could even be implemented
into other systems such as looking after vulnerable patients as
mentioned above. The need for a 3D sensor exists even for the
simplest of tasks; before a robot can make a cup of tea it needs to
calculate the position of the mug in front of it. Initially Dora was
designed with 2 point grey lasers to produce an idea of its
surroundings, but has since been updated with a Kinect, as in
figure 6, showing how the Kinect can hold its own against more
established technologies. Hawes claims that the Kinect is a “huge
new technology” [28] for robotics due to its incredible cheapness
and its effective data capturing. With the original system, items
were only sensed at the level of the sensors. It was noted that tables
were difficult, as they were not substantial enough at the height of
the sensor to be recognised. The Kinect has improved this
massively, allowing the robot to build a much “richer map” of the
local environment [28]. However, it still has difficulties sensing
things like towels, which are irregular and un-textured, and in
particular glass which doesn’t register with IR. Applications such
as this heavily rely on the ability to recognise objects around it.
Figure 6 – Dora’s Sensor layout including two point grey sensors and a Kinect (left) [29] and more
recently with just the one point grey sensor and a Kinect attached to a laptop.
Methods enabling the Kinect to recognise objects in images are
also being developed. There are many data-sets about, and one
proposed by Janoch et al. utilises both the RGB and depth cameras
to build up a database of images for various objects from different
angles in attempt to enable category level object recognition [18].
This research can add further to the usefulness of the Kinect in
robotics by giving a better understanding of the environment.
It is also shown by Stowers et al. (2011) [13] that the depth camera
and its feedback are robust enough under experimentation to be a
valid controller for the flying altitudes if quadrotors, as in figure 7.
The authors concluded that the Kinect was able to work effectively
under changing conditions and that its “low cost, high frame rate
and absolute depth accuracy over a useful range make it suitable
for use on robotic platforms”.
Figure 7 – The Kinect on a Quadrotor [14]
7.2.1 Advantages and Disadvantages of Kinect
The main advantage of the Kinect to robotics is the very low cost.
Robotics is expensive, however to be plausible in everyday life, it
is required to be affordable and robust. The Kinect excels here as it
provides robust data at a low cost, as well as being small and
sturdy meaning it can be integrated easily. It can also be used in a
wide range of situations as it does not need to be calibrated in
every new environment and can be used in almost any lighting.
The Kinect, however, might be rendered useless by direct sunlight,
as it uses IR. This could be a big dent to its effectiveness in
robotics, though more information is needed on the subject. It also
requires a minimum distance, 0.8 metres, to be able to calculate the
depth, which maybe an issue for a robot who needs to get close to
things. This can often be solved by placing the sensor at the back
of the robot body, though this can result in blind spots.
7.3 Education
One area in which I believe the Kinect could be influential, and
deserves more research, is education. Considering Chang et al.
found that the Kinect can increase motivation [16], I feel that it
could revolutionise primary and secondary education, akin to the
impact of interactive white boards (IWB). In particular, methods
could be developed to enable teachers to teach while moving
around the class, enabling children to get more involved through
interactive games and even alert teachers when a student raises
their hand.
Development in this area has been pushed by Microsoft, who have
many lesson plans available utilising the Xbox and the Kinect,
though many of these are not specific, including games such as
Kinect Sports and relying on teachers to make it educational and
relevant. For example, the distance which a student throws a
javelin can be used to teach decimals. Bowling could be used to
help understand the base 10 system. Darts can be used to learn
addition and subtraction. These can also enable competition in the
classroom which can also improve student motivation.
It is known, and has been shown many times including by Brekke
and Hogstad [26] (2010), that the use of computers and technology
can improve students’ motivation, satisfaction and results. We also
know that students learn in different ways, and that the more senses
used, the better they will retain information, and the stronger the
neurological connections in the brain will be. These are both
evident with the Kinect, which enables children to be physically
active while interacting with technology. When tested in Houston
Texas, as discussed by D. Enriquez-Vontoure in [21], it was found
that the children were much more motivated and engaged in the
lessons. Also, it was evident that students were better behaved,
and developed better team-working skills. It was also found to be
an effective reward and motivational tool to increase homework
completion rates and in class activities. 100% of teachers in the
sample saw these improvements, though only 66% would
recommend it as a useful and relevant instructional tool. 66% is by
no means poor, but why is this so low? Especially if the system
enabled better engagement while combating bullying and
improving behaviour [21]. One suggestion is that teachers were not
sure how to use the resource, or how to relate it to lessons. Another
is that new technology is not yet developed enough in this field. So
why is this technology worth developing? Games have existed in
education for years, so what makes the Kinect such a revolution?
7.3.1 Advantages and Disadvantages of Kinect
The Kinect could be very beneficial in this area for many reasons.
It is cheap, compared to an IWB which will set you back between
£600 and £1600 a Kinect can be bought and plugged into a
computer for less than £100. It is versatile, useful not just for
student interaction, but also for teachers and even parents [21].
Teachers can use the natural gestures interface to interact with the
computer without having to divert their attention from the class.
However, arguably the main benefit of this technology is its
accessibility.
The accessibility of the Kinect means it can be used by anyone.
Students who have disabilities, such as partial or even full
blindness, can use the Kinect easily. Feedback based entirely on
sound to headphones can enable the student to learn and interact
where many other input methods, such as a keyboard or controller,
would fall down [21].
Another aspect of the Kinect which lends itself to education is its
ability to recognise faces. The use of Kinect and Avatar Kinect has
enabled educators in the USA to help autistic children to learn how
to interact socially, by allowing them to interact with themselves,
as well as other people’s avatars in a safe and comforting
environment. [21, 23]
There are other aspects of the Kinect which one can see being
beneficial: The multiple microphone array, for instance, could
allow for a single player to be heard in a room full of children. The
RGB camera has been used as a webcam to provide the
opportunity for international conferences, allowing children to gain
a global awareness [21]. The speed with which the Kinect can track
people would be useful, and the ability to recognise people will
ensure that scores are kept safe even if two players change quickly.
The motorised base will also allow for players of different heights,
which sound trivial, but would save a lot of effort and time.
One disadvantage, however, is that students’ engagement will
decrease over time when playing the same games, so the Kinect
will require constant updating to stay interesting, though this is
nothing new and is true for all tools in education. Teachers will
also be required to plan lessons with it, and to be creative with it,
and this can be daunting as well as incredibly time consuming.
However, despite all these advantages, we still only have 66% of
teachers in Houston schools who were chosen as a sample to test
the Kinect recommending it as a useful tool. To understand why
this is, I would propose an experiment into the effectiveness of the
Kinect in education as follows:
7.3.2 Experiment
I think the future of the Kinect as a tool in education lies, not in the
ingenuity of teachers when applying games to learning, but with
more specific and specialised applications. Games programmed
with the sole purpose of teaching using the Kinect allow for much
more interactive lessons with much more specific lesson
objectives. For this reason, any experiment into the effectiveness of
the Kinect must reflect different methods of teaching. So, I would
propose that a pre-existing game or method is tested, along side
two or three bespoke games programmed from scratch with
varying levels of interactivity and utilising different aspects of the
Kinect. It would then be important to test these fairly, alongside a
control group, to see which, if any, has the best impact on learning.
It would also be important for the testing to adhere to common
practices, such as the experimental design model, so the results can
be reliable. For this reason, a test schedule which includes two
periods of contact as well as two periods without contact should be
applied. This can provide information on how the tool actually
impacts on the class as well as hopefully showing the impact over
time. Theoretically with scores higher during the second noncontact period than the first if the tool is effective.
To get fair results, I would suggest 4 games in the same subject
area, in this case maths. The first should already be implemented, a
good example being the use of Kinect Sports darts game, as seen in
figure 8, to improve basic arithmetic skills.
Figure 8 – Kinect Darts is being used to aid with teaching maths
Secondly, a bespoke game which enables more teaching to be done
directly through the game could be used. Again, I would suggest a
game based on darts, but which utilised maths questions to tell
students what to aim for, and awards a high score for getting the
right answer. This would provide a sense of competition between
students to get the answer right, and could be supplemented with
time constraints or bonuses. This will allow a direct comparison
between bespoke software and use of a pre-existing game.
Thirdly, I would propose a less interactive game which will focus
more on purely getting the right answer. A game which utilises the
Kinect’s ability to follow two players at once could have two
students side by side competing to answer the question on screen
first from a selection of answers simply by moving their hands
over it. The class could help by shouting out the answer and
competition between two teams, say boys and girls, should provide
ample motivation. In addition to this, players could have to swap
after each question to improve team-working skills.
Finally, I would use a more active game. A game in which students
attempt to pop a virtual balloon containing the answer to a question
asked on screen by jumping on the correct one, would allow
students to be much more active. Penalties could be used for
popping the wrong balloon, and scores can be kept to encourage
competition.
The testing method will also need to define what it is to be
successful as well as what it is to be helpful. For the Kinect to be a
helpful teaching tool, it would require the teacher to notice
improvements in the class when teaching. This would be quantified
through a questionnaire which collects information on the
motivation, behaviour and teamwork of the children as well as
other important aspects. In order to call the system successful,
there would need to be some evidence of an improvement in the
amount students are learning. This would require some form of
assessment to be performed before, during and after the
experiment. However, it was suggested by Cameron Evans, a
Microsoft employee, that conventional examination methods such
as multiple choice, might not directly show how successful the
Kinect can be [21]. He even suggests a more interactive exam to
complement the more interactive learning style. However, a short
multiple choice test which must be short enough for the student to
complete more than once could be used.
These games use a wide range of techniques which the Kinect
makes available and the relative successes of each can show us
both if the Kinect can be a useful tool and hopefully which aspects
are most beneficial.
A student undergoing any form of teaching should improve over
time, so there must be a randomly assigned control group which
never gets contact with the system in order to provide a
comparison. A further issue comes from the difference between
understanding and knowledge, for example: is it important for a
student to understand why 1 + 1 = 2 or just that it is? Clearly,
understanding is vital in many situations but in others, such as how
to drive where it is not required for the driver to understand why
turning the wheel turns the car, just knowing that it does is enough.
The tests should include questions which are easy for most students
even before contact, plenty of examples of work which is covered
during the contact as well as some which are not covered, but use
skills which are taught. This will show both how well they retain
knowledge, such as 1 + 1 = 2, but also how well they understand;
for instance if they then know 102 + 1 = 103 even if it never comes
up. The scores for each pupil can then be compared at each stage to
see any improvements, and the overall scores can also be
considered. These can be compared to the control group to see how
much improvement is down to the Kinect. An extension could look
into another subject area, and swap control groups such that the
control group in one subject is the contact group in another, which
will avoid them becoming demoralised which could skew the
results.
The results of the questionnaire and the results of the tests can
provide qualitative and quantitative data with which to conclude
the effectiveness of the Kinect in the setting.
7.3.2.1 The Games
For this experiment, I would test in a primary school, looking at
young children, though the methods could equally be applied,
through different games, to any level. Even in higher education as
is mentioned by Barilovits and DePriest (2011) [24].
7.3.2.2 Collecting Results
There should be a questionnaire for the teacher and tests for the
pupils at each of the stages of the experiment. Initially, looking at
the ability the pupils show and what the teacher’s initial opinions
are, then at each stage we want to see how the scores and the
teachers’ perceptions change until we get the final results.
The questionnaire should focus mainly on how the pupils are
motivated, how they behave, how they perform and how they
interact with their peers. This information is important, as a method
which is effective at teaching, but results in students misbehaving
is not beneficial.
7.3.2.3 Predictions
I would expect the results to reinforce the success of the Kinect in
other applications. Student results should be higher after the
experiment and in particular during contact periods. Also, the
teachers should find that children are more motivated and more [12] Nick Hawes, Matthew Klenk, Kate Lockwood, Graham S.
interested in class as well as other areas of school including
Horn, John D. Kelleher, Towards a Cognitive System that can
homework. I would suggest that the more active methods would be
Recognize Spatial Regions Based on Context,
more fun to start and therefore could have the biggest impact as
http://www.cs.bham.ac.uk/~nah/bibtex/papers/hawesetal12cdsr.
children like to be active. It is also known that physical exercise
pdf, accessed 11/11/2012
strengthens neurological pathways when learning.
[13] Stowers, J., Hayes, M., & Bainbridge-Smith, A. (2011, April).
Altitude control of a quadrotor helicopter using depth map from
7.4 Other Applications
Microsoft Kinect sensor. In Mechatronics (ICM), 2011 IEEE
Other applications range from simple interactive experiences in
International Conference on (pp. 358-362). IEEE, Istanbul,
public buildings, such as banks or museums, to choosing what to
Turkey, accessed 11/11/2012
wear in the morning [20].
[14] Biswas, J., & Veloso, M. (2012, May). Depth camera based
indoor mobile robot localization and navigation. In Robotics
8. Future work
and Automation (ICRA), 2012 IEEE International Conference
on (pp. 1697-1702). IEEE.
I would like to run the experiment proposed in this paper in order
to assess the impact which the Kinect could have on education. I [15] R. Rexit. (2011, December 15). Visualiztaion of Posture (for
Kinenct Pain Recognizer), accessed 12/11/2012
also feel that the Kinect could continually be developed in new
https://docs.google.com/viewer?url=http%3A%2F%2Fwww.cs.
areas as the interface it offers is so easy to use that the possibilities
pitt.edu%2F~chang%2F231%2Fy11%2Fproj11%2Ffinalruh.pdf
really are huge
[16] Chang, Y. J., Chen, S. F., & Huang, J. D. (2011). A Kinect9. Conclusion
based system for physical rehabilitation: A pilot study for young
adults with motor disabilities. Research in developmental
The Microsoft Kinect is a very impressive piece of technology
disabilities, 32(6), 2566-2570.
which enables a plethora of new uses. Interfaces controlled by
natural gestures are a phenomenal advancement as it will allow [17] D.K. Zondervan and D.J. Reinkensmeyer, Kinect-Wheelchair
Interface Controlled (KWIC) Robotic Trainer for Powered
learning curves to be dramatically reduced. The technology is also
Mobility
cheap and durable, while still producing quality results. There are
many areas in which it can be applied, and we have discussed but a [18] A. Janoch, S. Karayev, Y. Jia, J.T. Barron, M. Fritz, K. Saenko,
and T. Darrell, A category-level 3-D object dataset: Putting the
few.
Kinect to work. ;In Proceedings of Computational Methods for
The technology in the Kinect is revolutionary as it enables people
the Innovative Design of Electrical Devices. 2011, 1168-1174.
to do extraordinary things for a very low price. It is also found to [19] Smart Boards have revolutionised the way that we teach today,
be highly accurate with the data it provides, and though not the
http://www.stmarys.pta.school.za/pebble.asp?relid=56, accessed
most accurate camera, it is arguably preferable due to the
12/11/2012
durability, flexibility and ease of use. We also noticed that the use [20] http://www.youtube.com/watch?feature=player_embedded&v=
of infra-red enables the camera to be used in almost all light
1jbvnk1T4vQ#!, accessed 13/11/2012
conditions, an improvement over other stereo methods. It is also [21] A.Popa, D.Enriquez-Vontoure, C.Evans, Bring Learning to Life
less computationally intensive than passive methods as it avoids
with Kinect, accessed 27/11/2012
correspondence algorithms.
http://msevents.microsoft.com/CUI/VideoDisplay.aspx?Registr
ationId=1312602906&culture=en-US
It has been shown how the Kinect could be applied successfully in
the fields of medicine, robotics and education, and proposed a [22] IWB prices, http://www.projected.co.uk/smartboard.htm
[23] http://usatoday30.usatoday.com/news/health/story/2012-05method to evaluate it effectiveness in education. Indeed, the
31/video-games-autism-students/55319452/1, accessed
Kinect offers a huge range of options for many multimedia
27/11/2012
applications across the board, and has successfully been integrated
[24]
DePriest, D., & Barilovits, K. (2011). LIVE: Xbox Kinect© s
into some of these.
Virtual Realities to Learning Games. In 16th ANNUAL TCC
Worldwide Online Conference, Hawa,
10. Bibliography
http://etec.hawaii.edu/proceedings/2011/DePriest.pdf, accessed
[1] Borenstein, G. (2012). Making Things See: 3D vision with
27/11/2012
Kinect, Processing, Arduino, and MakerBot. Make Books, 978- [25] Kinect Hardware Image, http://www.socialphy.com/posts/do-it1-449-30707-3
yourself/7717/Do-it-yourself_-Kinect-teardown.html, accessed
[2] http://gilotopia.blogspot.co.uk/2010/11/how-does-kinect-really28/11/2012
work.html, accessed 11/11/2012
[26] M.Brekke, P.H.Hogstad, New teaching methods - Using
[3] Forsythe, T., & Green, M. ADVANCEMENTS IN 3-D
computer technology in physics, mathematics and computer
SENSING TECHNOLOGY IMPLIMENTED BY THE
science, International Journal of Digital Society (IJDS), Volume
KINECT, accessed 11/11/2012
1, Issue 1, March 2010, accessed 01/12/2012
[4] http://venturebeat.com/2009/06/02/microsoft-games-executive- [27] F.Pece, J.Kautz, T.Weyrich, Three Depth-Camera Technologies
describes-origins-of-project-natal-game-controls/, accessed
Compared , accessed 05/12/2012,
12/11/2012
http://www0.cs.ucl.ac.uk/staff/F.Pece/page9/files/abstract.pdf
[5] http://mirror2image.wordpress.com/2010/11/30/how-kinect[28] N.Hawes, L.Vernall,
works-stereo-triangulation/, accessed 12/11/2012
http://www.birmingham.ac.uk/accessibility/transcripts/Nick[6] http://www.joystiq.com/2010/06/19/kinect-how-it-works-fromHawes-Robots-the-reality.aspx, accessed 05/12/2012
the-company-behind-the-tech/, accessed 12/11/2012
[29] Picture of Dora the Explorer Robot by N.Hawes,
[7] http://www.roborealm.com/help/Microsoft_Kinect.php,
http://www.flickr.com/photos/61648734@N06/6021726504/siz
accessed 11/11/2012
es/m/in/set-72157627384115286/, accessed 05/12/2012
[8] CogX Project website, http://cogx.eu/results/dora/, accessed
[30] MicZhang, Z. (2012). Microsoft Kinect Sensor and Its Effect.
11/11/2012
Multimedia, IEEE, 19(2), 4-10.rosoft Kinect Sensor and Its
[9] http://www.i-programmer.info/news/105-artificialEffect
intelligence/2176-kinects-ai-breakthrough-explained.html,
accessed 11/11/2012
[10] http://electronicdesign.com/content/topic/how-microsoft-sprimesense-based-kinect-really-works/catpath/embedded,
accessed 11/11/2012
[11] http://www.cs.bham.ac.uk/~nah/bibtex/papers/gobelbeckeretal2
012.pdf, accessed 11/11/2012
Download