Document 17864642

advertisement
>> Kori Inkpen Quinn: Okay, we'll get started. Thank you all for coming. I'm really happy to
introduce Hunter, who's visiting us here today. Seth is just finishing up his PhD at the MIT
Media Lab, and I think one of the challenges Seth probably has, he has too many interesting
projects to share with you in the timeframe that we have.
So we'll see some of his work, but feel free, if you have time scheduled to talk with him, I think
he probably has lots of other projects that he's not able to show today. With that, I will hand it
over to Seth.
>> Seth Hunter: I do have a large variety of projects, but I thought I'd start out by talking about
integrated defense acquisition technology and logistics lifestyle management. April Fool's.
Happy April Fool's Day, everybody.
What I thought I'd talk about is just my current work, so you could sort of see what I'm working
on, which is nice, because then it sort of gets it out of the way and I can talk about how I led up
to that. I always hate waiting until the end of a talk to see what someone's working on now. And
then I thought I'd talk a little bit about my background. It's a little bit different than a traditional
researcher, but I think that could be an advantage here. And I have two in-depth projects where
I'll talk a little bit more about a full kind of spectrum analysis and the process leading up to those
projects. One is the MemTable, and the other is something I call VisionPlay, which is about
using physical objects to control digital characters with other people, at a distance. And at the
end, I'd like to just sort of touch on what's next for me and some of the areas that I think would
be great to grow.
So the fundamental problem, as I see it, right now in the current research I'm doing is that we're
separated. When we're videoconferencing with each other in all these different environments,
you still have two windows. You're not really doing things in the same space together. And the
other problem is that children lose interest very quickly. A lot of Kori's research and research
from Nokia has shown that, especially in the ages of four to seven, kids get distracted easily by
what's in their environment and they have a hard time knowing where the camera can see them.
Maybe you've experienced this yourself. The Kinect and the Wii have really brought families on
their feet, using their bodies, doing things together in the same space, and I find this really
powerful as a way of getting people together and getting also parents involved in media, new
media experiences.
So this is an artist named John Clang. You may have seen some of his work, and these are just
Skype calls projected on the wall. I think it points to this idea that we really want to be together.
We want to be in the same space. So what I've been working on more recently is called
WaaZam. This is six people, three in one space, three in the other, and what is WaaZam? Well,
WaaZam is like Skype plus Kinect, plus your imagination.
You take the Kinect camera right now and you plug it into your PC, plug the PC into your TV,
and WaaZam, you're connected to the other person. So what really makes WaaZam magical is
that you can do things in the same space together. I'll show you a little bit of footage from the
pilot study that I've done.
It's really rough footage, but it's two kids in two different spaces. This is a dad and a kid and two
interfaces. A lot of goofiness, a lot of roleplaying, a lot of acting things out, this notion of
pretending, because it's not really the real self, but at the same time, you can do things you
wouldn't normally do, like punch the other person.
For mother-daughter, dancing is something that can be really fun to do together, and I think
you've seen this with the Kinect camera. Some of the more popular games are about imitation,
using your body in interesting ways.
So the next thing I wanted to highlight is sort of what is my design process and motivation?
Where do I come from? For me, interaction design is really about people, and I always think
about what it means to have a face-to-face interaction with somebody, versus a mediated one. So
this is a piece that I made at Mediamatic in a four-day hack, and it's called Staring Contest, and I
thought I'd highlight it as a way of thinking about it.
So what we did in this project is to do blink detection. Whoever blinks first loses, and a sevensecond video gets uploaded to a social media site of their losing moment. And this is, like, super
popular, and also, hundreds of people talked to me about how much they enjoyed having this
very unmediated interaction with another person, where you're really staring into their eyes.
Even though it was just a provocation -- an art project to me is a provocation -- to me, it was a
provocation about HCI. It's about what it means to interact with each other. Can you have
technology facilitating that without ruining the feeling that you have, really making eye contact
with somebody else.
But this conversation happens within a media ecology, so I see the process of interaction design
as a systemic one, where we're really thinking about information, displays and people, and that
we have a responsibility when we're looking two to five years ahead to think about how we're
going to interact with each other and to identify questions and respond to technology trends that
are happening. So these are inspiring people who've talked a lot about how we interact with
objects and people.
And I've worked a lot with Sherry Turkle, who asks the question, does technology serve our
human purposes? And this is really what I asked myself all the time, and it's why I'm interested
in the social aspects of our engagement with technology, because I think to a certain extent, we
can simulate our relationships with each other, or we can strengthen our relationships with each
other. And in this sort of transmedia environment, where people are participating in multiple
ways with media, I think that empowering them to be producers versus consumers is another part
of what motivates me personally as a designer.
So thinking about how we can strengthen our social relationships and how we can be more
creative with media, in a social way, if possible, so scaffolding how children learn socially, but
also how we interact with each other when we're interacting through a screen with each other.
So, typically, the way that I work is identify what my values are, ask really core questions about
why I'm creating something, and then I like to go through more of an art-based process, theater
process, where I sketch out things, role-play them, storyboard those things, outline, develop, test
and revise.
And the implementation of the technology, I think for me, a lot of times it happens,
programming, but really with collaboration with other people. So I think it takes a vibrant
community of technologists to build things that are meaningful to people, not just one person.
The idea is like 1 percent of it to me. Within such a complex ecosystem of tools and
methodologies, it's really how to understand that technology and how to collaborate with others
that's important to me, so that's one of the reasons I'm here, is to find really strong people to
collaborate with.
Now, looking at my background, you may ask, what role do the fine arts have in interaction
design? What role would fine arts have at MSR? Because I do have a background in the fine
arts, coming into the lab. So I want to share that with you, because I think there is a thread that's
important and often unrecognized.
I started out by making these really, really large projections of one-inch square images in a
gallery space, like the size of the wall. And to me, they're just quite beautiful. This is a train
texture. These are pictures of cement. Different pictures of rust. And I was really interested in
the structure of these things, so I started making algorithmic programs to try to simulate this and
exploring the ephemeral nature of those things, the complexity you could get with writing
programs.
I started making prints, and then printing these things and selling them in galleries, for about a
year and a half. But I did find that, as you learn to generate different algorithmic forms, they
start becoming less and less interesting, because there's no social component. You're really
operating within the art economy, which is more of a commerce or I guess a way for people to
have wine and cheese.
So I started thinking much more about visualizing something more social. These plants are part
of an exploration called Word Garden, and Word Garden is a survey, so it says, like, mother,
father, color, sister, brother, on the roots, and then all the associations people had are the sprouts.
They're organized by either negative, positive or neutral, which people rated after each
association, and the time that they took to answer is the length and the curliness of each of the
plants.
Looking at this one, you can see the person took a moderate amount of time to answer, and they
answered in a variety of ways, mostly positive. This person took a little bit shorter, except for
the first two and maybe the last one and answered quickly, as well. This person was all positive
and answered very quickly. This person took a really long time, was very thoughtful and had a
diversity of answers.
So you can begin to see the characteristics in the plants of the people who are filling out the
survey, and so using information visualization as a way to kind of make an artwork, and then
share that with the person who took the survey, so each person would get a print in the mail.
And these things were all shown in a gallery, as well.
Thinking about just visualization of the self became more and more interesting to me, so I started
making portraits about interactivity itself. These portraits, this one was called Stillness Clock,
Motion Clock. So Stillness Clock would be this clock. It would tick when you were standing
still in front of an interface, and Motion Clock would move when you're not actually paying
attention to it.
So you'd get something like this. This is sped up, but what was interesting about this was asking
the user to think about attention itself, to think about engagement, and then the interface is
measuring that in two different ways. And what we got when we showed this in six different
locations around the world, we got all these different portraits of people, split-scan portraits of
people interacting with the system.
And this brings up this notion of time. So this is a picture that never happened. At different
times during a party, people came up to the camera, took the picture, and then they were all
juxtaposed together. I call this concept Chronographer, and it's something that I've been
exploring in the Media Lab as an art form. It's the notion of remixing time in some way, so
you're taking moments from the past and mixing them with the present, using background
subtraction.
This is an exploration to kind of think about this, a provocation, like of history trails, which
eventually became an installation in our lab. This is like a slow glass kind of thing, where
looking down on the space, you can see the history of the space over time, using background
subtraction in OpenCV. So, throughout the day, you could walk by this monitor and glance at it,
and you'd begin to see different things that had happened in the space.
We also began to explore time as a medium, interactive medium, looking back and repeating
time. In this case, you would see yourself at 10 seconds, at 20 seconds, at 40 seconds, repeated
in that kind of infinite loop. And this is an installation that we put in the Media Lab, sort of
exploring this notion of seeing not only yourself, but other people who'd been in the space at the
same time.
This is also something I was talking to Kori earlier about, this notion of the second self. What
happens when you see yourself repeated? How does that change your perception of yourself
over time?
A lot of these portraits that I worked on -- this was six years ago -- were really about seeing the
self and seeing the self over time in relationship to other people. And I'm still exploring these
ideas, in many ways. Still Watching, really, was a portrait of yourself that would trickle in over
time, so the longer you stayed in front of it, the more distinct your image would be.
In a way, this is like the opposite of interactivity. I'd call it a passively responsive interface.
And so, again, these are provocations about interactivity, and they eventually led to a project in
2007 that I worked on called the MetaDome. MetaDome is an immersive space. I know Andy's
done a few things in this area, but it's an inflatable dome with a spherical projector in it. And the
goal of the project for me was to use the universe as a metaphor to draw people into awareness of
each other and themselves, during an immersive cinematic experience.
So people would go into the dome for about four or five minutes, and they'd sit in these video
chairs. The chairs would measure how much they were moving, and as a result, the stars would
either come out of the sky and join them or sort of stay in the sky, depending on how little or
how much they were moving.
Eventually, the stars would migrate to your head in the space and then join everybody else's at
the apex of the dome. And the sound would go from being a kind of a static shhhh to being more
like wooooo, something that felt more like harmony or engendered a kind of -- I worked with a
sound artist on it, but engendered this feeling of unity with others. In part, this came out of the
fact that in Chicago I couldn't see the stars. Ninety percent of the stars are occluded by this kind
of pink haze over the sky there. And in part, it was about thinking about how an interface might
engender social unity.
And finally, the last project, just about my background, I've done a lot of stuff with immersive
projection in the Media Lab, so I've built systems that create really large projection with four
projectors, thinking about how you blend together different spaces. This is really a platform for
other people in the lab to develop something, so we made sketches that work in openFrameworks
processing ActionScript. And what I was interested in, what new social possibilities exist
between people when they're in a space, and what new ways can people use their bodies to
interact with information?
What I learned from this, though, is that in something this large, you really read it from multiple
spaces. You read it from the fifth floor or the fourth floor or the third floor. Wherever you are in
the space, it reads differently, and so you have to think about it on multiple levels. It has to be
socially scalable, but it also has to be viewed from many different places.
In a way, it's simple. I'm just tracking people and giving people that information about the
contours and the direction they're going, but a lot of people did magical stuff with it in the lab,
and it's part of the arts festival coming up. So often, in my practice and institution, I try to
provoke and think about artistic possibilities, as well as do more HCI-oriented research.
So now I'd like to show you two projects that are more on the HCI cycle or track. The first one's
called the MemTable, and it was my master's thesis at the Media Lab in 2009, so I'll spend about
10 or 15 minutes talking about that. What the MemTable was focused on is, again, thinking
about this social interaction with each other, but really thinking about how we interact with each
other in meetings and in brainstorming and looking at how we arrange spaces.
We use stuff, right? We use large surfaces together to arrange objects, so I was thinking about
this when I was taking these pictures, just really thinking about surveying how people use spaces
together, and thinking about how would an interactive table be in a space like this? One of the
critiques that I had of Microsoft Surface, at least the first one, was that you couldn't put your legs
underneath.
I've seen a lot of people using it in our space in fairly awkward ways. The display wasn't really
always integrated with the objects on it. We did use the Surface a lot to do prototypes. I was
also inspired by Vannevar Bush's original idea of this sort Memex table, a table that remembers a
memory of what happens inside it.
Pierre Wellner's work, Reikimoto, Bill Buxton, all of them were about integrating the physical
and digital objects together. And Bjoern Hartmann's work here at this -- I guess it's not in the
related work, but I corresponded a lot with Bjoern and other researchers who had done the fourby-six table here.
Just thinking about how do you interact on large surfaces together, but still be able to put your
legs underneath, have it be an ergonomic experience. As I began to think about it more and more
deeply, we just tried to take a very integrated approach. At CHI in 2009, there was this
discussion about killer apps, or is multi-touch a dry well? My take on it was the reason why you
didn't see a lot of large screens in the workplace is it didn't work well for a number of reasons.
And so an integrated approach was to try and attack many of the different pain points in terms of
using tables together.
One of those is just being able to work at it daily, putting your legs underneath. So I worked
with Steelcase on this design, and the primary part of the design that's interesting is this box in
the middle, which centers the projectors and the cameras together, so that all the sensing can
happen and you can still put your legs underneath, because there's a border around the table and
that border supports the objects.
It was in our lab for about three years, and what it does is, it saves the history of what you do at
it, and so I'll show you a little bit about that. So does it not only save the history of what was
happening in our lives at a given time, but we also did a lot of user studies, and so I'll show you a
little bit how it works.
When you touch your face, you get this menu that comes up, and you can fling the menu to the
side, kind of like a hockey puck, and it sticks to the edges. And there's two things that you can
do. You can either take things out of the table from a previous meeting, or you can put things
into it. So you have these two icons here, and then there's a heterogeneous set of inputs that you
can use.
So we try to support as many different work styles as possible, and so let's say you start a
meeting and you wanted to go back to something that happened previously. You would scan
over the landmark events on the timeline from a previous meeting, and then you could grab
something from within that widget and pull it into the current meeting. Then it would become
part of the current session, and the next time you go back to that meeting, you would see that
thing.
So somebody could come in five minutes early to a meeting and pull out a few things, just to
remind you of what you were doing the last time you met together. Each of the keyboards is
associated with the person, based on these menus, and you could enter things into the system that
way. And then each of these entries would be tagged, so you could tag it manually, or you could
tag it asynchronously offline, using Google Wave.
One interesting thing we use is the Onoto Pens to synchronize paper real time with the kind of
virtual notebook on the system, so that way you could take the paper with you, but you'd also
have a digital version saved. There were overhead cameras to capture objects, and we eventually
used Eye-Fi cameras, so that people could sort of creatively take pictures anywhere around the
system.
And there were a lot of features built into this. I worked on it for about a year and a half, off and
on, really trying to think about -- I was really trying to make a system that worked well and
pretty seamlessly, so we had PC and Mac FTP clients, so you could send a screen capture from
your computer to the system, and multiple desktops and audio recording.
A lot of it was really about, okay, now what do we do if we have a system where we can record
all of these things? I don't recommend virtual keyboards. That was just an experiment. And
also, if I'm developing on this thing daily, what utility could it really serve in my life?
So the typical scenario would be that we would meet together after having a meeting with
physical prototypes and we'd see something about our last meeting there. So I'd bring a
prototype to the table, maybe sketch a little bit about it. We found that people did not use the
audio very much. They would rather have the audio in sync with another form of input, so if you
touched the sketch, you could hear what somebody said during that time, or you touched some
other input.
So one of the things we did is analyze all these different forms of feedback, forms of entry, into
the system. This is just giving a sense of maybe what a meeting would look like as it built over
time. So the first two components is to save a memory is a real way to add utility, to make it
ergonomic, and the third thing was to integrate it with an offline review process, where people
could tag, so then you could search either at the table or offline, using Google Wave.
So we integrated this with Google Wave. I worked with some great undergraduates at MIT to do
that. I think this allows people -- people tend to make decisions not during meetings, but
asynchronously, at their desks or when they're reflecting, so this integration was really about
engendering that reflection process for people.
There were a lot of different types of input and output and integration in this project, but what
did we learn from it? So we did a study where we compared people using the table just with
objects, but the table was completely off, so they're in the same environment, but without any of
the capabilities of the table. And then we did studies with people using the table -- three of the
groups used the table and three of the groups did not, in the study. It was paper-based versus
non-paper-based groups.
So you can see the study here, what one of the groups looked like. One of the things I noticed is
that the table actually creates this formal space. You really have to navigate a lot, because the
system's on, but also, the table itself is very formal in some sense. It really made people
remember more, when they were using the system, I think because of having the menus in their
seats and watching what was entered into the system.
What we found is that people did not remember more accurately, but they remembered in greater
detail and with more richness when they had the digital component as well as the physical. So
I'll just give you a sense of, over time, how these different meetings would occur.
And subsequently, I started thinking about what could really work in a workplace, and why
haven't we seen these things? And I have a few ideas about how people use spaces. I think that
integrating two vertical displays on each corner would allow you to present when you want to
present, and then the table itself could be a sharing space, where you deposit things into the
memory of the system, but you could also share assets with each other across the table.
One of the problems for doing research in this space is that, in order to set up the system, there's
so much work in terms of keeping track of the memory and making sure that people are enrolled
in the system. You can get way deep into just the temporal aspects of how people meet together.
It was a new space for me when I was doing this, but we published a paper at CHI, and it was
really focused on thinking about an integrated approach and how did people really use this
within our lab?
Well, I think they used it in playful ways, and that's what I really liked to see. It became a kind
of place for people to leave messages for each other. It became a space for people to gather. It
was one of the few meeting tables that was always open and on, because it was a demo.
This is kind of how I would perceive of a much better version of this. Add remote capability and
add two displays on either corner in some way, so you can present, but you can also incorporate
remote groups in some interesting way. So looking at the future of the space, I think something
that might really work will incorporate those aspects.
This is the second project, and it's really a series of projects that have led up to the current
research that I'm doing. Has anyone seen Siftables or know about Siftables? I guess so. So I
worked with David Merrill in an office for two years, and what I was interested in with Siftables
is making applications for children, because they have this intuitive -- kids know how to use
blocks, but they haven't used a mouse and a keyboard as much.
So I began working on an application that connects the Siftables to a larger screen, called
TeleStory, like T-E-L-E Story, and the idea was that if you held up the sun, it would become
daytime. If you held up the moon, it would become nighttime. You changed the environmental
parameters.
If you showed the screen the tractor, then the tractor would come in. What I noticed, and you
might notice this about this particular child, was that they would show the screen something, as if
the screen could see it. This is really a nice takeaway for me, just from an intuitive design
standpoint, this notion that we want objects to communicate with each other, and so we kind of
signal to those objects. Or at least kids think that the TV can see it, so we're working on TVs
that can see us, and so I started outlining what I call VisionPlay framework. And this is partly
because I was working for Hasbro at the time during the summers, and Hasbro was sponsoring
my research. That might put it into perspective why I'm so interested in objects and puppetry
and expression.
What can you do? What are the playful things you can do with computer vision, and why is that
interesting? So puppetry is one. Real-time animation is another way of talking about it. Remote
playing, two people playing with objects at a distance and then using that to create content, and
mixed-reality scenarios, where you see yourself in the story.
So I'll try and show you examples of each of those. But what inspires me are components of
creativity, so transformation, social play, interactivity, gaining ownership through creating
something, storytelling and fantasy.
When we interact together and we engage in some way that transforms us or engages us with
each other or gives us ownership of the content, I think then it becomes magical in some ways.
And so imagine, you'd make your own drawing, or a kid makes their own drawing, and then they
hold it up to the screen, and that's what appears on the screen. Then, they make an object or they
have their favorite toy, and they hold that up, and it appears.
And then they call a friend, and then that fried holds their object up, and that appears on the
screen. So this is kind of my dream, is that people will start making animations at a distance
with each other in many different ways, and I've been exploring how you would go about doing
that.
So the first way I explored it was pre-Kinect. This is like a floating green-screen concept, I
guess. It's a glove with a character in it, and as you move the character up, it gets smaller
towards the horizon line, and as you move it down, it gets more in the foreground. So there's this
loose 2.5-D mapping to the world.
The cool thing about it, real-time segmentation, is that you can put any object in there. You can
put an owl or your hand, and puppets are especially expressive in terms of objects. So Hasbro
was interested in My Little Pony assets from their TV show, running up and playing with My
Little Pony when you hold up My Little Pony, and I think that's very interesting, as well. That's
probably where they would take it.
But they were also interested in action figures, so I started working with a puppeteer, and what
makes puppetry interesting to me is really the form of the character is very similar to what you're
controlling on the screen. With a PlayStation, when you press a key and that makes the character
jump, it's not the same as making the character jump with your hands.
So I was interested in this scenario where I would be holding a puppet and you would be holding
a puppet. I would be controlling the samurai, you're controlling the dragon, and we're playing
together, at a distance. How could you accomplish this?
I tried a number of different scenarios. I tried using markers. I mean, some of you have worked
with Roy, I guess, who is here. Using shape description, and ended up creating a kinetic model
in Box2D, a physics program, that could loosely correlate to the physical model on the top left
there.
This is when I was still debugging it, but eventually it started to look more like this, where you
could just hold a character up and then, 30 frames a second, the sort of costumed version of the
character on the screen would react. The difference here between this and the previous approach
with the floating green-screen is that it can interact with objects in the scene, and it can also have
behaviors of its own. So there's a much looser correlation between what you're doing and what
you see in the digital character.
The limitation is that you can't hold up multiple objects, right, unless the system somehow
figures out what it is that you're holding or what the properties of that action figure are. This is
sort of how I would costume it, using Photoshop, mostly, with Alpha PNGs.
So we did some pilot studies with kids, and what we found is that children have difficulty with
puppets that have more than three sticks. Two sticks is much more ideal, but they're extremely
interested in controlling things on the screen and also just discovering what the affordances of
those things are, so figuring out that relationship between the object and the thing on the screen.
And the more closely correlated it is, the more expressive it is, the more interesting the
engagement between the characters.
But they have trouble seeing where the camera is, so one of the things I've done is to build a
stage that kind of tells you where the camera is and try to even put glass there so that you don't
go too far forward, where the camera can't see you anymore.
So once you get over the novelty of something like this, what can you do with it? It's a difficult
space to parse. I've also tried, for Hasbro, again, putting your face into the story using hard
classifiers, so you can kind of control your own doppelganger, I guess.
This is really tricky, because when you're trying to do faces and move the character around at the
same time, your mind gets split into two different directions, so my takeaway from this is that it
might be better to record these things in two different tracks, or have somebody else control your
puppet while you do the faces, at a distance. This is a future area I'd like to explore.
I'm going to skip over some of this stuff, but I've done a lot of green-screen theaters at Harvard
in a puppetry course, thinking about how can we be expressive, real time? How can we make
real-time animations with different objects in more flexible ways, learning things from the film
industry, in a way, about doing stuff quickly?
So I still remember what it's like to be a 13-year-old boy. I don't know how many of you do, but
I try to utilize that. I push that with my research. I've done performances as Steve Mobs, from
the Muppets, and in those performances, the objects -- the toys are advocating to be a part of the
digital experience again, because they've been left out.
So you'd set up a little stage of toys and then they're all protesting and he's speaking. It was
really during the Occupy movement, thinking about how toys might become part of our digital
experience.
Eventually, this led to what I'm doing now, because I was trying to link real-time segmented
puppets together, but I ended up enabling a more composite environment for people. Patty is
very good, my adviser, at helping me make things more generalizable to our sponsors. Not
everybody is interested in real-time animation or expression.
And so I've ended up turning this into a much more general platform and working on it for about
nine months now, and it's been, I think to date, one of the more successful projects. Just to look
at some of the related work that inspires these kinds of telepresence systems.
Many of you know Myron Krueger, I guess, sort of legendary in some circles. In 1975, he was
working on systems like this, and his original vision was this notion of an artificial reality where
we would see these versions of each other in virtual space interacting and overlapping and
transforming in magical ways.
I found a lot of inspiration from this work that explored four different ways that children could
interact together and all the different affordances that come from these different modalities. And
to me, it speaks to this idea that there's a really rich space of mixed-reality experiences that can
connect children and parents and children and children to each other.
This was a really informative work for me, and also Nokia's work on storytelling at a distance.
And Sean's a really good friend of mine. When he started collaborating with Nokia -- I think
they're doing a startup based around this concept now, which is really interesting to see, and I'd
love to follow that work.
But the concept is to see yourself and the other person in the story during bedtime experience,
experiences with children. And there have been compositing systems prior to what I'm doing.
I'll talk a little bit about what's new about what I'm doing, compared to these previous
compositing systems. So there's been one from our lab. Stefan Agamanolis from 1997, and then
there's been a lot of green-screening-based systems for people to do things together in shared
environments.
I think this one's called HyperMirror. BT has worked a lot in these spaces, too. They have an
$18 million European initiative called VConnect, which is about engendering this feeling of
being together in spaces, and I invited BT to the workshop that's happening at CHI, so they'll be
there, and we'll be able to engage more with them on this.
But I'm really interested in what researchers out there are thinking about shared experiences and
composited video environments. Roy Ascott and Paul Sermon have also been working for 20
years in telematics art experiences that happen in these sort of crazy blue-screen sets, where you
can think about how a narrative would take place through real-time acting with others at a
distance.
So what we've built so far, this is in one space, and that's in the other, is a client that transfers the
image real time and then allows them to composite in different ways, so I can either go to your
space or you can go to my space, or we can create a fantastic background. One of the things I
did is to overlap the images in OpenCV so that I can make my body yours or you can make your
body mine. You can kind of build these sets.
So the sets are layered PNG files that are organized in the Z space, and you can place them
where you want to. And what I'm working on right now is a set builder for parents and children,
so they can design their own sets together and then go in and then use them together. And so
we're doing a study where, over a three-week period, we start by introducing the system so they
can get over the novelty of it, so they can try it out together.
And once they get used to it, we will introduce them to the scene-maker, and so the parents will - we'll teach the children how to make the scenes first. We have a fairly simple layout program.
And then the children will teach the parents, and they'll work together to make environments.
And we want to study how they use those environments over time.
So then subsequently, two or three more sessions, do those environments take on more
significance? The system already has a lot of interesting things like you can move characters,
people, around, you can transform their size, a lot of things inspired by Myron Krueger.
The technical implementation I'll skip over, but we're working on the foreground protocol, the
compositing techniques, the 2.5-D renderer. I've been writing all this code, mostly in C++,
OpenCV, to try and build the system. Because I really imagine that people, especially parents
and children -- I would love to see them interacting like this in their homes. Right now, it's only
working in the laboratory.
So puppetry comes back into this as a way of experimenting with how kids tell stories at a
distance. We make puppets. I do these workshops in Hartford, with kids, where we make
puppets, and then we use them with the system, as well. Actually, the most effective way of
using puppets with the system is just to set the near threshold at a certain level, so there's like an
invisible screen. You place it in and out of the screen and you know where the character is.
It's very intuitive. You can learn it in a few seconds. We tried tracking the left and the right
hand. That was too much, I think, in terms of you have to calibrate first, and then sometimes it
would lose the hand and then the puppet would disappear. That seems to be the most reliable
way of exploring, acting out different historical narratives and things like that.
So just a little bit about what my user study coming up next week is. So I'd like to study a couple
of things -- one, traditional versus composited environments. Given a choice between a
traditional face-to-face sort of thing and this merged environment, which one would parents and
children choose?
What types of activities do you do in this environment? What kinds of free play? How does the
engagement vary compared to just a regular videoconferencing session? And what's the best
means of customization of the environment? Is it the offline scene creator, or is it better to stamp
things in real time from your environment?
So I'm going to try to add these features -- and what type of environment is best for the parents
and children? Is it the fantastic background? Is it me joining your space and you joining mine,
or is it a custom space that you've made yourself?
The social dynamics are something that I'm really interested in, so we have some metrics to rate
attention and engagement, and then ownership as well. Does customization give you a feeling of
wanting to go back to the space, if you've helped create it with the other parent, especially over
multiple sessions?
And then, with WaaZam, a lot of the things I talked about earlier, transformation, the magical
things that enable you to fantasize and storytell and improvise with each other, how effective are
those in terms when you're measuring engagement and getting parents and children to play
together?
So definitely adding custom content is something that I'm working on, using physical objects sort
of like I/O Brush in our lab, where you hold the objects up to the screen and they appear, or you
can place them behind things by moving them through the dub space. And then also thinking
about how to map different gestures to things like "I'm flying" or "make me smaller" or "make
me bigger." These are things that we're working on now with some machine learning.
I probably won't get to this in the thesis, but I would love to have it so that you could see
different tracks that you've recorded with the other person and decide whether to keep those
tracks or not using gestures, simple drag and drop sort of thing.
Okay, so last two or three minutes, I just want to touch on what would I have wanted to do next,
or what are the broad areas that I'm interested in? The first one is creative telepresence systems,
so how can a broad variety of audiences really have shared experiences at a distance? That's
really what drew me to come here, because I like what Kori's doing in her group, and I'd love to
work in some capacity, collaborating on that.
So I've been working with some improv artists to make some sketches of the different types of
activities that people would do, so acting things out within our system to kind of create a remote
play together is one thing you can do. Giving a guitar lesson, it's a shared experience that you
can't do in Skype but you can do in our environment.
Co-broadcasting together is something that I think could be quite compelling, where two people
are relating to media and then rebroadcasting that out to YouTube. And even asking your
roommate if they want to come live with you, and maybe they don't, maybe they do, but you can
sort of get a sense of their place. These kinds of even negotiations seem fairly interesting.
I'm also interested in new depth sensors that are coming out that might enable real-time
animation, and so I've been playing a lot with the Leap sensor and trying to map these popsicle
sticks and felt pieces to different birds, animals, mapping out how different animals move and
then thinking about how we can control those animals in 3-D space, not just so you trigger
different pre-rendered animations, but so you actually could tie into the skeleton of the animal in
a physics world, like the previous explorations, but more in 3-D space.
It's a difficult area, because the physics world in 3-D creatures need to have some behaviors, but
you also want to -- so this is, I think, a really fascinating area, and one in which sensors will
enable new forms of creative expression.
Intel has a new camera. It's a time-of-flight camera. Has anybody seen this in the Perceptual
Computing Group? So I was just at Intel last week, and it's amazing. It's a really, really highdefinition, beautiful camera, and they're trying to think about different uses for it. And so I've
been signed up for the STK of this, and I'm really interested in doing more stuff close to the
screen, especially since if we have this thing built into the lid and release it in the next year or
two. So you'll start to see laptops that have this depth capability based on stereoscopic time-offlight cameras.
And the third thing is augmented play experiences in general. So you've seen a lot of my work
and maybe you get a sense of what I mean by that. We have a Tumblr, arplay.tumblr.com. If
anybody wants to follow it, it's where I keep all of my cool videos. If you're always looking for
new research or cool stuff, please follow it and we can discuss new videos together. And then
you can, of course, find my work at Perspectum.com.
So thanks so much for your time, and I appreciate you coming.
>> Kori Inkpen Quinn: Questions?
>>: At the end, it seemed like you were mostly talking about parents and children. Is that the
focus, or is children and children, or how do you think about those groups differently?
>> Seth Hunter: Well, I started thinking more about parents and children, because I saw the
imaginative play and the shared experience as a way to bridge the differences between how
parents and children see the world and how they play. So if you imagine children more like -Alison Gopnik's idea that children are like lanterns and they take an input from everywhere, and
parents are more like very focused on one thing or another, the environment that I'm in, I think it
creates an imaginary context where both people are willing to act out something they normally
wouldn't.
So I saw it as being most beneficiary to that age group. The children and children, when I
interviewed the children during the pilot study, they said that they would want to play it with
their friends, for sure. Their best friend, usually, they would want to play it with when they
weren't in the same space together.
But Patty is also interested in, I guess, how our sponsors will respond to the project, so I think it
has a lot of appeal for us to talk about if you're on a business trip, how you might be able to
connect, or from divorced families. It just seems like the area where there's the most need, and
so that's why I've been focused on that.
Yes.
>>: So one thing I like about what I know about Kinect and the telepresence and so forth is that
it feels like you're not really gunning for photorealism all the time. I was wondering if you could
speculate on sort of where you think that's going to go? Do you think that's there's going to be a
place for this with the non-photorealistic thing? What's it going to look like? What's the new
aesthetic of depth camera imagery?
>>: Seth Hunter: It's interesting, because I visited Cisco and Polycom, and I'm like, "Hey,
would you guys use this kind of a system within your business model?" And they're like, "Nope.
We only do business-to-business, critical decisionmaking, get as realistic as you can to face to
face."
Thinking about you guys, doing more consumer to consumer, I feel like there, if you're with
somebody that you're more intimate with and you feel comfortable imagining, as soon as you
cross that threshold between it needs to be real and it can be pretend, then all of a sudden, it
opens up a whole new space. That's the space that I'm interested in, is when you're pretending
together, what kinds of things will you create?
I mentioned this earlier. I feel like every time we engage in some sort of mediated interaction
together, there is a bit of performance happening. You kind of arrange yourself so you're in front
of the camera. If you're doing a Kinect game, it's a bit of a performance in response to stimuli
and feedback.
So I'm sort of interested in crossing that line from gaming more towards building our own worlds
together. So where I see it going is more a community of people who make these sets together
and they share them online, and they start to be building their own worlds, like Minecraft, where
if you look at the gaming world, like World of Warcarft and Minecraft, all these people are
building their own worlds and they're so into it.
I can imagine people doing that, that there's a lower threshold to parents and children and people
who would normally Skype having a shared activity. They could start building their own worlds
and sharing them with each other, and you'd start to see kids more creatively engaged when
they're at a distance. Does that hint at where you're going, or do you mean more abstractly?
>>: So it sounds like you're banking about it being an emergent property of giving these tools to
other people and sort of see what happens.
>> Seth Hunter: Yes, so that's what I'm really interested in, is if I give this to parents and
children, what kind of stuff will they make? Will they really get into it, or would they prefer to
have this kind of face-to-face engagement. I would think that they would switch between it,
depending on the context of what they're doing.
If they want to go into a virtual space where they can see all the pictures that they've shared with
each other over the last year or two, that would be, I think, a really interesting immersive space,
because you could sort of pin them on your virtual space. If you're talking about more abstract
data sets, like a network, the space that many of many of us work in is a simulated space. So
how do you discuss that space with others at a distance?
If you sort of enter that space together, then you can point to different things and manipulate
different things in the space. So I guess what I'm seeing is that there's this merger between the
real and the virtual, and that merger is happening in many different ways. Information is coming
into the world in the more augmented reality space, but in the augmented virtuality space, we can
also bring our world into the virtual. So it's a continual conversation, where I think there's these
subdomains that are going to be really interesting to explore, one of them being the more
imaginary possibilities.
But I think a lot of it will depend on the social relationships of the group that you're engaging in.
I don't think it would be appropriate for me to use WaaZam with you guys, necessarily. It might
be fun as a demo, but definitely with my cousins and my family. For me, it's about situating it
within the social spectrum.
Yes.
>>: I came in late, so I apologize if you already answered this, but you mentioned a little bit
about the metrics that you were using, and of course, interviews and lots of stories and
anecdotes. Do you have any way -- how do you think about measuring things like, or do you
think about measuring things like, the engagement with each other, the engagement when you
add a picture of the tree branch holding it up, when you put real objects or when you add virtual
objects? Is there a metric for using your imagination to suspend disbelief? How do you gauge
things?
>> Seth Hunter: I mean, the way we're planning on measuring it is to have two monitors, one in
each room, and those monitors have the standard one-to-seven scale, but we have seven things
that we're looking at. One is how much are they coordinating with each other? How much
pretending is there? How much roleplaying is there? When do they add things from the digital
world, and when do they add them from the physical world?
In a way, that's one way of assessing the features, but we also are going to interview people, but
after the three studies, not during them. It's really observing what they're doing and trying to
observe it in a fairly structured way. I've used papers actually from Lana Yarosh to think about
what engagement is. I really turn to actually your group and Nokia's group to understand how to
assess these metrics. And I usually try to also e-mail somebody outside of myself who's had
more experience.
Lana and I and Eric wrote a survey of the motivation and values that people have for the
Interaction Design for Children Conference, so last year we presented that paper. It surveys,
using grounded research, all the papers from the nine years to try to understand what people's
motivations are and how they do assessment, what theory informs their research.
So I try to think more deeply about this, and also Mitch Resnick is one of the people who is
advising the thesis, so he's been helping me come up with some of these metrics, as well. I know
that when I present, it's very visual, and I think that's partly because of my art background, but
also that, for me, I want to make the presentation engaging for people, so they don't fall asleep.
But I also appreciate the depth that an HCI approach has to really helping you grow as an
interface designer, which is how I think of myself.
>>: You said your stuff is very visual. Have you thought at all and played with other modalities
to increase this as a presence, whether it's tactile, auditory?
>> Seth Hunter: Yes, so I think sound is one that I've used a lot. I didn't show any of the work,
but on my website, there's a lot of work about what I call soundforms. There are different
objects that would have different sound properties, and depending on how you lay them out in
relationship to each other, you generate different compositions. This is using Microsoft Surface,
because you could get really nice data about shape of objects from Surface.
That was used in more art therapy type of sessions for children who are interested in learning
musical concepts. I am interested in sound, and I love collaborating with sound artists or sound
researchers. I haven't done anything with localized sound or anything in telepresence systems,
but I've spoken a lot with Cisco and Polycom and visited them and sort of learned more about
how they approach that to create a spatial sense of where people are, who's talking when, so I do
follow the research in that area. And I'm very interested in it. It's just a difficult space to work
in.
>>: I have a question. I've actually [inaudible] dance perspective, and looking at your interplay
and also at the end of your presentation, you had two people interlaid in one screen, would there
be application of projecting out into a larger space where you'd have more physical
collaboration, including two people in different areas of the world? Have you done any
experimentation with multiple people in such a large experience and more immersive, in like a
theater setting in any sense?
>> Seth Hunter: I think there are some people who experimented with this. I'm not sure with at
a distance, but there's been a lot of people using the contours to generate graphics or to create an
immersive theater experience, where there is digital content that's mixed with real-time
performance. Those are really inspiring to me, but one of the, I think, interesting things about
awareness interfaces is, I guess you guys have a new Microsoft Research in New York.
If you could connect these two spaces together in a public installation that was in the same -- in a
central place in both buildings, I think it'd be interesting to see this sort of time-based -- to
experiment with how you could make that work. Because people wouldn't always be standing at
the same time in front of each other, but maybe you could build something that would
asynchronously mix them together.
I think it is a very interesting space. I just am not working in that space currently, but taking it to
a larger scale. Probably one of the things you're thinking is like, "Oh, there's such a broad
spectrum of projects." I think that's partly the fact that, at the Media Lab, we have a lot of liberty
to experiment. For me, coming from an arts background, it was a place for me to gain technical
competency, but also try things that I hadn't tried before. So I used it as a test bed for a lot of the
mixed-reality concepts that I was interested in.
>> Kori Inkpen Quinn: Okay. Well, if any of you would like to chat further with Seth, just
reach out to me. It's K-O-R-I and my alias, so thanks.
>> Seth Hunter: Thank you so much. I appreciate it.
[applause]
Download