>> Sumit Basu: Good afternoon everyone. I'm pleased... researcher here at MSR. And I'm pleased to introduce...

advertisement
>> Sumit Basu: Good afternoon everyone. I'm pleased to -- I'm Sumit Basu. I'm a
researcher here at MSR. And I'm pleased to introduce Gil Weinberg who's the director of
music technology at Georgia Tech. And I've actually known Gil for quite some time
back when his hair used to be very small; now it's very large because he's famous and a
musician.
So I knew Gil back when we were both grad students at the media lab, and the thing that
I've always liked about Gil is his very innovative approach to music and the fact that he's
not one of these people who's just saying that we're going to take music and automate it
in some way; he's really been thinking about what's the experience of the musician, how
does the musician interact with the instruments.
And at the time back in the media lab he'd started working on new forms of musical
instruments, and he'll talk a little bit about that. But he's also moved on to think about
synthetic performance and being able to collaboratively work with synthetic performers
in terms of jamming and creating music, and really creating some experiences that are
just brand new and very, very exciting.
So it's my great pleasure to introduce Gil Weinberg.
(Applause.)
>> Gil Weinberg: Thank you. So I was a musician way before I became interested in
computation, and even way, way before I became interested in robotics, and I'll show you
the whole gamut of how I started from music from the acoustic element of interacting
with objects to creating sound to try to enhance it with digital algorithms and other ways
in which I could use some things I couldn't use in the acoustic world, and then coming
back to the acoustic world by combining both the acoustic and the computational aspects
of creating music.
And what I was mostly interested in is trying to really change the essence of the musical
experience, because there are many ways in which you can make music and play music
faster and cheaper and more convenient. I'm sure you're familiar with what happens in
studios, software that can allow you to do things that you couldn't do without software.
And actually this kind of area has also helped with creativity, and just being more -being faster or convenient in editing can also foster new ways of creativity. I'm not
saying that it's not innovative, but what I was interested in is more changing the essence
of the actual interaction with the music.
So one area which I explored is new gestures. This is a great gesture, this is a great
gesture, this a great gesture, this is a great gesture, but all limited in that we all know this
gesture because of the physicality of the instruments, and this is an instrument that can
produce this and history made them to be so. But what if this is a great gesture that can
be very expressive, you can create -- you can be real creative, but we just don't have the
instrument to capture it and to do something interesting with it.
So that's what I can do with -- so that's what I was interested in in terms of using new
sensors and new mapping techniques where I can capture these kind of interesting
gestures and maybe try to create music in a new way.
Another thing that I was interested in was the network. If you network by playing
together music simultaneously or asynchronously, new gestures and new music can
influence other people and will them. Imagine that you're playing cello and new gestures
are being sensed, captured and sent to the violin and changes his tambour from more
staccato to more legato. He would just change the way he interacts with the music,
because he still controls the pitch, but suddenly the tambour is a little different and maybe
his movements will become more abrupt or less abrupt because of the new sound.
And these movements can be sent to the viola and change half scale from minor to major;
she will change the way -- shifting and so on. So this is some kind of an immersive
interaction; that you are not the only one that controls the music that you play. That was
interesting for me to explore in terms of new musical experience.
And the last area was one -- an additional area was the idea of constructionism. Because
with these kind of instruments you can allow people not only to change the output by
playing different, but you can also allow people to program the instrument and change its
behavior and learn about music and learn about computation and so on, and working at
the intermediate levels seem more popular than others. I was very interested to try to
create these educational activities as well.
So let's start actually -- this is a 11-year-old clip that I did I think as first year as a
graduate student, but I still show it because I think it's kind of funny and it's led to many
other projects.
That's Musical Playpen. And talking about new gestures, explore new gestures. I had a
kid at this time about this age that was very expressive, but obviously you cannot have a
toddler and put in the piano and tell him, Play something interesting. You have to work
with gestures that he or she is familiar with, and this playpen filled with balls is -- if you
ever went to a science museum or children's museum, this is something that is very fun
for kids, this is gestures that they like.
So I just put pairs of electric sensors on different areas of the playpen next to this corner.
The more active the kid is, the more stable the rhythm becomes. So I have some basic
algorithms just have it introduce less triplet or quintuplet and just -- I'll talk a little bit
about stability later.
Next to this corner there was an Indian rag that the more active the kid became, the pitch
went higher. So just learning something about stability, something about control, which
I'm not sure they did, but they did learn something about causality; that they saw that they
control the music and not vice versa.
(Video clip playing.)
This is a (inaudible), all over the place. She is more introspective.
(Video clip playing.)
So you see the eureka moment that she understood she controls the music. I don't think
she knows about anything about the Indian ragas, but that was a first start and it was
installed in the Boston Children's Museum for a while. It was pretty successful.
The other gestures that I was interested in were these gestures that I described before, the
squeezable and retractable kind of gesture that maybe can be expressive. I'll show you a
quick clip of that. Maybe. And you should excuse the America media narration. It's
pretty bombastic. But this is what I have.
(Video clip playing:
>> Alan Alda: And in Eye on America, it's play ball. The toys of tomorrow from the
sharpest minds of today.)
(Inaudible) what I had here is a cluster of small balls, (inaudible) kind of first sensing
what this does in between the balls, so different angles of (inaudible) could control
different sequences of notes, the scales were changed by interdependently changing what
the other pitch is, the volume and so on. And you can become quite expressive with that.
Another version of it was just for mixing, maybe a little -- for mixing. This here is
conductive fabric, basically allow you to (inaudible) your own instrument. I worked with
(inaudible) and basically each one of these electrodes that you can actually ergonomically
make just right to whatever purpose you need, control other volume of (inaudible), some
tambour, and sometimes it changes the pitch itself so you can create some kind of a
melody.
(Music clip playing: The flute. The piano. A flute there over this electronic sound
(inaudible) again.)
All right. So that's some new gestures that I explored. But as I mentioned, the other area
that I was interested is in the network. And I had a version of this instrument that in
realtime you controlled -- it was actually Jell-O-filled kind of balls that I bought in some
7Eleven, which leaked and didn't work so well (inaudible) but in realtime there were
three people that were able to control each other. So there was a melody player that the
other player by squeezing and pulling these balls could change her tambour and so on,
similar to the experience that I mentioned before.
But it was a little too interdependent, a little too immersive, in ways that you have to
know what you're controlling and what other are controlling if you are to creating
anything meaningful. So this led to a project that used new gestures but the interaction
was much more sequential. You could create a little rhythm, you could listen to it, you
could manipulate it with this bend sensor antennae over here, that first we have bend
sensors that -- then the box, we had the hall effect sensors, later we put accelerometers
inside the instrument to imitate some foreign gestures or to capture some foreign
gestures.
And only when you're ready you can throw the pattern to someone else that can listen to
it and decide whether to develop it further or to capture it, if this person wants to capture
it, it has to contribute something else to society, right: If you get something, you should
give something. If they want to develop it further, they develop it further, (inaudible)
they don't like it, send it to someone else and create this kind of collaborative experience.
Let me show you a little bit of how this works.
(Video clip playing:
>> Gil Weinberg: Now I can start to manipulate it. I change the pitch. The more I press,
the higher the pitch goes. Now I also can change the rhythm, (inaudible) it. Stop it. Try
something else.
>> Alan Alda: Beat Bugs are supposed to allow even someone with no experience at all
at playing an instrument the ability to create and collaborate.
>> Gil Weinberg: If you're happy with the motif, hit the bug hard and it's sent to
(inaudible). What you played is recorded. I can't change what you played, but I can
enhance it. I can use different rhythmic values and different tambours.
>> Alan Alda: You can sort of hide it.
>> Gil Weinberg: Develop it. So you can take this one and play with it (inaudible) I can
play with it. So I control his music, he controls my music. So that's the more confusing
part.
>> Alan Alda: The Beat Bugs made one of their first public appearances at a trial run of
the Toy Symphony in Berlin last year. The children spent about a week working together
to create their composition, to give children a hands-on experience with music, even if
they've never even touched a conventional instrument.)
And we had kids program the Beat Bugs themselves and change some of the behaviors in
the tradition of constructionism, as I mentioned before. It traveled six cities around the
world working with the Deutsches Symphonie in Berlin, working with the National
Symphony of Ireland in Dublin and so on and working a week with kids before the public
performance.
And if you see, they developed all kind of gestures just to show each other that they're
playing with each other, even though nothing was sensed. Let me give you another
example of gestures that the kids made that we don't sense which later made us change
the instrument and try to capture more of what we were seeing. Or maybe be I'll show
you some of the workshop before, actually. I forgot about this slide.
(Video clip playing:
>>: What was it like when you first picked up this thing and it began making the noises?
What did you feel like?
>>: Famous. It's brilliant. Really good. It was really exciting. (Inaudible) and it's like
talking to each other.
>>: Is it as fun as talking to each other?
>>: Much more so.
>> Gil Weinberg: (Inaudible) which actually start to play drums after that.
>>: (Inaudible.)
>>: It makes you want (inaudible) because when you play the Beat Bugs you enjoy it, so
you just ->>: (Inaudible) this is too hard, but like Beat Bugs shows you that it's not always so
hard. You can enjoy yourself.
>>: (Inaudible) it's actually -- )
I don't know if you noticed, but this was in a city, Glasgow, which was great working
with kids that has no experience -- in Berlin they're all musicians -- and to see how they
can create music without having any background before. This is maybe an example of
this concert.
(Video clip playing:
>>: And to play (inaudible) by Gil Weinberg for eight Beat Bugs, here are six children
from Sacred Heart primary school Mark O'Keefe (phonetic) and Ian Crawford.)
So she started with a rhythm. She doesn't know where it goes. The surprise element was
very important. He received it and he developed it. Actually in this case he just pressed
something all the way down. But he could have developed it more delicately. And now
he can manipulate the first button, in this case, because she started, it's just the
metronome. And he developed it and send it to someone else. She listens to it, she
decided to develop it further. This guy decides to enter a new button, so he keeps the
buttons that he received. And he's looking to see who gets it. And whoever gets it can
manipulate it and change. And see all of these gestures just to make sure -- I'm now the
head of the snake. We called it the snake because it from player to player.
Another element is to have someone from the orchestra, this is from the BBC at Glasgow
(inaudible) percussionist (inaudible) and tries to explore whether it's interesting enough
for people that are very experienced musician, not only for kids.
And at some point the system is working -- maybe I'll show this in a different clip. The
system is working for multiple hits from everyone, and then couples just pair, another
pair, another pair, just to see how they control each other's music. And especially here,
notice the gestures that they developed.
(Video clip playing.)
>> Gil Weinberg: Just -- they don't know who's going to pair. They're looking around,
these two, and manipulating each other. Now it's four of them. (Inaudible) this one
doesn't know that they are pointing to this guy and not the other guy. And of course,
Glasgow, they like soccer so they brought this from the soccer field.
So we (inaudible) sense it and actually this is the last project that I'll show from MIT.
This is already at Georgia Tech. MIT was nice enough to allow me to take some Beat
Bugs to Georgia Tech and continue to develop them. So what we did is a couple of
things. First we replaced sensors with hall effects, more robust, robustness in these kind
of things, especially when you give it to kids, is crucial.
I just talked to somebody that I visited at the Experience Music Project yesterday and saw
how difficult it is to maintain robotic instruments where the general public play with. So
this was much more robust.
We also put accelerometers inside to sense these kind of gestures. But the application
was completely different, actually. Here, these are two of my students. I play piano.
And this whole thing that we learned is that kids had more sophisticated ideas than they
could actually create themselves. Here you can create something and then manipulate it
to create something more sophisticated. But if you love something that is, first of all,
melodic based but also more sophisticated, you cannot personalize it, you cannot take it
in the other application and make it your own.
So here what we did is that with the Beat Bug just (inaudible) and hit your record,
whatever you listen to, and it can be either -- one of the Beat Bugs is piano. Actually, in
this case, (inaudible) piano, it can also be from audio. We do some on set detection, pitch
detection and so on to capture if it's from audio, to capture what was being played. And
then how people manipulate it and change it and control it with gestures, send it to each
other and control this same kind of collaborative experience that I mentioned.
(Music clip playing.)
I play this pattern. (Inaudible) capture it (inaudible) this pattern (inaudible) and so on.
Let me show you a little bit what we did with the audio.
(Music clip playing.)
(Inaudible) Beat Bugs improvises (inaudible) audio recordings of music trumpet playing
in realtime. Here an analysis algorithm detects the notes and pitches played by the
trumpet and organizes them according to pitch and location. The piece begins with a
melody section where the Beat Bug player can capture specific notes using (inaudible)
and extend them by manipulating pitch and tambour. The improvisation section the
player can create his own melodic (inaudible) with the captured and analyzed audio
material.
>>: When I hold the antenna here, they play the low pitches. And as I move it up, it
plays higher and higher pitch chunks. So that was the lowest one I found. And as I push
the antenna down, it plays a higher chunk.
>> Gil Weinberg: (Inaudible) includes segmentation and reshuffling of the trumpet solo
and effect manipulation such as pitch shift, delay and reverb. The other (inaudible) Beat
Bug interacts with a (inaudible) utilizing a similarity algorithm based on (inaudible) and
rhythmic density. The more the antennae are bent, the less similar the transformation
becomes in comparison to the original piano solo.
All right. So it's all very well. So those are some things that I was happy with in terms of
the new gestures and some things I was happy with in terms of the interaction in a group,
the constructionism. But something was missing. And the thing that was missing for me
was starting playing acoustic piano and other acoustic instrument was the acoustic sound,
the richness of acoustic sound (inaudible) because as rich as it can be cannot really
capture the richness of playing acoustic drum or acoustic any instrument.
There was also the whole idea of a lot of my interactive algorithm were just the computer
listens, try to analyze what's going on and plays back to you first because you don't have
the visual cues. You can't anticipate. You can't coordinate. A guitar player, when he or
she wants to end the song will do this with the drummer, right? You see the gestures. If
I'm going to hit hard, I'm going to see it before, I can anticipate, I can know what's going
to happen or go high on the marimba or low -- there's something about acoustic sound
and the visual cues that were really missing for me with electronic and digital music.
But I really loved all the algorithms and the perceptual algorithms and the improvisation
algorithms I came up with. So the only thing that I could think of that could capture both
ideas, both acoustics and visual cues and the algorithmic aspects of interactive music was
unfortunately for some people that ask me are you -- do you want to replace all musicians
with robots, it was robots. And the robot has the benefits of actually creating the acoustic
sound. I can look at it, and you see in some of the clips that I show you how I can
synchronize and coordinate my work.
And obviously I can have the perceptual and (inaudible) algorithm there and try to see
what's going on when I play with someone that play's completely different than I do and
think about music different than I do. Actually, similar, because I programmed it, but
still different.
So the main benefit was combine the benefits -- the main goal was to combine the
benefits of computational power doing all the things that I mentioned, acoustics sound
and visual cues. And the big goal was to create novel music experiences. So the idea is
that I as a musician come with things that only I the human in the foreseeable future can
come up with or kind of expressions or emotions that I don't even try to imitate or to
model or to do anything with the robot because that would be I think futile.
And the robot brings things that it can bring that I cannot. For example, I'll show you
some genetic algorithm-based improvisational model that luckily is probably no humans
use but maybe interestingly they are what can. Fractals and other things of -- that would
make the robot listen like a human, because I want to create the connection with a human.
If I play particular tempo, particular beat, I want the robot to know this is a tempo and
this is a beat. If I play -- I show later some clips with melody too. I want the robot to
know that this is maybe a dominance that is not resolved, so maybe the robot can resolve
it for me if you think that it should be resolved now. But I do want it to improvise in a
way that is completely different and maybe to inspire new music, create new experiences.
So one of the things that was important is not to stick to this first design, because we
started with -- actually it was a (inaudible) one of the students. And it didn't create the
kind of connection that we wanted. It didn't create anticipation, it wasn't big enough, it
wasn't expressive enough, so we went with some kind of anthropomorphic design that is
not really humanoid but looks like a human and create better connection with anyone
who wants to play with it.
The first hand used just a solenoid. So that was the greater freedom. It doesn't compare
to any sophisticated robot that you see told. The focus was on the interaction and on the
algorithms. But there was some degrees of (inaudible) actually four, back and forth to
control the pitch, the striker with the solenoid was too small, so the second hand -- the
movements were too small, the second hand we decided to use a linear model that create
louder sound and much more visible cues.
In perception we had to deal with all kind of things, such as -- especially with (inaudible)
pow-wow drum, which was a collaborative instrument, it was important to use something
that is made -- one is (inaudible) everyone -- that is made for collaborative play. Yes,
you can play piano in four hands, but this is not the main goal of the piano. In the
pow-wow it's about playing together.
So just there's a lot of reverb to each pow-wow, it's just to use secondary (inaudible) try
to figure out where the picks was just a basic level of perception, to figure out that if I
play a loud song and then there's another one, I can figure out where the other tones are
and just recognize the new hit, the new onset.
Pitch detection we used some well-known algorithm. We didn't invent anything there
(inaudible) as well. High level percepts where higher level were tempo and beat just to
do some similarity autocorrelation kind of algorithm to get the beat detection. Simple
things, density, so let's play really dense -- or if I play really dense, the robot would know
it and maybe it will play sparser, would just give the beat and vice versa.
Similarity algorithm, I'll say more about it and stability again here. We took some
research from music perception by (inaudible) and Honig (phonetic) and other and
implemented in the robot. We didn't do the really perceptual work, but we did try to
focus on the human interaction, how to use some of this model in human and robot
interaction.
So something about the perception. We had a database of (inaudible) 100,000 different
fragments (inaudible) fragments. And the idea is that the robot listens to them in
procession, one after the other. And for each one of them it gives a number for how
stable it is and how similar it is to the original pattern. And based on that out of the
database it can produce anything. It can produce something that is as similar to what the
original rhythm was, or very different. Depends on the composition curve and we
allowed composers to compose with it.
As stable or less stable, more stable. But the idea is that the robot will know something
about it and then you as a composer or a designer system can tell it how to use what it
knows.
Really quickly, what was rhythmic similarity based (inaudible) paper from 1993, just
look at similarity of onsets, give zeros and ones. You can see that this is five. And four
out of five overlapping onset, we just use this and put in some kind of an equation to
know how similar the rhythms are. Very simple.
For stability we use (inaudible) and Honig's computation model that is based on ratios
between adjunct intervals. There's a big preference for small integers, one or two, to be
more stable. (Inaudible) stable is just if you can hit your hand to it. Less stable is when
something in the rhythm doesn't sound similar to what just happened before, so it
continuously changes.
Onset stability is a sum of all kind of stabilities between pairs of notes and then the
general stability of the whole pattern is a geometric mean of the onset stabilities.
So here, just quickly, between this quarter note and quarter note it's one, between this
quarter note and eighth it's two to one, here it's one to one and so on, and one to two.
And then what -- you take all of this and based on this filter of what you prefer, you come
up with a pattern that this -- if you want to see what the difference of stability of this area
and this area, you just take these two notes, these two quarter notes and compare it to the
first eighth note this, is four to one, stability is very low here, then this to the other one,
two to one, then this quarter note to the whole, which is basically one to one. And we do
the same thing to this quarter note. And then have the geometric mean and this pattern
has a stability geometric mean of three and there you can use it.
More interesting was how to use all of these in human to robot interaction. Because
when musicians play with each other, sometimes they play (inaudible). I listen to
something -- maybe even simpler. I want to imitate. This was interesting; I want to
imitate it. Sometimes I want to have some (inaudible) over it. Sometime I give an
accompaniment.
And the robot, who should it listen to? Sometimes it should listen to this one or to this
player or that player. Based on what? We did some studies asking musicians what are
you expecting when you play with other musicians, what -- who do you respond and
when? Sometimes you ignore everything and generate something in you because you're
completely bored and how would the robot decide now I don't want to use anything of
what I created and do something new.
So we used some imitations to (inaudible) transformation, take something that is a little
similar, just breaks the beat a little bit, perception transformation, make it more stable,
less stable or similar or less similar. Then synchronize just to detect a beat, perceptual
accompaniment, if you play loud, it will play loud; if you play dense, it will play sparse
and so on.
And then, always works, synchronized unison; you have a little (inaudible) just to end the
piece that you play together, it always -- fish for claps. If you play on the same beat.
And of course you can do interesting things such as morphing (inaudible) pitch curve
from one drum -- not really a pitch, because it's a drum, but low or high, and take the
rhythm from another drum and morph them together and play (inaudible) and see how
you respond to it.
Let me show you some clips. Everything is again in these clips is not sequence, it's all
based on what it listens to.
(Music clip playing.)
Notice here how Scott is actually playing with Haile. Haile is a robot, Scott is a human.
And how Haile captures on the difference -- Scott will play a seven-quarter rhythm
(beating rhythm ). Haile -- you can still hear it for the seven-quarters rhythm, but you see
the improvisation that Haile makes to it apply to it. And some (inaudible) some
perceptual. And you see how Scott captures ideas from Haile, build off of them, Haile
captures idea from Scott, build off of them and so on.
(Music playing. ) Introduce something new. Scott captured it. You can still hear the
seven quarters here in the whole piece.
All right. Here is a clip of the beat detection. You see how Scott change the tempo,
change the beat, and Haile tried to imitate it. It takes a couple of seconds of confusion,
but it takes some seconds of confusion for humans as well. So I'm pretty happy with it.
(Music playing.) Got it. Scott is changing. Got it. Scott is changing. Got it.
Confusion. Slower. Confusion. Got it.
Okay. So we use this algorithm and others to do this perceptual accompaniment. It
listens actually only to these two players. We just give the accompaniment, they actually
play (inaudible) better than us. And it morphs between the pitch from this guy and the
rhythm from this guy.
It also when it's very dense -- we have varied direction on microphone inside each of the
Robucas (phonetic), so it basically gets two different signals that are completely separate.
So when it's very dense, it will give the rhythm. When it's very sparse, it will improvise,
morphing between the two motifs.
(Music playing.) Okay.
Another thing that it can do, which is pretty fun and I don't know if we have time, but I
can experiment with you a little bit. It's follow rhythm. So it captured a little motif from
me during the piece, which actually was nine-quarter. It was (making beat). Do you
want to clap? (Clapping.) (Inaudible) so let's try to do it (inaudible). (Clapping.) If you
try it with one hand, that's a little bit more difficult, but also possible it's right for one
hand. (Clapping.) Okay.
Second pattern it received from Scott was seven-quarters. It was (making beat). Good.
Now, all of you (inaudible) play the nine in one hand and the seven in another hand and
let's see if every -(Laughter.)
You can't? Haile can. (Music playing.) (Inaudible) create some interesting structures
and sustain rhythms that are not likely to be created by humans, least not the humans that
I know. And as I mentioned before, it's always nice to end with a big unison piece. So
even though (inaudible) perception, no improvisation, we always ended every piece with
this little sequence (music playing).
(Applause.)
All right. Here's maybe a clip -- there's a paper in Computer Music Journal that you can
read more about it, but just briefly about the perceptual studies, after we embedded this
algorithm in Haile we wanted to see if actually it fit with how human perceive what's
more stable and what's less stable. This is -- and (inaudible) we have music classes at
Georgia Tech, we just can't tell the students to participate in studies.
And the results for pretty good. Think about it again, humans not always agree on what's
more stable or less stable. We got in the 80 percentile kind of responses that fits with
what Haile felt was more or less stable.
Let me show how this experiment worked.
(Music playing:
>>: Haile captured this pattern. So this is the first one. This is the second one. This is
the third one. (Inaudible) play the original one just one more time for you.)
>> Gil Weinberg: I don't know if you agreed with what was written there, but they tend
to agree in the 80 percentile kind of range.
And then Haile went melodic. And since we didn't -- I was interested in capturing some
melody and trying to figure out if this can be extended to harmony as well. I still didn't
have the robot to create rich acoustic sound, so a lot of the point that this is done to
extend the richness of the tambour, this doesn't really work here, maybe even it's because
sounds better (inaudible) xylophone here that was just retrofitted correctly to fit the range
that we already had.
But actually I'm worked on an NSF-supported grant right now. There'll be a big robot
playing a marimba with very good sound, that some of the algorithm here we can actually
implement in a better sound environment and larger gestures, bigger gestures for the
anticipation and so on.
So we tried genetic algorithm. And the way it worked is that I played all kind of small
motifs, created a generation, a compilation of small motifs. My musical genius, if you
will. Then in realtime Haile was listening either to a saxophone player or to myself and
using dynamic time warping trying to figure out similarity metrics, what's -- based on this
motif, what in this population is the 50 percent that is most similar to these motifs.
And it went pretty quick. So we ran like 15 generations based on all kinds of mutations
and crossover between these motifs in almost realtime. So we didn't run it enough
generations to create complete similarity, because if we run it, I don't know, 500
generations, it probably would have created with all this cross breeding and mutations,
probably would find something that was exactly the same as what it was listening to. But
we didn't want to maximize this GA. We wanted just to stop when it's similar enough to
what it listens to but still bears the genes from the compilation before and create -- and
see how the player responds to something that is similar yet different.
Some of the mutations or crossovers were very simple, such as it takes a pitch and
rhythm, has two parents, you see this as a two -- children, you see that we have the same
kind of pitch control as parent B and the same kind of rhythmic control as parent A and
vice versa. So all kind of -- some of the manipulations were musical, some of them were
random, just see what happens, pretty much.
And as I mentioned, (inaudible) similarity to input based on DTW, we used also some
kind of other algorithm, such as (inaudible) melodic attraction to see how much tension
there are between two notes melodic-wise and so on, pitch proximity. And the idea is
that you start with the (inaudible), we run several mutation and cross-selection here and
create some kind of morphed hybrid.
Interaction-wise, we looked at musicians, how they played. We looked at videos, we
looked at one musician, we interviewed musicians, first of all, to try to figure out when
you listen to whom based on how energy -- or kind of ideas how energy -- if there's more
energy, you listen more to this guy and maybe you try to ignore it or not, but just what -where that tension is, listening to what in each musician. When do you follow, when do
you interact, when do you ignore, when do you initiate something new, when do you give
an accompaniment. And when you play, do you just imitate; just to show audience, You
see? It's not just random. It actually listens to it, this is exactly what it played. When do
you transform and how do you transform, when do you generate something new, when do
you bring maybe something ten minutes ago. That's something else that robots can do
and humans can't: bring something that was just played ten minutes ago and suddenly
you manipulate it in an interesting way.
The first piece was (inaudible) if you are not big fans of free jazz, I will play it shortly.
Later I have more (inaudible). But this is pretty free.
(Music playing: Similar but different. Capture it, build on it. (Inaudible) I also listen to
him and to the robot. (Inaudible) listens to me because it was new information so it
tracks with me now. (Inaudible.))
All right. Something more melodic. Give an accompaniment, listen to me and the
saxophone player and put this trio -- it's a very fast trio based on the harmony.
(Music playing.)
All right. And here is more improvisation (inaudible).
(Music playing.)
All right. So currently I'm working on a robot that actually will deliver the promise of
creating better quality of sound and adds new futures, such as harmony analysis, better
tambour, social interaction. I'm working with Andrea Thomaz from Georgia Tech
(inaudible) social cues for robots to see what it looks for, how it responds. Just the
simple stuff that we just tried was head banging based on the beat just to create more
connection. Educational -- and all this multimodel, it's not really fair that I can listen to
Haile and also see it and Haile can only listen to me. So we are thinking about using a
camera and use some kind of vision in order to extend its capabilities.
I'm also working on a completely different scale because traveling with this demo is
pretty difficult. Actually, (inaudible) Haile before every trip, we traveled over -- close to
20 concerts around the world, from Asia to Europe to in America. And I was really
looking for something which is much easier to travel with. So we're using some of this
algorithm and some of these ideas in cell phones. Little smaller (inaudible) I don't have
to -- so we have actually started with Georgia Tech's (inaudible) now company that does
all kind of music application for mobile interaction.
And I really wanted to tell about it, but the CO that I just found told me don't say
anything yet. Hopefully I will be back here maybe even (inaudible) different product to
describe what we have. So that (inaudible) in mobile interaction in the future work.
And that's pretty much it. Thank you.
(Applause.)
>> Gil Weinberg: If you have questions.
>>: (Inaudible) in your videos is that in the early ones when the drummers are playing
with the robot, they're really very intently looking at the robot and following along with
it, but the later performances it seems like the performers are not really watching the
system play. And the -- yeah, yeah, typically jazz sort of combo situation, it's not so
much that you're actually watching the interactions on the instruments, but you're looking
for the little head nods and things like that. Have you considered doing that level of
robotic stuff, that you're purposely giving kind of head cues and stuff like that?
>> Gil Weinberg: So the second two clips -- the last two clips that I showed maybe are
interesting in terms of algorithm of exactly opposite to whatever I wanted to achieve.
The sound is not better and there's no visual interaction, so -- but, as I mentioned, one of
the things that we are interested in adding is the whole idea of the robot looks at us. And
exactly for this kind of situations. Because currently the robot cannot see what we are
doing, it can only listen and it doesn't have the social interaction that you have when you
look at each other as well.
So that will solve the idea of the robot knowing what we're doing, if want it to play, it can
maybe look at us and figure out with vision sophisticated with changing lighting
conditions. Vision will be tricky.
As to looking at the robot and his head movements, that's what is addressed by this social
interaction working with and head by Andrea Thomaz, which will have many degrees of
freedom. And we did realize that it's very mechanic, it just moves like this. It doesn't
groove. We want a grooving robot. So these two elements exactly hopefully are going to
address it to -- issue that you mentioned, which are definitely valid issues.
>>: When playing with the robot, there are specific instruments that it had more
difficulty learning from than others, or improvising with.
>> Gil Weinberg: In terms of the (inaudible) itself obviously (inaudible) instruments are
easiest. We don't have to work at all. And for each of the audio instrument, and we have
a trumpet as you saw and a saxophone, there was a lot of tweaking in order to adjust to
the particularities of each instrument. It looks as if it's easy, but it's not. Before every -you come to a space, you have to try everything again, you have to have half an hour just
to tweak it to the reverberation in this particular room to the sound of the new saxophone
player that you never met (inaudible) played in ICMC and we just met the saxophone
player two hours before the gig. So it was a lot of -- a lot of tweaking.
And so I think it will be -- we chose two difficult instruments. Trumpet is a lot of noise,
but we wanted to push (inaudible) there. I think if we can do trumpet, we can pretty
much do everything. Maybe low double-bass would be more difficult, there are low
frequencies in the reverberation, where it comes from and so on.
But in terms of horns and the horn section, trumpet and saxophone are pretty difficult. I
think everything else will be a little easier in terms of detection.
>>: So based on that, are you planning on adding learning algorithms where you could
go into a new room and a new trumpet player would come in and the robot would just
figure it out on its own without you tweaking?
>> Gil Weinberg: (Inaudible) new students to work on that.
>>: So is all of this microphone based to get the original signal?
>> Gil Weinberg: (Inaudible) piano.
>>: Yeah. I mean, but I remember years ago Roger Danberg (phonetic) used to have a
trumpet mouthpiece with a hole drilled on the side. Because you could get a much better
signal there.
>> Gil Weinberg: Yeah, no (inaudible) microphones (inaudible) 800 bucks each, so it's
not a simple one that you would buy tomorrow. But they work pretty well and isolate all
crosstalk from anywhere else. And for the -- we choose the microphones to the
instruments that we use, so for the (inaudible) for the pow-wow, it was just the right
microphones.
>>: Did you ever use more than one copy, I guess, of (inaudible) scale to the point where
two robots could accompany each other without interfering with each other?
>> Gil Weinberg: (Inaudible)?
(Laughter.)
>> Gil Weinberg: (Inaudible) we have one version of Haile.
>>: But another instrument, like if you build another Haile, did you ever do that so that
you could see if (inaudible) there were two Hailes. Could they both play (inaudible)?
>> Gil Weinberg: They can (inaudible) and once we have the new robot that will look
like this and will play the marimba, definitely at some point we will try to see how they
interact with each other and see what happens.
But this would be actually the second phase of what we're going to do. Because the main
interest is human-robot interaction. Can human bring what human bring and robot what
robot bring and see what happens. And the second will be, okay, let's try to go out and
see what happens if two robots interact with each other.
>>: I was wondering about the (inaudible) so if it's just in a rhythmic context when
you're playing with a drum, is there some notion -- how does it determine how long -when the downbeat is? Because you have those examples at seven or nine (inaudible)
what's happening?
>> Gil Weinberg: So the first (inaudible) Scott playing on the pow-wow, Scott has a
pedal. And it just does it. And we realize that this is not how we want it to happen.
There are some sections that Haile just listens to silence. After two seconds of silence,
now it's my turn (inaudible) response. And we tried this one and this one is also not -covers a whole gamut of interaction.
And the last interaction we have a very sophisticated mechanism that besides based on
level of energy, the kind of notes, the number of notes, time, silence, nothing, no pedals,
completely automated system that decides when you're going to answer this is a motif. It
started, it had some kind of a curve, it ended -- I think it ended. Something's wrong.
Maybe I wanted (inaudible) to play, but then it interrupted me.
When this is (inaudible) this is too similar, if I play the same thing again and again, Haile
will -- at some point it will do something and will try to create something completely
automated. But the first steps were very didactic, very simple.
Actually, in -- for the couple of clips (inaudible) you can see here we have John deciding
what interaction mode Haile is going to respond to. And here we just automated it totally
because we felt that this is wrong. And the results I don't think were as good as with
John because John is a very nice and good musician and the automated system is not
there yet, but it's on the way.
>>: I was wondering how you would characterize the most important things that you've
learned about in your musicianship or robotic musicianship that you think would be
important for other people building completely different systems (inaudible) somebody
not doing robotic percussion (inaudible).
>> Gil Weinberg: I think what I learned is that it's very difficult to create music that I
like, or that humans like. So, yes, it works, you can -- sometimes (inaudible) similar but
different, and if you don't feel I say it, so you feel it. And, yes, sometimes the results are
surprising, which was the best moments where suddenly it really inspired me. It was like
small moments that I was -- but most of the time I thought I will never buy a CD of this
yet. And in order to get music that I want to buy the CD of, there is many, many -- a lot
of research that can and should be done.
It's also very specific. Like musicians, if you're a good musician, you will come to a
drum circle and you will adjust. You go to free jazz, you'll adjust; you play rock and will
adjust. Here it's -- we made it almost like (inaudible) for different genres. I have no idea
how to try to -- but why I'm saying that, because that's what I see as musicianship, being
able to be musical in different context. So Haile can be musical and can be interesting in
specific context. I think it's very -- I don't know how to go. I think it was a lot of work
just to work on this specific concept before we try to really install musicianship in the
largest meaning of it. All right. More questions?
>>: So, in general, by working on all these (inaudible) calculations that these robots use,
do you think that has developed your own musical intuition or challenged it?
>> Gil Weinberg: I think that both. It was developed in the sense that since I program
the algorithm and even when I have no control of the genetic algorithm, for example, or I
didn't choose the factors or something that I have no control, I kind of had more
understanding or more connection with Haile than I have with other musicians that I don't
play with a lot. Because I kind of knew what it would sound like. Anticipating it and
really listen carefully, because I kind of know what's going to be, but not exactly. I think
that this developed my listening skills and my ability to look for things.
But it -- if by challenged you mean I felt (inaudible) some of the performance because I
was disappointed, yes.
>>: Actually, I was thinking more along the lines of by thinking so much about
technicalities, do you find it it may be more difficult to just improvise? Because a lot of
times when you're playing music, (inaudible) it would be that whenever you try to
improvise (inaudible).
>> Gil Weinberg: (Inaudible) sometimes and yes, this is a big challenge. In this robot I
was very, very attentive and look -- listen and tried to really create -- build the piece. But
it was the first performance also so I was concerned it's going to break and not work.
But here (inaudible) Haile, partly because it was a very crowded jazz joint, partly because
the monitors didn't work so well, so I didn't really hear everything that was going on. I
felt much (inaudible) improvised with Haile I think that was much better. So sometimes
yes, sometimes the opposite.
>>: I was wondering what your plans for the harmony stuff were. Because it's hard for
me to evaluate like the sort of free jazz sort of stuff, like melodically I'm not sure what's
going on there because I'm not a free jazz musician. So I'm excited about the (inaudible)
harmony stuff, what's ->> Gil Weinberg: Yeah. We discussed it before the talk, I'm looking at some studies by
(inaudible) trying to figure out harmonic relationships between the note (inaudible) and
care of tension release and try -- the first step would be to try (inaudible) Haile
understand how tense harmonically each section is and based on that decide I want to
make it even more tense or I want to resolve it and how.
But we didn't get there yet. And I'll be able to talk with you later. I know you're dealing
with similar things.
>> Sumit Basu: Thank you very much, Gil. And Gil is actually around today. He's
pretty booked up for the day, but if there's people who want to talk to him, he's around
tomorrow also, so you can figure out -- just send me an e-mail and we can work out a
time.
>> Gil Weinberg: Yeah. I'll be in the morning. Thanks.
(Applause.)
Download