>> Ivan Tashev: Good afternoon everyone, those who are... watching this talk remotely. Today we have Piotr Bilinski. ...

advertisement
>> Ivan Tashev: Good afternoon everyone, those who are in the room and those who are
watching this talk remotely. Today we have Piotr Bilinski. He's a PhD candidate in INRIA,
France in the Spatio-Temporal Activity Recognition Systems research group, but today he's
going to talk about his work during the last three months in HRTF here at Microsoft Research.
Without further ado, Piotr, you have the floor.
>> Piotr Bilinski: Thank you, Ivan, for the introduction. Good afternoon. Thank all of you for
coming for this presentation, for those who are here and for those who are watching this
presentation online. Today I would talk about my internship project, head related transfer
functions and their personalization using anthropometric features. This work is done together
with my supervisor, Ivan Tashev, Jens Ahrens and Mark Thomas. At the beginning I would like
to say thanks for the wonderful summer. It's been really a great pleasure being here. I thank
Ivan as my supervisor for bringing me here and for collaboration. I thank Jens Ahrens also for
his co-supervising me, also Mark Thomas. I would like to thank people from the Interactive
Entertainment Business Division, especially Alex Kipman for founding my internship project and
Jeremy Sampson for helping with the data collection. I would like to thank John Platt and Misha
Bilenko from machine learning team for their consultation on the machine learning. I would
like to thank Andrzej Pastusiak for his consultation about the TLC library and Chris Burges for
letting me use his boosted decision trees. Also I would like to thank Jasha Droppo from the
CSRC team for the consultations. Here is the overview of my talk. I will talk about HRTF, what
is an HRTF and why do we need them? I will talk about data collection and what we did, and
anthropometric feature extraction which I work on. I will talk about universal HRTFs, if they
exist and I will talk then about HRTF recommendation via learning from user studies, and then I
will talk about two approaches we propose, one based on sparse representation and another
based on neural networks. Finally, I will compare all of these techniques over the proposed
techniques and I will conclude and talk about future directions. So when I start, what is an
HRTF? HRTF is a head related transfer function that represents the acoustical transfer function
between a sound source and the entrance of the blocked ear canal. The HRTF describes the
complex frequency response as a function of a sound source position. It means azimuth and
elevation. In this particular work I will focus on modeling magnitudes as a solution for the
phases as it exists. Here is a sample of an HRTF in the horizontal plane. We can see that on the
horizontal axis we have azimuth of sound source and on the vertical axis we have frequencies.
This block shows the magnitudes of the fourier transform of the input responses. We can see
on this block markings of the HRTF on the scale and we can see that at low frequencies below
1000 Hz HRTFs are independent of the direction of the sound source but they are dependent at
higher frequencies have both 2000 and 3000 Hz and the differences can be on an order of
around 30 decibels. Why do we care? Why do we want HRTF? Basically, we would like to have
3-D audio over headphones and imposing HRTF onto a non-spatial audio signal and playing back
the results over headphones evokes the perception of a virtual 3-D auditory space. In other
words, having someone's HRTF allows us to control the perception of sound source localization
over headphones. So there are many potential applications. It can be used in games, so we
would like to know where is the opponent and how to move in a game. Maybe events
streaming or music performances, also in virtual reality or like in the movie Minority Report
where we would like to not only see what other people see but also to hear the same as what
other people hear. So why don't we have it? We don't it because HRTFs are highly individual
and using someone else's HRTFs , using HRTFs other than the user’s own HRTF can significantly
impair the results. If we choose incorrectly an HRTF, the perception of sensor location might be
also incorrect. Also, because measurement of HRTF is an expensive process; it requires
specialized equipment and an anechoic chamber. What do they depend on? HRTFs are highly
dependent on human anthropometric features, so they are dependent on ear features like the
pinna height, pinna width and they are dependent on head features like head depth, head
height et cetera, and also torso features and shoulder. What can we do? There are indications
that HRTFs are correlated with human anthropometric features. We try to select and
synthesize HRTFs for a given person from a database. In order to do that, in order to start
working, we first had to collect data. So let's start with the audio data. Here in Microsoft
Research there is an anechoic chamber. We asked participants to enter and to sit in this chair.
The chair is in the middle of this arc. The head of the participant is stabilized so the person's
head should be in the middle of this arc. We plug microphones into the participant’s ear. On
this arc you can see 16 evenly distributed loudspeakers and the arc is moving to 25 different
directions. Basically we played specific measurement signals from which we can calculate the
HRTFs. As you can see on this image we have 16 azimuth and 25 elevations, so that gives us
400 directions. What we don't have here is we don't have some directions. So we basically
extrapolate them using spherical harmonics and then we have 32 elevations and 16 azimuths
and that gives us 512 different directions. For all of the 512 directions we have HRTFs. Once
we have the audio data I would like to extract anthropometric features and in order to do that
we first do the head scans. We have a different room where we asked the participant to sit on
a chair and put on a swimming cap so we can pick up his skull geometry. Participants sitting on
this rotating chair and there are two capturing units. The participant is sitting at an angle more
or less approximately 90 degrees from these capturing units. Each of these capturing units has
two cameras so from each capturing unit we get a cloud of points, the representation of the
human head. As an example, we have a participant, we rotate the chair, we take pictures from
different angles, so then we can align images and create the 3-D head model of a person. So
there are lots of preprocessing steps like image alignment, filling of the holes, smoothing of the
mesh and I would like to thank Jeremy for his help in doing this. Then we get a 3-D head model
of the person. Once we have this model I would like to extract some features from it, so we
have the model which is represented by the cloud of points by the 3-D triangles and from this
we can extract some contours of the head and so that's the features I extracted from the head.
I implemented several algorithms which automatically extracted features. I'm not going to go
and say step-by-step how we do each of them. It would just take too much time. I will just say
that we extract some features from head like head width, head length. We extract features
from the neck, features related to pinna. We also extract from these head scans features from
the ear, so there are features depicting the height and width and other features. Apart, we also
take measurements by hand from participants. We have features like inter pupillary distance
using this for example the pupil meter. We use a measuring tape and we use other measuring
devices to capture information from shoulder, from torso, from head and neck. You can see
that some of them are both extracted from the head scans and some of them are inserted by
hand and that is correct. There are two reasons for that. Basically because in the 3-D model
sometimes the neck is just not visible so we cannot extract what is the width or the depth of
the neck and also basically sometimes the boundary between head and neck is not clear, not
visible. The second reason is that we need some features to scale the image pixels, the
distances in model dimension to the real-world dimension. As cameras are not fixed and chairs
are not fixed and actually participants on the chairs are rotating, that's why we need to scale
the image pixels to the real-world dimensions. Here are some examples of the screenshots of
the software that I did over the summer. It's to extract some anthropometric features as I
described before for one and another participant. Here are some more samples. What we also
do is we ask the participant to fill out a short questionnaire and there are 12 different questions
about gender, age, race, height, weight and the questions are all necessary and some
participants basically might not be willing to provide all of the information or all of the details
like age or weight, so that's why we have another less personal questions that are correlated
with our original one. In total we collected 115 people, people’s HRTFs and for 36 out of this
115 people we have full measurements. We have the head features from head scans. We have
ear features from 3-D head scans. We have measurements extracted by hand and we have
questionnaires, so in total we have 93 anthropometric features per person. Up to this point I
create a dataset. I created algorithms for anthropometric features extraction. There are scripts
for data extraction and validation for measurements and questionnaires. There are many
converters like participants are coming from different regions, so some would like to put data in
feet, meters, some converters for weight, shoe size et cetera. Finally, you can count on the
topic of HRTF recommendation, so my first question is is there any universal HRTFs which can
once be one-size-fits-all. I took the head and torso simulator from the Bruel and Kjaer company
and this is mannequin with a removable artificial mouth and pinna. And they provide this
mannequin with average human head size so it's supposed to be correct for kids and adults and
for females and males. I wanted to see how far is the HRTF from this HATS model to the
people’s HRTFs. I'm using the log spectral distortion as the most commonly used distance in
the literature and I would compare the log spectral distortion distance between a person's own
HRTF and an HRTF from the HATS model. And I would just like to mention that the perceptual
meaning of log spectral distortion is not clear. Here are the results. We have results for the
straight direction, so basically when the sound is coming from in front of the person and we
have for all the direction around the persons, so all the HRTFs, all the 512 HRTFs. And we also
created the perfect and worst classifier, so basically in perfect classifier I don't look into the
anthropometric features. I basically always choose the closest HRTF in the log spectral
distortion distance. That doesn't exist; it just shows that this is the range of results we can
receive and we see that HATS model gives us results very close to the worst classifier which we
can create. Basically the conclusion is that the HATS model is not suitable and we cannot create
one universal HRTF for everyone. If we cannot create one universal HRTF, let's try to select one
of them from our database. That's our goal in this part. I would like to identify the best HRTF
for a given person from our database. There are, however, two problems. One is to select the
best HRTF we need an HRTF distance, and as I already said perceptual meaning of LSD is not
clear. The idea was let's do the user studies and learn from them how do they rank HRTFs, so
we can find correlation between people’s anthropometric features and their personal HRTFs
ranking. So we designed user studies. Here's one of our participants. We provide them laptop
with headphones. The headphones have the head tracker and we design the HRTF
comparisons and we experiment. We asked people to compare each time two HRTFs. He can
switch as many times, the participant can switch as many times as he wants between A and B
and then he has to provide his preference. He can strongly prefer A, slight preference to strong
preference towards another one and this is a slider so he can obviously put it wherever he
wants. In this experiment we had in the training phase we show participants 12 different pairs
of stimulus showing the range of HRTF and testing the 156 pairs to compare. I would just like to
note that we use [indiscernible] speech for this listening experiment.
>>: [indiscernible] better based on what it's supposed to be in a given direction or just better
for a particular reason?
>> Piotr Bilinski: They are supposed to make the decisions why the sound is coming from the
screen, so they are evaluating maybe the straight direction.
>>: We intentionally didn't tell them what properties to look for because the priorities that the
people set on different properties might be individual, so we just asked them to give their
general impression of whether this sounds better than the other one.
>>: They are doing this in order to measure kind of spatial difference.
>>: Yeah. Ambience or whatever and they can just say oh it sounds nice [indiscernible] and
they give a judgment occluding the final [indiscernible]
>>: [indiscernible] true and also it would maintain consistent because people might use
different properties to make their decisions so we discussed many options and for each option
that we saw there were arguments supporting it and arguments against it so eventually we just
had to go for one and [indiscernible] whatever choice.
>>: Is there an assumption that if you use the wrong HRTF it would just sound bad regardless of
directionality?
>>: Yes, yes. The two main perceptual dimensions are the vocalization and the timber and not
necessarily independent. One can influence the other, but you can, hearing the sound source
from the right direction doesn't mean that it's a good set of HRTFs because the timber might be
too dull or too bright or so. We wanted to give people the freedom to input their priorities as
they wanted.
>> Piotr Bilinski: Yes, so it's up to the people to decide what they prefer was their, how do they
judge. So how do we choose the stimulus for the experiment? We take basically all of the
available HRTF and [indiscernible] after them and for the training we classed it into three
groups and for the testing to 12 different groups and for each class that we that we select a
representative person and obviously the people selected for training and testing are different.
So then for training we select three HRTF and four testing 12 HRTF and then we asked each
participant to compare a full matrix of selected HRTFs and each pair of HRTFs the participant
has to compare twice, so we have for example 12 selected HRTFs, we will have 12 x 12+12 for
the [indiscernible] so that gives us 156 comparisons. So how to select these representative
HRTFs which gives us the range of different HRTFs. For this we use log spectral distortion. I
said the perceptual meaning of log spectral distortion is not clear, but it definitely contains
some true information. If the distance is big the HRTF should sound different and if the
distance is small they should also sound similar. And here we want to select a representative
HRTF which somehow cover the full range of HRTF. That's why we believe that using LSD for
the selection of the present HRTFs is correct. And we can arrive at this distance in this form and
now we can apply some clustering algorithms like K-Means and that's what we actually do. We
had 23 people which participated in our experiment. For every participant and every pair of
stimulus we calculated the difference between the responses because every participant has to
respond twice to the HRTFs, to the same pair of HRTFs. If the participants reply strongly firstly
to the A and strongly to the B that gives us the difference between his response before, so
basically we put them on the scale and the values range from -2 to 2. So we can see what is
the consistency between participant’s responses. Zero means that they reply exactly the same
for the same pair of stimulus. Four means that they reply entirely opposite directions. So we
can see that there is some inconsistency between a participant’s responses. It might be that
they were tired. It might be their brain asked to see the difference when there was actually no
difference because in this experiment they were looking for the differences. It also might be
that people participated with a different level of engagement and actually one participant fall
asleep. We asked him what was the reason. Actually, it was not because of the comfortable
chair and it was not because of the experiment but because he was at a party up until 4 A.M..
So participants participated with different levels of engagement and that's why there is some
inconsistency in the database. Some participants spent like one hour 20 minutes to do this
experiment and other participant's only 20 minutes. We also plotted the same results in a
different form. We see the representative HRTFs that were the selected HRTFs. This shows
how many times one HRTF is preferred over other HRTFs, so the range obviously is from 0 to 24
and we can see here that some HRTFs are very preferable by this person and some are not. So
this gives us a conclusion that this representation provided better data for the analysis and we
can create ranking from this. That was the idea. Let's learn HRTF ranking from a person from
user studies. And I already mentioned several times ranking and some of you probably thought
about search engines like Bing and Google and indeed there is a similar ranking and that's how
we treated this problem, as a learning to rank task. How does it work? We have 23 people and
each person is described by 93 anthropometric features. For each person I indirectly ranked 12
representative HRTFs and each HRTF is represented by 29 Mel frequency cepstrum coefficients
and for ranking we use boosted decision trees. What I did is to create a ranking formula, how
to create a ranking from our experiment and basically to say it short, the idea is like let's try to
give the high ranked values for the less often and the low values more often, so basically only
putting the high values to only the best HRTF and only to few of them. To evaluate our results
we follow the metrics from learning to rank domain and we use the normalize discounted
cumulative gain metric. We can see that zero to one here is treated like a classification
problem, and here are the results. However, I believe that this audience would be more
appropriate with log spectral distortion so we also use the log spectral distortion here to see
the results. We can see that the results are better than the HATS model which was 13.77 and
we can also compare it to the perfect and the worst classifier and that's the result. However, I
believe that to evaluate this technique much better would be to do another user study and
that's what I wish for later. This shows that it is already better [indiscernible] than the HATS
model. Also this technique is gives us information which features are more important for this
task and which features are entirely uncorrelated, so there are features rated to the head width
and the pinna than the head. Excuse me. Now I would talk about synthesis. We tried to
generate the HRTF for a person. And we have two approaches which we propose. One is based
on the sparse representation and the other on neural networks. In the sparse representation
and the goal was the synthesis of the HRTF using anthropometric features, and the idea was we
model a person's anthropometric features as a sparse linear combination of anthropometric
features from other people, and we assume that the person’s HRTFs are in the same relation as
anthropometric features. We have a full range of people in our database and we would like to
generate, synthesize HRTF for this girl. We'd like to combine a few people and say that her
anthropometric features are a combination of anthropometric features from these people, and
ideally we would like to obviously only one person, the closest person and maybe two, maybe
three people and not use any other people. So we would like to create a sparse representation.
That's the idea. Let's learn a sparse vector alpha from anthropometric features and apply it to
HRTFs. That's basically our idea. So that's the problem definition. It is a minimization problem
and here we minimize the sum of the square arrow over all anthropometric features and we
also added a regularizer to this and we solve this minimization problem using the lasso
technique. I just would like to note that we learned the sparse representation on people, so
selecting people and not features as usually is done. Once we learned these parameters from
anthropometric data we applied it on HRTFs. We again, computed the results in the log
spectral distortion distance and you can see that now the results are much better compared to
other techniques and the results are actually very close to the perfect classifier which we
created. Again, you can see the distance for the straight direction, for the left and the right
here separately, and altogether for all directions together and basically in all cases the results
are very close to the perfect classifier. So that was the first I think and we also had another
technique based on neural networks. The idea is the same to synthesize an HRTF using
anthropometric features and here we basically tried to map anthropometric features directly
from HRTFs anthropometric features to HRTF using neural networks. Here I was using the
radial basis function neural networks, so they contained an input layer, hidden layer and output
layer and in the hidden layer there are radial basis functions and after this mapping we get such
results, which are actually even better than sparse representation. They are also very close to
the perfect classifier. Let's compare them. So we already have several techniques and let's see
which one is the best. Throughout all of our techniques that we created so there is this perfect
classifier which is as a reference. There is a sparse representation. There are neural networks.
There is learning to rank. We also try a ridge regression which I haven't mentioned before
which is like a sparse representation but without constraint of the sparsity. And we have HATS
model and the worst classifier. Using log spectral distortion we see that sparse representation
is mostly preferred especially in the frequencies which are most important for the 50 to 8
kilohertz. Neural networks are also performing very good and they are very close to each
other, these techniques. We also believe that the performance of the sparse representation
can be further improved with future selections. But also when you evaluate the results using
the user studies, so we run a small user study with seven participants and we asked participants
to compare their own HRTFs with selected and synthesized. Basically results are very similar to,
the procedure is similar to before. For distraction we also present other people's HRTFs. And
here are the results. You can see five different techniques, learning to rank, sparse
representation, ridge regression, neural networks and HATS. On the left -2 means that the
person strongly prefer his own personal HRTF, and towards the right means the person
preferred the synthesized HRTF. Basically, we would like to have obviously everything from 0 to
the right. Here we see the techniques that give the worst results are the HATS; that gets the
worst results, and the second one is ridge regression, which is also quite natural. So let's
remove these two techniques for a second and let's analyze the other techniques. We have
learning to rank, sparse representation and neural networks. From this plot we can see that
neural networks actually works the best and also, one person said once firstly that he preferred
his own HRTF, but actually when he compare a second time, the same pair, he said that he
strongly prefer the generated one. So actually, I would say neural networks are on the all
positive side. For the learning to rank, that's the blue color the person said that firstly he, one
person said I prefer, slightly prefer my own and then I slightly prefer the generated one. And
the second participant said I don't see a difference and then said I slightly prefer my own.
Sparse representation, actually two people said that they prefer their own. One was consistent,
so he twice said that he preferred his own HRTF. Another person said he slightly prefer his own
and later said he slightly preferred the generated one, so there's again inconsistency in the
result. We should probably run a bit more user studies to try to understand, but from this what
we can see here is basically neural networks was the best and all of the techniques are actually
quite close to each other and also work relatively good.
>>: It looks like the sparse representation creates exactly the same HRTF which is more precise,
while the neural network create a better agent than their own, because the maximum
restraints have shifted towards the right.
>>: There's one thing to keep in mind. The 0 means the subject has no preference. It doesn't
mean they don't need to sound identical. They can both sound bad or both sound good or both
sound bad in a different way.
>>: Yeah, but the reference is always their own.
>>: Yeah, but no preference doesn't mean they don't hear a difference. It just says that they
sound both equally pleasant. One maybe in terms of [indiscernible] and the other has a more
natural timber. That is also included in the 0.
>> Piotr Bilinski: But making an assumption that your own is relatively good, this will show they
both sound relatively good.
>>: We don't have an explanation yet for why we can't create an HRTF that sound better than
people's own, assuming that their own are the best ones and that this is what their auditory
system is calibrated to, but I'm sure that Piotr will mention this point in future work.
>>: And the other thing is also you can't measure HRTFs such that they are directly useful for
oralization, so you always have to do some sort of compensation or equalization of
microphones and loudspeakers and all these things and we haven't managed to have an
automatic way that spits out the perfect equalizer or calibrated HRTF so there is some sort of
manual tuning involved and you can never say that this is the truth, what you are you doing. So
it can happen that there is some flaw in our equalization and that the synthesis algorithm
happens to correct all that.
>>: It's also based on initial impressions. It doesn't take into account listener fatigue, so in the
same way a lot of people might turn up the contrast on the TV straightaway, but after a while
they realize that it starts to be unnatural, and the same thing can happen here. Sometimes
things that are boom in your face well sound more impressive, but not necessarily more
realistic. It might get irritating after a while.
>> Piotr Bilinski: That's actually what was happening. Lots of people decided firstly that they
prefer some HRTF and then after some time they said maybe for a short while it's very nice; I
would like to have it, but the second decision they made against another one.
>>: And that says they sometimes refer one and so we repeatedly present the same pair of
HRTFs so sometimes they say I prefer this one and sometimes they prefer the other one. That
suggests that either they are kind of equally good or users are unreliable maybe because their
priorities shift. These are things we can't say really say what is going on, so this is all subject to
future work. There are certainly indications that subjects were overwhelmed with, especially
those 150, comparing those 150 HRTFs. It's so fatiguing and it takes such a long time that even
I caught myself, and I noticed that my priorities shifted. Sometimes I put a higher weight on the
externalization and sometimes more on the timber and so it's a bit of a, it's finesse we'll be
working on definitely.
>> Piotr Bilinski: That is definitely true. There's still lots of stuff that should be investigated.
And the conclusion. We created a new dataset with HRTFs and anthropometric measurements.
Over the summer we created algorithms for anthropometric feature extraction. We created
four different techniques for HRTF personalization recommendation. We evaluated our
techniques using both log spectral distortion and user studies and the results are encouraging.
And based on the results, the best technique is the sparse representation when we are using
the log spectral distance and neural networks based on user studies. As future work, definitely
we should collect more data, more extensive user studies to assess the proposed techniques.
We should collect more data to cover a wider range of people like more females, more kids
more elderly people. For the techniques, for the sparse representation-based approach it
would be very nice to add feature selection so we can find useful features and easier to
measure features that give good results and remove all useless features. For the learning to
rank maybe it's a good idea also to learn ranking of HRTFs from LSD distance and also definitely
we can try in the direction of matrix factorization for the recommendation of HRTFs. Thank you
for your attention. Are there any questions? [applause]
>> Ivan Tashev: Are there any other questions by chance?
>>: It seems like you are generating these things [indiscernible] put in some features that spits
out and HRTF, right, is the goal, the synthesis goal, but it seems like these things work, are only
really evaluated by people in pairs. Your classifier has no, my understanding of your classifier
says there's nothing that says that features to the left here and the features to the right here;
there's nothing that says these two actually have to agree in any way, so your HTRFs could be
sort of anthropomorphically inconsistent. Do you know if that's ever a, can you ever like, for
example, mess up one person, give them their real left ear and the wrong right ear and see
what happens? I mean does that matter? [laughter]
>> Piotr Bilinski: Actually, what we are doing is the synthesis of both ears at the same time.
>>: From one classifier?
>> Piotr Bilinski: Yes.
>>: You put [indiscernible]
>> Piotr Bilinski: They would be equally bad or equally good.
>>: Both features go in and to ears,?
>> Piotr Bilinski: Yes. All HRTFs, all anthropometric features go into and they say that is the
HRTF that you should use. It generates both for the left and the right ear.
>>: [indiscernible]
>> Ivan Tashev: How many numbers is one HRTF?
>> Piotr Bilinski: How many numbers is one HTRF.
>>: Yes.
>> Piotr Bilinski: We have 512 directions and for each direction we have 512 values.
>> Ivan Tashev: So it's a substantial amount. So you can have a neural network with 96 inputs
and you managed to synthesize 500 x 500, a quarter of a million points.
>>: [indiscernible]
>>: I mean, at the end of the line subspace is a lot smaller than that. [indiscernible] the neural
network has [indiscernible]
>> Piotr Bilinski: They are basically very close.
>>: Given [indiscernible] generate [indiscernible]
>>: No. Actually, no. You can generate in one direction and apply the weights to all of the
other directions.
>> Piotr Bilinski: I mean like, for example, when we are using which technique, for example,
sparse representation, we learn the weights and we don't care which direction it is. We
basically apply the same weights for the HRTFs. And the same… Neural networks basically the
idea is we put all the direction the same in the same spirit.
>>: [indiscernible] all the measurement locations, did you, all the measurement locations, not
only just [indiscernible] directions [indiscernible]
>> Piotr Bilinski: Yes. All the directions, not only one, because to evaluate we need all of the
directions.
>>: [indiscernible] directions not future [indiscernible]
>> Ivan Tashev: It's just a backup number.
>>: Yeah, I know. But I imagine two classifiers, one where you say these are the
anthropometric features and this is the direction and elevation. Give me the HRTFs for these
anthropometric features. Give me all of the HRTFs.
>> Piotr Bilinski: All of the HRTFs.
>>: So there are two very, so there are fewer parameters to learn in the first case than the
second case. In one case there are only 512; the other one it's 512 squared.
>> Ivan Tashev: Yes.
>> Piotr Bilinski: Basically we can also just create 512 separate classifiers for generating
algorithms.
>> Ivan Tashev: [indiscernible] are completely independent. They are related [indiscernible]
>>: But if you use them as features then you still have the same network sharing all of the
weights and it could learn the relationship between the features.
>> Piotr Bilinski: They are not exactly the same. That's why we like learn from every
separation.
>> Ivan Tashev: More questions? If not let's thank our speaker. Thank you Piotr.
>> Piotr Bilinski: Thank you. [applause].
Download