Document 17954104

advertisement

>> Ashish Kapoor: So it's my pleasure to welcome Selene Mota. She is interviewing for a fulltime position with us. She's a PhD candidate at the MIT Media Lab, and she works on some of the most interdisciplinary areas that we wish to actually grow in, so specifically, she's been looking at human behavior modeling, machine learning, effective computing, as well as youcentric design. So it's an incredible mix of things.

Prior to that, she has had incredible industry experience, where search worked before at Philips

Research, as well as German Institute of AI, and also at Swiss Institute. Incredible startup experience, as well, with finalist at MIT$100K, as well Summit Prizes, so we are looking forward to hear her speak today, and looking forward to interacting with her during the rest of the two days. Welcome, Selene.

>> Selene Mota: Thank you. Thank you, everybody, for being here. So today, I will talk about activity recognition in natural environments. So as Ashish mentioned, I am passionate about behavior understanding, especially like how behavior understanding interacts with learning, emotions and health. I have participated in different projects, the Learning Companion that looks in paralinguistic information and the relationship with learning. So the Da Vinci Project in

EPFL Switzerland that looked at how activities in a surgical environment can be supported, and the Philips [comm] lab, that it was an instrumented facility that was looking on human behavior.

But I want to give you some facts about activity recognition. So the first fact is that activity recognition is not actigraphy. Most of the activity trackers today, they are looking at actigraphy, which is like the overall amount of motion, but doesn't tell you anything about the signatures of movement. Another fact is that activity recognition has had complex hierarchical topology, so it can go from body locomotion to complex actions like cleaning your house. So there kept being many approaches to activity recognition from the top down to the bottom up. So, for instance, in

2000, so many research facilities were built like the Place Lab at MIT, the HomeLab, the

Georgia Tech, our home, and those facilities were super-expensive, huge, but why we don't have them here? So there are several reasons. One is that those facilities were extremely expensive.

They were difficult to replicate, and it was not a users' home. So I actually worked in the Philips

[comm] lab, and what happened is that you couldn't pay enough people to live there for like a month, and after 6:00 p.m., because the lab was in the middle of a research facility, everybody would leave, and the people living in the houses started to close the windows, and all what you could hear is they were feeling weird about being there. And it was just very difficult to bring people to live at those facilities. So even that the intention was to look at natural human behavior, you were not seeing that natural behavior. So then, like over 10 years of research, so the field has been evolving toward the bottom-up approach, where you'd use wearable systems that is built like motion primitives that can build up to recognize activities. And those systems primarily use wearable sensors. And that's good news, because we are living in an era where we have an overwhelming number of wearable devices. Actually, at MIT, as just mentioned, I participated in the 100K, and I was working in the accelerator prize, and every month, we heard about a new startup, a new wearable, so it's just like incredible. And also, we live in an area where we are collecting lots of behavioral data in massive quantities, and we are collecting that data at different scales, like the global level, transportation systems, energy usage, sensors, public space. At individual level, like location, social interactions, digital records, behavior, at the cellular level. We are working in the epigenetics, which is how your genes change, because factors in the environment, which is like very, very exciting. But how can we use such data to

answer important questions about people's health, attention, learning and motion state? Because here, the big question is like, lots of data is not equal to lots of information. How can we answer questions such as, what is that person feeling? Is that person engaged on the task? What are the health markers of a person? Is a particular drug working? Or we couldn't develop just-in-time interventions or to assess how an intervention is working. For instance, we were cooperating with USC in tracking children that suffered from asthma. And all those applications and all those questions are related with activity recognition, and activity recognition plays a crucial role.

And another particular characteristic is that not only involved recognizing short-term activities, but the big challenge is that in both recognizing long-term activities at a population scale in natural settings and in a personalized manner. And those are big challenges. So systems that want to understand behavior, like context-aware systems, ambient-intelligence environments or personal health applications, they need to use activity recognition, sometimes not as the end-user application but as a crucial component that is part of their functionality. But let's look at how activity recognition works.

So, usually, the main components of an activity recognition system is sensing that gathers information about the actions and context. Modeling -- that extracts features and patterns that are useful to recognize activities, and applications that make the information actionable. This is like well-known components in activity recognition. But these components have -- when you are looking at sensing, you have to look at economics, and you have to look at equally susceptible for the people. In modeling, you have to model activities in realistic scenarios, and in applications, you have to know what type of applications can engage and sustain the engagement of the person. So all these components interact, so when you are designing a system like that, you have to look at all these components in order to make an effective design. And each of those components have challenges. For instance, in the sensing, so the systems not only need to be useful but acceptable to the users, and scalability is an issue. So, for instance, I am coming from the Media Lab, and we always make prototypes, and they are awesome, but it's very different to make a prototype to make something that you will deploy with 200 participants for one year, so it's a different issue. Sometimes, it's not like -- you will not create a new system, but you have to look at those components. So in terms of modeling, you have to develop a system that can be deployed in a broader audience. For instance, like special activity recognition algorithms are well, well known, but are subject dependent. And then, what do you do? So many, many people have tried to do algorithms for subject independency, but what happens is that there is a fundamental problem. People behave different. The variability of what someone does is really, really high. That's why it doesn't work. It's like trying to fix and to pick out the elephant from the room, so it's different. So then, like the type of machine learning that you need to develop has like a different challenge, so it's more about how you can learn to learn, how can you teach your algorithm to personalize the information.

>>: Can you give us an example of something that hasn't worked, so something that's suffering from one of the problems that you're talking about?

>> Selene Mota: Yes, actually, we have tested like hidden Markov models, decision trees, support vector machines, AdaBoost, and I will talk about that in my machine-learning session, and for instance, like many hidden Markov model solutions works for five participants and six activities, and they will work. But if you want to generalize, so they start to fail. For instance,

we were using for a really long time, and my former adviser was a huge fan of decision trees, because they have the advantage that you can see how things are modeled. But what happened is like whenever we had a demo and whenever we changed the choice, and I will be another environment, or we'll switch sensors, it will not work. So you would see me like jumping as crazy just before the demo, trying to make my model work, because every time that I was adding a new activity, I needed to -- or more data -- I needed to retrain the whole model. So all the things that I learned from a model that I did before will be lost, and that was really, really frustrating. And then making the activity and information usable. You do HCI, right? And the practical experience is like you cannot pay enough people to use your system for more than three weeks if it's [terrible]. If you are working in a sleep study, people have trouble with the sensor, are saying, I sleep like this. And there's no way they will use the sensor. So those are issues that you need to take into consideration. In order to answer those questions, I will talk today -- I will try to answer it in two domains. One, like using information of activities collected in experimental settings and other activities collected in natural settings. And why do we need the experimental setting? We need it to evaluate and actually compare themes, because they are data sets, but they are not that many, and many of those data sets are like, you will be surprised how well collected they are. So there is an issue about collecting useful data sets. And the other parallel is obvious, because we want to develop algorithms that work in the real world. So first, let's talk about sensing, and sensing is something that I get very passionate about it, because right now we have so many approaches. We have commercial sensors. We have medical sensors, so you know that the Actigraphs -- probably you know it, you know that it costs $2000, and it's just an accelerometer. And it's the Sensewear, the Genea, but why the Actigraph is so prevalent. It's because it has been used for like 30 years in medical studies. It's validated, and they don't disclose their algorithm. So we actually reverse engineered the algorithm. If you want it, I can give it to you. I gave it to [Rose], actually. And actually, we have also the sensors that are used in research, like the Shimmer, the Porcupine and some years ago, the Intel MSP that [Tansim] was doing. And there is a problem with those sensors, because they work in real time, so just like another aspect is like most of the medical sensors are loggers. So because they are [target] issues, ones that are transmitting all the time -- transmission is one of the most costly operations, when you transmit data all the time, so then that's why they log the data. They call it real time, but it's not really. It's like when the device goes in proximity with the base, so then they can transmit the information, but it's not like -- if you want to do just a recognition, you could not do it. And then you have the research sensors, that they actually you can do that. They work well, they are real time, but as you can see, the form factor is a little bit like, eh?

>>: Does the low-power Bluetooth standard help?

>> Selene Mota: Actually, it's kind of -- this project was started four years ago. It helps. Just to give you an idea, so we did like many, many experiments. So if you want to transmit with the previous Bluetooth real-time just acceleration signal and 90 hertz per second, so you will -- we did running for six hours, for instance. So, actually, like after we did the sensor, we extend the battery life on the device for 32 hours, transmitting real time, which for us is a real challenge, but then we have the problem with the phone.

>>: Oh, okay, so the phone is running out of --

>> Selene Mota: Yes. So it's like, yeah, sure. So it's all this effort. And then what happened is like, with the new Bluetooth, it can run until nine hours. So it does extend it for like three, four hours more, but if you really want something like 24 hours real time, you have to go beyond the

24 hours, because you have to -- there are practical issues. You will lash -- once we develop the sensor, it's like come on, now how we will make people to charge it every day and to remember.

>>: But it sounds like the limiting battery factor becomes then the phone's processor draw rather than --

>> Selene Mota: Yes, but the new phones, for instance, like also increase their battery. The new

Android -- sorry.

>>: It's --

>> Selene Mota: So it's actually --

>>: Windows Phone.

>>: Yes, exactly.

>> Selene Mota: So it works for like -- before, it was working for six hours, for instance. Now it's like 12 to 15 hours, 19 hours sometimes. It depends. Okay, but then you have like the quantified self-community that these are these early adopters, that they are trying to test every single circuit, sensor, but actually they are testing that stuff, which is great. But I actually go to these meetings of the Quantified Self in Boston. It happens in like Microsoft Research over there. And it's very interesting, and then I see people saying, I was tracking myself, and actually,

I didn't track that much. And then actually, I was just interested in something else. So you hear that over and over, right? So because I will tell you the dirty secret about wearables. They have a really very bad dirty secret, and this has been proved over and over, so this is one survey made last year, that even that for like activity trackers like Fitbit and Jawbone, that they claim, oh, my

God, I am changing the life of my users. Well, you are changing it up to like 15 months or something, because the usage drops. Even I have many friends with the Google Glass, and I have to tell you, they all -- that's the challenge, right? How do you build something that offers a long-term, engaging, sustained interaction? And actually, we don't know very well the answer.

Of course, it's on the side of the application, but we need to work more about it. Another dirty secret -- so we really are into the testing mode. We'd even built that shaker that we called -- these are devices we put together [indiscernible]. I don't have the photograph today, but I will wear all the sensors and so on, and we test like Sensewear, iPhone and everything, just to show you that those sensors are all over the place. Here, they are just like measuring actigraphy, not even activity recognition. It's just the overall level of activity that will be like activity, intensity and so on. And so they are giving you -- even the same sensor is giving you a different answer every time.

>>: These are the same on the graph?

>> Selene Mota: Yes. So then, they are different, and there are several problems here, and actually, when it went to the lab, I was talking with them, and they were saying, grrr. But the thing is, they don't disclose their algorithms. They have proprietary algorithms, starting for the

Actigraph that is like the gold standard. And then they have different hardware, right? So different specifications about how they are sampling that information.

>>: It strikes me though that, aside from Fitbit, which sadly seems to be the worst, it seems like they're fairly self-consistent across these things. I wondered, do any of them do any kind of a calibration stage, like say --

>> Selene Mota: No. Actually, because they are more -- actually, I went recently to the conference to the Society of Measuring Behaviors, that I was just commenting [Marie], and then they were really -- they cannot really use this, because they need to compare. And that's why they still buy these very, very expensive sensors, and they spend lots of money.

>>: The other ones -- can you explain these graphs?

>> Selene Mota: This is just like a quick graph just to give you a sense.

>>: What are the bars, what are the numbers?

>> Selene Mota: Sorry. This is just the level of activity counts that are mapped to a specific activity.

>>: So one person's wearing all the sensors. One person's wearing all of these sensors together, but they all measure different caliber --

>>: But what's on the Y-axis, is what we're asking.

>> Selene Mota: So is the activity counts.

>>: Is 01, 02, 03 the same?

>> Selene Mota: They are like that exercise in a different repetition. It's just like -- yes.

>>: On the different repetitions of the same sensor.

>>: Do you expect at least the same device to show the same --

>> Selene Mota: Yes, for the same. And this wasn't really not like -- this is just a rough test, so

I will go through the machine learning later.

>>: Lync is pretty self-consistent always.

>>: Well, Fitbit's in three circumstances --

>> Selene Mota: So, for instance, the FuelBand, if you look, the FuelBand is consistent.

>>: New repetition [indiscernible].

>>: Why assume that this person did that --

>>: The same way.

>>: Maybe Fitbit is actually recording that person. The others are missing?

>> Selene Mota: Yes. The thing is like, this was not a super-calibrated -- this was more like just to show you what we found. We did -- actually, I can show it to you -- very controlled experiments using something that we call Shaker that was a device that was moving the sensors in a specific -- I can show you. I have the paper on that. Because we needed to validate our sensors. We needed to validate it against the Actigraph, so we needed to be very, very precise in our measurements, but in general, this is just like the rough. Really, we put the sensors like that, but it's just to keep you a little bit built on the argument, so that that's something that is happening, especially that they are not sharing the algorithms. And here, we are talking about actigraphy.

>>: So how much is because of the algorithmic thing versus actually sensing and position. For instance, even just arranging the hand, you get different.

>> Selene Mota: Actually, the thing is like you then go into the activity count, which is like how do you measure an amount of movement? Activity count is what is using the medical field as the bouts of movement, and that, how do you tell me that your bout of movement is similar to yours?

That's the very basic program, so they are not sharing how they compute that. But let me just move on, so we need to make the sensing practical, useful, acceptable and sustained, long-term engagement, so we developed these sensors that we call the Wocket. It's wearable, can be worn in multiple locations, it's real time, has complementary modes. And what I mean, it can work in real time but also can work with latency, which means that it collects the data for 60 seconds, and then it sends it to the Bluetooth in a second, and then that extends the battery to several days.

So then it has the same advantage that other activity recognition wearable sensors that doesn't require infrastructure, doesn't work for a specific environment, and the sensor, I want to highlight, it was an amazing collaboration with Ben Kuris, which is a very talented hardware designer. The used to be at Intel, and he actually gave lots of input for the MSP. But we needed to go to the next level with this sensor, and he actually helped us do that. So for making the sensor, we did several participatory design sessions. Something that [indiscernible] really liked, that is technology probes, that actually, we did many, many iterations, and at each iteration, we were preparing a version of our sensor. We will give it to the users for like three weeks. They will come back to the lab every week and will tell us what they think about it, how they will change it, and we'll learn a lot about that. Especially for doing that, we interviewed 20 people who were working with us, and they were not adopters. It was like a little old lady, because we really wanted to know how the sensor will work for medical studies. We did several versions of this device. We were in version 5, and the connection is like a micro-USB connector. It's a USB connector, but we changed the tape, because actually, when people were sweating, or they were

running a lot, some sweat started to go into the sensor, and after three months, it started to oxidate. So we needed to have a different solution how to do that, and at the end, we had something really small. Of course, it depends on the battery, the shape of your battery, but we actually contacted some people that were developing flexible batteries, and it was exciting to see.

But we had, so you can see -- we had very flat batteries there that we could move, and then it was really, really soft. So at the end, this is our sensor. Those are the specifications, and it's like three axes, accelerometer. It's just like a normal activity sensor, but the cool aspect is that -- actually, is that you can program the framework. It's open, so you can make your own experiments, changes in [grade], if you want to use it in a latency or something, you can determine how many sensors you want, and they are working in a synchronous manner. So then you can imagine that opens the room for lots of applications. So also, they can work simultaneously. This is an example. So that's what's the common setting that we had for activity recognition, wearing a sensor in the ankle, in the wrist, another in the hip. So those were the main locations. But also, we wanted this system to work with other sensors, so we tested that this is an example, like the sensor was connected to a phone. The phone was gathering GPS information, was running experience sampling on the phone and was connected to a high-rate device, which is called the Alive, so then we can get this information, and then that was -- this was not meant for us to have that information, but to prove that it could easily be connected with other types of information. Because the [edition] that we had about these sensors and this work is that, for instance, if you have a large story, so you can send the sensors to the participant, the participant can download the software, connect the sensors to their mobile phone, you can send instructions. There was this movement about having people as a scientist, so then they will report and contribute with data, because they are also interested in the study, and then the data could be uploaded to a server, so where the researcher could look at the data on a regular basis.

For instance, the studies in the medical field, when I learned about them, I was worried, because they are super-expensive, three years, five years, and they don't know anything about the study after the three years. And I was thinking, in three years, you kind of know, but you really don't know, and then there was a mistake, and you cannot fix it. Then it's like millions of dollars, and it's like those are the short studies, but they have the cohorts that they follow people during their whole lives. And you can imagine how it's like, oh, my God. Then, we had this vision, and we started to try the sensors, so what the user receives is like these sensors, so actually these sensors looks like that, because it's also do it yourself, so it was part of the initiative, because we also wanted these sensors to be super, super cheap, and also that they could be made everywhere in the world, in every lab, even in India, in China, so they needed materials that they could just find easily everywhere. And these really is -- we really worked out the procedure to make it easy.

And then, the usual experiment is you give the sensors 40 to 90 hertz. We tested what was the best frequency, the sampling rate for physical activities. It depends application the sampling rate that you need. For instance, we work with autistic people and Parkinson patients. You need 90 hertz for that, but for physical activity, it's 40. And that has implications with the memory of the device. But also with that, in the phone, we developed software in Android and also Windows

Mobile, backward we received a donation of 200 phones. It was great, actually. I like it. We implemented experience sampling. I don't know if you are familiar, some people call it momentary assessment. But then people could also report about their moods or their activities.

And then this is how they will look like, so I will just browse -- so some of these activities are labeled through the experience sampling. Some of those are for the activity recognition, especially natural scenarios. So when we have a study, so we just focus six activities, very

simple activities, because we come and solve the ground-truth problem, how to label the activities and things like that. Then, the day starts, you see the sleeping patterns, the poster

[chips]. This is actually my adviser. He was in the [dark]. Then, so he's in a meeting. You can see how the patterns are changing, so it's sleeping. You see the [indiscernible].

>>: So these are different accelerometers.

>> Selene Mota: Yes, it's like the ankle and the wrist during several days.

>>: Does it align with the accelerometer?

>> Selene Mota: Yes, the pink lines are the acceleration measured every second, and the red line is the average of that. So that's why you see the peaks. For instance, even if it's a very committed user, sometimes he forgets to charge the sensor when he's tired, or he's just taking a shower. So it's everyday activities, so then you can see when a person goes to the restroom, so the [indiscernible] when you're asleep. And what is important to highlight with the Wockets is that it's also -- it's capable to collect raw data in a very high granularity, which is not just summary data. Okay. I will just browse here. And this is like what we give on the phone. So first of all, the UI allows you -- because one problem is where the sensor is, where the person is putting the phone, and then it allows you to set up where you want to put the phone, so these are the reports, but it also tells you when you miss the phone. This is like the user can go and check data, as well as the researcher, and we can automatically for every day send summary information about --

>>: Can the user edit?

>> Selene Mota: Actually, it's like something they cannot edit right now, but we were interviewing the users, what was their wish list, and actually, one of the things that we go is it's really interesting for me, my raw data. I want to, for instance, use it as a journal. I can even label it, because it's a very heavy task, but I want something in this, and that was clear. So we didn't make an interface especially for the users, but we got a look inside. So then, we get the data like that, so if we are getting data, until that point, we are just collecting the data, but how we can go from the data collected to actually distinguish between different activity patterns, these are the typical accelerometer signals, and for that, we will have touch some questions about modeling. So, for instance, what are the most sufficient features, where to put the sensors on the body. Depending on the activity, which gives you more information? We are focused a lot like in medical studies, so we need to figure out that their researchers get what they want. Which algorithm is optimal? But also at least algorithm that is optimal, working in the real world, so I will try to answer these questions. First, in the experimental domain, and then in the real-time domain, and I will go fast, because -- so then we did lots of experiments, validating our sensors, how along with other -- like 10 sensors. It was very elaborate experiments in semi-naturalistic settings, several protocols, many participants. We did it many times. And then, we have all the software where we merged and synchronized the data. You know how every sensor has a different protocol, so we needed to synchronize it all, to make sure that everything was collected, so these were the labels, etc., etc. And then we used that type of data to answer the first questions, so just really the top line of machine learning is like this. You have the sensor,

extract, segment the signal, extract the pictures. Then you'll [indiscernible] and the classification, but you can iterate. All those, it's all together --

>>: Did you video and then label the data afterwards?

>> Selene Mota: In those experiments, we had two people following.

>>: Oh, a guy saying, at this time, he did this.

>> Selene Mota: We did, yes. And we had synchronized devices to do that. It was not video data, because we did video data before in the Place Lab, and it was super-difficult to label video data. It was just lots of work, we had to pay many people, and the labels were not even good, because the annotator was almost sleeping. So then we decided it was better to do it live.

>>: Were the scripted activities then mostly different exercises?

>> Selene Mota: I can show you later what activities. So we have many protocols, but we tried two activities, like what you will see in everyday scenarios, but also activities that tells you things about the intensity, because many people are interested in energy expenditure, and then it's important to have activities representative of different intensities in order to map to what is called the med stable.

>>: And you were measuring the oxygenation and heart rate.

>> Selene Mota: Yes, so if you'll see in the picture, she has the Oxycon.

>>: Is part of your goal -- maybe you'll get to this later, but it seems like operation set of goals is around recognizing activities or particular levels of activity. Another approach might be that you could take cheap sensors and try to regress more detailed sensor values from them.

>> Selene Mota: Actually, it was a very generic project. It was like building a platform that others could use. In this project, we were collaborating with the Stanford Medical School, that they were interested in detecting activity type in order to test the equations that they use, in order to determine energy expenditure. And those equations are based -- for instance, there is a table that actually maps jogging at this speed, maps it with oxygen level. But that's something that they did by hand. It's incredible that they did all that work, but it's like a book. So they wanted to do automatically. You can imagine --

>>: They wanted to tune the parameters of the setup.

>> Selene Mota: But for those, we wanted to use the sensor in a wide range of experiments that

I will try to go fast to get into that, because I am getting too much time here. So then, so feature selection, we use like information gain. Some people ask me, like why don't you use boosting?

But we wanted to rank, to know what information each feature was giving us, not just to know like, this is a particular set of features, but we wanted to know why, because that was informative for our different applications. And also, after that, we knew, we rank which features were giving

us a particular type of information, so we cluster them according -- for instance, if you see the lower table, clustering what was the computational cost, because that was what we desired, to have on the phone, and we wanted to know, what is the computational cost, and what information is this giving us for the classification test? Just to give you an idea, for activity recognition, so sometimes people use 356 features, which involves for all the statistics to the time, the frequency analysis and the energy analysis, but we just realized, in many, many experiments, over and over, that you don't need more than 13. They are the basic statistics just for activity recognition, for the basic activity recognition, so we were applying the same system for detecting tremors in Parkinson's patients, and that, you need the [indiscernible]. You really need that of, for instance, for autism. So then, do you know like some features can confuse your classifier, but some others, it depends really the task, so that's why we did this analysis, to be informative. That's why boosting was not, in our case, the answer to our question

>>: So you're saying basically you wanted an interpretation of the features, and this was the --

>> Selene Mota: Yes.

>>: But it seems like those things seem somewhat factorable in the sense that you could select the features that you felt were informative, but then, given those features, you could train multiple trees and --

>> Selene Mota: Yes, this was more like really because we wanted to inform the community, and we wanted to inform other people that they were already applying other techniques and things like that. It's just like I received that question a lot before, so I just wanted to be aware.

And another question is we were testing with five sensors at the same time. Actually, our system will support seven sensors at the same time, and you can imagine, to sync all those seven was challenging, but then, my adviser said, you know what? We just need one. And then the reason why we did that is because, in many medical experiments, they can have just for one sensor.

And then the question was like, if we use a common classifier, which location you will pick, right? And then we did lots of experiments with 15 activities that I will tell you later, which are -

- and we clustered them according to activity type. Why activity type? Because we were working with the Stanford Medical School, and they wanted this information for the studies on energy expenditure and physical activity. But then, we saw that it's more like for ambulation, it's just not that big difference.

>>: So it seems like ankle is the best. It's everywhere.

>> Selene Mota: But when we look at the activities, I don't have that table, actually, so for ambulation identification, yes, ankle will give you always a good answer, much better than -- you can imagine, because when you are typing, it's kind of common sense. For sleep?

>>: What is ambulation?

>> Selene Mota: Ambulation movement, walking, all kinds of walking, slightly. So then I was thinking, we should use like a smart sock.

>>: Smart sock.

>>: It's being developed.

>> Selene Mota: The thing is like not in the shoe, because most people, when they arrive home, they take off their shoes, so then you miss all that. But then, for sleep studies, it's a risk, definitively. But, of course, the ankle doesn't give you that much information when you are sleeping. It's a tradeoff. It depends really what you are looking for. And then, of course, you -- if you want a high granularity, you need to use both upper motion of the body and lower motion.

And then, we were exploring how to use more the self-report. So that's -- I don't have the answer for you right now, but all what we have done experimentally on that is it can really inform your algorithm. It can give you tips to make it better. So then algorithm selection. We select with many algorithms, hidden Markov models, decision trees. My adviser is a huge fan of decision trees and support vector machines, so we tested with these basic activities that are the activities that most people are interested on, in activity recognition. And you can see that also we were interested in how many features you use and the complexity of those features, because you need to actually compute them on the phone, and there's more or less set of features was the ones that used just statistical information, but it's like 13 features. It's just like mean, variance, total mean, difference in variances, nothing fancy. And then, with that, it's like in the upper table, it shows you the results of hidden Markov models. With these activities, it's not a big data set, but with

13 features, it's not doing very well. As you can see, when we increased to 42 features, this improves the -- it really needs like 64, but that involves FFTs and it's kind of expensive to compute that on the phone. Actually, the accuracy doesn't even match the highest accuracy of the decision tree using 13 features. So then, it is superior, and we did it many times over and over, and we have done it with several data sets. So it didn't really handle well the data.

>>: Is the training test split across users?

>> Selene Mota: It is overall.

>>: But I mean, does the test set contain?

>> Selene Mota: So it was subject dependent.

>>: Subject dependent, so --

>>: Subject dependent, so you have train and test from the same --

>> Selene Mota: Yes. This should be a better outcome.

>>: And each individual user, per [model] user, so one multiplier for every single user.

>> Selene Mota: That is like the best situation, and [Dee] had the number.

>>: So it just [indiscernible] user or there's some knowledge transfer.

>> Selene Mota: We made a classifier for you, for per person.

>>: What do you think would have happened if you had tried to use some --

>> Selene Mota: Because usually we have seen over and over, and that's why I'm not showing the results, is that when you test with other users, the accuracy drops.

>>: I understand, but have you considered using some knowledge transfer in which --

>> Selene Mota: Yes, so actually, I will go to that later in my talk. Here, I just wanted to have a rough sense where we were, so decision trees and support vector machines seems better, and then we picked up and went on with those. Because, in the beginning, we just didn't know what to use, so we were exploring. So then, we decided to go on -- as you can see other properties, I will just go through -- to answer your question. Other property is that actually to distinguish the known class, to know that it's other, is nothing that you are focusing on, and then we were looking on how these algorithms do that. As you can see, for instance, playing soccer is really a really hard task. It usually gets confused, but it's normal. It's part of the problem with the taxonomy of activities, because it's really interconnected.

>>: You just run a lot, right?

>> Selene Mota: Or driving a car. I mean, because it's part of also sitting, right? So then, still we wanted to have good properties in that regard, and those algorithms seemed good. So I will go into your question, so just like I want to tell you about what is going on with this status, when you'd ask me like which activities, what was the protocol, you can see an example of what were the activities. We just made a selection just to test, probably. I have the results in detail of all the experiments, but it will take me really a long time to explain all of them. And then the caveats of this data sets, a few users, fixed sensors, clean sensor signal, you know, the starting and ending of the issue in activity recognition, so segmentation of -- but this was all clean, right?

And then you have all the signals were complete.

>>: Segments that you knew from --

>> Selene Mota: But this is the classification, though, is per frame? You're not looking at --

>> Selene Mota: So we established that -- we did also experiments that establishes what is the appropriate window, so here our window was 400 milliseconds, but that will be across all our experiments. That's very optimal for physical activity.

>>: But in this experiment, you made that the entire 400-millisecond window was within the activity, was within one activity.

>> Selene Mota: Yes.

>>: It was not crossing the boundary.

>> Selene Mota: No, we didn't have overlapping, because it was made through a protocol, right?

I was walking, and then it said start, end. And then you can see, there are lots of disadvantages, right, to really tell you something about how an algorithm is performing. But then we move to testing in real time, and then to see how this algorithm is actually working. So for that, we were collaborating again with Stanford, but in this time, we deployed already with 50 participants, but the aim is like 200 participants for two years. It's a real-world data collection. It's a seminaturalistic and naturalistic setting, and the semi-naturalistic setting is for data validation. So you deploy a sensor in the wild, so how do you know that is working? So then for that, we have to come with ideas how to do that, for instance, like the participant will wear the sensors and will report with the experience sampling. But we will choose one day where we will do the activities with all the gear, and we will go with them, and we will try to do it outside in biking and things like that. It's just to collect. These participants, like once per month, they will do that. They will, for three hours, will [cure] -- and we had to put a sign like it's not a terrorist or anything like that. You can imagine, it was a problem. So actually, once I got caught. It was not very good.

So, anyhow, we collect data, so we had two modes. For extended, the battery was like one minute latency, so then the things like the Bluetooth consumes a lot of power if you send like one bit of information. We were trying to accumulate information, send it one minute, and it's how we extended the battery life for a really long time. But also, the sensor was working in real time. We actually deployed with those participants two little applications I will show you later, and then, these are the kind of results. The image is not very clear, but other is other activity, so right ankle, using a particular -- it has just a lock that this is the lock that is when it's in a passive mode. And so the good news is the most -- the basic activities, the fixed activities I showed you, are mostly identified. The problems are the transitions with the activities, and it's very difficult to segment the type of data. We run many algorithms, but it still was really hard. And also, we needed -- it's difficult to classify other, so we need some commonsense rules for the algorithm.

Steady state of activities is rare. Well, not all the time, but actually, if you want to measure ambulation is when it's rare, and that is an active area of the research, so I will just give you an idea. So this was one of the applications that we deployed, so we have the activity, and we asked participants to help us to segment, so the signal was shown to them, and you can see, one researcher at the lab cannot do it all. It's just too hard, but we asked the participants, and we were giving them some points at the end of the week. So if they could just show us some of the segmentations once in a while, and then we did that, and it actually worked.

>>: How did they know?

>> Selene Mota: So it's because you see the activity, and then there are transitions.

>>: But they're self-analyzing it, saying, oh, right here I was running?

>> Selene Mota: They are like, when the form measures a change, so it prompts the user who sees it or not that sometimes we notify.

>>: Oh, I think you're driving right now, could you --

>> Selene Mota: It's a little bit stupid, but it's what we did.

>>: But it notifies them, so it finds --

>> Selene Mota: It notifies them, or the user can just put it, but it wasn't really like --

>>: Did you try doing active learning with the change.

>>: Right. So they're not trying to adjust the boundaries. They're just marking binary questions about did something change or produce a label at some point.

>> Selene Mota: Yes. But then we come back to our question, and your question. The algorithm is suitable for a real-world system, and we have to look at these things. So modularity, so all the time, if you ask participants what they do, maybe six basic activities will be similar across the participants, but all the rest will be different. So some people will bike, other people will swim, so then the level of complexity starts to grow, so how can you add activities on the fly depending on your interest? That's important. And also, it's transferred learning. So sometimes you are interested in a new activity, but you have very few examples, so then how can you borrow a model from other users that has provided lots of examples, and you can use it to borrow it? The problem with that, especially in activity recognition, is that you cannot do that for all the subjects. We tried. So the thing is like physical complexion is very important, and it is not very trivial, so actually, when we measure physical complexion and so on, it's still working very different. So then we have started to think about similarity measurements for the type of movement in making clusters of types of participants in order to borrow information. Maybe -- and this is ongoing research right now, but it's starting to compare two activities that have enough information for both of you and see the similarity, and then, from that, I know that you are a good model for me, so then we can borrow each other's information. That's the way how we are doing. Right now, I don't have those results, because we are testing, but for instance, like

I mentioned that we had decision trees and support vector machines, kind of like both good results, but we decided to go for support vector machine for a practical reason. So you can train the support vector machine in kind of like [all pairs] mode, which means that you don't solve the multi-class problem jointly, but you divide it, so you make pairs. So then, if you have like running and working, so you have just binary classifications of running, walking, running, sitting, running, standing, and then you just compare that as the classes grow. And then the problem becomes the decision in the second layer about how you decide who wins. And we actually thought that was interesting, and it was the advantage of modularity that the support vector was giving us, especially in this program of activity recognition that many times we have models built, and we didn't want to lose information. We didn't -- for instance, there were classes that were very difficult to recognize. For instance, walking and walking briskly, you need lots of examples to detect the nuances of those differences, and then, retraining against all the others, because you are adding standing. It's not worth it, you know what I mean? So then that's why we went into that direction.

>>: You said that you could do that with SVM. Why couldn't you do the same thing with trees?

>> Selene Mota: So because we were training the classes jointly. Maybe if you do like one, three -- I don't know exactly. I'm not familiar how you could retrain.

>>: So is it the fact that there's a new class that arises, you have to retrain the entire decision tree versus for the best [MKS], you can just --

>> Selene Mota: Yes.

>>: If I understand correctly, for the SVM, what you did is kind of all pairs or one against all when you trained. You could do the same thing with trees, right? Train a tree to distinguish between brisk walking and running.

>> Selene Mota: Yes. So we didn't do it like that. I cannot just tell you the exact -- the result, but what I can tell you is that when we train, for instance, two classes, so decision trees, support vector machines with polynomial kernel one, and by the way, some other experimental results, we tried many types of kernels, like random basis kernel -- random basis, sorry, and all polynomials. And I have seen actually in Ubicomp last year that they report results using random forest and AdaBoost with decision trees, and they also tested with support vectors, but they tested with support vectors with linear kernels, which we know that doesn't work. So for almost all activities, it just starts to get good recognition from polynomial number two, three and sometimes four, but it's mostly for efficiency, two and three. And then they get really, really robust when you use quadratic polynomial. So then, I don't know really how to interpret those results. They're not exactly comparable, because if you -- the support vector with linear kernel doesn't work, so you know what I mean, it's not a fair comparison. So the reason why we went out with support vector machines is because, in experimental -- in all the experiments that we did, support vectors were just performing better, even as an individual binary classifier, so we just decided to go, though I will show you some of the results later that might contribute a little bit to answer your question and add a little bit of discussion. So yes, so I will just go on, so when it didn't work, the support vector machine worked, but when we tested, normally it wasn't super great, so as I mentioned to you, the problem became the second layer, so how do you

[both], so because sometimes you will have types of classes saying it's this or the other, it will be wrong. So then you need to apply a little bit more intelligence at that level, and there are several algorithms that can tell you more about it. So these algorithms actually are inspired in the field of game theory, which I am really a fan of. So because actually they built a little bit of memory about how the class did before in order to increase or decrease a particular way, it's not as fancy as some of the game theory approaches, but it is very interesting. So we apply it, and you can see when we evaluate it with 10 of them, the six activities that we use always as our testing bed, so the classifier that did better, as you can see, the super-vector kernel too and the decision trees, when you make 10 iterations, they are all about the same place, but when you use the support vector with the [bolting] mechanism, it gets lower error.

>>: The upper curves are one versus all, and the lower one is one versus one with weighted --

>> Selene Mota: So in this one, it's like the results of -- we have here just how many users, 15 users? And then we train a model. It's subject dependent. It's not subject independent. We did it that way, because we are going towards personalization, so that we create a model for you, and you can transfer the information. So if you look at the confusion matrixes --

>>: I didn't understand what was the upper curve? So the lower curve sounds like you're doing one versus one activity, pairwise classifiers, and you're doing this majority voting to figure out the winner, but what's the upper curve, SVM K2 by itself, the solid line?

>> Selene Mota: It's just you'll see like normal voting. Not weighted, not weighted, but also voted.

>>: Number one is weighted.

>> Selene Mota: It's just like one is weighted, and the other is just not weighted. It commits --

>>: And the same set of one versus all, and the one versus one classification.

>> Selene Mota: Yes, the same set. So I just didn't understand your question. Sorry. And then you can see, we analyzed, the best classifier still is confusing the playing soccer. It's obvious.

But also, we want to know with these different modalities of weights, so which one was performing better, so then we were adding different number of samples, and we were trying to figure out what was better. As you can see, and this is just more for you to see, so if we use data gathered in the laboratory settings, most algorithms perform. The world is beautiful. They perform really well, not all of them, but most of them. But then when you go into the naturalistic setting, it's just the same activities. It's just really nothing fast. It's very simple, but that's what you have in activity recognition. Also, another thing that we experimented is like when you add new activities. For instance, in the set of basic activities are like the six activities, so this is how the algorithm performs when you add -- this answers a little bit the question he had, so how the decision tree versus the support vector machine does when you have two activities, right? And this is when you have two, when you have three, when you have four, but you can see that it's still -- from the old techniques, the dependent of experts is the one that did better. So in general, just to conclude this part, is that, as you can see, the support vector machine does better in terms of accuracy, but in computational time, it's 30 seconds. It takes longer. It works also better with the experts, and the thing is like here, you have other questions. What is acceptable? So is 30 seconds acceptable to retrain a new activity? We thought that it is, so when it's totally new, but it depends really your setting. And also, with the real-time deployment, we deployed these two little apps. One is the StepLively that just counts your steps. Another is called EveryFit, and I will just give you like a demo. So it's just to show you how some of the activities -- now it's playing. Sorry. It's kind of funny, this demo.

>>: Where did you publish this?

>> Selene Mota: So we have it in like Ubicomp.

>>: Was this this year?

>> Selene Mota: No, no.

>>: Last year?

>> Selene Mota: Actually, our lab has a lot of examples like that, so we did it with the [mice] and with this. But this is just really like, yes, brutal, as you said. But, of course, natural activities do not work like that. But people got very entertained.

>>: But it seems like he should have been burning way more calories than what it's showing.

>>: Do you think so?

>>: Yes.

>>: Then the number is realistic. Sorry to say that.

>>: I think this is per activity. I don't think it's cumulative.

>> Selene Mota: So actually, like the calories --

>>: This is per the net calories which [indiscernible].

>>: Biceps curl --

>>: Yes, how much weight do you have?

>>: Well, with no weight, yes.

>>: Well, running and pushups.

>> Selene Mota: Yes, and then is like, for instance, with the StepLively, we got lots of -- we got burned with some people, because they were really into counting the steps, and then they were saying this app can't distinguish between walking on a flat service versus going upstairs, and I want my points. And they were like, yes, come on. So things like that.

>>: Fitbit, as well. Once, I was on a bus with a Fitbit, and it was counting.

>>: All right. That's the way to get some exercise.

>> Selene Mota: And I have to move on. Otherwise, guys, I will have you here for a really long time. So then you can -- in that, using these different methods, you can browse like day of data, days of data, like months, like this is a month. This is one particular user, and what is interesting, so if you look at each activity is color coded, and this is one user in the summer.

And this is the same user, but in another period of time, and it's just really different.

>>: Sleeping better.

>> Selene Mota: Yes, and then like the desktop work. So it's ongoing research, so things like about the transfer learning and so on is to be explored, how to explore better the feedback with the user at the algorithm level is also another area of research, how you can fill the gaps, how

you can use more interventions. We learned a lot of things. You have to really go and implement in real time, and that's why we think that our sensors are really useful for machine learning. And then just like you have a really, really fast, fast [roll] of the applications, because we did all these under an NIH initiative, which means like many groups across US were involved, especially in the medical field, and then they developed -- they came to us, they gave us the problem, we actually worked with them to make it happen, and then they developed like really critical applications. So this is the initiative. Those were the requirements, low cost, less than $60, do it yourself, waterproof, real time, work at large scale, be used for all these kind of studies, for seniors, for children, for behavioral, change interventions. This is integrated with other systems, and the first project that I really liked was with Stanford. They were like a really close -- we collaborate really closely with them, and they have a big cohort of cancer patients, and actually, what they were investigating is how physical activity changes the progression of cancer. And I thought that that was really, really interesting. It has to do with epigenetics, and actually, they have been tracking these people for several years, actually, and they have seen that

30 minutes of physical activity as an intervention can slow down the progression and can make the treatment more effective, which is just very interesting. They are investigating what will happen the other way around. So they are investigating people that have already -- has already -- sorry, have already cancer, but they want to work more in preventive medicine. If you have the probability to have cancer, what can you 10 years in advance to actually change, because if you can change the progression when you have it, imagine what you can do when you don't have it yet.

>>: Is some of that not -- it seems like some of that you could ask post facto. You could look at the population of -- for those who are genetically predisposed to a disease, you could then say what kind of exercise regimen did you have for the last --

>> Selene Mota: Yes, and also diet, which is slight. This intervention on diet is even --

>>: Assuming they're truthful, yes.

>>: Oh, I worked out every day.

>>: But what motivation would they have --

>> Selene Mota: And then this is the study that is going on gradually that I mentioned to you.

We deployed with 50 people. The target is 200. I will reach that [rating] before I finish, hopefully. So then the other project is about mental health and depression. Depression is one, like [Mary] was telling me about her research about depression, but it's really heartbreaking, because it's one of the diseases -- because it is a disease -- that is not actually taken care at all.

People don't like to talk about it. Actually, when you talk about it, how can actually someone help you? So then, in collaboration with Northwestern University, so they got interested in the project. Some of the indicators of depression is like physical activity. For instance, if you do not go out of your house, you are not moving a lot, it is likely that you need an intervention. And for that, they are developing -- actually, it's like ongoing. It's called Purple Robot, that is pretty much based on context sensing in experience sampling, and actually, they are trying the Wockets to have a better assessment of the physical activity. But, of course, like the phone as a platform

has other sensors, and they are experimentally -- like, it's very interesting what they are doing.

We had worked a little bit with them, especially like how to determine that someone needs an intervention based on behavior, and I can show you some of -- we can discuss more about it.

And then another project is autism, so they were very interested -- Matthew Goodwin, from the

Groden Center actually was very interested in how can you detect a stereotypic movement. The program illustrated the movement, like is it on balance class -- they are very few and happen really, really fast, so you don't have many examples, and people do it in different ways. So then it was ongoing research. This work is published with several collaborators.

>>: Tactile feedback means? Tactile feedback?

>> Selene Mota: It's because actually like the teacher, when we were doing the experiments, many people were involved through the many runs of this project. But the teacher just realized the children were wearing the phone in their bags, and this was like the [research part], and suddenly, for some reason, accidentally, the phone started to vibrate. And then the child suddenly stopped. It was like this controlled rocking, and the teacher was like, oh, wow, he stopped. And then he repeated it and repeated it, and it turned out that for some children, just being aware that you are doing -- they are just not aware when it's happening. It's so intense emotionally that they just freeze. They are moving, but they mentally freeze. And just to have a reminder that this is happening is actually the same phenomenon that happens in Parkinson's.

When someone tries to open a door and someone just pushes them, they just open the door, or they can dance, even if they have a tremor. And one of the explanations is that the motor part of the brain that is -- they are suffering from lack of dopamine -- is what impedes the movement.

They see several paths, when they want to follow one. But some things they remember, and it's storage. In other parts of the brain, they can just do it, because it's not passing through that part of the brain. So that's why, for instance, they can bike or they can dance. They are thinking that the same happens with autistic children. So then another project, very quickly, is the Parkinson's project that I just mentioned to you, but the interesting part is more like the 100K and the MIT accelerator. It was in the top finalists, the three top finalists of the 100. It was not 100, but then you win the accelerator, so we can develop more. And the [pattern] we developed more was with Sanofi-Aventis, where they wanted to know -- they have clinical trials, and the way how they measure if the drug works or not is terrible. It's just like self-report once per week. It's terrible. The patient has to go to the clinic, and they want to have a better way. Another problem with Parkinson's is that the patients suffer very quickly of medication habituation. So the medicine can work for three months well for six hours, but then eventually it will work less and less, and you really don't know. That's why, for instance, I don't know if you ever saw

Michael J. Fox, and it's actually very subject to emotions. And something that the Parkinson's patients told me and told me again is saying, you know, sometimes a tremor is not even my biggest problem. It's the depression that the medication is causing me. The side effects of the drugs are terrible. It was really heartbreaking, actually, to work with them, because they really need help, and some of the thing is really their lives are not destroyed because they don't have the medication. Their lives are destroyed because it happens, what happened to Michael J. Fox.

He went to MTV. He super-nervous. He took a little bit extra medication, or like the right amount of medication, just to be ready with the interview, but he was so stressed that the medication just didn't work, and he started in the middle of the -- everybody was saying, why you do that? Why are you just showing now your tremors? And he was almost crying, because

it was not his intention. In Parkinson's, you have either freezing or dyskinesia, and if you pass a little bit the medication, it's so sensitive, that instead of working well, you can have like horrible movements like in the middle of the street. So I will -- this is like the project that I mentioned to you. This is the asthma project that I mentioned to you, and why not? So like APIs for having fun with sensors, so in many of my classes, with [Adney], a friend of mine that kind of knows, we did the [Space Frog], so this is a system for children that they can collaborate movement, and they move the frog, and then this other that I really like was just published last month. So it's where one friend of mine, she's doing animation, and she was saying, it is so difficult to move the parts of animation every time. It's a lot of work. Can you just put your sensors and I move, and then just generate the animation? And I say, let's try, and then we work with [Kendra

Lieberman] to actually say, okay, so how running can be mapped to movement and then, if I want to see what is the closest, playing basketball -- that involves running, using the common sense, so then we did that, and it was interesting. So there are many open questions, challenges, with the evaluations. The future work, epigenetics, metatranscriptomics. It's difficult to pronounce, but I love the work of [power processes] of [indiscernible], so it's not just one way that behavior shows how you feel, but also by doing something changes your biology, which is very interesting. So make participants -- PatientsLikeMe. It's not just to quantify itself. There are so many movements out there, so three blocks from our lab is the PatientsLikeMe that is like an amazing community, the Parkinson's patients are -- these people are doing lots of work every day. Then just put them sensors. So creating meaning, and then it's like behavior change, and I like this phrase of Dan Ariely, so we want to change where like a better, amazing [people] in the future, so we don't live in the future, so that's why it's difficult to do behavior change. And so the coaching, like the aim of the Parkinson's project, but also explore other interfaces, like brain interfaces. And three months ago, and actually [Kenna] probably knows about this, I was blown away when I learned about this center, like brain fitness. So how can you train your brain to focus, to be better? It's just like the next level, I think. And also, why not? Engage the world.

Some projects I have been super-interested is like the SANA Project, because these sensors are low cost. Part of the challenge of making these sensors is very low cost, but then, you know that can benefit people in India, in Mexico, people that needed that. They cannot go to the hospitals, so I think it's just very interesting. And in summary, many things, I am just like running out of time, but it is very important to work with real-world data. Otherwise, you just cannot get insight that you get when you test in the real world. And that's it. Thank you.

>>: I'm going to ask my questions during my interview.

>>: We ran pretty late. Does anyone have any other questions? No? Thank you.

>> Selene Mota: Thanks.

Download