>> Jie Liu: I'm Jie Liu of MSR. And it's my great pleasure to host four faculty members from the MD2K Center. MD2K is an NIH funded big data to knowledge center across many organizations and universities. And today we have Santosh Kumar from University of Memphis, Mani Srivastava from UCLA, Jim Hehg from Georgia Tech, and Emre Ertin from Ohio State. They'll stay for the entire day and interact with us, meet people, and this morning Santosh is going to talk about the center and some of the work they're doing. Welcome. >> Santosh Kumar: Thank you, Jie. It was very nice for you to host us and provide us this opportunity to interact with Microsoft and we are looking forward to it. So as you mentioned, representing the Center of Excellence and what we'll talk about mostly is how [indiscernible] is contributing to realizing the presence [indiscernible] initiative that President Obama launched early this year. So before proceeding further, I'd like to acknowledge again that this work has been funded by both national trans foundation and [indiscernible] health that we're about to discuss. And in particular, the Center of Excellence itself that is funded by NIH under its big data to knowledge BD2K initiative. So it was September 29th or so last year in 2014 when NIH funded 11 big data centers of excellence across the country. What we list here are from in what areas that these centers were funded so that there were two neuroimaging, there are two in general mix, and are one each on [indiscernible] format [indiscernible] typing, proto mix cross modeling, et cetera. And then there was one that was specifically for metadata as it relates to biomedical data. And then one at Stanford. That's starting the mobility impairments and then the MD2K Center that is focused on the one health. So together with the Stanford's mobilized center MD2K is the one that's the flagship center for doing big data research in mobile health as it relates to NIH's interest. So here are all the 11 centers on the map. And so ours is the MD2K Center. That's what we'll talk about mostly but this is a consortium. The MD2K Center itself is a consortium, but also across all the 11 centers. So we are all working together so when it comes to doing any work related to mobile health, any other center of excellence, they work with the MD2K Center of Excellence. So the MD2K Center itself that involves 12 universities and Open mHealth and it covers a variety of different disciplines. Covers medicine, behavioral health, electro engineering, computer science, and statistics. In terms of the people, the key personnel, we have an amazing group of investigators on the computing side of these. Emre Ertin, who is the lead for the sensor development at MD2K. Then Jim Rehg who is the lead for the data science research at MD2K. And Mani Srivastava who is the lead for the MD2K computing platforms. They are all here and they will be glad to answer any questions that anyone might have after the talk. And then we have an equally amazing team of health researchers. Those in the black. They are behavioral health researchers. Our public health researchers. Those in the blue, they are the clinical and clinician specialists. And Susan Murphy, she's a statistician who is a pioneer in the experiment design and in particular about the smart designs of our most recently micro randomized trials for development of adaptive interventions. So now let me give a brief overview of mobile health that many of you may know, but I would like to organize it in how we think about mobile health at MD2K. So first I'll talk about the capabilities of mobile health. So first, it can provide measurements of the exposure as it relates to health and risk factors, health risk factors. So just two things in here, mobile phone and the smart watch. Both of these are easy to carry or wear. And people are usually in the habit of carrying these devices with them in their daily lives. GPS is embedded in the mobile phone, and from GPS, there is a variety of exposures that we could obtain that could indicate various risk factors. Then from the smart watch as opposed to the mobile phone, it is in the exposed part of the body, and therefore it can capture or measure the ambience of the person that includes the exposure to daylight that can have an effect on the stress level of the path people. It can monitor UV exposure. It could monitor the noise pollution. It could monitor exposure to -- if it has a chemical, environmental sensors integrated in it, then it could also measure the environmental pollutant exposure. So that was just a small sample of various risk factors that could be obtained from mobile sensors that we are used to carrying every day. Then sensors, some of our sensors can also help measure behavior. So on the top row, what you see are the various behaviors that could be detected by a smart watch that works in sensors, so sedentary behaviors, that is a huge risk factor for various diseases. Then so that's pretty widespread. Other behaviors that could be detected are eating behaviors, smoking behaviors. About a couple years ago, Emre had integrated an alcohol sensor onto the smartwatch form factor, so using that which senses alcohol from the sweat. And then if it is -- if the smart watches could measure the inter beat interval or timing of successful pulses, then they could also be used to obtain measures of cocaine usage or cocaine use behaviors. And then if we have a measure of respiration or measure of audio, then we could get a measure of conversation behaviors as well. So sort of tractions that play a huge role in a variety of different diseases most related to mental health. So I talked about measurement of exposures, risk factors. Then I talked about measurement of behaviors. And next is the same mobile sensors could also measure the outcomes and the symptoms. So kind of the end result or end, what we are trying to maintain or avoid. So for example, it could measure stress from either ECCG or respiration. We could have a measurement of depression from microphones, microphone and data. We could have measurement of fatigue from the smart eyeglasses that could look at the eyes. We could have a measurement of some asthma symptoms from mobile phones, and as we'll talk in more detail, there is a sensor that Emre is developing that could be used to have a measure of condition for congestive heart failure patients. So this is again just a sample to show that mobile sensors can help us measure various risk factors that a person is exposed to in their daily life. It can measure the behaviors that also can show risk factors and it can measure the outcomes too. So next I'll talk about how this all can contribute to the precision medicine initiative. In his State of the Union address in 2015, President Obama had announced the launching of this largest health initiative ever undertaken in the world. And this was after the general medicine initiative. This is the next initiative that U.S. wants to lead the world in. And the goal here is indeed that personalized health should be available to everyone on an individual basis. So the closest example is the prescription glasses. So each of our prescription glasses are tailor made for our skulls. So similarly the treatment should be tailor made for every individual. And that's basically essentially the goal. So we describe how mobile health can contribute and play a crucial role in realizing this in precision medicine. So say we conduct a general mix analysis, protein mix analysis, micro bioanalysis, the analysis of the person. And suppose that indicates a risk for hypertension. So at this point it is still just a risk. We don't know when or whether it will occur to the person or not. So if in addition we had the person monitored with mobile sensors, then as the symptoms begin to appear, we could have an intervention or treatment delivered and avoid irreparable damage to the end organs, to the person. So basically, incorporating mobile health or mobile monitoring of risk factors as well as the outcomes from the health status, we can tremendously improve the temporal precision in delivering precision medicine. So that's an example of early detection and how -- and if we can have early detection, then we can help save certainly a life but we can also hopefully save damages, irreparable damages to the end organs that a person may then have to live with the rest of their lives. So that's early detection. Extremely helpful. Then second is the prediction. So if we can identify, measure the risk factors continuously, if we're able to measure the health status or the outcomes, then we can try to do predictive analytics and find those risk factors from the sensors that may predict an adverse outcome. And if so, then those predictors could be used as triggers to deflect and deliver just-in-time intervention. And if so, then we can truly realize the vision of preventive medicine. So in addition to early detection, mobile health can also help with prediction and prevention. The third thing it can help is adaptation of the intervention itself. So since the mobile sensors can measure both the environment of the person, they can determine the right context and not deliver intervention when suppose, say, somebody is having an important meeting with their boss or in case in driving a car, those may not be the right moments to deliver the intervention. Also mobile sensors that can help measure the outcome or the response to treatment and using those as the feedback that the treatment itself could be adapted to the individual. So there are various ways in which their treatment can be adapted and engage the individuals much more. So we're applying this paradigm and so show the usability and applicability of all the toolkit, research and technologies, and we picked two applications to demonstrate the utility of MD2K works. So one is smoking cessation. Smoking is the largest cause of mortality in the U.S. and elsewhere. And it's an extremely hard disease to treat. So our approach here is to have a way to detect smoking, when smoking occurs. And if we can detect smoking, then we can look at other sensor data to find what may predict a lapse in a cessation attempt and if we find those predictors, then they will be used in design, development, and delivery of sensor triggered just-intime intervention. And we are adopting a similar approach for congestive heart failure which has the highest cause of readmission in the country. And so in this, as I mentioned briefly, we are using a new sensor called easy sense to measure the lung fluid condition and together with other measures of systems of physiology, developing index of the status of condition. In congestive heart failure patients so that will [indiscernible] the early detection. And if we also monitor other behaviors, such as, say, exposure to [indiscernible] eating or salt intake, then we can try to find predictors from the behaviors that may indicate worsening of this condition status. And if so, then that will again be used for delivering just-in-time intervention. So there are two points of treatment delivery in the congestive heart failure. One is the early detection itself could be a trigger to adjust the dosage for the patients. And then in the long term, help people avoid worsening of the condition itself by helping them adopt healthier behaviors. So the variety of sensors data sources that we use in MD2K most of the sensors here that are listed are actually developed by MD2K and that's why we have a sensor lead with an MD2K auto sense is this is the sensor that measures [indiscernible] that was developed by Reardon as part of the [indiscernible] environment and health grant from NIH and then the easy sense sensor that's for more assessment of heart motion, lung motion, and lung fluid level that I'll describe in a little bit more detail. That was developed recently by Emre as part of an NSF funded smart health project. Then we had a smart watch sensor as well for assessment of motion and we are now in the discussion to replace that with Microsoft band. And then the fourth one is a smart eyeglass that's what you see here is an eye shadow of eyeglass, smart eyeglass that is being developed by the [indiscernible] group at UMass Amherst who is also [indiscernible] investigator. And all of these sensors, they stream data wirelessly in realtime to the mobile phone where all the data are synchronized with the self-report as collected by the person and the GPS as collected by the mobile phone. So and then in addition, with. CHF history, we also plan to use some other sensors such as for weight, daily weight monitoring and blood pressure monitoring. So to summarize, the goals of MD2K is basically to develop the software tools, training, and the signs to gather, analyze, and interpret health-related mobile sensor data so that just like today, any health researcher is able to collect self-report data and analyze it, we would like the entire community to have the ability and the resources and the skills to be able to collect, analyze, interpret, and use mobile sensor data. And then we are developing the right analytic tools that can help researchers to develop innovative methods for early detection and prevention of complex chronic diseases and to discover the knowledge behind the prediction and prevention as it relates to the chronic diseases. And ultimately our goal is to help develop or lead the development of mobile health intervention senor triggered mobile health interventions that can realize the space in precision medicine. But there is a huge amount of step that goes in before we can move to our sort of development of efficacious promising sensor triggered intervention and so I'll describe that in more detail. So in terms of concrete deliverables, what is it that we are developing? So this is a mobile sensor data sent to all centers so we have contributions both on the data science research side as well as on the knowledge discovery. So first on the data science research side, as I mentioned, we have developed or are developing various mobile sensors. Next is computational models that can convert this noisy mobile sensor data collected in the national real life environment to usable clinical relevant markers of health of what, markers of exposure, markers of behaviors, markers of symptoms and health outcomes. And then we take this -- next is the development of predictive analytics that can be used to extract or discover predictors in this time series of or multi-variant time series of markers. And once we have those predictors, then they can be used to develop a sensor triggered mobile intervention. So MD2K is conducting research in all of these layers. And then the next task is to have this computational platform that can be used for data collection, reliable data collection in the field environment for a variety of different disease conditions, as well as for development of each of these models, whether it's the model for converting sensors to markers or the predictive analytic models, or for the development of the intervention itself. And so when those models are applied to sensors, then we will be generating the sensor data and when these models are applied for the sensors to [indiscernible] applied, we will discover markers of health status and various risk factors. Then we'll also hope to discover early detectors and predictors that can be adopted widely in the health practice, practice of medicine. And then ultimately, this is in medicine interventions that can be adopted by the end users themselves to monitor and improve their health. So next I'll describe an example of the computational, how we approach the development of computational models for converting the sensor data into markers. So there are several challenges in converting the sensor data into marketers. First is the sensor design itself has to be very sound. There are a variety of different challenges in there. It must be variable so people will feel like wearing it on a daily basis. Must be safe for wearing. Must be reliable and robust so that the data that is collected in the natural life environment can be trusted for making clinical decisions and it should be versatile, meaning that we should be able to make a variety of different inferences from few of sensors that people will be willing to wear. It's not realistic to expect that people willing to go around their daily life with ten sensors on them on a daily basis. So therefore, it behooves on the data science researchers computing the searches to find ways to develop models that can take the data from some sensors that people will be willing to wear and then be able to extrapolate or infer a variety of different information out of them. And then the data collection software itself has to be robust so that it has the right sampling. It can regenerate if it crashes. It minimizes the losses due to wireless communication. It can last the entire -- the devices can last at least the entire day, if not more, on a single charge of battery while doing all of this. And then it should have reliable storage. So if all of this works, then we have good quality data. If we have good quality data and if we have ways to infer when the data is good quality and when not, so that we don't make decisions when the data is not of usable quality, but after that, the next challenge is to make inference of events so that we can distinguish the events of interest from other events. So an example, for example, if we're using the R motion to detect smoking behaviors, then the same R motion could be involved in eating behaviors. And while yawning and so on. So there are a variety of other things that people do on a daily basis and being able to automatically infer and reliably what we are interested in from other closely related similar looking confounding activities that is a significant challenge. Then next challenge, let's say we want to detect eating from arm gestures. There are so much variability in between situations like if we sometimes we can eat with fork, other times we can eat with hands, sometimes we can eat with both hands. So the variety of different ways we engage in the same behavior at different times in different situations. Sometimes we can be eating while standing, sometimes in seated, and so on. Sometimes while driving too. And then there is wide variability between persons, how different people engage in the same behavior is different. So model developed should ideally work on anyone without having to retrain the model or without expecting a user to go through a scripted straining session. So next I'll talk about an example which is detection of smoking from variable sensors. Smoking is important because as I said, it's the cause -- largest cause of death in the U.S. and therefore a lot of effort is spent in finding solutions to help people quit smoking but it has been extremely hard. Quitting smoking has usually seen less than ten percent success rate so right in the first week, more than 50 percent of them lapse. So the main issue with respect to the smoking, developing the right interventions is that we don't even have a way to determine when a smoking lapse occurs. So here is an example. So to find a predictor for smoking, let's say we conduct a study where are we have people who are interested in quitting undergo that entire experience of enrolling in the study, receiving an intervention, picking a quit date, and then they quit. And they're supposed to remain abstinent. Then at some point they lapse. And if they lapse, when they lapse, that's the important time. And if we can find what happened just prior to that lapse, we'll know as to what are the portent precipitants and antecedents. But today, we mostly depend on self-reports. So even if we could collect the sensor data that could measure this risk factor such as, say, exposure to tobacco alerts or exposure to the bars or alcohol, and so on. But in this example, suppose this is a participant who is being monitored and we -- if we still depend on them self-reporting when they lapse, then we have significant ambiguity as to when the lapse occurred. The lapse occurred when they were at the gas station. Did they just fill the gas or did they purchase a cigarette? Or when they were at the bar, did they just have alcohol or did they see someone else smoking or did they see some portent cues. So it's hard to tell if we depend on the self-report. And that's temporally inaccurate. Therefore, we need to develop a method to detect lapse from sensors. And if we can do so, then we can localize exactly when the lapse occurred and if so, then we can find the sensor-based predictors. So that's what -- so therefore, we set out to develop a method to detect a smoking lapse from variable sensors. So there are two sensors that we used in this process. One is the respiration sensor at the chest level that captures the breathing pattern. The other is the inertial sensor on the arm that captures the arm movement. And as you can see, there are some confounding activities like eating, so that also involves some changes to the breathing pattern as well as similar motion of the arm. But there is evidence that smoking induces a different pattern associated with deep inhalation/exhalation while taking the puff and so that deep inhalation/exhalation commences just after we have taken the hand to the mouth and just after the hand starts to come back from the mouth. So that was the pattern that we set out to detect. There are several challenges in developing a detector that will work in real life on real participants. First is that each smoking puff is only three to four seconds long. But in the ten hours of sensor reading, there are 36,000 seconds. So basically, there are 67 positive instances in about 10,000 candidates in the entire day. And we do need, for it to be clinically useful, we do need very high recall rate and low false alarm rate. Then there are other issues of when the person is reading the sensors. For example, if we're talking about arm movement, if they do what we usually do when we wear the watches, we wear it on non-dominant hand. And then the smoking is occurring with the dominant hand, we have must have had it entirely. So if we could give them two band to wear on both hands, if we -- I mean, because we don't know which hand they might use during smoking, if we give them two bands, then one is left, one is right, which one is left, which one is right? We could mark them but then they might switch it sometimes. They could wear it -- we ask them to wear it here, sometimes they could wear it differently. And sometimes it could slip. It may not be tight. So there are a variety of different variability issues in the wearing of the sensors themselves. So there is attachment degradation as their day goes by, then there is data loss as well. And again, these are the markings of the puff will only occur for 3- to 4second time window. And if there is some data loss during that, even of one or two trackers, then it could make it challenging. And then as I mentioned, there are numerous confounders. And finally, as I will show, that in the smoking, regular smoking people take about 15 puffs. But when it's the first lapse, they could take very few puffs, two, three, four puffs. So we may not have very many instances in the vicinity to go by to improve the reliability. So and then, as I discussed, there is wide variability. Sometimes when people are walking and smoking. Sometimes they are talking and smoking. They're moving their hand everywhere. Then sometimes they are seated and smoking. When they do this versus sometimes they're standing and smoking, they do this. So there are significant variabilities. And again, we have no control as to how somebody engages in that behavior. So all of these challenges exist. But still, if we want to make a difference, we need to develop a method that is highly reliable. So I'll briefly describe the approach that we adopted. So first, in this continuous time series of data, we tried to extract those short windows of interest that would potentially represent a puff. And so for that, we used the gyroscope to monitor the movement as to when the hand is coming to the mouth and going back. And after we see that the hand is moving, if it's coming to the mouth and coming bag for that we use extra metric. And so after that, then as I mentioned, there is a wide variability to determine to have an adaptive method for determining the timing of the start of the movement and the end of that movement. For that, you use the moving average convergence/divergence method. So that is able to adapt to the person and to the situations. And then so with that approach, we have some number of -- so that time series of -- entire time series of data that gets reduced to some candidate windows, but that's still too many. So we then adopt a few methods, a few techniques to reduce that number of candidates to a manageable level. So first is that is the duration of the segment small enough that it -- I mean, appropriate enough so that it represents a smoking puff. So if it's here for too long, then it may not be. If it's too short, it may have just been to scratch, touch the hair or so. Then eating might involve a different orientation of the wrist versus when somebody is smoking. And so we look at whether the hand orientation is appropriate or not. For that, we use the pitch and roll. And then with that, we can exclude a lot of the non-candidates. So after we do that, then we train as via model. For that, we use 17 respiration features and 12 hand gesture features and then we train the classifier for on the training data that we collected while each puff was marked by an observer on the mobile phone as a volunteer was smoking. And after that, then we have the markings of puffs. So then we assume that nobody takes a single puff. So I'll remove any isolated puffs that does not have any other puffs in it close vicinity. And then, we try to -- we conduct a tradeoff analysis to see what is the minimum number of puffs that we should consider to constitute a smoking episode. And so we did this analysis on the real life data that was collected for smoking cessation that I'll show a little bit later. And so what we've figured out was that if we use at least two as the minimum number of puffs in a smoking session, then we have 100 percent recall. And what this shows is that in the first lapse, the first time that people lapse after a quit attempt, they could take as few as two puffs. But if we have two puffs as the minimum number of puffs, then false alarm is still little high, which is 1.6 or so per day. And we get much better results if we set it to the minimum number of puffs to four. And with that, we get about one false alarm every six days, and that's pretty acceptable performance. So we applied -- so we trained this model on six smokers where each puff was manually marked on the mobile phone, and that's what was used to train the model. After that, we applied it to an independent data set while there was this real life study of smoking cessation. So there were 61 participants who quit smoking under our observation. They wore the sensors one day before quit and three days after quit. And of them, 33 lapse. And 28 were able to abstain in that three days that we observed them. And our method was -- and each day that these participants reported to the lab where they were tested for abstinence or lapse by a steel monitor, carbon monoxide, so they were asked to blow into a steel monitor and depending on the results, they were classified as having lapsed or not. They were also asked to self-report when they lapsed. Some of them self-reported. Others did not. And when the next day they were tested positive, then they were asked to recall what type they had lapsed the previous day. So out of the 32 lapses where we had good quality, out of 33, we were able to detect the first lapse in 28 and then on the 28 abstained smokers, we had one false alarm every six days. >>: Where do intervention in this case? The person already started smoking again and then you come in and say wait, you smoked? >> Santosh Kumar: So this is an observational study which is being just used to see why do people lapse. >>: Okay. >> Santosh Kumar: So there's regular traditional intervention that's given to them, so every day when they come to the lab, then they are given intervention. But so there is no action when they lapse. >>: Okay. >> Santosh Kumar: So all of this is a study to move towards the development of intervention. So we have some interesting findings. We see that usually people take about 150 puffs in a day when they're smoking regularly. And on the lapse day, they're still struggling and this marks their first failure in their cessation attempt, which is a big event for them. And so basically the number of puffs they take is extremely low, about 7.7. But after that, as we know, about 90 percent of the cases, they go to full relapse. As you can see, the number of puffs keeps increasing day over day. Yes. >>: Doesn't the sensor that you were using, what about using a sensor that will measure the chemical that are exposed to the air when you smoke? >> Santosh Kumar: So, you could follow that approach too. So you could say rather than asking people to blow into a CO meter when they come to the lab, why don't we have steel monitor on them. So right now, it requires you blowing into it, so it's not a passive sensor. So a passive sensor that can reliably detect that when you're exposed to cigarette smoke and not the regular smoke and not anything else, I don't know if that exists. >>: I think there's some nicotine-based sensors, but there's also the basic problem of if I detect that, am I smoking or am I in a smoky environment? >>: But maybe when you connect it to the other sensors -- >>: Yes. You can improve the accuracy, right, yeah. >> Santosh Kumar: Okay. So next we also -- then remember, our goal was to have a method that temporarily precisely is able to pinpoint when is it that the lapse occurred. Thus far, the entire field depends on selfreport to know when a last occurs in all the smoking cessation histories that are done. And they usually ask people to come back to the lab every day for the CO verification. But that's day level granularity. So the best granularity that exists thus far is depending on self-report. So what we see is that out of those on which we were able to detect, to 28 lapsers on whom we were able to detect when the lapse occurred, there were about nine of them who did not self-report, but then they're recalled when they came to the lab and tested positive. Those who did self-report, the inaccuracy of when they reported versus when the lapse occurred was widely varying. Sometimes they reported before they were going to lapse and most cases they reported way after their lapse occurred. And in those cases when they were asked to recall, then the inaccuracy was much greater. So this is very promising and is very exciting for smoking researchers to be able to know precisely when the smoking lapse occurs. So going forward, we have our two new observational studies that will be conducted with about 600 smokers and similar protocol will be followed, but for a longer period. That means they will monitor several days pre-quit and a couple of week after a quit. And then we also are moving forward to develop this entire framework so just-in-time sensor triggers, just-in-time intervention. So at this point, we have a method to infer stress from ECG and respiration. So that means we have tenuous measure of a stress so we are also starting a smaller scale study where in the similar protocol of monitoring prequit and post quit, people begin to receive sensor trigger intervention based on the realtime measurement of stress from the sensors. So that means they will receive intervention sometimes when the stress is high, sometimes when the stress is medium or sometimes low to -- I mean, so that we can determine the right policy for delivering a just-in-time intervention. So we are adopting a micro randomized trial that Susan Murphy has initiated or is actually known for. And while that -- so what that enables us to do is to have this entire framework and the research methodology to develop and evaluate sensor triggered just-in-time interventions that is delivered on the mobile phone that is based on realtime sensor data. And then each year, starting from early next year, we'll have 75 new participants who will participate in these studies and the predictors that we'll discover in the preceding year will use that -we'll incorporate that into improving the sensor trigger just-in-time intervention that we will deliver to the participants. So with couple iterations, our goal is -- our hope is that we will have both discovered the methodology and the science for developing and delivering, determining the right timing to deliver the sensor-triggered just-in-time intervention, as well as we will have discovered interventions that may be promising for reducing smoking cessation. So next I'll quickly talk about the CHF management. So CHF, as I mentioned, is the highest cause of rehospitalization. And the current approaches of daily weight and system monitoring hasn't really been showing to have a statistically significant effect. So as part of a smart health project [indiscernible] is developing the EasySense sensor which can measure both the motion of the heart and the lungs as well as the changes in composition within the lungs. So we have some early evidence of them both being able to infer the motion of the heart as well as the composition and composition is basically measured by changes in the related propagation delay as well as the degree of absorption and here are some early results on a healthy subject when they change their posture from up right to supine. And as you can see there is observable effect both on the attenuation as well as on the delay. So in terms of status, right now, for the CHF management application, we have a pilot study that is in progress at Ohio State Medical School where we have 20 patients who are admitted to the hospital due to their decompensated heart condition. And then basically we will use the hospital measurements of the fluid intake and fluid output -- out flow and the hemodynamic markers of lung fluid and see how well the EasySense is able to detect the lung fluid congestion. And each year, we'll have 75 congestive heart patients who will go home with this EasySense sensor as well as a variety of other sensors to measure weight, balance, frailty, blood pressure, symptoms, respiratory effort, and activity measurements to be able to both obtain an index of related action as well as to find predictors for prevention. And so our goal is to, again, here also develop just-in-time treatment for this congestive heart failure patients. One potential prevention target that we hypothesize is the fast food eating that my worsen the condition, but again, once we have the study, then we'll know. So I mean, there are several new markers that are under development. So first I'm listing those that can potentially be detected from smart watches. And as I said, at this point, we are in the process of incorporating Microsoft band for our smart watch sensor. So there's a newly funded project from NIH that's for oral health in which we are also collaborating with Procter & Gamble who are bringing their smart toothbrushes, but the goal is to detect brushing with manual toothbrushes using the smart watches and then when they use the electronic toothbrushes, then that can automatically be detected by the P and G system. Then we're also looking at detection of eating that basically will help in the prevention of congestive heart failure and then detection of drive, that means whether is the passenger or the driver, by looking at the data collected by the smart watches. And that will help in deciding when it's not the right time to deliver intervention, so to make intervention delivery contextsensitive. And then also developing ways to measure conversation from respiration sensor and then from the smart eyeglasses, be able to detect cues for smoking cessation or for congestive heart failure, which could be advertisements or [indiscernible] alcohol. And then as I mentioned, for congestive heart failure index, status index, we're using the EasySense sensors. So there are a variety of different activities. Once we have the markers, then the next goal is to have the time series pattern mining and discovery from the markers. And for that, if I just give an example, for smoking cessation, the predictors could include an exposure so tobacco outlets, exposure to bars or alcohol, stress, heated conversation or a stressful conversation or seeing a cigarette packer something somebody else smoking. So each of these are potential targets but there are a variety of challenges in developing or finding the predictors. They include them in the events that we're trying to predict, whether it's smoking lapse only occurs once the first lapse. So it's extremely rare event and so there is significant variability as to what may lead to the lapse, what may lead to the lapse for one person, one time, and out to be the same reason next time. And across different people. There's limited recurrence of these adverse health events. It must, just like when we visit the doctor, we expect that the doctor treat us and make us healthy, not that just get to make a claim that I was able to help 20 percent or 80 percent of my patients. Every person matters. So therefore, here also, when we're talking about treatment delivery, every individual matters. This is the realtime prediction problem. That means the prediction must happen realtime without knowledge of the future. And if we indeed expect these predictors to be used in realtime adaptive interventions, then it should be lightweight enough for mobile implementation. It should be tolerant to data losses and data quality degradation. And ultimately, if we really want it to be adopted in health, healthcare or health research, must be interpretable and clinically useful. Otherwise, all of this will remain just fun for the data sense researchers. And then when we move from prediction to intervention, there are a variety of interesting challenges and also both the very predictive analytics for from discovery of predictors as well as the development of interventions. These are the problems that we're starting to work on after development of several markers. So these are the things that we plan to take on as research problems as we move to year two of the MD2K Center. So in development of the intervention itself, so first is what is the right policy. So how do we fuse the data from a variety of different data sources and the various predictors. What is the optimal times. So as we catch the person at the right moment and have the best chance that person will engage with the intervention, and not have it too early, not have it too burdensome to the person, and then if -- then the content of the intervention itself should be personalized to the individual. It should be appealing enough or persuasive enough, and then if we have this adoptive interventions, the traditional methods of evaluation like RCT, it ought to be applied directly so methods of evaluation needs to be developed and ultimately if it's not clinically efficacious, then it's not going to work. So there are a variety of interesting challenges that we plan to take on. Here is just an example of the data visualizations that is being used in one of the ongoing studies on private and burden and utility of the various mobile health sensors. And what you see is the top one is the time series of this various inferences or the context and then these are some pie charts that show as to what fraction of -- so interaction between the various markers that people can get to see and that helps them reflect upon themselves. So then all of these model development, they occur in the computational environment. So right now, what we plan to build or what part we are building right now in terms of our software platform is that there is a back end and then there is a mobile version of it. At this point, the initial targets, the users are data sense researchers who will use the back end infrastructure to develop and test their models, whether it's for sensor to marker or markers to predictors or predictors to intervention design. And then the health researchers who is going to use our tools to first conduct a study, to collect data, and then to analyze and do publishable analytics to be able to develop or evaluate the efficacies of interventions. So this is just a quick overview of the architecture that we have on the mobile phone that facilitates the data collection. Then it's realtime processing for both data quality assessment and if the data quality is good, then extraction of various features, then from features to inferences and then from inferences to interventions and then so that's connected to the user interface to connect the self-report as well and it can collect data both from internal sensors and external sensors and ultimately can connect with the cloud as well. So going forward, it should also have the capability to interact and coordinate with the smart watches because part of the intervention could be initiated at the smart watch because people are more likely to look at the smart watch or more frequently look at the smart watch than their mobile phone so they could get the cue to engage in intervention of the smart watch and if they become interested, then they could be pulled into the smartphone. So that also [indiscernible] incorporated in this mobile phone software. And then the back end software architecture that should have the capability to distribute the processing so as to enable large scale data analysis as well as visualization and should have the ability to export the data in appropriate formats and enable people to be able to visualize or explore the data to facilitate in the knowledge discovery. So there are several challenges as it relates to the software development on both the back end and the mobile phone on the back end, the efficiency of computation, scalability across the volume of data and the generalized availability to various disease conditions. That's important when it comes to the mobile phone. The latency of computation is important because, remember, the timing of delivering intervention is extremely critical to catch the people in the right moment when they are most likely to benefit from the intervention and most likely to engage with the intervention. Then if the phone or the variable sensor doesn't last the whole day, then again, it will be unusable. Then it's important for people to feel that their privacy is being protected while they engage in using these devices for monitoring and improving their health, especially when it relates to sharing data, to receive feedback or the interventions and then provenance is important too because that's what will add the right transparency and mix and match what sensor is -- what sensor provides what granularity of data, what algorithms provide what sensitivity and a specificity and ultimately when the intervention is delivered, what confident does the system have in the inferences that it is able to make. Those are all extremely important to propagate through the entire chain so as to add the transparency. And add predictively too. And then for interoperability across a variety of different sensor data sources, we are opting to have an Open mHealth approach to the APIs. So just trying to conclude, as I said, in the MD2K we are adopting this approach of early detection and prediction and prevention and then adaptation of the intervention in realtime on the mobile phone and we are applying it to a smoking cessation and congestive heart failure to begin with, but it is now being expanded to oral health and a variety of other conditions. And as I mentioned, in terms of software and the research, we expect to do that on both data science research as well as knowledge discovery and which should result in huge body of research that the community can build upon. All the software we're developing will all be released open source and will invite the communication community to contribute as well as use or build upon what all the software we develop, any markers that we discover will again release the software associated with it, and hopefully data as well so that people can do apples to apples comparison, which is not usually feasible these days. And if we can do that, then it will become much easier for people to build upon or compare or validate markers independently and the same goes for the predictors too. And then we hope to be able to develop interventions that indeed will help realize the [indiscernible] in medicine and with our training efforts, we expect to engage and facilitate the formation or communication among this mobile health community. So with that, I'll conclude. And here's a link to our website if anyone is interested in knowing further about it. Thank you. [Applause] >> Jie Liu: Questions? >>: [Indiscernible] talk about the intervention of eating being the new marker. Are you also looking at trying to detect content they're eating or just the activity? >> Santosh Kumar: Not yet, but as I said, our goal, it's not just enough to know when they're eating but the salt intake in the eating. That's what sodium intake is the one that's certainly believed by our clinical, clinician specialist to be important predictors, so yes, that is one of our interests. And we would hope to undertake that problem if it hasn't been solved by anybody else by the time we get to it. >>: Okay. Thanks. >>: So the [indiscernible] EasySense, right? >> Santosh Kumar: >>: Yes. What's the principle behind it is RF? >>: It's essentially a micro radar platform that goes in a sense [indiscernible] and records the backscatter. So it's a contact sensor. It doesn't need to touch you. It can be used over shirt and things like that. And the idea is seeing the internal motion of the heart, you can detect everything related to pulse. And seeing the moment of the lungs, you get the respiration measure without relying on a band. Moreover, since you know where things are in space because it has enough resolution to resolve things in space, you can see the effect of water to that displacement. When you have water, essentially speed of light slows down. Then you get little shift. In addition, you have absorption, so everything in amplitude [indiscernible]. So those combinations, you know, potentially can help you to assess lung water. >>: And how does -- I think you use that to sense eating? >>: That is a separate -- eating sensor is based on -- >> Santosh Kumar: It's motion. >>: -- your gestures. Right. Is sensors for respiration, heart motion, as well as this lung [indiscernible]. >>: This is great stuff. Really sort of high impact sort of problems. As I look at sort of the technology solutions and as we work in the space, close to infinite number of technology solutions, we can sort of plot for each of these. Oftentimes they're bounded by the eventual use cases and the complexities of deployment and adherence and all the other things that come further down the pipe. Can you say a little bit about how you guys sort of role that into sort of the technology selection and technology decision and study designs? >> Santosh Kumar: Sure. >>: Because ultimately I suspect that the driving factor behind all of this is deployment of these technologies are much more complex ecosystems. >> Santosh Kumar: Yes, I think that's a very good question. And it can come from someone who has struggled with this. So that has been front and center. I'll just tell a little story. As I said, the auto sense was developed in the [indiscernible] environment initiative program that was way back in 2007. So and we kept working on it because our goal was not just to be done when we wrote a paper about it but that this device should work in the national fill environment so that the data we get out of these is usable by us as well as by health researchers. So that was much taller goals. It took us 4, 5 years. Emre, how many versions you made? About more than ten versions that he actually made with improvement every time we deployed it and we saw what the issues were, fixed it, deployed it, then fixed it. To the extent that it became usable enough that it was used by over 100 participants, including those who were illicit drug users and they wore it for four weeks in the natural field environment. So our goal had been to develop the sensors such that it is usable by general population or specialized population in their natural field environment. That means they should feel comfortable enough wearing it and going about their daily life and it should be such that we get reliable data out of it. Unless these two goals were met, we did not declare victory yet for any particular technology that we picked. So our criteria was that this technology should be such that it is, should be maintenance free. So when we started the GI program, we had an interstitial fluid-based sensor that harvest interstitial fluid from your -- underneath the dead layer of your skin and it was integrating that for alcohol assessment. We realized that it is not maintenance free. You need that and you need to create micro pores in your skin, needs to create the vacuum suck interstitial fluid. If you take it off, then you have to create the pores again, so it was not usable. Next he integrated a transdermal alcohol sensor and then that at least didn't require creating micro pores but then it requires hydration. Sometimes when if you hydrated, doesn't work properly. So any sensor that is not maintenance free, we did not take it forward. Those that can be used -I mean, we can -- send it to our collaborators, health researchers. They can easily learn how to use it and then they can train the participants to take it off, put it on themselves and still we get reliable data. We'll lead that technology, we proceeded forward with. So that's what we did with respect to the sensor and the data collection itself. When it came to inference, as I mentioned, we developed this model for smoking detection that was just based on respiration. We had 87 percent accuracy. Was it good? Bad? Somebody could say, yes, 87 percent, that's pretty good. After you start analyzing that, okay, in the day, there are 10,000 respiration cycles. So that means 1,300 respiration cycles you're falsely declaring as a smoking lapse. And here, a smoking lapse is the first event that people want to detect. So if you are not able to -- if you are detecting every event is a smoking event, then it was useless. So when it came to inference too, our goal has been to get it to a level where it is clinically usable in the field. I'll give you a third example. So the third example is that of cocaine usage. So when we use this technology in the field, we saw that if you look at the ECG response to cocaine use, then it is pretty pronounced. So it should be easy to detect. But again, there were many issues with that because, I mean, how do you get the training data? You could round up some people, give them cocaine, but you can only give them tiny dosages, right? You can't have training data on those who are going to wear it in the field. They could smoke, they could ingest. In the lab you could really do some. In the field, they walk. I mean, right after they take cocaine and the response to walking is very similar to that of -- so there are several issues. So it took us many years, but again, our goal was to have a model that is clinically usable. That means that has very high recall rate and extremely low false positive rate. And so fortunately, after years of work, Emre has spent great amount of time not just building the sensor but developing the model too, that we had a model that was informed and to the extent that it was -- it had 100 percent recall, at least when we had good data, and so the effect was that the drug use researchers now would like to use it and now we have to negotiate with NIDOW [phonetic]. I mean, they want to use it in the clinical trials network. So the approach always has been that to take any technology we take up to the finish line where it becomes clinically usable in the health community. >>: Just following up on that question and the points you mentioned, so are clinical trials the channel out for you? Is that the way you're looking to expose this to world? >> Santosh Kumar: That's a very good question. So how do we plan to expose this? So I think there are several answers to it. I'll give the short answer. So how does the general community begin to use it? So first is the sensor [indiscernible] data collection [indiscernible]. So if anything we are developing works on general, regular mobile phones, great, then anybody can use. All we need to do is just release the software and that's it. Right? Anybody should be able to use it so it needs the software, training materials associated with it, and that would suffice. But then many of these are about variable sensors. So some of which we have developed, some of which are coming. So if we have developed it, then there should be similar commercially available sensors that people can buy. So that means there should be some way to acquire the data for the community. And then, I mean, the software that can work with that sensor, only then it can scale widely. Now, how do we get it out to people? So yes, I mean, we certainly work with some researchers directly. But that's not going to scale either. Even for a clinical trial, even if somebody wants to use it for a clinical trial. Us producing all the sensors, supporting, that's not the scalable way to go. So the model that we think would work best is if there is a way to either for our software to work on commercial sensors, that we see that commercial sensors improve to the level that our sensors have been able to give us the data. And then our software be able to provide similar accuracy on that data. So our goal is only to demonstrate the usability utility by working with few groups, but after that, it has to go to an autopilot mode for it to be adopted widely. So we will take it to some level where we believe that now it has got the legs on its own. >>: So it might be important to develop software while you develop it to make it more valuable, right? >> Santosh Kumar: Precisely. So that's why we're excited with this Microsoft band collaboration because Microsoft band, anybody can buy it. And if we can have inputs in informing the design of the band itself so that it is usable for all this health purposes and that the software we'll develop can work with the band that's commercially available, then that become scalable. And so many of the health researchers, they look for specific capabilities. And many times, that's not easily available in a commercial platform and that's why they go look for computer scientists or electrical engineers who have those devices, but if such kind of collaboration can work out, then I think that will be the right way for its accelerated option by the wider community. >> Jie Liu: Any other questions? All right. and visitors again. Thank you very much. [Applause] Let's thank our speakers