>> Mary Czerwinski: Hi all thanks for coming. It’s my pleasure to introduce Jonny Canny today from Berkeley. He is a professor of computer science and his research is near and dear to our heart because he marries HCI, with machine learning, with signal processing and even big data. He is focusing these days on health and education primarily and he will be talking to us about those topics today. And some news about Berkeley, John’s participating in sponsoring the creation of a Jacob’s Design Institute, which is supposed to open next year? >> Jonny Canny: 2014, I did say next year, 2015 I should have said. >> Mary Czerwinski: 2015, that’s kind of cool that it’s happening soon, so welcome Jon. >> Jonny Canny: Thanks Mary and thanks for the chance to talk to folks. So today I was going to talk about some work that we have been doing for a few years on the opportunities of sensing without sensors or at least without things that we normally recognize as sensors. And most of the work has been directed at sensing some sort of affect. Most of our work has actually been on stress. We did a little bit of work on pain and some work that had a focus on depression and mental illness. The signals though are quite similar that we end up being able to detect and we are currently starting to move from the sensing and detection into interventions based on what we find on the sensor data. So to quickly review stress, stress is a very important and very global regulatory mechanism in human beings that helps us function at enhanced performance when we need to. That’s how we pull all nighters and make conference deadlines, but it has negative effects obviously as well if we leave it on too long because basically our stress mechanism is a tradeoff between fight and flight ability. Our ability to respond accurately and sort of exceed or average potential briefly and all of the important regenerative and repair functions that we need to do as well. So we have, it’s a complex system, but roughly there are fast and slow components of the stress response. The fast one is the adrenaline or epinephrine response which is when we get excited our pupils dilate, we are ready to spring in to action and it happens very fast for obvious reasons. If you are going to fight or flight you probably should do it soon. And then there is a slower response which is to do not, well largely with shutting down or sort of toning down the rest and digest functions. So this mechanism is perhaps the one we should be most concerned about because when that kicks in over long periods of time that’s when we start to have trouble. Okay, so two different physiological responses and they tend to have two different types of signals that you can observe in the body in response to them. And I will just clarify for those of you that, no in both cases I am talking about parts of the Sympathetic Nervous System. So both of these parts are enhancing fight and flight and reducing rest and digest. There is a complimentary system that helps with your regenerative functions, but we are only talking about the negative side or the stress side I guess on this slide. All right. So we have these fast chemical mechanisms and then the signs you see in the body are heart rate increasing with fast stress response, breathing speeds up, pupils dilate, we improve memory retention and some types of recognition and our emotions are typically intensified, amplified. Then the slow mechanism interestingly the heart rate variability, the sort of variance in heart rate, changes with the slow stress response. We get galvanic skin response changes, little pulses effectively in changes in skin resistance with perspiration, breathing becomes more eccentric often and we have muscle tension. So when we say we feel tense because of stress that’s not just a kind of illusion, we really are tense. Many muscles in the body become tensor in response to stress. So heart rate variability, although we are not primarily trying to measure it this is the gold standard that we do in the [indiscernible] work. So just to quickly review what it is when we talk about heart rate variability we are typically talking about different measures of variation of inter-beat interval and this is a very complex topic. There are lots of different measures. The simplest one to understand is just the variance or standard deviation in that actual number from a suitably measured peak in the heart wave form. The others are quite a bit more difficult to explain, low frequency/high frequency ratio which requires us to Fourier transform the interval signal. The first one is easier to measure, but it’s harder typically to provide physiological justification for. Biological arguments have been made that this more complicated measure directly measures the ratio of the sympathetic or stress response to the parasympathetic or rest digest response. So, I am bringing it up, we will say more about it later, it is a measurement we made as part of our study of [indiscernible] sensing. The hardware we use is another project we call the Tricorder, Berkeley Tricorder. That was a custom sensor that we started building about 8 years ago and we can probably see some dates on here. This is a 2008 version; it recorded all of these vital signs. So in particular we had a heart wave form signal and it would both store the data on a flash card or also stream it out on bluetooth and it recorded a lot of other measures that we didn’t use for this study. So having done the work on the Tricorder by the way we became quite enamored of the idea of not using hardware sensing. Hardware sensing is obviously hard to do, it’s challenging, and there are lots of adoption challenges even with fairly simple technologies like wrist watches and so on. So we really started to focus on measurements that we could make implicitly from people using existing electronic devices in the way that they would normally use them. So we did some work that I will describe on monitoring a voice on a cell phone typically during phone calls and we also have a sensor in our lab, which I won’t talk about, but we have the ability to also do location analysis and also affect analysis on voice signals of people in the lab at Berkeley. The most interesting stuff which is current work is looking at motion sensing from the mouse, from people using a mouse on a desktop, potentially using a trackpad on a laptop which we haven’t done yet. Another similar kind of signal which we do have measurements of and which we are analyzing right now is accelerometer phone. The phone is quite rich and it includes the study use of the phone. We are interested in both the passive signal of somebody just holding the phone in front of them and looking at something, the tapping signals from interacting with the screen widgits, dragging and so on. All of these produce a signal that can be analyzed in a similar way to the mouse signal. Finally there are a lot of opportunities, we are not pursuing this, but a number of other people are. Actually one of my former students has been recently looking at camera sensing on cell phones for pulse measurements for stress also. There is a lot of opportunity there, but we are not currently following that though. Okay, so speech encodes stress in an interesting way. It’s, I am not an expert, but as I understand it part of our response to stress is for the vocal cords to pull away and basically open up the breathing passages so we can suck down more air rapidly. So those stress related muscles actually pull on the vocal cords and change the shape of the glottal pulses when we speak. This particular signal happens to be very accurate in predicting stress. So, we have got about 95 percent accuracy, and what we were measuring was a controlled experiment where we deliberately stressed the subjects and had them in an unstressed state. We then looked at the accuracy of their signal in predicting which state they were in. So it would be nice, it’s desirable to do analysis of that signal on the phone, but it turns out to be computationally challenging because it has a fairly complex inverse problem to solve. Because we are interested in their signal, which is the vocal cords basically flickering I suppose, that’s the signal that produces our speech, but before we hear it goes through the vocal tract and basically a complex filter is applied by the way we shape and make phonetic sounds. So what comes out is a lot more complex then little pulses, more or less isolated pulses that are going in. So the task to do this analysis on a cell phone is to model and invert this transpose. So notice that we don’t know what the filter function is because it’s varying as the shape of the vocal tract varies. So we have to both estimate it and try to undo it. Yeah? >>: So when you say 90 percent accuracy at predicting stress you mean the short term stress right? >> Jonny Canny: Yeah, I mean stress that we were able to induce with math problems, yeah, short-term stress. Well, okay to be a little bit clearer there are both kind of probable adrenaline and probably cortisol responses in the stress response that we measure so it’s not purely short-term and in fact a lot of the experiments we have done have had more than a 5 minute lag between the stressor and the measurements. So they are more likely to be from the slower response. Well, this is more detail than I need, but there are a lot of features that contribute to stress analysis. The one that works really well though is the one that’s not on this page which is the glottal features; those are the ones that are de-convolving. So to remind you those tend to be more then all of the other ones on the other screen combined. So we basically did some work on accelerating. So we have an architecture for doing general speech analysis with all of those features, but the features on that list were fairly easy to implement because we used an existing tool kit called OpenSmile. Yeah? >>: So how do the classifications features that you are looking at differ from [indiscernible] recognition? >> Jonny Canny: Well, so some of them are essentially, it’s a broader set. I mean speech recognition tends to be built on HMM models on the MFCC, MFCC’s are more or less the standard features at the front end of a speech recognizer. The other ones tend to be, you know --. Pitch per say is considered more of an affect signal than a speech recognition signal. A lot of these things are the people that speech recognition people try to avoid because this is highly variable from male to female speakers, children and so on. So speech people try to, in a sense speech recognition people try to factor out those effects, but we are very interested in them from an affective point of view. But for pure speech these are basically sort of filter banks, nonlinear filter banks. And then they go through HMM’s to basically track the little trajectories of phoneticization of speech. When we do recognition we are actually using these more as like static features. So again getting more of this kind of general average shape of the vocal tract which is probably a better cue to affect than dynamics, but there are some dynamic features that we --. So my student did a course project and one of the things that he was trying to measure was using the speech features and as simpler HMM basically measuring speech repetition the number of phones per minute and also the durations of pauses, because diminution of pauses is the sign of stress as well. So yeah, there is a little bit of stuff that you can get from high level phonetic analysis of speech, but we didn’t do much of that and we didn’t do any analysis of the content of recognized speech either, which could have also been a good signal. Yeah? >>: [inaudible]. >> Jonny Canny: Yeah, so 90 percent I am saying the experiment was a counter balance to the experiment where people would either start off or they would rather be sort of pressured into a stress state and then relaxed or they would go the other way around. So we were comparing the two signals for each speak --. No, I am sorry; actually the 90 percent is a speaker independent measure. So it’s actually a very strong measure. I need to keep the experiment straight. So that was a remarkable thing about our work and the work before which is the stress measurements are actually speaker independent. So they are surprisingly strong. The work I am going to talk about later is trained per subject. But the vocal speech features are robust and you can get those accuracies in a speaker independent way, surprisingly. >>: [inaudible]. >> Jonny Canny: Yes, so the point is that almost all the recognition, or most of the accuracy comes from the glottal features which are much more speakers independent. The changes in them are more speakers independent. So those are good questions. So here is the framework that we built onto a phone platform. Again, most of the features came from the OpenSmile Toolkit which is an open source speech processing toolkit and our classification was just simple. It was linear SVM’s built on those features. The toolkit, I think we have the accuracy figures, with let’s see, where are we? So these figures by the way are for recognizing a discrete set of emotions rather than stress. The numbers are lower, but also they are multiwave classifications so you would expect them to be lower. The numbers for the stress figures down here are with and without those glottal features. It’s a bit misleading because you see only a 1 point increase. The point is thought that this is a very large increase at that level of accuracy. And also when you look at this recognizer it’s using mostly those features. So they do contribute a lot to recognition. But it’s still worth pointing out that the stress features are actually remarkably strong compared to say recognition of emotion. Recognizing stress is relatively easy, certainly for machines, probably for people too I suspect if you really ask people. A lot of people I know claim that they can tell, certainly relatives claim they can tell when I am stressed. Okay, so that’s what we did earlier on recognition of stress from voice. I should point out that there was related work by other at Georgia Tech on actually discriminating depressed and non-depressed patients from very similar signals. And they also obtained accuracy in the 90's for detecting depression. But they had a control group and a clinically depressed group and the signals were very strong between those two. Yeah, question? >>: Yeah, I certainly feel my stress level is kind of continuous from a little stressed to a lot stressed. Do these signals operate in that same way? >> Jonny Canny: Yeah, so all of the measurements are going to be quantitative measures. So I am quoting accuracies in order to give a -. Well, actually these accuracy measures here are assuming some sort of threshold has been set. A rock area is a kind of measure that is capturing the accuracy as you vary the threshold. So rock area is normally used for a quantitative measure of some kind of feature. It tells you that this should work over a range of different values. So it is a useful quantitative predictor. Okay, so the most recent work we did was on motion sensing because the voice work is interesting, but there are a number of issues with it. It requires somebody to speak at regular intervals if you want to track stress. It can potentially; well it suffers from the usual challenges of speech which is people will often do it in a noisy environment. In fact people will deliberately avoid often quite environments like their office so they don’t disturb people. On the other hand things like computer keyboards and mice are tremendous potential tools for sensing because people use them so much and they are using them also in a context where there potential stressors are actually hitting them at the same time. So we started looking at phone sensing and, excuse me mouse sensing and phone sensing. So we have a system called MouStress which looks at the stress-induced increases in muscle tension. As I mentioned there has been work documenting stress certainly in the neck and shoulders, but also in the arms and people have specifically looked at stress during computer work. So there is very large literature on this. This is just a couple of references that have kind of told us the obvious that yes, people do have muscle tension when they are stressed and it’s measurable. So we are in good ground for trying to do this. So then to the study our goal was to see if we could measure stress by looking at fairly typical mouse movements such as moving, clicking, dragging and steering. Steering is a maybe less common one, but does show up. Steering kind of mechanically it’s one of the best ones for making the measurement. It’s not as common, but it would include things like tracing your way through kind of nested pop-up menus. So we have not yet done a naturalistic study on real interfaces. We did a more, I suppose theoretical study where we had controlled tasks with varying distances for people to move and varying size targets. So those had similar dimensions, similar geometries for their clicking task and for a dragging task where the idea was to move that target onto that one. And the steering tasks look like this. These are also tasks people have used often in other studies of mouse performance. So our target sizes were in factors of two across a fairly large range of, this is pixels a big fraction of this screen, perhaps a fraction of an inch to several inches for the movements and there were 5 different values for distance and 4 different values for width. Again, more or less very similar to what people do in other types of mouse study. So when we want to analyze the mechanics of mouse movement we adopt a very simple, but nevertheless fairly widely used mechanical model which is a Mass-Spring-Damper. So in biomechanics it’s fairly common to consider muscles more as adjustable springs rather then some kind of motor that’s supplying controlled force. It does seem like muscles when we activate them. We typically activate them in pairs with a set point and there is some biological damping that causes the muscles not to oscillate when we do that. So it’s a simple system. It’s characterized by a second order differential equation which means that basically it rings with a diminishing sign wave. So that’s all we need to do. We want to though infer those models somehow from the data. And the data is going to look like this. These are some recordings in one dimension from those mouse movements. It’s interesting you can see people very visibly hunting and making several adjustments most of the time, but the dynamics of the models are encoded in the curvatures of these little segments here and we use LPC which is a standard signal processing technique in order to infer second order models that match those trajectories. Yeah? >>: Is the model [inaudible] for different types of [inaudible]? >> Jonny Canny: Well, because we are trying to model the arm itself intuitively it should work with any sort of position sensor, velocity sensor or acceleration sensor because it’s really the mechanics of the arm rather than the --. But yeah, if somehow, I see what you are saying, yeah if it’s a tracking device that doesn’t involve perhaps movement of the arm then yes, we have to look at that. The hope would be that because the muscle tension seems to be a fairly global kind of effect it might also apply to fingers, but yeah that would be worth checking because it’s not obvious that would work. So from this we fit the second order model. So LPC is a model that’s commonly used to fit second order differential models to signals and it’s very efficient. You basically just need to extract a few local correlation coefficients from these signals and that gives you a second order LPC model which has a very simple relationship to parameters of the spring mass model. So these are parameters that you can get from the LPC model and then those translate directly into stiffness and effective mass of the mass model. In other words, we can observe a trajectory, generate these second order LPC model coefficients, find some roots and then derive mass spring down per parameters. None of this is expensive to do computationally. So it’s easily done as a little background process on a PC. So we designed an experiment to try to quantify the accuracy of this and it was a counter balance design. Our goal was to have reached subject because we didn’t know how strong the signal would be. We wanted to have some measurements periods for each subject where there were in a stressed state and an unstressed state, but because it was a one hour experiment it was rather compressed so people would enter the experiment and have a calming phase to try to get everybody to a similar state and then they had to do a challenging math task which was stressful for most people. Then we did the first mouse measurement where they were given these pointing and dragging tasks. Then there was this kind of calming exercise given to them for five minutes and finally there was a second mouse measurement phase and a final exit calming required by IRB. So that’s the first condition for subjects and half of their subjects were in the counter balance condition where they were given an effective calmer first, then given the mouse task and then finally the stressor in the second phase. So one thing to note about this design is it’s wasn’t ideal in terms of giving us the best stress signal to measure with the mouse because we actually had separate phases of the stressor itself from the mouse task. So the measurement in the stressor weren’t actually concurrent. And that was the choice we made in order to make sure we had a very good sort of robust signal here we basically used a task where people were concentrating on the math problem that was similar to what people used before in other stress studies. So we could have tried to give them a mouse task at the same time, but the concern was that it might have distracted them. Maybe they weren’t really stressed. We thought it was safer to produce real stress which we could validate with self report NHRB measurements which we did and then see what we could get from the mouse analysis later. So we expected a certain amount of decay and in fact there was some decay of the stress signal during the mouse measurement phase, but it was nevertheless strong enough for us to get reasonable results. In fact it was considerably better then we had expected. So I think this is all just saying what I just said. So we wanted to again have validation that the stressors were really working so subjects were given self report questionnaires at the beginning and ending of each phase, that’s important to point. Sort of for practical reasons we had to give them the survey when they were transitioning from one behavior to another. Those were very reliably significant and you can see that also there is a label here saying that the surveys were taken at stress and calm stages when actually those numbers represent the averages of the questionnaire responses at the beginning and the end. So we called the stress questionnaire response, the value average between here and here, and it was supposed to be something in the middle here. Similarly the M-stress signal was an average of the reading before and after the phase. So what we observed was the difference in stress was higher during the active stress phase relative to the calming phase. The difference is about twice as big there was it was here, so about roughly 2.1 verses 1.0 differences. So in fact there was dome decay over time once the stressor was moved. So heart rate variability was measuring continuously all of the subjects where hooked up to our tricorder instrument and we measured heart rate variability and basically the results were all over the map. We worked hard to get them cleaner then this, but the reality of heart rate variability is that it’s a very difficult signal to get consistent answers out of. Every reference we have seen on the particular instrument has similar results. The best results were actually for the fast stress response, the basic heart rate difference was the stronger signal, not the variability at all. There was a huge difference in the two measurements. So we actually went over the data a number of times to check that and it’s actually a very strong signal and it’s really there. Realistically and none of the other measures I would say are reliability reportable, although we have some significant results here. We have taken seven different measures and if you really want to have a significance of .05 when you are taking multiple measures you need to correct. And the thresholds would drop below. And you really should be working at .05/7. So these effects aren’t really strong enough to be reported. You are having too many chances to succeed I would argue. The other rather disturbing thing I would is that these signals were marginally significant maybe as one way to say it. There were some others over here thought, especially this one and there is one up there which is actually as strong and they are in the wrong direction. So heart rates seem to be they were more stressed in the calm phase. And again, we went over this data a lot and while there are some outliers a lot of it is coming from a little bit of motion artifacts which we weren’t able to sort of reliable eliminate, that is we couldn’t define criteria that said we can throw this data out and keep the other data. So the high order message was that we tried our best to keep good HRB results for this data set and we got one which was actually a fast stress response. All of the other I would say was not credible. And the last thing I would say is that when people do heart rate analysis all of the formal work does involve hand clean up. So it involves trying to eliminate the worst phases of non-signal and the worst outliers of these values. And in spite of doing that we still didn’t get a reliable signal in heart rate variability. So there were several measures, but if you actually factor in Bonferroni correction they were less strong in the MS stress, but none of them I think were reliable reported and some of them were actually in the wrong direction, so not a good result. So now to the measures that were derived from the dynamic measurements, the mass spring model, here we got much more credible results. And just to remind you these measurements were made in the aftermath periods, MStress or MCalms which where actually the intervals right after this stress or the calming influence. The absolute values of the signals are pretty small, but you will see on the graphs that those are actually, the parameters themselves are clustered around those values and when you see differences like that they are actually quite good. So we do get a reliable difference and a nice strong ”P” value and they go in the right direction, in other words people are more tense when they are stressed. And these are just simply some different parameters. These are the actual frequencies of the damped response these are the damping parameters. It was less clear whether the damping parameters should be smaller or larger when people are stressed. And our results were not very conclusive on that. And as a reality check you might expect people to be a little bit faster when they are stressed, but we didn’t get a robust signal endorsing that, in fact we got virtually no signal. Okay and what I just showed you were the aggregate across all of the pointing, dragging and steering tasks. It didn’t matter if you broke them down they are individually significant in all of the cases. As long as you use the frequency features. So yeah, it seems like there is a real signal there. We didn’t get a signal from time surprisingly. So let’s look at the signals a little bit more, one really nice feature that we noticed of the stress signal as a function as task is that it had a visible and sensitivity to the distance of the task. So just to recall we had in all 20 tasks, 5 different distances and 4 different widths in powers of 2 and they are shown along here. This is the 5 different distances. These are the distances here and these are the widths of the targets along here. And you can see that there is a complete distance dependence on the distance and close to zero sensitivity on the width. So, it’s quite a difference from a Fitts’s law kind of sensitivity. But you can also see a nice separation between stress and no stress, but you can also see it would be important to model the sensitivity to task if you want to really distinguish these things. There is no separation if you confound for task. >> [inaudible]. >> Jonny Canny: Well we could because the Fitts’s law index of difficult --. Well you will see in a second the time follows exactly the index of difficulty. But no, the index of difficulty is basically the distance divided by the width so that’s a sore tooth going like that. And basically it’s only half right. It has the right sensitivity to distance, but the completely wrong sensitivity to width. So I use a different and simpler model and that’s what we built. We found similar results for dragging. The dragging curves look like this, again virtually no sensitivity. Maybe it’s very slight slope, we actually didn’t model that, we just measured the distance sensitivity. And there is a good reason for that. Let’s see finally, yeah, so we turned this observation into model where basically we fitted a parameter based on a log of difference. So the other things to observe here is the steps are constant in size and the target distance is varying exponentially so it argues that the right model is alpha times the log of the distance, similar to Fitts’s law, but again without the width sensitivity. And again from that it’s very easy to derive what alpha should be and in fact we use an alpha that’s independent from the subjects. I am sorry I didn’t actually quite, I was out of sync, and the dragging task here, the other ones here were actually for the pointing tasks, but you can see it’s the same kind of shape. The steering was more like a Fitts’s law type of sensitivity. So, this one did show sensitivity to the target width as well as the distance, again though with apparently logarithmic dependence. So we did analyze the steering task using the small complex model that had both width and distance parameters. And finally time looks like this, and if you work these values are just proportional to index of difficulty from the number below. But here it’s also kind of obvious that there is really very little or no separation in the time signals which was a bit of a surprise. All right so now we have a very simple model, but nevertheless a model which is very easy to compute which is we can take those role readings and basically subtract off the stare effect which in the case of clicking and dragging was independent of W, just depending on D. And the numbers that you get out are sort of then normalized and you can simply apply a classifier between them. Is that making sense? So we basically remove the staircase dependent. So now if we have had any observation as long as we know what the distance of the movement was we can produce a kind of canonical measurement which should be different for the stressed and stress case. So from that day did we finally run some kind of a classifier and derive some accuracy results. And here is the result of that. We tried a few ways of classifying. The simplest one is just taking the stressed and non-stressed points, taking the mean of them; these measurements like the HRV measurements still have a lot of outlier mean values have a lot of outlier problems so taking mean values is a very bad model. We instead used a max accuracy classifier which simply means we took the threshold; it’s a one dimensional signal now. So we took the threshold which gave us the highest accuracy which would be equivalent to doing a support vector machine one dimension, but it’s just simpler to take the highest accuracy threshold. So that’s what the blue curve here. The red curve is taking a simpler threshold, which is just simply the mean of the two sets, so taking an average mid point between stressed and non-stressed populations of data. Both of those are using the stare model. If you take the stare model away you get this accuracy here. And to be more specific I think it says it here, but the measurements are made by taking our experimental data, randomly taking a sample of some of the points as the test set and then training the model, meaning just setting the threshold on the other points and finally using the trained threshold to classify the held out points. So along this access here is the number of held out points, the number of sampled points. There were only 100 points in total so the accuracy is generally increasing as the sample get’s bigger, but at some point it tapers off because you don’t have enough data from the model. The model is the other points. But anyway, we get accuracy of about 70, this is per user. So given let’s say a few hundred points of data for a user you can generate, assuming it’s labeled as stressed and unstressed or perhaps it might be labeled as neutral, you can learn a threshold and then from about 10 subsequent observations you should be able to classify stressed and unstressed to about 70 percent accuracy. Again, to remind you the data that we were using was based on the state of the subjects in the MStress and MCalm state from the self report and the HRV data. Roughly half as much stress as the full stressors. The data for the clicking and steering tasks is here. We get similar results overall. This one is a bit lower and this one is about the same so around 70 percent accuracy again. So it seems good enough for practical use. And the key advantage of the staircase models for clicking and dragging is they don’t require knowledge of D. That’s extremely useful logger that’s application oblivious that’s just running at mass activity all the logger has to do is recognize the stat in the end of mass movement, let’s say with a time window, it doesn’t need to know what the target size was in order to figure out the stare correction, because it’s only using the distance value. Presumably because it has that really nice logarithmic dependence on the values that we were able to take you should get fairly accurate measurements of some distance which wasn’t one of the one that we did. So we are actually in the process of doing a subsequent study with more realistic math tasks where we simulate angry messages from supervisors coming in e-mail to produce the stress the stress in an actual GUI context so we can get more realistic movements, but still there is pretty good evidence from this study that should work. So we are building this revised logger which will run as an independent processes, won’t need to be linked to applications which has some privacy advantages as well. But from simply by looking at mass movements it should be able to report and evaluate a kind of real time estimate of stress. So our original goal of this was sort of health related. How can we help people monitor stress, but it does suggest that we can perhaps generalize a little bit in our goals how we can look at peoples levels of frustration or perhaps anxiety about user interfaces or applications. In the absence of measures of what the stressors are in peoples lives if we simply look at these stress levels as a function of time and if we are able to get a little bit of information about which application people are running then we can use this kind of measurement as a kind of implicit usability measure, which I think would be pretty interesting. I mean its simple enough we could get masses of data and then depending on how you cross cut that data you could isolate factors such as the application that people are running and get perhaps the implicit usability information. All right. So that’s the summary. I am going to wrap up there. We have generalized to a cell phone. We have collected from a similar study from a cell phone. The only difference in the cell phone is the task are more diverse and less controlled, meaning that we did so some pointing and dragging tasks on the cell phone. On the other hand though is we observed, and we have the video of the experiments that peoples use of the cell phone was a lot more diverse of the mouse and that people would rest a hand on the table and sometimes hole the phone up. I some cases would be holding the phone in two hands and dong two tasks. So in terms of the dynamics it’s a lot more complex and most likely we will have to at least attempt to recognize the distinct mechanical states that people are in when they are using the phone. Nevertheless there is some science that certain types of features, the basic tapping on the screen seems to have a really nice dynamic ringing signal that seems to be related to hand tension. So all right, that’s the work in progress. I hope to have that soon. So to summarize we have been working on senseless sensing, which is trying to leverage existing technologies some of which seem to have remarkably strong signals around affect generally, but especially for stress. So we described recent measurements on mouse stress which is a ubiquitous low cost, hopefully reliable source of stress in the real world from ordinary mouse use. We would like to see if we can get similar measurements from cell phone use and we think we would have advantages over the voice based cell phone use in that there would be people holding the phone which they arguably spend a lot of time doing and we will have to see if the environmental vibrations and so on are trackable or not. So, yes and of course work that my student Pablo is doing and Mary’s work is doing is working on interventions and trying to tie some of these measurements back and deliver appropriate, timely and effective interventions for relieving stress and improving mental health. Okay. [clapping] Yeah? [inaudible]? >> Jonny Canny: I don’t currently have a student who is oriented towards that. I think it’s a great source. My former student [indiscernible] who is at University at Pitsburg has been doing some work using the camera bit contact mode. So he has built a small video game that involves a lot of movement of the thumbs over the camera sensor so he get’s a simple pulse signal from that data. But the face, you might know, there has been work at MIT and elsewhere on recognizing pulse ROM, changes in face during red blood there’s enough signal least a pulse signal, of course there is so of toolkits. facial color, there is enough blushing of the infiltration, I guess that’s not the word, but from the flushing of the face that you can get at a fast stress signal directly from the face. And much emotion in the face and there are a number Unfortunately a number of them seem to be proprietary right now. There is a little bit of open source work in the open CV toolkit, open computer vision toolkit that does at least face isolation and a little bit of feature recognition. But anyway part of the trouble is that there is a fairly significant technical on ramp for doing vision analysis. So right now we are interested a little bit more in some related topics. We are doing some work in deep neural networks for general image recognition. It might be applicable to this, but for right now, no we are not doing tracking. Do you know of work? [inaudible] >> Jonny Canny: Oh, I would love to do pupil dilation, but I don’t have the resources right now. I think especially because it’s both a stress cue, but also an attention cute. And I think a lot of the works on stress I focusing on these rather microscopic effects. They are very important effects to do with attention and interest that are more on the positive side, like you would like to know when people are being effectively engaged and having an appropriate level of engagement, but not becoming obsessed. So in a sense they are in the zone of ideal behavior that is they are sort of attending to things without being distracted. I think gaze patterns and pupil dilations. You want the right combination where people are transitioning from one stimulus to the other without being haphazard, applying appropriate focus, etc, I mean that’s really about detecting whether people are in the zone. That’s sort of ideally where we would like to move from remediating stress to actually helping people get into the right cognitive end cognitive zone. So yes that would be a great topic, but we aren’t really there. [inaudible] >> Jonny Canny: So I will tell you a little secret which is no. The detail is we didn’t expect to see really the second order behavior and gross behavior. And the truth is we don’t actually, it’s not strictly second order. It’s two second orders stacked because the gross movement is really not what we are interested in. We are interested in the second order little wiggling that’s on the top of it that’s at a different frequency. So the truth is we actually do a full photo model and throw away the lower frequency poles. And we take the high frequency ones which turn out to be in the right frequency range for the system that we are trying to detect. So you are very astute, that’s not exactly what we are doing, but it works to essentially filter out the other component. >>: And when I think about that model, when I move the mouse am I just basically just making a new set point for my [indiscernible]? >> Jonny Canny: Well again, it’s not exactly a fourth order system. What we understand is that when people do a gross motion it’s not that the spring damper is supposed to be a sort of open loop system that doesn’t have a big input. When you actually move you really have this other system with a big forced input that’s responding to that input. So you again don’t expect. And when you look at those polls they are just not robust. They are all over the place, the dominant poles. So I am not sure if I am going to answer the question, but yeah, so what were the cracks of the question? >>: If the model of a motion is just making a set point from one [inaudible]? >> Jonny Canny: So the simplest biomechanical models that I like are basically changing the set point of the system. So yeah, we did try a variety of things that intuitively might have helped such as trying to only run the analysis during what appears to be a passive phase. You can define passive as energy as coming out of the system and it turns out have a simple formula in terms of the direction of derivatives, first and second derivatives. So we did it in passive phase and it was not as good. We also looked at active phase. So, I don’t know, I mean yeah, we tried to do the things that might have helped, but they didn’t really help. >>: What did you do to try and calm your subjects down during the study? [inaudible] >> Jonny Canny: Yeah, you know what I honestly don’t recall. So it’s a research question which interventions work best, and I think it might have been a breathing exercise, but I am not completely sure. But yeah, from the work that Pablo has been doing and Mary has been doing them are clearly different interventions having different effects on different people. So it would be great to have a better understanding of this in the context of this experience. I am sure you can, you get so many anomalies because some people don’t get calmed at all during the calm phase and some people aren’t getting very stressed in the stress phase. So there are all these outliers which make the data a lot less clean then we would like it to be. And perhaps better machine learning and modeling so that you are giving people perhaps the best one would probably help this data quite a bit. >>: My second question about checking for stress inside the voice and do you think there is a different type of change in the voice during the high speed version of the stress version the low speed version? Because if in real life you don’t know when the stimulus is, so if you can distinguish between the two you can almost figure out what was the stimulus that caused them to be stressed. >> Jonny Canny: So I do know that it’s a chronic signal. So it’s the same signal they find in depression. It’s most likely cortisol because it’s basically permanent from depressed people. >>: But could there be one from [inaudible]? >> Jonny Canny: Yeah, you would think that there would some how it might also be the same effect but stronger in a short-term situation. So I don’t know if anyone has measured that. >>: [inaudible]? >> Jonny Canny: Yes, yes, that’s certainly true. Yeah, the voice really is a wonderful signal and they breathing --. The good thing about the work that we did earlier is that there are so many untapped signals in vital signs if you are able to get them; it’s just really hard to get them. There is definitely more in breathing that people haven’t tapped yet. It would be great to be able to do that. We just decided this is much easier to do the work, do the experience and have impact with the implicit signals, but the signals are often a whole lot more ambiguous. >> Mary Czerwinski: All right, let’s thank Jon again. [clapping]