Early Prediction of Alzheimer’s disease Presenter: Dr. Eva K. Lee, Georgia Tech Recorded on: August 6, 2014 And now we'll go to today's presenter, Dr. Lee. >> Thank you, Sherry. So today I will present a talk that is on some of the primary related to early prediction of Alzheimer's Disease. Alzheimer's disease is the 6th leading cause of death in the United States and it is the progressive irreversible brain disease causing memory loss and other cognitive dysfunctions. And within the United States, its estimated about 1 in 8 elderly American suffer from this disease and that we spend about 183 billion dollars in terms of health care expense in these areas. And worldwide about 35 million people suffer from this illness. And the objectives have always been there have been many attempts by many different groups of Clinical ready to discover the earlier possible diagnosis and also to provide earlier intervention. That is because Alzheimer's Disease is incurable at the moment, and that once it's diagnosed It's different, the drugs that is given to them is in an attempt to slow down the progression or to mitigate some of the symptoms. It does not kill the diseases. So here you see the MRI images of a healthy brain. And the one with mild cognitive impairment, and then progressively to Alzheimer's Disease. And we look at all these, there is a prevention in terms or like the type of food or the type of protein that accumulate in the brain, and how individuals may follow certain types of lifestyle that may help preventing it. So this is also a very active research area. As the progression of where people look at the information and there is type of systems inside the body and how things change and how it may affect changes in the cognitive. And trying to understand it, and try to determine it. And also catch this. And then by the time the symptoms occur, then clearly we march into the disease stage and at that time, then we cannot reverse the process, but rather maybe the, being able to take the drugs to slow down the progression and basically treating the symptoms. So, if we look at the type of data that are available now and being actively collected as psychological tests, basically is a 10 to 15 minute test like the IQ test where they ask questions and all these things that is being asked and based on how the individual respond to different questions and draw different pictures. Then the doctor can diagnose what is the cognitive status of the individual. There is also the genomewide association study, the MRI I imaging, and then there's the biomarkers and putting on S waves. It's a little bit more invasive, and this study is that some of them are taken from the spinal cord and some of them have to be taken when the patients die and the affinity results can be tested. So be looking at the new psychological tests, as I mentioned, this is the one that is very non-invasive and you can think of it more like an IQ test. So there's the mini-mental state exam and asking many different things and that's also asking languages like in terms of like the tester will give you ten different words and you would be asked about different words and how you remember them what ordering and then response time as well as how you answer will be captured, there's also the drawing of the clock hands. And as you see, these are quite obvious that you can see the clear problems in terms of how they understand or they visualize and how they actually put down what they see on the piece of paper. And then there's also the depression scale, like asking questions how they feel. So all of these are part of the Neuropsychological tests. So the objective of this study is that we would like to be able to find the earliest form of detection mechanisms so that we can actually provide earliest intervention for individuals. And so if you look at all the studies is that, there have been studies on biomarkers on imaging, and many of which have pointed out the deficiency is that, by the time they called the individuals, it is already, it's a diagnosis of the illness. That means they are not really able to reverse the process, but rather have to treat just the symptom or slow down the progression. Neuropsychological tests give us one glimpse of hope is that if we can actually use these tests and be able to really test and diagnose the individuals wit cognitive impairment at that stage, then we can catch that early enough, then we may be able to reverse the process. So this is the most important part, is having a chance to reverse the process, and being able to provide a of early intervention for the doctor. We receive all this data not just in Neuropsychological tests. We receive all the data, as I mentioned, and interesting enough it's at the beginning, we put all the data into the system and trying to see like what kind of data we pop up to be the most effective one for earliest diagnoses. And the results of the Neuropsychological tests pop up way at the top and long before the MI images. So what it tells you is that the brain chronically changes. The brain may change, but it is not significant enough to provide a diagnosis with the current technology in terms of the resolution of the images. But the changes in how people talk and how they can they understand words and how they remember words and how they draw things. If they can capture those, even those that aren't as nice in terms of like, because based on all the imaging that we receive, but yet it provides you with a lot more powerful tools for early data diagnosis. So now, looking at the data. So we are looking at the classification. So the idea of classification's really simple, if we look at two different sets of data here. The dark one and the light blue one. It says the idea of classification is to separate these two So that you can say, this group belongs to that blue one and this is the light blue. And then you can see there are some errors in between. So that's the. So, am I still connected, Sherry? >> Yes, yes you are, I'm sorry, I have to do some people. Go ahead. >> Okay, thank you, so. Now I got discombobulated because I got lots of things on the screen sorry one second. The classification here, the idea is that is to separate the patients from the control group which is the normal patients. To the patient that has mild cognitive impairment, that means mutations that are progressing but not yet diagnosis with Alzheimer's and invitations that have Alzheimer's. So what we would like is really to be able to catch the patient that has the earliest form of mild cognitive impairment and the Those individuals. So our approach is the analysis using mixed program and that allows you to have a reserved judgement. So if I project, for example, they have three groups that correspond to the Alzheimers and mild cognitive impairment and the control group, for example, the R1, R2, and R3. You can see there are some errors in between the mix of some of these letters or these numbers, one, two, and three and especially in the middle. So our idea is that instead of classifying them into three groups exactly, you can make a reserved judgement where those Individuals that are fuzzy in the area. You can actually kick them out first. At the beginning you classify all three groups, and for those patients that are fuzzy, you can classify them in the second round. By doing that you can avoid over-trained. You can also avoid Too many misclassifications right in the first step, and this is actually really important in the clinical setting, because a lot of the techs may not be definitive and there may be some fuzziness in the lab results. It is nice to be able to have these fuzzy areas reserve judgement where we can continue to classify these individuals. So the set up of the Predictive Model is as follows: So we input G groups and N entities, and in this case, we have three groups. And that the entities are the patients, so we have three groups of patients. And then each of them has different attributes, so here the entities are the patients. The attributes are all the data related to the patient. In this case, if you think of it. It has neuropsychological data, the MRI data are markers and all the information that we have related to the individuals. So the mathematical model here is that for each patient, we associated with a 0/1 variable to represent if they classified it correctly or not. And then there's a mathematical expression to describe which group, which entity, will be classified. And also, another mathematical expression to describe how to place an entity into the reserve] judgment. We control how many misclassifications that we allow. And the objective then, is to maximize the number of correct classifications. So in this case we want to maximize the number of patients that we can classify correctly into the respective group. So at the end the outcome from these mathematical model is a predictive rule and that it can allow individuals, or it can allow the doctors, to use that when a new patient comes in. They can actually do the test they need to do and then the doctors can diagnose right away what is the status of this individual. That's the most important part. So I will just show you one slide. This is the only mathematical slide I will show. Here you can think of the patient J from group G. Classified back to group G, so that's maximizing the correct classification. And then here we can say these patients J from group G being classified, needs classified into a different group, which is group H here. So we want to control the error rate, so in this case it's 15% that we control. So this is the mathematical model, and without going into detail, this model had some challenge in terms of how do you solve it to optimality, how to do you come up with a good classification that the physician can use. So how do we? How can we measure how good the predictive rule is? So we use the data from the hospital and one set is called the training set. And we perform what we call K-fold cross-validation. In this case of all the training data we separate them into almost equal parts of K-equal parts and then we use the different parts to establish the rule and then we validate one fold of it. Then once we finished the training and the cross validation then we do blind prediction on a set of patients that are not used in the training. So 10-fold cross-validation is what we use, is that we partition the training set into ten subsets, roughly of equal size, and we develop the classification rule using the mathematical model that I just showed, and based on nine subsets of it. Then we test the results using the remaining subset and we perform this ten times, each time withholding a daily, different subset of validation, and by doing that we obtain an unbiased estimation of the classification correctness. Then once we obtain the rule, then we can apply the rule to new sets of patients. So one, the most important part we understand is the following: Developing the predictive rule is hard. That means solving the model itself and trying to come up with a rule that is good is very difficult. But once you get the rule it will take you only seconds to test new patients. So this is the most important part because the physician would like to be able to use that in real time as they are seeing the patient or when they get the results from the lab. They want to be able to run those predictive rule right away, and then they are able to see that in seconds, what the status of the individual should be. So here I'm gonna show you, this is the training set subjects, we have two trials of patients, it's a pretty small subset and here we have, this is the Alzheimer's patient, this is the mild cognitive impairment patient, and this is the control group and that's only 35 patients total for the training. And it is designed in such a way that I would like a small set of data for training, and so that we can see how the rules may, if there are really good rules that will come out or if we need to extend the data set. And I must mention, to those of you that are familiar with statistics, is that this is not a statistical method. So this is more artificial intelligence and machine learning approach. So we don't suffer from a small data set because we are not relying on the size, in terms of being able to predict. So this is the training sets, so these are the results, and so the training set is separated into two parts that we are doing the 10-fold cross validation and then we use the rest of those data that we don't use for training as time prediction. So if you look at the training, here this is the Alzheimer's patient being classified by to Alzheimer's, and this is the being classified as MCI and then this is MCI patient classified to MCI, and this is the control patient. So basically, if you look at the diagonal elements of this matrix, these are the correct classifications, anything that is off diagonal is the error. So in this case, we have 80% correct in classifying the Alzheimer's patient, 100% correct in the mild cognitive impairment patients and 100% for the control. So if you look at the blind prediction here, as said, we have a pretty good blind prediction as all the Alzheimers patients are classified back to the Alzheimers group. Let's see MCI group is one of them being classified as Alzheimer's and in control group is all classified into the control group. So now the interesting part, as I mentioned, at the beginning we put in all the data but then the most critical discriminant factors that are being chosen. We tried to use as few, as much as possible, so here, we only select five of them. And these turn out to be all from the neuropsychological order. I don't remember all this detail, what those means, but here, it means that The individuals were being asked or being read ten words and then they are being asked which word appeared in that order. And these words show up all the time, in terms of whether the individuals remember or not. Each of these have different meanings to it. If you notice it is pretty nicely done in terms of Psychological Tests that really allow us to look at the quantity of individuals. So this Blind Prediction Axis, so that 91% of these very small trials of individuals, so we expand these products to a much bigger set. Here we have the Blind Predictions for a huge group of individuals. So the Alzheimer's correction is a 102, and the error is 14 of these individuals who are being placed in the map phonetic. And then the map phonetic individuals, 80 of them are classified correctly and seven is in place as control and in the control, 13 of them being placed in the MCI. So this is the correctness, so the Blind Prediction accuracy's about 90%. So this is very exciting, and as one can imagine, it is difficult to get all these patient data, so this study we started in, I think, 2010, the study and we finished now. All these results have been populated, so there are two papers related to these results and then we are now doing three different mega-trials receiving from three different sites and three different universities. And then we also have multi-trials from many different sites that included about 2,000 patients. So this is progressive in terms of really looking at how does it work? And how much can be believe in the rule? And what is the predicted power? But one has to understand is that it is not so much about being able to just do ten full cross validation. The Blind Prediction is the most important because this is what the efficient support system is about. What is the Clinical Implication? Is that Neuropsychological test now has proved to be inexpensive and that it is really easily conducted at any time? You can conduct it at the base line even at the age of 18 when students are going to universities, they can be tested and this could be included in the annual check up for the individuals. It appears to be able to shed the earliest signs of Cognitive Impairment compared to all other types of data. And that makes sense, in a sense that a lot of changes maybe very in the body, yet we may not be able to capture them even in the imaging or in the bio markers that the, really how individuals actually perform in a day to day, responding to words or doing and writing and all these. Those may be a various few that we could take and be able to capture them. That's very important. Another really exciting implication, is that these could also be baseline for the literature personnel. Before they're being sent out to the. We can take those information. These are very young individuals that may not have like, anytime Alzheimer's but yet it is important to capture their Cognitives on a status because they're being exposed to many types of different chemicals or they're being exposed to explosives and that may also affect their Cognatives. So this is really inexpensive regular checkup and it takes about 15 minutes to do the test. So the conclusion here is that we have identified, I only showed you one rule, but we have identified many different rules, and they are all quite interesting related to different types of the Neuropsychological tests. And it allows patients to be able to come in to do the test inexpensively, and that physicians can effectively screen these individuals for earliest signs of. Cognitive impairment. So this is all based on evidence, because we basically capture all those information. And knowing the status of those individuals and use those to develop the classification rule. And clearly, the Schema is generalizable for any type of diseases. One thing that I do like to mention is that I know that when I talk to students about Alzheimer's and they would think okay, this for old people. That one would not really be able to imagine this is really not so much for old people. We had patients diagnosed definitively with Alzheimers at the age of 40. So now only 40 is still old compared to how the student will think about it. But on the other hand, it is really quite young, compared to what people think these other type of disease that are for people that are over 65. Most importantly, it is really quite important to be able to catch it early. And if we are able to catch these individuals early, what an indication, is that these individuals could be placed on Prophylactic drugs. There are three types of drugs available now. Before they progressed Alzheimer's if they take those medications, it will be very surprised at it and that means they were not progressing through the Alzheimer's disease state. That is really important because once the patient is diagnosed with Alzheimer's, there's no cure. But, before that step happens the cure is there. It's most important for us to do early diagnosis. So there are two papers available for these works, and for those of you that are interested in it, I will be happy to provide the paper on the chart website that you can pick up the paper to look at some of the results that we have, and more detailed results that you can understand. These are all raw data, and that means these are not processed data that can go doctors are more used to usually they look at all the clogs and then they put a rating on the clog. But rather, we use all the pictures and everything and then incorporate them into our model and it turns out to be more accurate in terms of the prediction. Thank you. >> Thank you Dr. Lee for the presentation. So now I am going to unmute people who use the phone function for the Q&A session. And then those who use the earphones function please send me your question by using the chat box function. Does anyone have any questions for Dr. Lee? All right, if there's no questions, then I would like to conclude today's Webinar. So thank you everyone for attending today's Webinar. Please feel free to send us an email if you have any questions and we will see you at our next Webinar in the fall, thank you. >> Thank you.