>> Hoifung Poon: It's my great pleasure to welcome Doctor Jeff Shrager who is visiting us from Stanford and Cancer Commons. Jeff got his PhD from CMU with Herb Simon and then went on to do a lot of interesting and exciting stuff including pioneering in eCommerce and working on computing platforms for pharmas. Today he will probably talk to us about molecular tumor boards which is an exciting frontier in genomic medicine, and without further ado here's Jeff. By the way, for people who are online you can hit a button to ask questions, so I will be monitoring your questions and then ask Jeff in real time. Feel free to send in more questions. >> Jeff Shrager: Yes, you have to write annoyingly interrupt the speaker in angle brackets. To be clear, but the way, I was working on software for pharmas, not farmers, just in case it wasn't obvious. I'm going to be speaking today about mountain bikes, I'm told which is what MTB stands for. The original title of this talk was Tools for Molecular Tumor Boards, but I never do this just before he talk, but I decided to change the slides. Most of what I'm going to talk about is tools that are built in conjunction with molecular tumor boards or the equivalent of molecular tumor boards to be used for molecular tumor boards. So I retained it for tools to quote a famous philosopher by, for and of molecular tumor boards and you'll see how that plays out. This is the outline. I'm going to be relatively short, although that has a lot of pieces. I'll start off talking about precision oncology which is essentially the problem that we're doing with which is essentially a decision problem, which is what drugs or, more generally, treatments to give to who and when. Some of the treatments can be non-treatments depending upon many dimensions of the phenotype, there are many decisions to be made. That sounds like a fairly simple problem. In fact, early on in medicine it was a fairly simple problem. In the preomic era there were approximately 10 different phenotypes, lung cancer or breast cancer. >>: What is omic? >> Jeff Shrager: Genomic is the word for basically genome and omic is like you take *omic, so you take genomic, metabolomic exomic, blah, blah. It's just short for everything *omic. We're going to get bigger very quickly. There were approximately 10 phenotypes which were histologically, see now we're getting all of the terms. That's not here unfortunately, Which were a histologically defined, breast, lung cancer et cetera and there were a few chemotherapies. In that world what would happen is someone would show up and they would do a biopsy and then they would select some treatment and then there was some regimen for doing the treatment and then they would see what happened. It was a pretty simple model. In that world, so this is the world of today in the omic world and I would, it's pretty hard to show an infinite matrix. I'll use the technical term eleventy-zillion. So basically it's an eleventy-zillion by eleventy-zillion dimension problem where you have not only single treatments but there are treatments coming online all the time, combinations of treatments, different plans for different treatments, all kinds of stuff on the treatment side and, as I'm sure you're familiar even if you've just been reading the news, there are thousands and thousands and tens of thousands, probably hundreds of thousands of different dimensions on the general phenotype side. If you just look at the genome you count from 16,000 to 3 gigs worth of information just in the genome itself of the patient. Notice that in cancer you are talking about not only the patient's natural genome type but also some tumor, some mutation which caused the problem so there is genome x everywhere. In this world the workflow, although it still looks like a circle, is very different. Essentially, there are biopsies. You run the panomics, essentially get all of the data you can out of the patient and the history of it, you do some kind of complicated targeting decision-making, some kind of complicated treatment planning. Sometimes you can test the proposed treatment in mice and other patients hopefully, first. You run the combo therapy when do much more short time observation of what's going on so you don't have to basically wait five years for the person to either die or be cured. And then they recur or don't recur. In 2011 we published a paper that was a review of the different techniques at the time, the different problems at the time in cancer and we claimed that AI could cure cancer. I'm not sure I would claim that today. In fact, I would claim the opposite today. We divided into three different kinds. I don't expect you to read this. I just want to give you the three different gross categories of problems that we identified and then go to the paper if you care. There were knowledge opportunities, basically, get all the knowledge there is. There were learning opportunities, which is essentially observed what happens when you do something and then update your models. And there were planning opportunities. Most folks focus on the knowledge and data because computer sciences these days kind of a knowledge and data kind of field and, in fact, this is the second part of this. It's the obvious thing to do and what we and other people have created our knowledge and literature-based tools for tumor boards, generally. I'm going to show you an example of this because I'm going to go to the for buy end version of this. This is a particular project. Some of this I'm going to skip through because it's going to be cost Newcastle for you folks. This is focusing on the modeling. Essentially, really this is treatment selection or treatment ranking, so you do something which implicates a model and then choose a treatment and then a goes into treatment planning. What happens here, generally, as you say we're going to scrape the literature, get some bunch of information which is approximately this model. It could be a molecular model or not. I'll show you molecular and non-molecular versions of it. And then use that in treatment selection, so this is sort of the obvious thing what you're doing is trying to take broad-based omics, broad-based research, somehow makes them together and come up with answers to various questions. The most relevant question for a patient is which drugs are applicable to which phenotype. If you have this data you can answer other kinds of questions. A particular instance of this is a project that we did with Muchella et al and Musellin. It's an Italian name. I have a little trouble pronouncing it. They had a bunch of physicians, oncologists that had formed this molecular melanoma molecular map project. What they did is they went out and they built melanoma molecular maps. They were experts at this and they built a bunch of PowerPoint slides, so there's no underlying representation to this. And they mapped out, there were like 20 of them, and they mapped out some very complicated systems, as are all of these. And then they did this other interesting thing. They had all these pictures and they said what we're going to do is we're going to go to literature and we're going to create this thing called the targeted therapy database. The targeted therapy database was a spreadsheet and it was built, and I'll show you this a little bit closer in a minute. It was manually created over a decade of melanoma research by experts in the field. Experts, biologists are really good at filling in spreadsheets. They really like to deal with them. They go with them very well and they are good. What they did is, and this is a close-up of the same thing, I just selected the b raff part, so this is the molecule, the gene of interest in this particular case, the particular mutation. There are lots of entries that detected that and the reference, the paper that was scraped for that information, and they read the paper and they decided that basically one line per paper, approximately, whether that paper for a BRAF mutation showed that this was sensitive to whatever drug. This was the drug. Sometimes the drug had an alias. So far that's relatively obvious. The interesting thing is these columns here. The model is what kind of experiment was done and the number of cases, which is sometimes relevant if it's a laboratory experiment, is how many observations. In this particular case a 537 is an observational study with 37 human subjects. Notice that a 6th some number, of which there were many, this is just a little piece of 1000. I think it was a 1400 line table, 1400 row table. A 6 would be randomized trial and a 7 would be a meta-analysis. We'll get back to this. Keep this in mind. This is where I'm going to go through this relatively quickly because you will all understand it without me having to explain it. You basically do some fairly simple straight forest statistics and you can score the likelihood that a different molecule is actually going to have an impact on melanoma given that particular observed mutation. We built a tool to do this and the tool have a couple of interesting features. Yes? >>: The previous slide, when you combine all of this evidence do you treat all of those lines in a uniform way? >> Jeff Shrager: They are weighted by the model. That's exactly right. They are weighted by the model and the number of observations, but other than that they are uniform. That's why that model is in there. >> That table. You take the math that is underlying it, by the way, this is published so you can go to the paper if you really wanted to see the math. We did the obvious thing. We built a web-based tool, but the web-based tool had several interesting features that I want to show you because it emphasizes this interactive point I want to make. This was usually set to the standard set of treatments, so the inputs are basically the observations, the tests and their results, whether they were consistent with data as expressed. Concordant is apparently Italian for yes and discordant is apparently Italian for no. You would set this. The interesting thing is it would tell you first of all which things that you had to test. That is to say it's ranking not only the treatments but also the tests to do. I give you a bunch of tests. You would fill in the ones that you did and then you would end up with some score that basically says yay or nay and what is the probability that this is actually going to have an effect. It's more interesting when there are multiple things so I'm going to skip forward to this. What's going on here is it tells you the references. It shows you the rows from the prevalent hypotheses and this was very important to the doctors. Remember, we were building this with the physicians that built the table and so they insisted on seeing the relevant hypotheses and seeing the references because they wanted to be able basically to go back and figure out whether to believe it or not and we'll come back to that because I was very important. Again, it's more interesting when there is more than one hypothesis it and set. It's telling you essentially it is doing what we call drug ranking or treatment ranking. This is treatment ranking based on data that was supplied by human experts reading the literature and making a judgment as to what the value is. You can do the obvious thing and say we are going to semi-automate the reading of the literature. It turns out to be extremely difficult for reasons that I'm sure you guys could explain to me better than I can explain to you, but essentially the way that we did this was by just looking for concordance relationships and then actually going back and having to score what the concordance, which direction the concordance relationships were. The problem with concordance relationships is, this is out of Pub Med. The problem with concordance relationships is that there are our 2 to the 22,000 possible combinations if you just do it raw. But the great thing is that remember it's the melanoma molecular map project so they actually had drawn all these nice pictures and so we could just focus on the genes out of the pictures. That's a fairly easy thing to scrape out of the picture. In fact, with mechanical Turk you can scrape the relationships also. You end up with a bunch of relationships that basically say the probability that a particular gene… Let me back up one slide because there is a better example. The probability of this is what you really want and you can't really get that. That is the probability of a cure giving a treatment and the disease characterization. You can get things like the probability that the gene goes with the treatment and then you have to actually score the direction of that manually because we did not have enough natural language umph to be able to tell what direction the result was. I still think there isn't enough natural language umph to do that. You can go through and you can get that and you can have those data in and then the experts would look at that and they would have to score the direction. But a more interesting thing to do is actually use exactly… Sorry. I had backed up and I lost my place because of backing up. Basically, we use these things to constrain what the concordance relationships were over. A more interesting thing is to go back to this and say if you got the ability to do human observations, so 537 is an observation of seven, so seven subject, what is 51? Case study, right. There are thousands and thousands of case studies in the world, many more then there are papers. What you can do is you could actually start adding rows to this table for every case that comes down the line. It's a little complicated because they are nonindependent if there are multiple observations, so there are details. But the idea is that eventually a few added cases to this it's an extremely simple version of a learning approach from the data. They don't have to publish the cases. You want this to be sitting there, reading the cases that are coming in through the tumor boards, and I'll get to tumor boards in a second. That's the obvious approach, abbreviated genotype there, genotype to treatment models and the treatment also. But the problem is it's not a complete solution, not even nearly. Not even the 80 percent, it's more like 10 percent of the complete solution. The reason is if you look at treatment plan that is a little bit into the treatment selection, but more into the treatment planning, it's really an incredibly complicated thing and so you don't have to read this I've written it out here. Same catcher, but now look at all of the other considerations. You've got the disease model, whatever information you grabbed out of the literature which is kind of that table. You've got the patient's preferences, the treatment history, financial considerations, which is almost the same thing as drug availability and affordability. You've got guidelines as to what things are legal to do, what things have been approved, what things have not been approved, what you can get away with. Tumor availability for testing is a major constraint. You might say I'm going to do every test in the book, but it turns out you've only got a small amount of tumor and every time you have to get new tumor it's not only expensive but it's exceedingly painful if you can get it at all. All of these kinds of considerations are faced by these things, basically it's added to broad-based clinical experience is where we are going. The question is where do you find broad-based clinical experience and the answer is molecular tumor boards and that's where I'm going to spend most of the detailed time here. What is a molecular tumor board? A molecular tumor board is a team of, not always in the same room, sometimes it's virtual, sometimes it's e-mail, but in any case, it's a group of experts in many different areas faced with the problem of treating the patient before them. I like to think of this more as an engineering problem then as a science problem. This is Apollo 13, one of the greatest scenes in any movie at a time when the astronauts are falling out of the sky, running out of oxygen and they are faced with figuring out how to not suffocate and they pour all this junk out on the table and say we've got to find a way to make this this. I don't know which way it went but basically this fit into a whole for that using nothing but this. That's really the problem faced by molecular tumor boards also. >>: How do you get 15 people to fit in the same timetable? >> Jeff Shrager: It's an interesting question and let me get back to it and I have several different approaches to it. There are many practical issues like that with tumor boards. The way this is generally works in case it's not obvious is that a patient shows up who has progressed on whatever standard treatment they were given. These are the most advanced patients. The tumor boards basically see problem patience and that's good. It's bad that there are problem patience, but it's good if you focus on molecular tumor boards you are seeing the hard cases. The tumor board meets and we'll get into what happens there. They come up with some hypotheses and some treatment strategies and then it gets fed back to the oncologist and the patient. They make a decision as to what they are going to do. They try a few things. Sometimes it comes back to the tumor board. It's sort of an obvious workflow. Reimbursement is a big part of it and actually the argument for reimbursement is part of what the tumor board worries about. So it's a significant problem. They're facing and the discussions they're facing is not just what treatment to select but what to do with this patient who is falling out of the sky, or worse, given all of these considerations. Again, there's a group of experts in many different fields that come together on some timetable and some discussion often remote with e-mail to explicitly assemble, and I'm using model generally, not model in the sense of a pathway model, necessarily, but basically some theory of what they're going to do with this patient, according to best current knowledge, reason from that, the theory would be a much better word. The interesting about tumor boards is they expose their reasoning to us. By watching molecular tumor boards in process we are watching the in process problem solving that these experts are doing and they are doing all kinds of stuff and I'll show you examples using the knowledge obtained from literature et cetera. There are two unique kinds of things that molecular tumor boards deal with, lots of unique things. First of all, they know or they have hypotheses which they use about what parts of literature are too early, like what do we think really is worth looking at. They know what parts of the literature are outdated. Every time I say I know they operate as though they know. What it really means for a piece of literature outdated in some absolute sense is unknown but they are operating this way. They have to make decisions. Knowledge is not available at all in the literature; there is all kinds of stuff that the literature doesn't cover, many things, like how to actually interact with pharma over getting this drug. What are the important relationships between these different kinds of things? The cross domain translation of terms, we mean this by this gene and foundation medicine means that by that gene. What happens if the gene isn't there at all because it got ripped up? And they have heuristics that they used to do this. Basically, they are doing practical knowledge and this you might call scientific pragmatics. There is another category of knowledge which is that they basically are making arguments pro and con. They are really trying to work this thing out for the hypotheses. They give you the rationale for rejected hypotheses. This is very important because if you look at what's in a EHR, you only basically see what was done sometimes in the letter that went to support it or in the notes you say that the reason they chose that, but you don't see the reasons that they rejected hypotheses, which turn out to be I would think more important, but they basically never get any of that. It's basically lost information. This is sort of the same as the previous one. They have implicit unpublishable knowledge, knowledge which there is no experiment but it's in the problem solving they have to use it. They use it every day in their actual work. Tumor boards have stats on the cases that typically occur, which you could get by looking at the data that came out of the health records, but basically they have that information in front of them. You might call this in context knowledge, so it's pragmatic knowledge and in context knowledge. This distinction isn't very important. Let me show you what tumor boards use and produce. Everything you're seeing now is real data that either came from stuff that went into a tumor board or stuff that came out of a tumor board or stuff that was transcribed in a tumor board process. I haven't distinguished it all. One thing that tumor boards do is because they're meeting on a time schedule when they really have to think about this and get all of these experts to think about this problem, is they really condense the dimensionality of the problem for their reasoning. For example, they never look at the entire huge radiological history of this thing. The radiologist says here's the important thing. The circle it and say okay. This is where the tumor is. This is where the tumor was. Similarly, you get the history. They also very often these days molecular tumor boards will actually pick a diagram out of a paper often, a review paper and put it up and say this is the thing we are reasoning about is the Ras pathway. Actually, if you look at this one this came from a real molecular tumor board, it's actually got where the drugs target and things like that in the pathway and things like that. They're struggling with trying to actually do realtime reasoning about this situation. I am going to go on and give you a better view of this. Here is data from the tumor. This data comes from across the bridge, Tony Blau at UW. This is the condensation of the story. They give you relevant background knowledge as opposed to dumping the entire health record. This is different selections of background knowledge now from literature and databases, not from the actual health record. They went out and looked at some data and said for some reason it's relevant to them that there are 44 of these endometrial cancers with this mutation and cosmic. They pick up the literature that they believe is relevant and they pointed out and, importantly, they often, at least in my experience, which is huge, they, I'll change it to sometimes. You can sometimes see them state things that we can't find in the actual literature which seem to be implications of the papers but aren't actually in the papers if you read them. It's essentially in context reasoning about factoids. This is a well-known fact that means something different in the context of reasoning in the context that you would read them out of a paper. This is partly why I think reading papers is slightly crazy if that's all you do. They also create explanations. After all of this stuff, basically, they collectively decide that this particular FGFR mutation is the driver of this particular tumor. Now they've got a hypothesis about what's driving it, a model if you will. Now they're going to create a treatment plan. They say we're going to try this drug and this drug and it turns out they did try that drug, some stuff about the approval or not approval of the drugs and then this is their bottom-line suggestion, predicted that the BRCA and CHECK2 mutations may predict sensibility to PARP inhibitors which is what they are going to try to use. Sometimes a get test results. They tried it. The tumor board find out what happens usually fairly quickly like a few meetings later whenever they can get the actual measurements back. This is just to point out various places so you can find relevant models, model-based reasoning, background context. I said all of this as I was talking through it, so I'm just going to page through it. This is just annotations on the same thing that I said essentially. Test of hypotheses and outcomes. That's what tumor boards do. Go ahead? >>: Have you compared Army intelligence people? >> Jeff Shrager: Actually, I'm going to get to that exact question. >>: The other is [indiscernible] ever hit these proceedings? >> Jeff Shrager: I'm sure I have no idea what would happen. My guess is that they're immune to it by some sort of sign off participation, but I don't know. No one is really immune to the lawsuit, but I've never heard of a tumor board being disbanded because of a lawsuit. But it may have happened. Your point about the military is interesting for a bunch of reasons. One is that DARPA is partially funding this. Another one is one of the pictures when I showed the pictures of the tumor board, I didn't say this, but one of the pictures was actually not a tumor board. It was Afghanistan's war room. They look exactly the same, to your point. I'm going to actually talk about some technology. If you think of them as engineering teams or to your point, essentially, a war room process, it's exactly the same kind of thing. Exactly, while people are dying either in the field or, but that's the thing that makes it more interesting is you actually got lives on the line. It's not some random engineering support for building a bridge someday in the future. Money could be dying in that particular spot. I'm going to talk very briefly about some approaches. Each of these is partly implemented in certain ways. The first is this thing we call EPOCH which the tumor board caisson. The idea here is that we want to capture the reasoning that goes on in the tumor board and use that to help reduce the dimensionality in this problem, basically give you hints. We're focusing on the broad-based experience. What we're doing here is capture the reasoning. Remember you get pro and con reasoning, so essentially this is just a repeat of what you get, so you'll see the genomic aberration history, blah blah, case summary, personalized decisions and the hypotheses and the rationales for those hypotheses. The most important part of this is the rationales for hypothesis and the rationales for the rejected hypotheses. We capture these things in the obvious way and what we're starting to do now is take it and code the relational meta-knowledge out of it. Currently this is manually being meta-coded. You can tell by x's and stuff left out, but basically I just mainly coded this page. The idea is I believe that the technology out there can figure out that Poly-ADP ribose polymerase as a PARP, that kind of natural language technology is kind of a done deal. What's not a done deal is the relations this arguments supports that. This is a fact which came out of a database and supports this and this contradicts that. That's the kind of stuff the kind of meta-argument that we're manually coding now and hope to be able to semi automatically code by associate training on the manual code. That gives you the explanation structures. Here's just another piece of it. These reference the support whatever, I'd have to read it, that support that fact, so they pulled this fact out of those references. I believe existing technology can code what that fact is. What it can't code is the meta-structure of it partly because this was written out but often it's just spoken in a conversation, so you're not actually looking at something that somebody wrote at the time. Essentially, what we're doing is gathering from tumor boards semi automatically and hopefully someday automatically coding it and that could be used for a bunch of different things. For physicians, they want to get, and physicians want this, physicians want to basically put their cases in, get similar cases out, start using them as evidence for their own reasoning and reasoning examples. For payers you get advanced intelligence on which standard therapies are being considered. These are the thought leaders, these tumor boards. For pharma it's similar. You're basically getting intelligence about how things are being considered by the thought leaders in combination with other things and really use. For hospital systems, presumably because you can make the argument to payers better this way, so these are just kind of technical. The first one is not technical. That one is for the patients. >>: What would be the approximate volume that you could conceivably get? How many tumor boards… >> Jeff Shrager: Excellent question. As of two years ago the number was zero. As of now the number of things that actually call themselves molecular tumor boards is probably 15. The number of tumor boards that now get molecular data and have to cope with it is 100 percent essentially, because essentially they're getting molecular data, some from their pathologists. Sometimes patients walking with a foundation medicine report and say I you went and did this thing. I've got a report. Use it. What generally happens if that's not University of Washington is they call up somebody in some other field and they say help. We've got this pile of genomic data and we don't know what to make of it. The number is rising quite rapidly. You had another question. >>: What is a virtual molecular tumor board? >> Jeff Shrager: It's just one where they're not meeting. It can be virtual either in time or in space. Usually it's both. What happens in practice is a couple of doctors will meet in a local tumor board or proximal tumor board and they will say we don't know what the hell to do with this and so they will call somebody or they'll send an e-mail to somebody and describe the case and send the FM report to them and then get some guidance back, get some information back. >>: [indiscernible] >> Jeff Shrager: They do. Very good point. There was an arrow for that but I didn't make it clear. They very often do postmortem. That's the wrong word. They almost always get information back about the outcomes and that's very different from normal electronic health records. A normal health record is if the person comes back for more treatment you'll see that. If the person, you have no idea what happened. They could have died and gone away or gone to another hospital or whatever, the tumor boards do follow-up themselves, so an admin for a tumor board will call the patient up or the patient's family up and bring that information back to the tumor boards. One thing you could do with this is you could use this data to do the kind of usual clustering, pattern matching cohort. I'm sorry, clustering cohort analysis and that's a fairly straightforward thing to do and I think we actually have existing learning tools that could do that in a simple way. But that's not actually going to get you much distance over an 11 gazillion wide problem. 11 gazillion is a technical term. Another approach which used to be called explanation based generalization and now it's called causal Bayesian networks is essentially you use the explanation. Notice that an explanation is going to walk from a set of observations through some proof, that is to say through some inference. Proof may be too strong to some decision. So you can use this trace to tell you which pieces of the space are related to which other pieces of the space, essentially. That used to be called ABL or ABG and these days it's essentially the same thing as a causal Bayesian net process to do it statistically. Go ahead? >>: If we do the math you've got a few thousand instances of molecular tumor reports a year at most? It still seems very, very sparse. >> Jeff Shrager: It's very, very sparse but it's way less sparse than if you didn't have the knowledge coming from the tumor boards because you got the exact same number of cases, but a lot less guidance. You could be misguided. That would be the only downside. In other words if they're going someplace that's actually in the wrong part of the space, which is possible, but hopefully you would learn that in the process. >>: You can generalize a limited part of the space nearby what the cases and the issues that they have. >> Jeff Shrager: Absolutely, but they are all the hardest cases, so that's what it really comes down the line. Almost all of the hardest cancer cases and up at a tumor board or the equivalent of a tumor board. The easy cases are easy. So basically this is, if nothing else, a way of throwing away the data you don't care about. You don't want to look at every bit of the EHR. You don't want to look at everybody's records because that has been dealt with essentially. I agree with you. You're not going to solve the problem entirely. It's an additional source of data or source of guidance, essentially. The second approach here is a company that I actually cofounded called CollabRx. What I actually did was essentially use experts to build an expert system and I'll show you what it does. The experts built a model just like the MMMP folks built a model, but the model here was more explicit. They both had a molecular model and they just had a model in the sense of a set of hypotheses, set of clusters, functional phenotypes. And there was a tool in you could put in your stuff in it would give you guidance. The reason I'm going through this is that I'm going to go back to a picture. These models were open. There was actually an API to them, so you could actually go to the models and, actually, you still can, as far as I know. And published, so peer-reviewed at least at an instant in time, how a model its peer reviewed in general. But the interesting thing was our answer to Watson was this thing we called Norman. What Norman did, notice this guy's face. This is Robbie Salgia. He is one of the top lung cancer guys in the world. He was willing to put his face on this app for no money. The interesting thing about that is it's not just him. There's actually a bunch of thought leading lung cancer guys and girls in this particular set of people. What we were doing initially was taking advantage of this virtual tumor board, if you will. But we wanted to do that in a semi automated way, so that's where our version of Watson we called Norman came in. What happened is and any old-time Star Trek folks will recognize this if you don't look it up. Norman was the coordinator. What happens here is we would pull down the usual suspects from databases. And we would pull the data of what was actually happening in the apps, which are the actual instances that you're seeing, so those are the cases walking in the door. And Norman would essentially look in there and look at our models and his job was to update the models. But the way it worked was not automatic. What it would do is it would create a case, write it out using a template that said a patient presented with blah, blah blah and depending on what Norman hypothesis for updating the model was and send it to these guys. There was actually human intervention in the e-mail part of it, but basically the idea is that you would write a case out and give it to a doctor and say what would you do with this particular case. And we would give ranking hypotheses. And they would say yeah, that's fine. Or no. That's not good because of such and such. The idea was to wrap in the experts into the process. We weren't just reading the database and we weren't just reading that knowledge data, knowledge bases. We were also winding the updated process. The experts, you might argue partly that because we didn't have the technology to do it automatically, I don't think the technology can exist to do it automatically at this point in time if for no other reason than dimensionality arguments. Finally, and this gets to your point about military quite directly. I'm going to talk about this thing called -- actually, there's another finally, but it's just one slide. Remember you've got this nexus of experts and they're not always in the same room. How can temporal and spatial virtual MTBs keep track of what's going on? There's an approach that the intelligence community uses called ACH, which is analysis of competing hypothesis, and this is taught to intelligence analysts as a way to analyze political intelligence. What they do, basically, is they have some set of hypotheses, some set of evidence. They make a matrix and they are trained to fill in every cell in this as to whether the evidence is concordant with the hypothesis, discordant or neutral with respect to the hypothesis. That is just the intelligence technology. You could do the obvious thing. It's just a spreadsheet, essentially. You might say what if you have a bunch of different people doing this or different teams at different times and places? One approach which I called the Google wave of collaborative decision-making was you have everybody pile on the same matrix, so essentially, you take however many analysts there are, you build a huge matrix that has like hundreds of hypotheses. This is usually like bomb Iraq, don't bomb Iraq. At the top level it's these gross hypotheses. Every piece of information you have, and there are like 10 analysts trying to make consensus out of this. This is like a terrible idea. But I think it is actually being used as a military collaborative intelligence analysis. Another idea which arose from an ARDA project was this thing called a Bayesian community. The ACH idea is very powerful because think of it as these matrices are mind sized as long as they don't get out of control. One analyst has one matrix or maybe a small group of analysts has one matrix. They are really working their problem. Think of one tumor board working on one patient or one scientific group working on one drug. They would work on their problem and the trick is how the you get them to interact without knowing that they are interacting in some organized way. What you do is you give them each their own ACH matrix. You don't make them all pile into the same matrix, and then you wire them together in the background. Essentially, the output wires from one becomes the input wires to the next where relevant. For example, if these folks and this is a tumor board, this is some set of mircro-ray analyses about some particular drug's impact on a gene and this is something about personal biomarker observations. To the extent that these guys need this information they'll draw it from something which has evidence sitting under it. The evidence sitting under it is being work on by a team just working on that problem. If these guys update their results it will bubble through the network, hopefully it's a DAG. If not I'm sure it can be dealt with. You could change the ranking of treatments on the end. So this was an idea. It was implemented. We can't tell whether the intelligence community ever used it because they wouldn't tell us. Essentially, the idea is to, it draws on this idea that the ACH matrices are mind sized and that works very well for the intelligence community. So you can imagine a group, teams at different levels of science working on a problem and at the and you are trying to treat a patient, but all this data is coming at the bottom, all the way down to sensors if you like. There's a lot more to say about that but I won't go into it. The last thing just very briefly is Global Cumulative Treatment Analysis. As I implied there, it's not just one tumor board and to your point how many tumor boards are there. There aren't a huge number but there are enough that it's not one. What you want to do is, since you've got multiple tumor boards, you don't want them all doing the same either smart or stupid thing in parallel. You basically don't want this to operate as a trivially parallelize able system, which it now is doing, because the only way to connect them if you don't have the Bayes community model is through the literature which is incredibly slow. There's no high throughput. There's no Hadoop communication system or inaction machine communication system would be a better model. Essentially, it's hundreds of these tumor boards operating all seeing similar patients, so the idea of global cumulative treatment analysis is that you, in the normal case -- the green is just the normal case, basically someone shows up. You have a treatment hypothesis. If there's a best choice than they do it. If there's no best choice, if there's no acceptable choice at all then you basically have to decide. But if there is equal choice. You don't have the statistical strength enough to make a choice, essentially what you do is you do basically the equivalent of exploration exploitation tradeoff. You say from reinforcement learning, you say what's the best choice to be given to this patient for learning purposes. But you have to be watching the entire community in order to do this properly. You might say this is nuts. You would need a giant connective computer system watching all of the different patients coming in all over the world to do this. Does anyone think this is not nuts? >>: You could just choose randomly and it would be pretty close. [indiscernible] less efficiently. >> Jeff Shrager: Yeah, but remember you have very few patients. You would exploit it less efficiently, but let's say you wanted to -- I don't know, but somebody apparently thinks that this is a good idea because there is a giant connective computer system called the VA. And the VA actually does this. This is called a point of care trial and what they do is a patient comes in and the patient and the doctor or given choice A or choice B. If they have a preference the preference gets given. If they don't have a preference the preference doesn't get given. If they don't have a preference, normally it would be given randomly, but what happens in the VA system is the computer chooses. And then there's a whole series. I have a paper on this if you want to see the theoretical reasons for it and you are you right. In many circumstances it would be good enough to give randomly, but if you have a very limited pool to choose from and the dynamics are such then you really want to take advantage of ->>: [indiscernible] randomly. >> Jeff Shrager: Yeah, exactly. In cancer that's the situation. You've got a relative the dimensionality problem the n is to your point, very, very small, so you don't want to replicate experiments if you don't have to. >>: The problem is the fact that with cancer the dimensionalities that they could offer them is just one. [indiscernible] >> Jeff Shrager: That's exactly right, fair enough. Anyway, this is Called Global Cumulative Treatment Analysis. This is a horrible picture. I have no idea how to depict what you want to do here, but basically, you want to somehow have these guys interacting over making choices in a sensible way and then recording in more or less real-time the relationships between the choices and getting them to do different things to search the spaces. >>: [indiscernible] two options. One is you just pick a random treatment. Two is you send them to a molecular tumor board. They make a decision and you record the reasoning. >> Jeff Shrager: No. The order is the other way around. If I understand what you're asking, which I may not quite. What happens is a molecular tumor board will often try to find a trial for the patient. Once the patient goes into a trial, then it is chosen randomly in the trial, usually. If it's an adaptive trial, which is essentially what a point of care trial is, but that's a local adaptive trial, then what happens is they are is some calculation on which pharm to put the patient in. If you are interested in the statistics of science, the coolest place to look right now is in adaptive clinical trials because really smart people are trying to figure out exactly how to use this information in the important way and tools for, they have plenty of tools themselves, but the essentially run simulations all the time. I think the order is the other way around. This global cumulative treatment analysis is trying to envision what would happen if you could coordinate over the tumor boards and have the tumor boards also inter-coordinated through something like a Bayes community model. We are trying to do all of that stuff simultaneously. And there you have it. Questions, comments? Other than the ones you have made. Thank you for your attention. [applause] >>: Any other questions? >> Jeff Shrager: The basic message is don't get cancer. [laughter]. >>: Cancer researchers recommend not getting cancer. >> Jeff Shrager: Yes, exactly I think they all agree on that. I think they all agree on that one, right. Really, from a statistical standpoint it's a horribly difficult problem, in fact, essentially intractable. We're picking at bits and pieces of information we can get really don't want to lose information and we don't want to waste subjects, essentially. It it's not clear random would be -- by the way, to your point about random, you're right that it would be close if they went random, but if they are not even doing random, what they are doing is channelizing. So what happens is they all read the last public report and they all do thing A. That is the worst possible version of it. That you really want to avoid. The trial system which I was describing to you does randomize in the trivial way. The adaptive ones randomizing a slightly smarter way. The vision of it is to do a grand version of this VA's point of care trial where you are actually random, the computer is taking advantage of what's going on. That's very complicated because the dynamics of cancer are so horrible that -- cancer in general is horrible, but the dynamics are very poor in the sense that you don't get real outcome data for a long time even if you're looking at local results, like a tumor load. What will often happen is, not often, but there are cases where suppose you're treated here times zero and then -- so you have two different treatments and one of the treatments seems to be doing much better than the other treatment. If you start taking the data away from -- sorry. The other way. If you start taking the patients away from the one that's doing poorly in give them to the one that is doing well you're taking away the statistical power of this observation. So there are cases where they cross over again, but you won't see it because you've taken all the statistical data, statistical power away from this case. So the whole adaptive trial thing is very, very interesting and complicated. And fraught with ethical problems, yes. Absolutely. >>: You mentioned this point of care in the VA. The always question for researcher is opportunities to engage outside. Like how do they figure out the hours? >> Jeff Shrager: The primary paper is by Feori…Livori or Livori…Feroi and once it's like Phil Livori and Lou Feori or it's Phil Feori and Lou Livori. Anyway, the point is if you look up Feori and Livori and Livori or maybe it's Feori is a Stanford in public health and we've been talking to him about this thing. He designed the study, that particular thing, but there are folks designing adaptive trials, there are many adaptive trials being designed, lots of interesting work in designing trials that take as much advantage of the information as possible. It's really, I call it in the talk at DARPA, I called it robot science for real. People talk about it robot in a laboratory running science, well this is you want to basically run the entire medical community as though you had control over it and were actually making decisions in a sensible way with respect to the statistics. That's very hard to do. The VA can do it because they have a giant connected computer system and they control all decisions. To the extent they can, I mean patients really control decisions. They also have access to all of the data. Go-ahead? >>: If it happens where some widow or widower says why did he die because he chose this treatment? He chose this treatment because you proved the statistics. >> Jeff Shrager: That's the way trials work now and that, I'm sure, has happened all the time, but that's an ethical issue that trials have faced forever. I don't think that this is any worse than that. Anyway, any other questions? Thank you.