>> Krysta Svore: Today the core group welcomes Robin Kothari who’s here to speak with us. He is from the University of Waterloo. His advisers are Andrew Childs and John [indiscernible] and Robin will actually be joining MIT as a post doc starting in the fall. Today he is here to talk to us about exponential improvement in precision for simulating sparse Hamiltonians. So, thank you Robin for coming and we will turn it over to you. >> Robin Kothari: Thanks, thanks for the introduction. Okay, so I am going to be talking about Hamiltonian simulation algorithms, algorithms that have an exponential improvement in precision. So this is joint work with Dominic W. Berry, Andrew M. Childs, Richard Cleve, and Rolando D. Somma. Please feel free to ask questions at any time and interrupt me if you want to. I don’t get side tracked talking about anything you want to talk about. Okay, so first I am going to just start with giving you a summary about what I am going to talk about the next hour maybe. So the main result is that we have this new algorithm for simulating sparse Hamiltonians. And the way the algorithm works is we reduce the problem to another problem, which is the problem of simulating fractional queries. So if you don’t know what any of these things mean, that’s fine, that’s what the rest of what my talk is going to be about. But, this is just like a big picture overview for those of you who know what’s going on. And the new thing is we have this new reduction to simulating fraction queries and then we have a new algorithm for simulating fractional queries. And this new algorithm uses a technique that we are calling oblivious amplitude amplification. So the first part of my talk is just going to be about the Hamiltonian simulation problem. So what I am going to do is I am going to explain about what the problem is, why we care about the problem, what’s know about the problem and what we have done about this problem. Okay, so let me get into it. Are there any questions at this point, problem not? Okay, so what is the problem that we care about? So consider the task of simulating physical systems. So what this means is that you just have a description of a system, I tell you what it looks like right now and you have to predict what it’s going to look like after five minutes or something like that. So this is a very basic fundamental problem in physics. I mean you can almost say that this is what it means to understand a system. Like if you say you understand the laws that govern a system then you need to be able to predict what’s going to happen after sometimes. So a classical example is I give you the description of say “N” bodies under gravitational force, and I tell you where they are right now and how fast each of these bodies is moving and you need to tell me what they are going to look like after a couple of minutes. So that is just a general example and similarly a quantum example would be there is “N” cubits, there is some Hamiltonian that governs the time evolution and I tell you what the current state of the system is and I ask you what’s the final state after some time “T”. And if the Hamiltonian is time independent we know how to solve that, like explicitly as an equation and it’s just the final way of function is E to the minus IHT times the initial way function. So more formally the Hamiltonian simulation problem is this general problem, which is specifically for quantum system. And the problem is you are given the Hamiltonian “H”, which is just a Hermitian matrix, it’s a complex remission matrix of size N x N, you are given a time T and what you need to do is give me a unitary that does E to the minus IHD and we allow some error, which is going to be epsilon, that’s the error parameter. Now I am not going to talk more about what I mean by the error, but you can just take any convenient version of this that seems reasonable to you, but for example the unitary that you implement should be close to the actual unitary E to the minus IHD under some suitable norm, for example the diamond norm or anything you like. For example the [indiscernible] output by this unitary and the ideal unitary should be closed in L2 distance. It should be epsilon closed in L2 distance, that’s another reasonable way of thinking about what it means for two operators to be closed. Okay, yeah, so are there any questions about the problem or the general set up? Okay, so this is what we care about and why do we care about this problem? That’s always a good question to ask right after someone has defined a problem. So the first thing I can say is that this was the original motivation for building quantum computers. And Feynman originally said something to this effect that we want to simulate the dynamics of a quantum system, but the best way we know how to do this right now is by a classical algorithm that’s very inefficient, that takes exponential time to simulate the system. So that’s bad and what that means is that we don’t know how to get simulations or systems that we care about because it’s just really large and we can’t solve it on today’s computers. And as a result of this a good fraction of today’s computing power is actually devoted to solving these problems otherwise in practice. And people in a diverse number of fields like quantum chemistry, material science, etc. They care about understanding small quantum systems and they are not able to do this to the accuracy or maybe to the number of particles that they would like just because they are being limited by the computational power. So if we had efficient quantum algorithms for this and if we had a quantum computer then these two together would give you a lot of, well these guys would be really happy. So we don’t really have a quantum computer, but it’s good to just get working on the algorithms and get one that’s as efficient as possible. So that’s the usual motivation for studying this problem. But there is also another motivation that was interesting to me as someone who works in quantum algorithms, which is that you can use this as a subroutine. I mean every time you solve a problem you can now reduce to this problem, like that’s just a general fact about solving problems. But, specifically Hamiltonian simulation turns out to be a pretty useful subroutine. So for example you can use it to implement continuous time quantum walks and it’s done in this paper where they show this exponential separation between classical and quantum query complexity using a glued trees graph and you need to traverse a tree. And another example, or another set of examples comes from using Hamiltonian simulation to --. So this last paper that sited as this paper for solving linear systems of equations that uses Hamiltonian simulation as a subroutine. The first paper and the last line is by [indiscernible], [indiscernible] and [indiscernible] and it’s to solve this query problem called the [indiscernible] tree problem you need to compute the nine different points of a function that’s recursively defined in terms of some Boolean variables and that’s also solved by doing a continuous time quantum walk, which is essentially a Hamiltonian simulation problem. So I guess what I am trying to say is that Hamiltonian simulation is also useful as a subroutine for designing quantum algorithms. Okay, so now what I am going to talk about is, as I have said a couple of times, there is no known efficient classical algorithms for our Hamiltonian simulation, but we have quantum algorithms that are good. So it’s a good question to as: what does efficient mean? Like this has to be made precise at some point: like what is efficient? Of course there are classic algorithms to solve the problem if you don’t require efficiency like this. There are always classical algorithms to solve everything that quantum computers can solve if you don’t insist efficiency. The main advantages are like in how much time, and space and whatever resources are needed. So as a computer scientist the first phrase is kind of obvious “efficient as polynomial time”. That’s always the answer to what is efficient, but, polynomial in what? There are bunch of parameters in this problem so it’s not completely straightforward what you might mean by polynomial time. So here I am going to say that polynomial time is polynomial in the size of the system. So for example of the Hamiltonian is an N by N matrix the number of cubits it acts on is log-in. So you want it to be polynomial in log N, not polynomial in N. It’s fairly straightforward to do Hamiltonian simulation if you don’t mind a [indiscernible] time that’s polynomial in N, because it’s just an N by N matrix. You need to exponentiate it and that’s easy, you just diagonalize it and exponentiate it. So the challenge to do this is poly log N. And then you need some scaling that depends on how long you want to evolve for, because of course if you want to predict the state of a system after a long-time then you should be allowed to have more time, that just makes sense. So some dependence on T and that would be fine. And then there is this odd dependence on the norm of the Hamiltonian and at first sight it’s like, “Why is this weird quantity entering the system”? But, the reason for that is that time is not uniquely defined by itself. So the thing that you want to simulate this unitary which is E to the minus IHT. So H and T appear to a product. So you can always scale down the Hamiltonian by the factor of 2 and scale up the time by a factor of 2 or maybe do it the other way around. This would allow you to cheat if you just had some dependence on T. So somehow it should go like the product of the two of these things. So you can take any norm of the Hamiltonian you like, but there has to be some way of normalizing the Hamiltonian so you don’t just put all the time into the Hamiltonian and then just evolve a very large non-Hamiltonian for unit time twist, that wouldn’t make sense. And the last parameter that I am not going to say too much about now is, “How should it scale with epsilon”? So epsilon is this accuracy threshold where I want to be within epsilon of the right map. And maybe it’s not completely appear a priori how things should scale with epsilon and maybe a polynomial in 1 over epsilon would be fine, but if you can get poly log 1 over epsilon that would be even better. And what we do in this work is we improve the scaling with epsilon and we actually get it down to poly log 1 over epsilon, as compared to previous simulations that went like a polynomial in 1 over epsilon. Okay, so, yeah? >>: Didn’t the previous work of [indiscernible] also get a log 1 over epsilon? >> Robin Kothari: Yes, so our work is now merged with their work in the sense that we are previous work as stuff before that. So we decided to merge our papers and that paper is now to be considered succeeded by our paper. So, yeah, right, any other questions? Yeah, so another interesting question or like one of the things that you might think about first is: can you simulate all Hamiltonians efficiently? And if you think about this for awhile this is not possible, even on a quantum computer. And maybe it’s not obvious to see why on a quantum computer, but the analogous classical question would be: can you compute all functions as polynomial size circuits? And that’s just not possible. There are just too many functions and too few circuits, like you can just do a simple counting argument to convince yourself that all functions do not have polynomial size circuits. And it’s the same thing quantumly, like you cannot simulate all Hamiltonians in polynomial time, that’s just not possible. So all you can hope for is to be able to simulate some restrictive classes of Hamiltonians. And so what Hamiltonians should we think about? And I guess going back to our motivation one good class to study would be the classes that actually arise in practice, like that makes. And Hamiltonians that arise in practice are often local Hamiltonians, like that’s a 7411widely studied class of Hamiltonians that1 arise in practical applications. But, more generally you can think of a class of Hamiltonians called sparse Hamiltonians that I will define on the next slide which is a generalization of local Hamiltonians and effectively captures almost all the Hamiltonians you would want to simulation on a quantum computer, especially from practical applications. I am not aware of any application that needs you to go beyond this model of simulating Hamiltonians. And lastly I would just like to mention that this problem of simulating local Hamiltonians or even sparse Hamiltonians is BQP hard. And what this means is that it’s the hardest problem that can be solved by a quantum computer. So in other words if a classical computer could solve the Hamiltonian simulation problem then every problem that can be solved by a quantum computer can also be solved by a classical computer. So this is like the hardest problem and if you can solve this then quantum computers don’t do anything for you at all. In other words there would be classical algorithms for like factoring, etc. So this also makes this problem interesting. I guess it’s somehow truly representative of the class of problems that can be solved on a quantum computer. >>: What’s an original reduction in [indiscernible]? classical reductions or quantum reductions? >> Robin Kothari: Uh, yeah. >>: [indiscernible]? Is it done under >> Robin Kothari: Right, yeah, so it’s classic reductions, but I would have to define a decision version of this problem more specifically. I mean right now I will just define it as a problem of producing the final state. That’s of course not a problem that a classical computer could ever solve, because it cannot produce a quantum superposition for you, but you would have to find an appropriate decision of this. And that could be something like I give you the initial state of the Hamiltonian and what you need to do is for the final state maybe sample from the final stage property distribution. But even a simpler one, like just tell me, say your promised of the first cubit has a very high probability of answering 1 or 0 and you just need to decide which one. And then that’s a decision problem now and this decision problem you can classically reduce to --. I mean you can show that if this decision problem has a classical algorithm then all quantum algorithms can be done classically. >>: Can you start with another problem that’s not [indiscernible] hard. Say [indiscernible], could you convert it into an instance of [indiscernible]? >> Robin Kothari: Right, yes, you can do that. In fact it’s a local Hamiltonian and I think it’s a 4 local Hamiltonian that’s the cleanest, easiest construction I know. And that reduction will be fully classical. So I will take your instance of Jones polynomial and I will spit out a local Hamiltonian, which I will write down on a piece of paper for you. So, yeah, that’s the sense of which it’s hard and it’s a complete problem I guess in that sense. Yeah, are there any other questions about this? >>: So finding a ground state is that also hard? >> Robin Kothari: No, no, so finding a ground state is a really hard problem. That’s funny you asked that question. I am going to get to that in two slides. That’s a question that’s often confused with Hamiltonian simulation, but they are actually completely different problems. So I will get to that in two slides, but before I get to that let me just tell you what local and sparse Hamiltonians are more formally and then talk about how the input of the problem is specified. So what is the input of the problem first? It’s a Hamiltonian and like an N bound matrix a time and epsilon. So let’s start with local Hamiltonians. So this is something that a lot of people are usually familiar with. So a local Hamiltonian is just a Hamiltonian that’s a sum of terms that each acts nontrivially only on a constant number of cubits. So for example a 3 local Hamiltonian is a Hamiltonian that’s a sum of terms and each term just involves 3 cubits out of the log N total cubits that you have. So how would you specify a local Hamiltonian to me? You just write it down on a piece of paper for me. So, each term only acts on say 3 cubits so for each triplet of cubits you can just tell me what the local Hamiltonian is. That’s a polynomial sized description, it’s not too long, and so that’s just the input setup. The local Hamiltonian problem is pretty easy to deal with. The input representation problem comes about when you talk about sparse Hamiltonians. So what’s a sparse Hamiltonian? So recall that the Hamiltonian is an N by N matrix so in principal it can have N squared nonzero entries. And in each row or column it could have up to N nonzero entries. So we see a Hamiltonian as sparse if it has only poly log N nonzero entries. And by poly log N I mean just some polynomial in log N, like say log N squared or log N to the 4 or something. So this is drastically fewer nonzero entries than is potentially possible. So it’s really sparse, it’s like it’s almost all 0 except a couple of entries. And even though it’s sparse in this sense the matrix itself can have exponentially many nonzero entries, like for example the identity matrix is a very, very sparse matrix. It has only one nonzero enter per column, but still if you had to describe the identity matrix like by listing everything out there are exponentially many ones, because the matrix is epsilon N by N. So if you want me to simulate a sparse Hamiltonian for you, you cannot just write this down on a piece of paper for me because it’s going to take me exponential time to read this piece of paper and then you cannot expect me to run in polynomial time. That just wouldn’t make any sense. So you need to have some kind of succinct description of this Hamiltonian. And what often happens in practice or in all the cases of sparse Hamiltonian simulation I know is that you have the following kind of succinct description which is if I tell you the row number and I ask you for a particular nonzero index you can compute this pretty efficiently. So I can tell you, “Hey what’s the fifth nonzero entry of the eighth row?” And you have some efficient algorithm that spits this out. So for example if the Hamiltonian is local then you can come up with a polynomial time algorithm that does this kind of thing. So this is what we call an efficiently low computable Hamiltonian, which means you can just compute nonzero entries in any row efficiently. And what we do in this sparse Hamiltonian simulation model is we assume we have been given a black box that just does this for you and we think of complexity in terms of the number of queries you need to make to this black box. And we also count the total number of gapes that you will need, like the other one in two cubit gapes that you need. But, in terms of the Hamiltonian we just count the number of queries made to this black box because we don’t know how expensive this black box is to actually implement and that depends on the problem at hand. Okay, yeah, so are there any questions about local Hamiltonians, sparse Hamiltonians or how they are represented? We are good? Okay, let me summarize what we know about current Hamiltonian simulation algorithms. So as I said we are going to measure the complexity algorithm in terms of the number of queries made to this black box that we assumed on the last slide. And the relevant parameters are N, the size of the Hamiltonian, the time that you are evolving for, the error parameter and D is going to stand for the maximum number of nonzero entries in any row. This is the thing that we assume to be poly log N. So at the end you will get a polynomial time algorithm. So the first algorithm of this kind was by Lloyd in 96 and this only worked for local Hamiltonians and this only worked for local Hamiltonians. And it gave that kind of scaling so its polynomial in the D and log N and I think its quadratic in these other parameters. This was later improved by [indiscernible] and [indiscernible] who extended it to sparse Hamiltonians. And they introduced the sparse Hamiltonian problem for the firs time. They got better dependence on some of these parameters and they extended a larger class of Hamiltonians. And then after that there have been a bunch of improvements, so in the next paper by Berry, [indiscernible], Cleve, and Sanders they really get down the dependence on some of the parameters. So, for example this previous one was just log N and it was some polynomial in log N, but here they have brought it onto log star N, which is a really, really slow growing function of N. So maybe I won’t define it, but for any reasonable N that you would put in, like for example the number of particles in this unit where log start of N is 6. So it’s not something you should worry about much. You can almost think of that as a constant for any real application. So the thing I want you to notice is the dependence on the error parameters. It goes like 1 over epsilon to the delta for any [indiscernible] zeros. For example you can think of delta as 0.01 or something. So the dependence on 1 over epsilon is very good, like it’s a very small polynomial in 1 over epsilon, but it’s still polynomial dependence. And that’s the same thing in the next result, which just improved the dependence on D, the scarcity parameter, but essentially left other things unchanged. And the last one on this slide is a completely different approach and it’s not related to the previous approaches and that get’s a better dependence on the degree, but at the cost of a worse dependence on epsilon. Now it goes like the square root epsilon, the square root of 1 over epsilon. So, all of these algorithms had polynomial dependence on epsilon. And this was an open question for quite awhile: Is this necessary, like do we need polynomial dependence on 1 over epsilon or can we get it down to poly log 1 over epsilon? And that’s essentially one of the main things we are going to talk right now, which is that our algorithm finally achieves poly log 1 over epsilon dependence. And in fact it achieves this weird function which is if you just isolate the dependence on epsilon we get this dependence which is log of 1 over epsilon divided by log, log 1 over epsilon, which kind of seems like a strange function out of nowhere, but in fact it’s optimal. We also prove a matching lower bound showing that log over log, log is the right dependence, even though this function looks kind of like out of nowhere, but it’s really the right dependence with the respect to epsilon. So yeah, all right, any questions about this? No, okay, so that’s all I am going to say bout the Hamiltonian simulation, except I am going to answer Martin’s question with a full blown slide. So what is the difference between the simulation problem and the problem of finding ground states? So this is something that people often ask about and sometimes people thing that the simulation problem is the same as ground states. So they are extremely different problems morally. So if you take a very coarse view of life where quantum, classical and all of this stuff is the same, like don’t even differentiate between quantum computers and classical computers. They are just computers that run in polynomial time. The simulation problem is an easy problem in principal. It’s just mimicking the behavior of this other system and if you have enough resources you can do it. You may have a little bit of a slow down, but in principal you can do it. Whereas the finding a ground state kind of problem this is a bit like among all possible configurations that the system could have started with what is the best one that maximizes something? It’s like an optimization problem. So for example if you are just given a Boolean circuit and I give you an input and I ask you what the output is. So you want to mimic the behavior of this Boolean circuit. That’s really easy, you just follow all the gates, you compute what the outputs are supposed to be. It’s a small circuit; you can do it. But, if I ask you: is there any input to this Boolean circuit that outputs 1, like that’s a satisfy ability problem. That’s a well know NP compute problem, that’s really hard. So similarly in quantum it’s like I give you a quantum circuit and I give you an input and ask you, can you do this? If you have a quantum computer yeah, sure just follow the circuit, it’s almost trivial. On the other hand finding the ground state is almost like finding something like the maximum acceptance property over all inputs or finding, like which is the input that maximized the acceptance property. So this is a really hard problem. So that’s kind of what I am trying to get across in this slide. That a simulation problem is generally easy if you have enough resources to, well if you have resources similar to the kind of system that you are trying to simulate. If you are trying to simulate a classical system and if you have classical resources it’s kind of an easy problem. But, ground state kind of problems is always really hard and in general they are NP hard if your problem is a classical problem. It could be QMA hard if it’s something about a circuit and whatever it is its going to be quite hard. Does that answer your question? >>: The definitions seem to me to be --. They have a very different flavor of those two classes, like when you say VQP hardness verses QMA hardness. One is like a semantic class in classical complexity, right. How do you even know that a problem isn’t VQP hard if your [indiscernible] definition of a language even though it’s in VQP and your reason about probability of acceptance and so on. >> Robin Kothari: Yeah, that’s right. >>: That’s different from QMA, right? >> Robin Kothari: No, QMA is also a semantic class. >>: I thought it’s a syntactic. >> Robin Kothari: No there is no syntactic definition of QMA because if I give you a QMA how do you know it has this property that has more than 2/3 accepting probability of less than 1/3. It’s the same as classical MA, like even MA has the same problem, even BPP has a problem. These are all semantic classes, so –-. >>: NP would be an example. >> Robin Kothari: NP is a syntactical class, yeah. So if you wanted to be extremely technical I should be saying promise VQP and promise QMA, because they are all hardness for the promise versions of these classes where you assume this kind of behavior. But, most people don’t care about that level of technology, so I don’t get into that. But, morally it’s hard and it’s QMA hard and in that sense it would be VQP. Any other questions about this or about Hamiltonian simulation or anything I have talked about? So that is what I would say is the intro part. I am trying to convince you that the problem is interesting and what I am thinking about is interesting and why I care about the problem. It looks good? Okay, so this is a summary of the first part of my talk. I talked about Hamiltonian simulation and I told you that I have this algorithm that scales like this where [indiscernible] is D squared times the norm of the Hamiltonian times T and it has this kind of nice dependence on error. I haven’t told you how this algorithm comes about; I have just stated the result. In the second half of my talk I am going to try to explain to you how the algorithm comes about. And to do that I am going to start with something called the fractional query model. So if you have never heard of it that’s fine. You probably shouldn’t have heard of it. It’s just a really exotic model to study. So I am going to describe the problem, what this model is, what we know about it, what we did and try to prove what we have done. And when I talk about this right now it’s going to be completely unrelated to Hamiltonian simulation, so just put that out of your mind for the next 20 minutes or so. And finally in the last 10 minutes or so I am going to connect up this problem and show how we reduced Hamiltonian simulation to this fractional query model. Okay, so let’s talk about quantum query complexity. So this is one of my favorite topics. So quantum query complexity is just this model where you have some input and you need to compute some function of this. Classically what you have is you gave oracle access to this input in the sense that, so think of the input as an N bit string. And you have oracle access in the sense that you ask the oracle, “Hey, what’s the fifth bit of this string” and the oracle replies, “It’s a 0 or it’s a 1". And quantumly it’s the same thing, but now you are allowed to do this in super position, that’s where a lot of the power of quantum computing comes from. And so the standard way to represent this oracle is like this: you give the oracle two [indiscernible], I and B, I is a number between 1 and N and B is just a bit, and what’s going to happen is that it’s going to put a phase up front on your state. And the phase is going to plus or minus 1 based on the bit XI that you are trying to learn. And so that’s the standard query complexity model. And the measure of complexity is how many queries you make to this black box or how many questions you ask the black box. So for example this is a circuit that depicts a two query algorithm. There is some unitary that’s independent of the input, then you make a query, you do another unitary, you make a query and that’s a two query algorithm. Now suppose I gave you the ability to make half a query. It doesn’t really make sense because you are asking for a bit. What does it mean to give me half a bit? So it doesn’t make any sense classically, but quantumly it does make sense because what’s happening is if the bit was 1 I was going to put a phase of minus 1. If the bit was 0 I was going to put a phase of plus one or basically not do anything. But, instead of putting a phase of minus 1 I can put a phase of I and that’s like doing half a query because if you do this two times you get a phase of minus 1. So 2 half queries can simulate 1 query so that justifies calling it a half query. And so now think of this model where you are allowed to make half queries and if you have a circuit of this kind where you have used 4 half queries I am still going to charge you two queries for it. I am going to say that your circuit made 2 queries and then maybe generalize it to a quarter query. And like you can make 8 quarter queries and I am still going to count that as 2 queries and so on. You are allowed to make arbitrary fractional queries and I am only going to count the total full queries. >>: [inaudible]. >> Robin Kothari: Sorry. >>: I went to the oracle 4 times, whether or not I got half a bit each time or I got a full bit, how does that effect the amount of work? >> Robin Kothari: Exactly, I mean it seems like this is just a crazy thing to do, like I am charging you way less --. >>: Like I just think it is 4 and 8, so why am I wrong? >> Robin Kothari: Right, right. So I am going to convince you later that this is right. But, let’s define this model which seems like –-. >>: [inaudible]. >> Robin Kothari: Right, so it means that instead of being given this gate that has a minus 1here, like replace this definition with instead of minus 1 you put the correct square root of minus 1. So for half a query you put I, for a quarter of a query you put like the fourth root of minus 1. So it in general will be E to the I pie, or E to the 2 I pie divided by the fraction. So if you want to make an alpha fractional query E to the 2 pie I alpha. That’s the gate that you have been given. >>: [inaudible]. >> Robin Kothari: Right, so the only trivial observation that you can make about that gate is that if you use that gate the correct number of times, like if you use this gate 4 times you get back the usual thing. So if I define this model where I use this kind of counting, as opposed to just counting the total number of times I called the oracle by this kind of count, it’s at least as powerful as the usual model, but it seems like it’s way more powerful because this model get’s to do so much more. It’s almost like it’s cheating, but the punch line is that no, it’s not more powerful. And this was shown by, let’s see if I can remember all the authors, [indiscernible], [indiscernible], [indiscernible], [indiscernible], [indiscernible] and [indiscernible]. So this was a stock paper from 2009 where they showed that if you have a fractional query algorithm, so an algorithm or crazy model that makes capital T queries counted in the way that I described it can be simulated in the regular model which has just normal queries and it just uses a little more number of gates. So if you think of simulating into some constant precision it just uses something like T log T gates, which is very close to T, but the model seemed like it was way more powerful. The model seemed like it was allowed to cheat and do great things, but it turned out you can simulate is essentially with just a log factor loss. So that’s really surprising because with this model you wouldn’t have thought that, but that’s true. So what we do is we improve upon this result by getting a better dependence on epsilon. And this connects up with the epsilon in the Hamiltonian simulation which we are trying to improve, which is the main focus of our work I guess. So I guess that’s why epsilon is the parameter where we are trying to reduce the dependence on. So what we do is we improve this scaling from this expression to this expression. The difference is only the epsilon in the denominator. Okay, so any questions about the fractional query model or what we did, because it’s kind of a strange model to wrap your head around. >>: [inaudible]. >> Robin Kothari: No, I am going to explain that to you, but maybe not in great detail, but I am going to kind of explain that to you. Hopefully I can at least convey the intuition for why that’s happening. Okay, so I am going to prove our stronger result. So this is the number of queries I claim that you need to simulate a fractional algorithm that makes only T queries. So the first thing to observe, so it’s broken up into two steps, the first thing you need to observe is: well how many queries do you need to simulate a 1 query algorithm? So I have a fractional query algorithm that makes 1 query in total, but it could be using a whole bunch of fractional queries and I am going to convince you that it can actually be simulated using only 1 log of 1 over epsilon by log, log 1 over epsilon queries. So that’s the special case where T equals one, but this special case implies the general case because you can split up a T query algorithm into T 1 query algorithms and you choose the error bounds to be small enough in each little part so that when you combine them they are still small enough. And it’s just the usual thing and you will get the right expression. So all I need to do is convince you that if a single query fractional query algorithm can be converted to a normal algorithm in the usual model and it only makes this number of queries to achieve better epsilon. And so if I can prove this point number 2 then I have proved the result I have claimed. Okay, so this is a summary so far. What has happened is we have talked about Hamiltonian simulation awhile ago, but now put that out of your mind. I introduced the fractional query model. I claimed that this can be simulated with T queries, ah, you can simulate a T query algorithm with this number of queries and then I talked about the special case when T equals one and I argued that they are equivalent. So that part is easy, it’s the sketch of the proof that I gave on the last slide. So now what I need to do is I need to convince you that this box is true. So I that box is true assuming this box is true. Okay, so to prove this is true I am going to prove something even weaker. So it’s going to be a chain of reductions. So what’s even weaker than this is, so we are trying to simulate a 1 query algorithm and we don’t know how to simulate this directly, but what we know how to do is simulate it probabilistically. Well I like to call it probabilistically, but you can also think of it as non-deterministically. So if we have a circuit that when it succeeds it does the 1 query algorithm, when it fails it does something bad, but we know how to fix that. So it’s in the spirit of these repeat and closed circuits. So what do I mean precisely? So what I mean is let’s start with a 1 query fractional query algorithm. So what does that mean? It’s some unitary V which can be written like this where there are a whole bunch of fractional queries, like there is N fractional queries which each of which are Nth fraction of a query. And I have taken all the fractions to be equal in this example, but it also works if they are different. And you are allowed to do arbitrary uintaries between these queries. And V is some unitary that is now, that we say can be implemented in the fractional query model with cost 1. And what we want to do is we want to implement V in the usual model with only log 1 or epsilon by log, log 1 over epsilon queries. So this weaker thing of not being able to actually do V, but being able to probabilistically implement V or non-deterministically implement V is this thing that I talk about here. So think of this map U as a map that kind of implements V or it implements V incoherent super position with something else that you don’t care about. So what it does is it takes 0 times [indiscernible] and it maps it to V [indiscernible] with the first cubit telling you that this has worked with some probability P with probability 1 minus P it just does something else, maybe we don’t know what it does. And it’s important that there is a cubit that tells you that you have succeeded otherwise you don’t know when you have succeeded and think of P as a constant. So what these guys showed is that there is really an algorithm that only takes log 1 over epsilon by log, log 1 over epsilon queries and does this task for you. This is not the task that we set out to do, but it’s kind of almost the task and it’s for constant P think of PS.1 or something if you want to think of a constant. And given such a procedure as a subroutine what we want to do is we just want to get [indiscernible] out of it. So we want to get a circuit that just implements V for you instead of this thing that’s non-deterministically implementing V. So the most straightforward thing that you would do with a circuit that implement something with a certain probability as well is you apply U, you measure the first cubit and you see if you got 0. If you got 0 then you are good, you have V psi. If you don’t have 0 then you have got this state [indiscernible] phi, which I haven’t explained what it is, but you can work out the details and there is some description of [indiscernible] phi, which shows you that you haven’t actually lost [indiscernible] psi. Then you reverse this, get back [indiscernible] psi and try to do it again. The problem is the reverse procedure is also probabilistic, because you need to run the same map again. So you have this kind of really complicated recursion relation to analyze where with some property you fail and then the correction procedure also fails with a certain probability and this is really hard to analyze and you need a large number of gates to do this. So this is how they were doing this before in the previous algorithm, this is why they had this kind of bad dependence on epsilon because to be more precise you need to bound the total number of branches that are going to fail and you want the total order of possibility of failing to be really, really small. So we get around this problem and show that if you are given a map that does this we can just implement V psi for you deterministically, like no probabilities, non-deterministic, what ever. >>: Question, so does P depend on the given algorithm? >> Robin Kothari: So P is just a constant. I just didn’t write it down, but it’s something fixed. Like think of it as just this .1; it’s independent of the algorithm. >>: But you will know it. >> Robin Kothari: I know it, yeah. something –-. Like I would know it, but some crazy >>: Can it be computed in principal from the given description? >> Robin Kothari: Yes, yes, it’s something like --. There are some signs and co-signs of the fractions involved. So if all the fraction queries are 1 over M or something like sign of 1 over M plus 1 the inverse of this something. But, they are all basic trigonometry operations. You can compute it in principal just by looking at the V that you present to me. >>: But it depends on the U’s and [indiscernible]? >> Robin Kothari: It doesn’t depend on the U’s, it only depends on the fractions that you use here. So if all of them are 1 over M then I can just compute the number for you and it’s something simple, but if all of them are different then it’s some number that depends on what each of those things are. But, it’s a simple calculation, like classically you just need to tell me the fractions in the exponent and I can tell you what P is. Okay, so let me summarize what we know now. All right. We were at this stage where this was the result I was trying to prove. I showed you it’s equivalent to this simpler result and then I introduced this fractional query model where we are trying to probabilistically simulate this in the sense of the previous slide where with probability P it does the right thing, with probability 1 minus P it does something wrong. And now what I want to show you is that if you have a way of doing this then you can do that efficiently and in fact using only a constant number of uses of this circuit. So that’s going to be what we call oblivious amplitude amplification. Okay, so what’s the problem? The problem is that you have this unitary U, which takes U psi and it maps it to, like with probability P, it maps it to the thing you want to probability 1 minus P something you don’t want. So one obvious idea is, hey, let’s use amplitude amplification because that’s a standard technique to increase the probability on some good subspace and decrease the probability on some bad subspace. So what does amplitude amplification do? What you would need to do is you would need to reflect about the good subspace, which in this case is the subspace has 0 in the first register. So that’s easy to do, well it’s just the zed gate on the first cubit. So that’s fine, but you also need to deflect about the starting state. So amplitude amplification is technique that takes two reflections and does them over and over again, but the other reflection is the reflection about the starting state and that’s U times this thing over here. But [indiscernible] psi is a state that you don’t know, like this is the input state of your algorithm. You know, you are trying to simulate the behavior of a unitary on an unknown input state. You don’t know [indiscernible] psi, so you don’t have the ability to reflect about it, like you don’t even know what it is. If you measured it or did anything you would destroy the state. So you can’t just use amplitude amplification, so that’s the problem. So what we introduced is this thing we call oblivious amplitude amplification and it’s oblivious in the sense that you don’t need to know the input state. So it works when the input state is unknown. oblivious to the input state. So you’re And what we show is that given such a circuit you can indeed do exactly what amplitude amplification would do, but we don’t use the same two reflections of course, because the second reflection is something we don’t know how to implement. So we use a different reflection, but what we show is that the way the algorithm proceeds is that it does exactly what amplitude amplification would have done had you had the correct reflection to do it. And this uses ideas from [indiscernible], which they introduced a couple of nice techniques in that paper for dealing with these kinds of things. And just as in standard amplitude amplification if you know what P is, the success property, then you can boost amplitude amplification to get you the right state with probability 1. So this is like in Grover's algorithm if you know there is exactly one marked item you can find it with certainty, like its not probabilistic anymore. And that’s the general feature of amplitude amplification. If you know the probability then you can exactly get the right answer. So that’s what happening here. We know what P is, as Martin asked, I can compute what P is from the description of the circuit and then I have this technique, oblivious amplitude amplification, that exactly let’s me, since I know P, I can do amplitude amplification and I can exactly get probability 1 here. So what I get at the end of the day is just P psi, no error. And I have an exact statement of the theorem of if, I don’t know, someone is really interested in the technical details of what this theorem states, what’s a lemma. So essentially it’s just that U and V are unitary matrices that have this property that U acting on 0 psi produces V psi with some probability sign theta or amplitude sign theta and cost state of doing the wrong thing. And just as in amplitude amplification you can define some unitary S, which when you apply T times to this it increases the probability of being here by 2 T. So, you know, you start an angle theta, you do it once you get 3 theta, you do it again you get 5 theta and so on. So this is exactly the statement of amplitude amplification, it’s just that we are using a different operator here. It’s not the two reflections that you would have had in amplitude amplification so this needs to be proof that using these reflections instead of the ones that you were supposed to use actually works and still produces the same thing. So this is the heart of the technical content for this lemma. But, if you just want to use it like a black box it’s just like amplitude amplification, you don’t need to worry about the details. So, right, so this is –-. >>: [inaudible]. >> Robin Kothari: Yes, so --. >>: [inaudible]. >> Robin Kothari: No, it’s exactly the same. So the states, even the states that amplitude amplification would go through, go through exactly the same set of states. So it’s really mimicking it and effectively what’s happening is you were supposed to use some reflection, reflection A, but we are using reflection B. And what we show is that reflection A and B is the same in the subspace in which you are operating. So from the perspective of the algorithm it doesn’t know which one you used. Like did you use the one that you were supposed to or did you use R? R reflection is kind of a long one to use, but it’s the same in the subspace that the algorithm works. So the algorithm doesn’t know the difference. So that’s where we are getting this power from. >>: [inaudible]. >>: If you know P as a constant. that’s why this works. The whole point is that you know P and >>: You don’t need to know P. >> Robin Kothari: No, you don’t need to know P. >>: You just need to know P to make it deterministic. >>: Excuse me, for the deterministic. do know P it’s fully deterministic. You don’t need to know P, but if you >> Robin Kothari: Right, so that’s the same even in amplitude amplification; if you know P you can get it. So I guess it’s a generalization in the sense that we can apply it when you don’t know the input state, but use this very specific form for the thing that you are trying to amplitude amplify. Like, amplitude amplification works even in other cases where it’s not like this or for example like one example could be that this probability P depends on psi, like it’s different for every psi. >>: Ah, I see. >> Robin Kothari: Like that’s a different thing and then this technique doesn’t do anything about that. But, amplitude amplification would still let you amplify that. >>: [inaudible]. >> Robin Kothari: Yes, yes, the set of things you could apply amplitude amplification is large. There is a smaller set of things in which our problem lies and for that smaller set of things we have a generalization. It doesn’t generalize amplitude amplification, but if you happen to be in this class of things then that’s great for you. And this technique has already found applications. So [indiscernible] and Adam have used it in one of their recent papers to boost the success probability with some circuits, so that’s nice. So let me summarize again what we have so far. So I said that let’s assume that you have this ability to probabilistically implement this unitary nondeterministically and then I introduced this technique called oblivious amplitude amplification that allows you, given a way to do this, you just use this guy a constant number of times and get the thing you want. And then I convinced you that these two are equivalent, so if you have this ability then you can get all the way up there. So now all I need to do is convince you that this makes sense and then you would be convinced that make sense. I still need to relate this to Hamiltonian simulation, but this is where we are right now. Any questions about this? Is it making sense so far? Okay, so now let me try to explain how we do this probabilistic or nondeterministic simulation of a 1 query algorithm. So this is really the heart of the --. So you might have this intuition that this model is really strong, but you can simulate it with just like a constant number of queries for constant precision. So then how is it that all the magic is is happening in this step? So let me explain how this step goes. So again let’s start with the 1 query algorithm. It looks like this, just like before; some M fractional queries are being made to the oracle, there are M unitaries in between and as these guys showed that you can do this for some constant. So that’s a summary of what I have been saying. Okay, so let me introduce this thing called the fractional query gadget. So it’s a little gadget in the sense of its a little circuit and it’s the circuit over here and it’s nice. What it does is it has 3 1 cubit gates and 1 controlled Q and it starts in [indiscernible] psi and what it does is at the end if you measure the first cubit if you get a 0 then the second register is in the state Q to the alpha [indiscernible] psi. So for example think of if we said alpha to half or said alpha is a quarter then when you run the circuit, with some probability, you are going to get a 0 here and this guy is exactly the sate you want it. This is a quarter of a query done on [indiscernible] psi. So right now --. >>: So I have a question. [indiscernible]. So if U is a black box squared and >> Robin Kothari: Yes, so --. >>: So is it always possible to manufacture or is it an assumption? >> Robin Kothari: No, so the way I define Q is I define Q to have this like this [demo]. Given such a Q you can always make a controlled version of it and that’s because this second registered B is effectively controlled. When you said B to 0 the map does nothing, because B is 0, this is 0 minus 1 to the 0 is 0, so it’s the identity map. >>: Oh, I see. >> Robin Kothari: So the definition of the oracle already includes control. And this is the traditional way to define it, but if you were given a version of the oracle that does not have control then there is no way to manufacture it. >>: Okay. >> Robin Kothari: That’s hard, that’s provably hard I guess. So, yeah, so this is a little gadget and what it does is it uses 1 copy of Q and with some probability it does Q to the alpha times psi. So right now this seems like it doesn’t help at all. Like firstly it uses 1 full query to do a fractional query, so that didn’t help us out and it does it probabilistically so that’s pretty crappy as well. It seems like we didn’t get anything out of this, but the good part is this circuit output 0, like the outcome you want, with probability very close to one. And it’s 1 minus big theta of alpha and by that I mean that the amount by which it’s far away from 1 is linear in alpha. So if alpha is really, really small, like say 0.00001 then this is almost probability 1. So you have almost certainly succeeded when you do this map. This still doesn’t seem great because you have still used 1 full query to do a fractional query. It’s like, well, how does that help you? But, just keep this in mind that this succeeds almost certainly. So this is just the definition of the gate. It’s not really important, but just for completeness, if somebody wants to know what the gates are. So yeah, so keep this gadget in mind and now what we can do is we can use this gadget over and over again. So say you want to do, you know, you have some starting state. You first want to do U 0 on it; you just do it, well that’s just a unitary that’s free. You want to do Q to the 1 over M, so you do that using this gadget over here. Don’t measure this final state, just let it be for now and then apply U 1 on the output of the circuit, you can do that. Then take a fresh [indiscernible] and do this gadget again on this state to enact the second gate and so on. Like do each of these gates one after the other on the output of the circuit. So what I mean is essentially implement this circuit over here, which is you have M [indiscernible], on for each fractional query you are trying to do. So you start with do use 0, do this fractional query gadget, there should have been a gate here which I have just pushed to the end, because it doesn’t matter, there are not more gates happening on this line. Then you do U 1, and then you do the fractional query gadget on the same state, because it’s the same state you are trying to evolve with these fractional queries. And you do this for all M and you collect it up into this one bit circuit that we are calling a segment. And what does this segment do for you? What it does is if you measured all of these cubits and if you got zeroes for all of them then the circuit did exactly what you wanted it to do. It did all the fractional queries right, it did all the unitaries in the middle right, like you are just good to go. So what’s the probability of all of them being 0? So as I said before the probability of one of them being 0 is very close to 1 and it was bounded away by some linear function in M. So if you, it’s like a coin toss where the probability of the bad outcome is like 1 over M and the probability of the good outcome is 1 minus 1 over M and if you toss this M times what’s the probability that you always succeed? It’s some constant, so this is a circuit that has some constant probability of succeeding and it’s some fixed constant that you can compute based on this M. So this is related to what you were asking before and it’s something, maybe it’s like .1 or something. Yeah, it’s easy to compute with some tail bound. So this is a circuit that implements the map that we wanted to implement with constant probability, where you have succeeded when all of these cubits are 0. So that’s great right, so that’s what we were heading out to do. We wanted a circuit that takes, that effectively implements a 1 query algorithm in the probabilistic or non-deterministic fashion. But, we also want the circuit to make very few queries. Like as written, this circuit now makes M queries. So the original circuit you are trying to implement, I mean this fractional query algorithm, this makes only 1 query, its cost is 1, but now I have given you a circuit that makes M queries so it seems like that wasn’t useful at all. case. That was just a huge waste of time, but that’s not really the So now is where the real magic happens here or maybe it happened in the previous step, I don’t know, but something really nice happens which is this is the circuit I showed you on the previous slide. So think of this input state over here, which I am going to blow up over here. This is some fixed state; it doesn’t depend on the oracle or anything. It’s just 0 with this matrix R sub alpha acting on it. And R sub alpha, let me go back to where I defined R sub alpha, here. All right. It’s a matrix like this and these expressions are hard to parse, so I will just tell you what happens. R sub alpha is actually a really tiny rotation and it takes [indiscernible] 0 to essentially something to [indiscernible] 0 and a very little bit of [indiscernible] 1. So if you think of this tensor product state it’s a tensor product, an M full tensor product of state that’s essentially [indiscernible] 0 and a little bit of non [indiscernible] 0 things. So if you write it down there is a large rate on the first term which is like the all 0 state, then there is a little bit of weight on the state that have only one 1 inside them and then there is even less rate on the states that have 2 ones and so on. So the weight decreases with the Hemingway of the strings. So what you can do is this is a state that has superposition, like has weight on every single bit string, but the weight goes down as the number of ones increase. So what you can do is you can just kill that part of the state. So this state that has very high overlap that has overlap with the high Hemingway states, you can just kill that part, just truncate it. And because the distribution is very strongly peaked around 0 this truncation is just going to affect your circuit by an epsilon. So the question is: how far in Hemingway do you need to go to be effected by epsilon? And that’s only this much and this comes out of doing a [indiscernible] calculation. So that’s this question of: I give you M coins that are very strongly biased towards 0, so the probability of getting a 1 is like 1 over M and the probability of getting a 0 is 1 minus 1 over M and I toss all of these M coins and then what’s the most likely even? It’s probably you get all zeroes, but in fact I don’t want to just find out the most likely event, I want to cover all the events that have probability of 1 minus epsilon. So you just need to go to maybe a couple of heads appearing or a couple of zeroes, because it’s so unlikely that you get 1 that if you just accounted for the possibility of there just being a couple of ones you have basically covered all possibilities. And the possibilities that you have neglected are only an epsilon fraction. So what we do in this circle is just replace this state with the truncator state and this truncator state has a property that it’s epsilon close, but it has very few ones. Why does this help us, that it has very few ones? The ones decide when the queries happen. So these are controlled query gates and the queries only happen when there is a 1. So if most of the time, if no queries are happening, then you don’t really need to be making that many queries. So for example, if you know that this state over here has at most this number of ones then really there is only this number of queries happening in the circuit. Like it looks like there are more queries, but there is a way to rearrange the circuit so that you only make this number of queries, because in any branch of the super position there is only this number of ones. >>: Are you still doing M U’s? >> Robin Kothari: Right, right, M U’s? So the U’s are free. >>: Why are they free? >> Robin Kothari: So we only charge the number of queries made to this oracle in this model. >>: Okay? >>: In query model. >>: And that’s fine, but I am just saying the point is you had replicate the U you would have done with one query to be now M U’s and you don’t care if you only have to do two queries, I still have to do M U’s now instead of 1 U. >> Robin Kothari: Right, right. >>: And U is where the actual system is running. >> Robin Kothari: Uh. >>: The query is just telling you the parameter that goes at this point and you still have to use it. >> Robin Kothari: Yeah. >>: So the depth of the circuit got massively bigger. >> Robin Kothari: Uh, well actually, yeah, that’s not necessarily true. >>: Okay. >> Robin Kothari: What you need to do here now is I need to rearrange the circuit so that it actually only makes this number of queries. Like, as [indiscernible] still makes M queries. And in the new circuit, what will happen is there will be a bunch of --. >>: What if I use different U’s? Like a U that does M minus 3, which is the 0 case and then 3 of them that aren’t [indiscernible] the queries. What about that? >> Robin Kothari: There would be new U’s, yeah. >>: Like I haven’t seen a reason why that M is not a lot more complex or more depth then these U’s at the M the level of them. >> Robin Kothari: Yeah, so if your U’s are complicated to do then this isn’t going to help. >>: That’s fair, I just needed to hear that and I am okay. >> Robin Kothari: Yeah, yeah, yeah, I mean say in these query models the assumption is the query is the main cost and reduce the dependence on that. >>: Okay. >> Robin Kothari: Yeah, so I mean all non query operations are free, anything you feel like. >>: I think you need to be more cautious here. I mean even if the cost of U is negligible, right, what you are claiming is you end up doing [indiscernible] an O of 1 queries. >> Robin Kothari: Right. >>: So that means that the entire thing, right, doesn’t depend on anything. >> Robin Kothari: Uh. >>: [inaudible]. >> Robin Kothari: Right. >>: [inaudible]. >> Robin Kothari: Yes. >> But, if M is very large, if like M tens to infinity then giving even a tiny cost associated to U will add up and will add [indiscernible] and the second point is it will outgrow. [indiscernible]. >>: Well and plus as the M’s grow you have to more of the fractional Q’s to get the Hemingway correct. So then you also have to do more queries also. >> Robin Kothari: Right, right. >>: So it’s not a clean and simple setup. You can claim that it’s practically useful, but in a [indiscernible] sense you have kind of ruined them. >> Robin Kothari: No, no, I guess, I mean maybe we need to come back to the definition of what’s the standard query complexity model. So in the standard query complexity model you are allowed to call the query the query gate Q some number of times, that’s going to be the cost measure. But, you are allowed to do arbitrary unitaries otherwise that are not charged for. Like you can even solve un-definable problems, you can solve exponentially hard problems. The standard query complexity model allows you to do input independent unitaries, whichever ones you like, as many as you want. >>: Mr. Kothari we agree, we are talking more about if I was asked to put this on a machine there are practical reasons why you can’t ignore the other items. >> Robin Kothari: Right, right, so when I get to applying this to Hamiltonian simulation I will need to care about this, because I am going to give you a bound on the total number of gates needed and I am going to show you that are also polynomial. So when I apply this to the specific case of Hamiltonian everything is efficient, but in general there is not much I can say, because in the query complexity model you are just allowed to use gates that are just crazy to implement. So, you U 0 itself could be solving an un-decidable problem and then there is nothing you can do in any model to make that efficient. >>: Also, I have an issue here if whether alpha should be literally seen as depending on M. >> Robin Kothari: Sorry, what do you mean? >>: So right now if I look at this literally, alpha is some magical small anvil. >> Robin Kothari: Uh, so alpha is just --. >>: Like you do not have to change as M grows. >>: Oh, yeah no. >> Robin Kothari: No, no alpha is 1 over M. I haven’t defined it here. >>: [indiscernible]. >> Robin Kothari: Right, so alpha is the exponent of Q so when I define the gadget, so this little gadget it does Q to the alpha for you. So in our example we wanted Q to the 1 over M, so alpha is 1 over M for the next two slides. So here, this alpha is 1 over M; it’s exactly 1 over M. >>: So we know that in practice the cost of doing small anvil increases as the anvil increases. I mean even with [indiscernible] magic. >> Robin Kothari: Oh, you mean for actually implementing [indiscernible]? >>: In terms of actually implementing the R’s. >> Robin Kothari: Yeah, yeah, yeah. >>: But, his comment is there are very few of them because you only have to do a few queries. So the [indiscernible]. >> Robin Kothari: No, I guess what I charge for this, but when I actually for it and it won’t be too bad, like will be some dependence on M when we exactly this thing. am do it do saying is that in this Hamiltonian simulation will be fine. It will Hamiltonian simulation model we don’t I will charge scale, there because of >>: But I would write alpha depending on M right there, so there is no --. >>: [inaudible]. >> Robin Kothari: Yeah, yeah, it’s a good point; maybe I will put alpha equals 1 over M on the side, that’s good. >>: [inaudible]. >>: Well let him get to the point where it actually costs him. >>: There are more simplifications ahead. >> Robin Kothari: Right, right, so you are concerned with the query cost? >>: Is it an amortized cost? You had said you ran over all these binary screen, but they are heavily concentrated at the low wage. So couldn’t you use that to kind of say how many of these queries actually executed? Don’t you have to accommodate for the worst case? >> Robin Kothari: Right, yeah, yeah, so this is not a proof. So this is a circuit that still makes M queries, because how do you count the number queries? You look at the circuit and see how many cue gates there are. So from this circuit I would have to write down for you another circuit that only has this number of cue gates in it. Like the full circuit, when you look at it drawn on a piece of paper with only a constant or whatever this number of cue gates run and I am just saying that can’t be done morally because they aren’t really happening in super position. But, I need to show you that, which I am not going to because that’s kind of technical. So, but yeah, morally I am trying to convince you that in super position, if every branch of the super position only makes 5 queries even though the gate has 100 queries then some how there should have been a way to write this circuit down so that it only made 5 queries. So I guess that’s what I am saying. Yeah, okay, so that’s the last piece. That was to show you how this is done. So that completes this chain of reasoning over here. Okay, so, okay, I will probably just take 5 or 10 more minutes and wrap this up. Okay, are there any questions about this chain here? I wanted to prove that, so I reduced it to this, which I reduced to this, then I proved this and then by the sequence of reductions I have proved what I wanted to prove. Does that kind of make sense? Okay, yeah, so now this reduction from here to here is also kind of technical so I am just going to try to convince you why Hamiltonian simulation is at all related to a fractional query simulation, because at first glance they have absolutely nothing to do with each other. One is a result about some query thing, some exotic query model and the other is about Hamiltonians. So what is this reduction? So take a very simple case of you have two Hamiltonians, H 1 plus H 2, H is the sum of these two Hamiltonians and let’s define Q wanting to be E to the minus I H 1 and Q to be E to the minus I H 2. And you want to do this, you want to simulate the Hamiltonian H for time T, so that means that you want to do this gate, or this unitary. So we know that approximately, by the [indiscernible] product formula that this matrix is approximately, you do the first one for time 1 over M, you do the second one for time 1 over M and then you do this whole thing, M times M. And this is approximately true, the error depends on M. So this guy I have defined as Q 1 and this guy I have defined as Q 2. So this gate is essentially Q 1 to the T over M times Q 2 to the T over M done M times. >>: Just a notational question: isn’t that M queries to Q 1 and Q 2? >> Robin Kothari: Uh, no, no, you mean here? >>: Yeah. >> Robin Kothari: Well I guess it depends on what you mean by queries. But, like if you think of Q 1 and Q 2 in this fractional query model where you only charge for the total sum of the exponents this is T over M and it is happening M times, so it’s really only happening in T times in this magical model where you only charge for these things. But, as I previously convinced you, even though this model seems crazy you can actually convert it to the standard model. So, right, so if you go through this conversion it turns out that this actually works, but you need to decompose the Hamiltonian into a sum of Hamiltonians, each of which look like query articles. They look like the kinds of things that the previous model assumed. So how do you get that or what does that mean? Actually, let me go back to the slide where I described that. So this fractional query gadget, the only assumption we made on Q is that Q was a unitary that squared to 1, which is a property shared by this unitary of course. When you square it you get 1 because minus 1 squares to 1. So you need to come up with a sum of Hamiltonians such that E to the minus I H 1 squares to 1, which means that if the eigenvalue is only plus minus 1, so if the unitary is eigenvalues are plus minus 1, the Hamiltonians eigenvalues values are 0 or pi. So you need to decompose your Hamiltonian into a sum of Hamiltonians, all of which only have 2 eigenvalues 0 and pi. So now that’s the new challenge, like I give you an arbitrary sparse Hamiltonian and you need to break it up into sum of Hamiltonians, each of which have two distinct eigenvalues and they need to be exactly 0 and pi. Well now that sounds like just a crazy problem. How can I solve that or why would that be something that’s easy? It turns out that this is actually kind of easy. People have done it or done things similar to this before and a lot of the previous Hamiltonian simulation techniques have studied this kind of problem. So the way that you do this is you break it up into two steps: first you would decompose H into a bunch of 1 sparse Hamiltonians and what 1 sparse means that in every row or column there is only 1 nonzero entry. And almost every Hamiltonian simulation technique until now has had this as the first step. You always decompose it into the sum of one sparse Hamiltonians. This can be done; there are bunches of ways of doing it and we give a new way of doing it in our paper, which is better in some respects. But, anyway, that’s doable. And then given a 1 sparse Hamiltonian you can decompose into Hamiltonians that have only 2 different eigenvalues. So that’s essentially this step, so you take this Hamiltonian, break it up into a bunch of Hamiltonians, each of which are something that the fractional query model understands and knows already how to deal with and then you use that whole reduction and see what you get at the end. So you have to specifically do it, like you have to do it again for Hamiltonian simulation, like it’s not easy to see if you just plug it in, what happens. So we go through and do this and try to compute what is finally going to happen. >>: So if I have N terms to start with, right, how many do I have here at the bottom? >> Robin Kothari: Here? >>: Yeah. >> Robin Kothari: So if you have N terms in the sum? Uh, I am sorry what do you mean? Like if the Hamiltonians already decompose --. >>: The sum of H of N Hamiltonians and now I am going to go and break each one of those apart, each one is going to explore by how much? >> Robin Kothari: Okay, well that depends on the property of the individual. So what do your terms look like are they de-sparse or are they? >>: No. >> Robin Kothari: Uh. >>: I am doing final chemistry. >> Robin Kothari: Oh, okay. >>: So if you are going to span the entire some of them will be very tiny, some of them will span the entire set of cubits. >> Robin Kothari: Right. >>: So they are one sparse in this language and [inaudible]. >> Robin Kothari: Right, right, right. >>: [inaudible]? >> Robin Kothari: So like individual Hamiltonians are like products of poly operators, right. They are like sigma Z to like sigma X whatever something of that kind. >>: [inaudible]. >> Robin Kothari: It’s, I mean, um --. So actually if you have these poly operators as in quantum chemistry you don’t really need to do this. I mean, the only thing we wanted from your Hamiltonian was it had two different eigenvalues; poly operators already have that. So you don’t need to go through this decomposition. So in your application to quantum chemistry, yeah, forget about this, you can directly jump to the end, because you already have just 2 eigenvalues. >>: Okay, so I don’t explore my number of [inaudible]? >> Robin Kothari: No, no, uh, right. We are trying to deal with the general case where they are space, so we need to worry about this. But, yeah that’s nice and in specific cases where you already have a decomposition into a nice form you can avoid this stuff. So yeah, that’s what essentially links these two problems together. I mean I have only sketched it because the details are gory, I guess. So this is what I would explain. So, you know, I have convinced you that this thing makes sense and on the previous slide I have showed you the reduction, so now the claim is that this makes sense based on all of this that I have tried to convince you in the last hour. Um, okay, and there I also a lower bound, but maybe I will skip that. I mean, it’s not as interesting, I mean people who work in query complexity all the time might be interested in seeing exactly how the lower bound goes, but maybe I can tell you later or I can tell you if there is time. Instead I would just like to highlight some other results that are in this paper or I will go back to give you a big picture perspective. So the previous Hamiltonian simulation algorithms had this log style independence on this Hamiltonian, on N and we show how to get rid of this and it’s not specific to our algorithm. We show how to get this in other algorithms. So there was a reason this log star end kept coming up and we just tackle the thing directly and show you that, well you never need it in the first place. It’s not a big deal, like in log star, it’s at most 6 for all these values of N, but theoretically it’s very unsatisfying to those, this annoying function of N. It also made the paper’s really long, because every time you would explain this log star N bit that takes like 3 pages, because the algorithm that gives you log star in running time has to be complicated. How would you get a function like log star unless you had a complicated algorithm to serve it? So that’s nice and our algorithm also has gait efficiency. So this get’s to the point of how you actually do this. So, and by gait efficient I guess we mean that the total number of non-query gaits is in the total number of 1 or 2 cubit gates that we use in addition to the query gates this is pretty comparable to the query complexity. Like, it’s maybe a log factor over the query complexity and our algorithm also works for time dependent Hamiltonians and there is some dependence on the first state of the Hamiltonian, etc. If you care about time dependent Hamiltonians. We have some improvements. There is a D squared term in our running time, but if the Hamiltonian is local then it’s actually only D, there are some savings if you know the specific form for the Hamiltonian or of you know an explicit decomposition to poly’s like in the quantum chemistry example, then there is a speed up you can get. And yeah, as I explained, the error is optimal for both problems, meaning both for Hamiltonian simulation and for the fractional query model, if you think of them separately. We show two different lower bounds showing you that you need this dependence on error. So, it’s not possible to improve that. So, yeah, so what’s open? So what’s open is like our algorithm, say for the fractional query model or for the other model it goes like this: we know a lower bound of this expression. So, the epsilon dependence by itself is tied. We know that you need linear number of queries, this was shown before, but like that doesn’t mean that it has to go like this. I mean in particular this goes like T log T “ish” in T, so there is definitely possibility of improving that, but also it doesn’t need to go like this particular function. It could go like T plus log over epsilon, log, I mean, there are possibilities for improving the upper or lower bound, I guess is what I am saying. Not by too much, they are kind of close, but there is some way to improve it. And then of course there are applications like what we can do in instance that people care about; for example quantum chemistry. And then there is this third question which is very specific to people who work on this: that our dependence on the degree is quadratic, but we think it should be linear and we don’t know how to do that and it’s important for things that have high degree. So that’s something that’s also open, okay, yeah, I think that’s the end of what I wanted to say. Are there any questions or anything? >>: So for number 3, do you see any way around this problem other than devising a graph coloring algorithm that’s as sufficient as [indiscernible]? >> Robin Kothari: No, no, I think what’s going to have to be done is someone is going to have to come up with a good [indiscernible] coloring algorithm that uses [indiscernible] of D colors to color a [indiscernible] by [indiscernible] graph. >>: [inaudible]. >> Robin Kothari: Yeah, well that only makes your life easier, but I guess you could assume is [inaudible]. The only way I know [inaudible]. Any other questions? [clapping] Okay, so --.