>> Krysta Svore: Today the core group welcomes Robin Kothari... speak with us. He is from the University of...

advertisement
>> Krysta Svore: Today the core group welcomes Robin Kothari who’s here to
speak with us. He is from the University of Waterloo. His advisers are
Andrew Childs and John [indiscernible] and Robin will actually be joining MIT
as a post doc starting in the fall.
Today he is here to talk to us about exponential improvement in precision for
simulating sparse Hamiltonians. So, thank you Robin for coming and we will
turn it over to you.
>> Robin Kothari: Thanks, thanks for the introduction.
Okay, so I am going to be talking about Hamiltonian simulation algorithms,
algorithms that have an exponential improvement in precision. So this is
joint work with Dominic W. Berry, Andrew M. Childs, Richard Cleve, and
Rolando D. Somma. Please feel free to ask questions at any time and
interrupt me if you want to. I don’t get side tracked talking about anything
you want to talk about.
Okay, so first I am going to just start with giving you a summary about what
I am going to talk about the next hour maybe. So the main result is that we
have this new algorithm for simulating sparse Hamiltonians. And the way the
algorithm works is we reduce the problem to another problem, which is the
problem of simulating fractional queries. So if you don’t know what any of
these things mean, that’s fine, that’s what the rest of what my talk is going
to be about. But, this is just like a big picture overview for those of you
who know what’s going on.
And the new thing is we have this new reduction to simulating fraction
queries and then we have a new algorithm for simulating fractional queries.
And this new algorithm uses a technique that we are calling oblivious
amplitude amplification. So the first part of my talk is just going to be
about the Hamiltonian simulation problem. So what I am going to do is I am
going to explain about what the problem is, why we care about the problem,
what’s know about the problem and what we have done about this problem.
Okay, so let me get into it. Are there any questions at this point, problem
not? Okay, so what is the problem that we care about? So consider the task
of simulating physical systems. So what this means is that you just have a
description of a system, I tell you what it looks like right now and you have
to predict what it’s going to look like after five minutes or something like
that. So this is a very basic fundamental problem in physics. I mean you
can almost say that this is what it means to understand a system. Like if
you say you understand the laws that govern a system then you need to be able
to predict what’s going to happen after sometimes.
So a classical example is I give you the description of say “N” bodies under
gravitational force, and I tell you where they are right now and how fast
each of these bodies is moving and you need to tell me what they are going to
look like after a couple of minutes. So that is just a general example and
similarly a quantum example would be there is “N” cubits, there is some
Hamiltonian that governs the time evolution and I tell you what the current
state of the system is and I ask you what’s the final state after some time
“T”. And if the Hamiltonian is time independent we know how to solve that,
like explicitly as an equation and it’s just the final way of function is E
to the minus IHT times the initial way function.
So more formally the Hamiltonian simulation problem is this general problem,
which is specifically for quantum system. And the problem is you are given
the Hamiltonian “H”, which is just a Hermitian matrix, it’s a complex
remission matrix of size N x N, you are given a time T and what you need to
do is give me a unitary that does E to the minus IHD and we allow some error,
which is going to be epsilon, that’s the error parameter.
Now I am not going to talk more about what I mean by the error, but you can
just take any convenient version of this that seems reasonable to you, but
for example the unitary that you implement should be close to the actual
unitary E to the minus IHD under some suitable norm, for example the diamond
norm or anything you like. For example the [indiscernible] output by this
unitary and the ideal unitary should be closed in L2 distance. It should be
epsilon closed in L2 distance, that’s another reasonable way of thinking
about what it means for two operators to be closed.
Okay, yeah, so are there any questions about the problem or the general set
up?
Okay, so this is what we care about and why do we care about this problem?
That’s always a good question to ask right after someone has defined a
problem. So the first thing I can say is that this was the original
motivation for building quantum computers. And Feynman originally said
something to this effect that we want to simulate the dynamics of a quantum
system, but the best way we know how to do this right now is by a classical
algorithm that’s very inefficient, that takes exponential time to simulate
the system. So that’s bad and what that means is that we don’t know how to
get simulations or systems that we care about because it’s just really large
and we can’t solve it on today’s computers.
And as a result of this a good fraction of today’s computing power is
actually devoted to solving these problems otherwise in practice. And people
in a diverse number of fields like quantum chemistry, material science, etc.
They care about understanding small quantum systems and they are not able to
do this to the accuracy or maybe to the number of particles that they would
like just because they are being limited by the computational power. So if
we had efficient quantum algorithms for this and if we had a quantum computer
then these two together would give you a lot of, well these guys would be
really happy.
So we don’t really have a quantum computer, but it’s good to just get working
on the algorithms and get one that’s as efficient as possible. So that’s the
usual motivation for studying this problem. But there is also another
motivation that was interesting to me as someone who works in quantum
algorithms, which is that you can use this as a subroutine. I mean every
time you solve a problem you can now reduce to this problem, like that’s just
a general fact about solving problems. But, specifically Hamiltonian
simulation turns out to be a pretty useful subroutine.
So for example you can use it to implement continuous time quantum walks and
it’s done in this paper where they show this exponential separation between
classical and quantum query complexity using a glued trees graph and you need
to traverse a tree. And another example, or another set of examples comes
from using Hamiltonian simulation to --. So this last paper that sited as
this paper for solving linear systems of equations that uses Hamiltonian
simulation as a subroutine.
The first paper and the last line is by [indiscernible], [indiscernible] and
[indiscernible] and it’s to solve this query problem called the
[indiscernible] tree problem you need to compute the nine different points of
a function that’s recursively defined in terms of some Boolean variables and
that’s also solved by doing a continuous time quantum walk, which is
essentially a Hamiltonian simulation problem.
So I guess what I am trying to say is that Hamiltonian simulation is also
useful as a subroutine for designing quantum algorithms. Okay, so now what I
am going to talk about is, as I have said a couple of times, there is no
known efficient classical algorithms for our Hamiltonian simulation, but we
have quantum algorithms that are good. So it’s a good question to as: what
does efficient mean? Like this has to be made precise at some point: like
what is efficient? Of course there are classic algorithms to solve the
problem if you don’t require efficiency like this. There are always
classical algorithms to solve everything that quantum computers can solve if
you don’t insist efficiency. The main advantages are like in how much time,
and space and whatever resources are needed.
So as a computer scientist the first phrase is kind of obvious “efficient as
polynomial time”. That’s always the answer to what is efficient, but,
polynomial in what? There are bunch of parameters in this problem so it’s
not completely straightforward what you might mean by polynomial time. So
here I am going to say that polynomial time is polynomial in the size of the
system. So for example of the Hamiltonian is an N by N matrix the number of
cubits it acts on is log-in. So you want it to be polynomial in log N, not
polynomial in N. It’s fairly straightforward to do Hamiltonian simulation if
you don’t mind a [indiscernible] time that’s polynomial in N, because it’s
just an N by N matrix. You need to exponentiate it and that’s easy, you just
diagonalize it and exponentiate it. So the challenge to do this is poly log
N.
And then you need some scaling that depends on how long you want to evolve
for, because of course if you want to predict the state of a system after a
long-time then you should be allowed to have more time, that just makes
sense. So some dependence on T and that would be fine. And then there is
this odd dependence on the norm of the Hamiltonian and at first sight it’s
like, “Why is this weird quantity entering the system”? But, the reason for
that is that time is not uniquely defined by itself. So the thing that you
want to simulate this unitary which is E to the minus IHT. So H and T appear
to a product. So you can always scale down the Hamiltonian by the factor of
2 and scale up the time by a factor of 2 or maybe do it the other way around.
This would allow you to cheat if you just had some dependence on T.
So somehow it should go like the product of the two of these things. So you
can take any norm of the Hamiltonian you like, but there has to be some way
of normalizing the Hamiltonian so you don’t just put all the time into the
Hamiltonian and then just evolve a very large non-Hamiltonian for unit time
twist, that wouldn’t make sense. And the last parameter that I am not going
to say too much about now is, “How should it scale with epsilon”?
So epsilon is this accuracy threshold where I want to be within epsilon of
the right map. And maybe it’s not completely appear a priori how things
should scale with epsilon and maybe a polynomial in 1 over epsilon would be
fine, but if you can get poly log 1 over epsilon that would be even better.
And what we do in this work is we improve the scaling with epsilon and we
actually get it down to poly log 1 over epsilon, as compared to previous
simulations that went like a polynomial in 1 over epsilon.
Okay, so, yeah?
>>: Didn’t the previous work of [indiscernible] also get a log 1 over
epsilon?
>> Robin Kothari: Yes, so our work is now merged with their work in the sense
that we are previous work as stuff before that. So we decided to merge our
papers and that paper is now to be considered succeeded by our paper. So,
yeah, right, any other questions?
Yeah, so another interesting question or like one of the things that you
might think about first is: can you simulate all Hamiltonians efficiently?
And if you think about this for awhile this is not possible, even on a
quantum computer. And maybe it’s not obvious to see why on a quantum
computer, but the analogous classical question would be: can you compute all
functions as polynomial size circuits? And that’s just not possible. There
are just too many functions and too few circuits, like you can just do a
simple counting argument to convince yourself that all functions do not have
polynomial size circuits.
And it’s the same thing quantumly, like you cannot simulate all Hamiltonians
in polynomial time, that’s just not possible. So all you can hope for is to
be able to simulate some restrictive classes of Hamiltonians. And so what
Hamiltonians should we think about? And I guess going back to our motivation
one good class to study would be the classes that actually arise in practice,
like that makes. And Hamiltonians that arise in practice are often local
Hamiltonians, like that’s a 7411widely studied class of Hamiltonians that1
arise in practical applications. But, more generally you can think of a
class of Hamiltonians called sparse Hamiltonians that I will define on the
next slide which is a generalization of local Hamiltonians and effectively
captures almost all the Hamiltonians you would want to simulation on a
quantum computer, especially from practical applications. I am not aware of
any application that needs you to go beyond this model of simulating
Hamiltonians.
And lastly I would just like to mention that this problem of simulating local
Hamiltonians or even sparse Hamiltonians is BQP hard. And what this means is
that it’s the hardest problem that can be solved by a quantum computer. So
in other words if a classical computer could solve the Hamiltonian simulation
problem then every problem that can be solved by a quantum computer can also
be solved by a classical computer. So this is like the hardest problem and
if you can solve this then quantum computers don’t do anything for you at
all. In other words there would be classical algorithms for like factoring,
etc. So this also makes this problem interesting. I guess it’s somehow
truly representative of the class of problems that can be solved on a quantum
computer.
>>: What’s an original reduction in [indiscernible]?
classical reductions or quantum reductions?
>> Robin Kothari: Uh, yeah.
>>: [indiscernible]?
Is it done under
>> Robin Kothari: Right, yeah, so it’s classic reductions, but I would have
to define a decision version of this problem more specifically. I mean right
now I will just define it as a problem of producing the final state. That’s
of course not a problem that a classical computer could ever solve, because
it cannot produce a quantum superposition for you, but you would have to find
an appropriate decision of this. And that could be something like I give you
the initial state of the Hamiltonian and what you need to do is for the final
state maybe sample from the final stage property distribution.
But even a simpler one, like just tell me, say your promised of the first
cubit has a very high probability of answering 1 or 0 and you just need to
decide which one. And then that’s a decision problem now and this decision
problem you can classically reduce to --. I mean you can show that if this
decision problem has a classical algorithm then all quantum algorithms can be
done classically.
>>: Can you start with another problem that’s not [indiscernible] hard. Say
[indiscernible], could you convert it into an instance of [indiscernible]?
>> Robin Kothari: Right, yes, you can do that. In fact it’s a local
Hamiltonian and I think it’s a 4 local Hamiltonian that’s the cleanest,
easiest construction I know. And that reduction will be fully classical. So
I will take your instance of Jones polynomial and I will spit out a local
Hamiltonian, which I will write down on a piece of paper for you. So, yeah,
that’s the sense of which it’s hard and it’s a complete problem I guess in
that sense. Yeah, are there any other questions about this?
>>: So finding a ground state is that also hard?
>> Robin Kothari: No, no, so finding a ground state is a really hard problem.
That’s funny you asked that question. I am going to get to that in two
slides. That’s a question that’s often confused with Hamiltonian simulation,
but they are actually completely different problems. So I will get to that
in two slides, but before I get to that let me just tell you what local and
sparse Hamiltonians are more formally and then talk about how the input of
the problem is specified.
So what is the input of the problem first? It’s a Hamiltonian and like an N
bound matrix a time and epsilon. So let’s start with local Hamiltonians. So
this is something that a lot of people are usually familiar with. So a local
Hamiltonian is just a Hamiltonian that’s a sum of terms that each acts nontrivially only on a constant number of cubits. So for example a 3 local
Hamiltonian is a Hamiltonian that’s a sum of terms and each term just
involves 3 cubits out of the log N total cubits that you have.
So how would you specify a local Hamiltonian to me? You just write it down
on a piece of paper for me. So, each term only acts on say 3 cubits so for
each triplet of cubits you can just tell me what the local Hamiltonian is.
That’s a polynomial sized description, it’s not too long, and so that’s just
the input setup. The local Hamiltonian problem is pretty easy to deal with.
The input representation problem comes about when you talk about sparse
Hamiltonians.
So what’s a sparse Hamiltonian? So recall that the Hamiltonian is an N by N
matrix so in principal it can have N squared nonzero entries. And in each
row or column it could have up to N nonzero entries. So we see a Hamiltonian
as sparse if it has only poly log N nonzero entries. And by poly log N I
mean just some polynomial in log N, like say log N squared or log N to the 4
or something. So this is drastically fewer nonzero entries than is
potentially possible. So it’s really sparse, it’s like it’s almost all 0
except a couple of entries.
And even though it’s sparse in this sense the matrix itself can have
exponentially many nonzero entries, like for example the identity matrix is a
very, very sparse matrix. It has only one nonzero enter per column, but
still if you had to describe the identity matrix like by listing everything
out there are exponentially many ones, because the matrix is epsilon N by N.
So if you want me to simulate a sparse Hamiltonian for you, you cannot just
write this down on a piece of paper for me because it’s going to take me
exponential time to read this piece of paper and then you cannot expect me to
run in polynomial time. That just wouldn’t make any sense.
So you need to have some kind of succinct description of this Hamiltonian.
And what often happens in practice or in all the cases of sparse Hamiltonian
simulation I know is that you have the following kind of succinct description
which is if I tell you the row number and I ask you for a particular nonzero
index you can compute this pretty efficiently. So I can tell you, “Hey
what’s the fifth nonzero entry of the eighth row?” And you have some
efficient algorithm that spits this out.
So for example if the Hamiltonian is local then you can come up with a
polynomial time algorithm that does this kind of thing. So this is what we
call an efficiently low computable Hamiltonian, which means you can just
compute nonzero entries in any row efficiently. And what we do in this
sparse Hamiltonian simulation model is we assume we have been given a black
box that just does this for you and we think of complexity in terms of the
number of queries you need to make to this black box. And we also count the
total number of gapes that you will need, like the other one in two cubit
gapes that you need. But, in terms of the Hamiltonian we just count the
number of queries made to this black box because we don’t know how expensive
this black box is to actually implement and that depends on the problem at
hand.
Okay, yeah, so are there any questions about local Hamiltonians, sparse
Hamiltonians or how they are represented? We are good?
Okay, let me summarize what we know about current Hamiltonian simulation
algorithms. So as I said we are going to measure the complexity algorithm in
terms of the number of queries made to this black box that we assumed on the
last slide. And the relevant parameters are N, the size of the Hamiltonian,
the time that you are evolving for, the error parameter and D is going to
stand for the maximum number of nonzero entries in any row. This is the
thing that we assume to be poly log N. So at the end you will get a
polynomial time algorithm.
So the first algorithm of this kind was by Lloyd in 96 and this only worked
for local Hamiltonians and this only worked for local Hamiltonians. And it
gave that kind of scaling so its polynomial in the D and log N and I think
its quadratic in these other parameters. This was later improved by
[indiscernible] and [indiscernible] who extended it to sparse Hamiltonians.
And they introduced the sparse Hamiltonian problem for the firs time. They
got better dependence on some of these parameters and they extended a larger
class of Hamiltonians.
And then after that there have been a bunch of improvements, so in the next
paper by Berry, [indiscernible], Cleve, and Sanders they really get down the
dependence on some of the parameters. So, for example this previous one was
just log N and it was some polynomial in log N, but here they have brought it
onto log star N, which is a really, really slow growing function of N. So
maybe I won’t define it, but for any reasonable N that you would put in, like
for example the number of particles in this unit where log start of N is 6.
So it’s not something you should worry about much. You can almost think of
that as a constant for any real application.
So the thing I want you to notice is the dependence on the error parameters.
It goes like 1 over epsilon to the delta for any [indiscernible] zeros. For
example you can think of delta as 0.01 or something. So the dependence on 1
over epsilon is very good, like it’s a very small polynomial in 1 over
epsilon, but it’s still polynomial dependence. And that’s the same thing in
the next result, which just improved the dependence on D, the scarcity
parameter, but essentially left other things unchanged.
And the last one on this slide is a completely different approach and it’s
not related to the previous approaches and that get’s a better dependence on
the degree, but at the cost of a worse dependence on epsilon. Now it goes
like the square root epsilon, the square root of 1 over epsilon. So, all of
these algorithms had polynomial dependence on epsilon. And this was an open
question for quite awhile: Is this necessary, like do we need polynomial
dependence on 1 over epsilon or can we get it down to poly log 1 over
epsilon?
And that’s essentially one of the main things we are going to talk right now,
which is that our algorithm finally achieves poly log 1 over epsilon
dependence. And in fact it achieves this weird function which is if you just
isolate the dependence on epsilon we get this dependence which is log of 1
over epsilon divided by log, log 1 over epsilon, which kind of seems like a
strange function out of nowhere, but in fact it’s optimal. We also prove a
matching lower bound showing that log over log, log is the right dependence,
even though this function looks kind of like out of nowhere, but it’s really
the right dependence with the respect to epsilon.
So yeah, all right, any questions about this? No, okay, so that’s all I am
going to say bout the Hamiltonian simulation, except I am going to answer
Martin’s question with a full blown slide.
So what is the difference between the simulation problem and the problem of
finding ground states? So this is something that people often ask about and
sometimes people thing that the simulation problem is the same as ground
states. So they are extremely different problems morally. So if you take a
very coarse view of life where quantum, classical and all of this stuff is
the same, like don’t even differentiate between quantum computers and
classical computers. They are just computers that run in polynomial time.
The simulation problem is an easy problem in principal. It’s just mimicking
the behavior of this other system and if you have enough resources you can do
it. You may have a little bit of a slow down, but in principal you can do
it. Whereas the finding a ground state kind of problem this is a bit like
among all possible configurations that the system could have started with
what is the best one that maximizes something? It’s like an optimization
problem.
So for example if you are just given a Boolean circuit and I give you an
input and I ask you what the output is. So you want to mimic the behavior of
this Boolean circuit. That’s really easy, you just follow all the gates, you
compute what the outputs are supposed to be. It’s a small circuit; you can
do it. But, if I ask you: is there any input to this Boolean circuit that
outputs 1, like that’s a satisfy ability problem. That’s a well know NP
compute problem, that’s really hard.
So similarly in quantum it’s like I give you a quantum circuit and I give you
an input and ask you, can you do this? If you have a quantum computer yeah,
sure just follow the circuit, it’s almost trivial. On the other hand finding
the ground state is almost like finding something like the maximum acceptance
property over all inputs or finding, like which is the input that maximized
the acceptance property. So this is a really hard problem.
So that’s kind of what I am trying to get across in this slide. That a
simulation problem is generally easy if you have enough resources to, well if
you have resources similar to the kind of system that you are trying to
simulate. If you are trying to simulate a classical system and if you have
classical resources it’s kind of an easy problem. But, ground state kind of
problems is always really hard and in general they are NP hard if your
problem is a classical problem. It could be QMA hard if it’s something about
a circuit and whatever it is its going to be quite hard. Does that answer
your question?
>>: The definitions seem to me to be --. They have a very different flavor
of those two classes, like when you say VQP hardness verses QMA hardness.
One is like a semantic class in classical complexity, right. How do you even
know that a problem isn’t VQP hard if your [indiscernible] definition of a
language even though it’s in VQP and your reason about probability of
acceptance and so on.
>> Robin Kothari: Yeah, that’s right.
>>: That’s different from QMA, right?
>> Robin Kothari: No, QMA is also a semantic class.
>>: I thought it’s a syntactic.
>> Robin Kothari: No there is no syntactic definition of QMA because if I
give you a QMA how do you know it has this property that has more than 2/3
accepting probability of less than 1/3. It’s the same as classical MA, like
even MA has the same problem, even BPP has a problem. These are all semantic
classes, so –-.
>>: NP would be an example.
>> Robin Kothari: NP is a syntactical class, yeah. So if you wanted to be
extremely technical I should be saying promise VQP and promise QMA, because
they are all hardness for the promise versions of these classes where you
assume this kind of behavior. But, most people don’t care about that level
of technology, so I don’t get into that. But, morally it’s hard and it’s QMA
hard and in that sense it would be VQP. Any other questions about this or
about Hamiltonian simulation or anything I have talked about? So that is
what I would say is the intro part. I am trying to convince you that the
problem is interesting and what I am thinking about is interesting and why I
care about the problem. It looks good?
Okay, so this is a summary of the first part of my talk. I talked about
Hamiltonian simulation and I told you that I have this algorithm that scales
like this where [indiscernible] is D squared times the norm of the
Hamiltonian times T and it has this kind of nice dependence on error. I
haven’t told you how this algorithm comes about; I have just stated the
result. In the second half of my talk I am going to try to explain to you
how the algorithm comes about.
And to do that I am going to start with something called the fractional query
model. So if you have never heard of it that’s fine. You probably shouldn’t
have heard of it. It’s just a really exotic model to study. So I am going
to describe the problem, what this model is, what we know about it, what we
did and try to prove what we have done. And when I talk about this right now
it’s going to be completely unrelated to Hamiltonian simulation, so just put
that out of your mind for the next 20 minutes or so. And finally in the last
10 minutes or so I am going to connect up this problem and show how we
reduced Hamiltonian simulation to this fractional query model.
Okay, so let’s talk about quantum query complexity. So this is one of my
favorite topics. So quantum query complexity is just this model where you
have some input and you need to compute some function of this. Classically
what you have is you gave oracle access to this input in the sense that, so
think of the input as an N bit string. And you have oracle access in the
sense that you ask the oracle, “Hey, what’s the fifth bit of this string” and
the oracle replies, “It’s a 0 or it’s a 1". And quantumly it’s the same
thing, but now you are allowed to do this in super position, that’s where a
lot of the power of quantum computing comes from.
And so the standard way to represent this oracle is like this: you give the
oracle two [indiscernible], I and B, I is a number between 1 and N and B is
just a bit, and what’s going to happen is that it’s going to put a phase up
front on your state. And the phase is going to plus or minus 1 based on the
bit XI that you are trying to learn. And so that’s the standard query
complexity model. And the measure of complexity is how many queries you make
to this black box or how many questions you ask the black box.
So for example this is a circuit that depicts a two query algorithm. There
is some unitary that’s independent of the input, then you make a query, you
do another unitary, you make a query and that’s a two query algorithm. Now
suppose I gave you the ability to make half a query. It doesn’t really make
sense because you are asking for a bit. What does it mean to give me half a
bit? So it doesn’t make any sense classically, but quantumly it does make
sense because what’s happening is if the bit was 1 I was going to put a phase
of minus 1. If the bit was 0 I was going to put a phase of plus one or
basically not do anything. But, instead of putting a phase of minus 1 I can
put a phase of I and that’s like doing half a query because if you do this
two times you get a phase of minus 1. So 2 half queries can simulate 1 query
so that justifies calling it a half query.
And so now think of this model where you are allowed to make half queries and
if you have a circuit of this kind where you have used 4 half queries I am
still going to charge you two queries for it. I am going to say that your
circuit made 2 queries and then maybe generalize it to a quarter query. And
like you can make 8 quarter queries and I am still going to count that as 2
queries and so on. You are allowed to make arbitrary fractional queries and
I am only going to count the total full queries.
>>: [inaudible].
>> Robin Kothari: Sorry.
>>: I went to the oracle 4 times, whether or not I got half a bit each time
or I got a full bit, how does that effect the amount of work?
>> Robin Kothari: Exactly, I mean it seems like this is just a crazy thing to
do, like I am charging you way less --.
>>: Like I just think it is 4 and 8, so why am I wrong?
>> Robin Kothari: Right, right. So I am going to convince you later that
this is right. But, let’s define this model which seems like –-.
>>: [inaudible].
>> Robin Kothari: Right, so it means that instead of being given this gate
that has a minus 1here, like replace this definition with instead of minus 1
you put the correct square root of minus 1. So for half a query you put I,
for a quarter of a query you put like the fourth root of minus 1. So it in
general will be E to the I pie, or E to the 2 I pie divided by the fraction.
So if you want to make an alpha fractional query E to the 2 pie I alpha.
That’s the gate that you have been given.
>>: [inaudible].
>> Robin Kothari: Right, so the only trivial observation that you can make
about that gate is that if you use that gate the correct number of times,
like if you use this gate 4 times you get back the usual thing. So if I
define this model where I use this kind of counting, as opposed to just
counting the total number of times I called the oracle by this kind of count,
it’s at least as powerful as the usual model, but it seems like it’s way more
powerful because this model get’s to do so much more. It’s almost like it’s
cheating, but the punch line is that no, it’s not more powerful. And this
was shown by, let’s see if I can remember all the authors, [indiscernible],
[indiscernible], [indiscernible], [indiscernible], [indiscernible] and
[indiscernible].
So this was a stock paper from 2009 where they showed that if you have a
fractional query algorithm, so an algorithm or crazy model that makes capital
T queries counted in the way that I described it can be simulated in the
regular model which has just normal queries and it just uses a little more
number of gates. So if you think of simulating into some constant precision
it just uses something like T log T gates, which is very close to T, but the
model seemed like it was way more powerful. The model seemed like it was
allowed to cheat and do great things, but it turned out you can simulate is
essentially with just a log factor loss.
So that’s really surprising because with this model you wouldn’t have thought
that, but that’s true. So what we do is we improve upon this result by
getting a better dependence on epsilon. And this connects up with the
epsilon in the Hamiltonian simulation which we are trying to improve, which
is the main focus of our work I guess. So I guess that’s why epsilon is the
parameter where we are trying to reduce the dependence on. So what we do is
we improve this scaling from this expression to this expression. The
difference is only the epsilon in the denominator.
Okay, so any questions about the fractional query model or what we did,
because it’s kind of a strange model to wrap your head around.
>>: [inaudible].
>> Robin Kothari: No, I am going to explain that to you, but maybe not in
great detail, but I am going to kind of explain that to you. Hopefully I can
at least convey the intuition for why that’s happening.
Okay, so I am going to prove our stronger result. So this is the number of
queries I claim that you need to simulate a fractional algorithm that makes
only T queries. So the first thing to observe, so it’s broken up into two
steps, the first thing you need to observe is: well how many queries do you
need to simulate a 1 query algorithm? So I have a fractional query algorithm
that makes 1 query in total, but it could be using a whole bunch of
fractional queries and I am going to convince you that it can actually be
simulated using only 1 log of 1 over epsilon by log, log 1 over epsilon
queries.
So that’s the special case where T equals one, but this special case implies
the general case because you can split up a T query algorithm into T 1 query
algorithms and you choose the error bounds to be small enough in each little
part so that when you combine them they are still small enough. And it’s
just the usual thing and you will get the right expression. So all I need to
do is convince you that if a single query fractional query algorithm can be
converted to a normal algorithm in the usual model and it only makes this
number of queries to achieve better epsilon. And so if I can prove this
point number 2 then I have proved the result I have claimed.
Okay, so this is a summary so far. What has happened is we have talked about
Hamiltonian simulation awhile ago, but now put that out of your mind. I
introduced the fractional query model. I claimed that this can be simulated
with T queries, ah, you can simulate a T query algorithm with this number of
queries and then I talked about the special case when T equals one and I
argued that they are equivalent. So that part is easy, it’s the sketch of
the proof that I gave on the last slide.
So now what I need to do is I need to convince you that this box is true. So
I that box is true assuming this box is true. Okay, so to prove this is true
I am going to prove something even weaker. So it’s going to be a chain of
reductions. So what’s even weaker than this is, so we are trying to simulate
a 1 query algorithm and we don’t know how to simulate this directly, but what
we know how to do is simulate it probabilistically. Well I like to call it
probabilistically, but you can also think of it as non-deterministically. So
if we have a circuit that when it succeeds it does the 1 query algorithm,
when it fails it does something bad, but we know how to fix that. So it’s in
the spirit of these repeat and closed circuits.
So what do I mean precisely? So what I mean is let’s start with a 1 query
fractional query algorithm. So what does that mean? It’s some unitary V
which can be written like this where there are a whole bunch of fractional
queries, like there is N fractional queries which each of which are Nth
fraction of a query. And I have taken all the fractions to be equal in this
example, but it also works if they are different. And you are allowed to do
arbitrary uintaries between these queries. And V is some unitary that is
now, that we say can be implemented in the fractional query model with cost
1. And what we want to do is we want to implement V in the usual model with
only log 1 or epsilon by log, log 1 over epsilon queries.
So this weaker thing of not being able to actually do V, but being able to
probabilistically implement V or non-deterministically implement V is this
thing that I talk about here. So think of this map U as a map that kind of
implements V or it implements V incoherent super position with something else
that you don’t care about. So what it does is it takes 0 times
[indiscernible] and it maps it to V [indiscernible] with the first cubit
telling you that this has worked with some probability P with probability 1
minus P it just does something else, maybe we don’t know what it does. And
it’s important that there is a cubit that tells you that you have succeeded
otherwise you don’t know when you have succeeded and think of P as a
constant.
So what these guys showed is that there is really an algorithm that only
takes log 1 over epsilon by log, log 1 over epsilon queries and does this
task for you. This is not the task that we set out to do, but it’s kind of
almost the task and it’s for constant P think of PS.1 or something if you
want to think of a constant. And given such a procedure as a subroutine what
we want to do is we just want to get [indiscernible] out of it. So we want
to get a circuit that just implements V for you instead of this thing that’s
non-deterministically implementing V.
So the most straightforward thing that you would do with a circuit that
implement something with a certain probability as well is you apply U, you
measure the first cubit and you see if you got 0. If you got 0 then you are
good, you have V psi. If you don’t have 0 then you have got this state
[indiscernible] phi, which I haven’t explained what it is, but you can work
out the details and there is some description of [indiscernible] phi, which
shows you that you haven’t actually lost [indiscernible] psi. Then you
reverse this, get back [indiscernible] psi and try to do it again. The
problem is the reverse procedure is also probabilistic, because you need to
run the same map again.
So you have this kind of really complicated recursion relation to analyze
where with some property you fail and then the correction procedure also
fails with a certain probability and this is really hard to analyze and you
need a large number of gates to do this. So this is how they were doing this
before in the previous algorithm, this is why they had this kind of bad
dependence on epsilon because to be more precise you need to bound the total
number of branches that are going to fail and you want the total order of
possibility of failing to be really, really small.
So we get around this problem and show that if you are given a map that does
this we can just implement V psi for you deterministically, like no
probabilities, non-deterministic, what ever.
>>: Question, so does P depend on the given algorithm?
>> Robin Kothari: So P is just a constant. I just didn’t write it down, but
it’s something fixed. Like think of it as just this .1; it’s independent of
the algorithm.
>>: But you will know it.
>> Robin Kothari: I know it, yeah.
something –-.
Like I would know it, but some crazy
>>: Can it be computed in principal from the given description?
>> Robin Kothari: Yes, yes, it’s something like --. There are some signs and
co-signs of the fractions involved. So if all the fraction queries are 1
over M or something like sign of 1 over M plus 1 the inverse of this
something. But, they are all basic trigonometry operations. You can compute
it in principal just by looking at the V that you present to me.
>>: But it depends on the U’s and [indiscernible]?
>> Robin Kothari: It doesn’t depend on the U’s, it only depends on the
fractions that you use here. So if all of them are 1 over M then I can just
compute the number for you and it’s something simple, but if all of them are
different then it’s some number that depends on what each of those things
are. But, it’s a simple calculation, like classically you just need to tell
me the fractions in the exponent and I can tell you what P is.
Okay, so let me summarize what we know now. All right. We were at this
stage where this was the result I was trying to prove. I showed you it’s
equivalent to this simpler result and then I introduced this fractional query
model where we are trying to probabilistically simulate this in the sense of
the previous slide where with probability P it does the right thing, with
probability 1 minus P it does something wrong. And now what I want to show
you is that if you have a way of doing this then you can do that efficiently
and in fact using only a constant number of uses of this circuit.
So that’s going to be what we call oblivious amplitude amplification. Okay,
so what’s the problem? The problem is that you have this unitary U, which
takes U psi and it maps it to, like with probability P, it maps it to the
thing you want to probability 1 minus P something you don’t want. So one
obvious idea is, hey, let’s use amplitude amplification because that’s a
standard technique to increase the probability on some good subspace and
decrease the probability on some bad subspace.
So what does amplitude amplification do? What you would need to do is you
would need to reflect about the good subspace, which in this case is the
subspace has 0 in the first register. So that’s easy to do, well it’s just
the zed gate on the first cubit. So that’s fine, but you also need to
deflect about the starting state. So amplitude amplification is technique
that takes two reflections and does them over and over again, but the other
reflection is the reflection about the starting state and that’s U times this
thing over here.
But [indiscernible] psi is a state that you don’t know, like this is the
input state of your algorithm. You know, you are trying to simulate the
behavior of a unitary on an unknown input state. You don’t know
[indiscernible] psi, so you don’t have the ability to reflect about it, like
you don’t even know what it is. If you measured it or did anything you would
destroy the state. So you can’t just use amplitude amplification, so that’s
the problem. So what we introduced is this thing we call oblivious amplitude
amplification and it’s oblivious in the sense that you don’t need to know the
input state. So it works when the input state is unknown.
oblivious to the input state.
So you’re
And what we show is that given such a circuit you can indeed do exactly what
amplitude amplification would do, but we don’t use the same two reflections
of course, because the second reflection is something we don’t know how to
implement. So we use a different reflection, but what we show is that the
way the algorithm proceeds is that it does exactly what amplitude
amplification would have done had you had the correct reflection to do it.
And this uses ideas from [indiscernible], which they introduced a couple of
nice techniques in that paper for dealing with these kinds of things. And
just as in standard amplitude amplification if you know what P is, the
success property, then you can boost amplitude amplification to get you the
right state with probability 1.
So this is like in Grover's algorithm if you know there is exactly one marked
item you can find it with certainty, like its not probabilistic anymore. And
that’s the general feature of amplitude amplification. If you know the
probability then you can exactly get the right answer. So that’s what
happening here. We know what P is, as Martin asked, I can compute what P is
from the description of the circuit and then I have this technique, oblivious
amplitude amplification, that exactly let’s me, since I know P, I can do
amplitude amplification and I can exactly get probability 1 here. So what I
get at the end of the day is just P psi, no error.
And I have an exact statement of the theorem of if, I don’t know, someone is
really interested in the technical details of what this theorem states,
what’s a lemma. So essentially it’s just that U and V are unitary matrices
that have this property that U acting on 0 psi produces V psi with some
probability sign theta or amplitude sign theta and cost state of doing the
wrong thing. And just as in amplitude amplification you can define some
unitary S, which when you apply T times to this it increases the probability
of being here by 2 T. So, you know, you start an angle theta, you do it once
you get 3 theta, you do it again you get 5 theta and so on.
So this is exactly the statement of amplitude amplification, it’s just that
we are using a different operator here. It’s not the two reflections that
you would have had in amplitude amplification so this needs to be proof that
using these reflections instead of the ones that you were supposed to use
actually works and still produces the same thing. So this is the heart of
the technical content for this lemma. But, if you just want to use it like a
black box it’s just like amplitude amplification, you don’t need to worry
about the details.
So, right, so this is –-.
>>: [inaudible].
>> Robin Kothari: Yes, so --.
>>: [inaudible].
>> Robin Kothari: No, it’s exactly the same. So the states, even the states
that amplitude amplification would go through, go through exactly the same
set of states. So it’s really mimicking it and effectively what’s happening
is you were supposed to use some reflection, reflection A, but we are using
reflection B. And what we show is that reflection A and B is the same in the
subspace in which you are operating. So from the perspective of the
algorithm it doesn’t know which one you used. Like did you use the one that
you were supposed to or did you use R? R reflection is kind of a long one to
use, but it’s the same in the subspace that the algorithm works. So the
algorithm doesn’t know the difference. So that’s where we are getting this
power from.
>>: [inaudible].
>>: If you know P as a constant.
that’s why this works.
The whole point is that you know P and
>>: You don’t need to know P.
>> Robin Kothari: No, you don’t need to know P.
>>: You just need to know P to make it deterministic.
>>: Excuse me, for the deterministic.
do know P it’s fully deterministic.
You don’t need to know P, but if you
>> Robin Kothari: Right, so that’s the same even in amplitude amplification;
if you know P you can get it. So I guess it’s a generalization in the sense
that we can apply it when you don’t know the input state, but use this very
specific form for the thing that you are trying to amplitude amplify. Like,
amplitude amplification works even in other cases where it’s not like this or
for example like one example could be that this probability P depends on psi,
like it’s different for every psi.
>>: Ah, I see.
>> Robin Kothari: Like that’s a different thing and then this technique
doesn’t do anything about that. But, amplitude amplification would still let
you amplify that.
>>: [inaudible].
>> Robin Kothari: Yes, yes, the set of things you could apply amplitude
amplification is large. There is a smaller set of things in which our
problem lies and for that smaller set of things we have a generalization. It
doesn’t generalize amplitude amplification, but if you happen to be in this
class of things then that’s great for you. And this technique has already
found applications. So [indiscernible] and Adam have used it in one of their
recent papers to boost the success probability with some circuits, so that’s
nice.
So let me summarize again what we have so far. So I said that let’s assume
that you have this ability to probabilistically implement this unitary nondeterministically and then I introduced this technique called oblivious
amplitude amplification that allows you, given a way to do this, you just use
this guy a constant number of times and get the thing you want. And then I
convinced you that these two are equivalent, so if you have this ability then
you can get all the way up there.
So now all I need to do is convince you that this makes sense and then you
would be convinced that make sense. I still need to relate this to
Hamiltonian simulation, but this is where we are right now.
Any questions about this?
Is it making sense so far?
Okay, so now let me try to explain how we do this probabilistic or nondeterministic simulation of a 1 query algorithm. So this is really the heart
of the --. So you might have this intuition that this model is really
strong, but you can simulate it with just like a constant number of queries
for constant precision. So then how is it that all the magic is is happening
in this step? So let me explain how this step goes.
So again let’s start with the 1 query algorithm. It looks like this, just
like before; some M fractional queries are being made to the oracle, there
are M unitaries in between and as these guys showed that you can do this for
some constant. So that’s a summary of what I have been saying. Okay, so let
me introduce this thing called the fractional query gadget. So it’s a little
gadget in the sense of its a little circuit and it’s the circuit over here
and it’s nice. What it does is it has 3 1 cubit gates and 1 controlled Q and
it starts in [indiscernible] psi and what it does is at the end if you
measure the first cubit if you get a 0 then the second register is in the
state Q to the alpha [indiscernible] psi.
So for example think of if we said alpha to half or said alpha is a quarter
then when you run the circuit, with some probability, you are going to get a
0 here and this guy is exactly the sate you want it. This is a quarter of a
query done on [indiscernible] psi. So right now --.
>>: So I have a question.
[indiscernible].
So if U is a black box squared and
>> Robin Kothari: Yes, so --.
>>: So is it always possible to manufacture or is it an assumption?
>> Robin Kothari: No, so the way I define Q is I define Q to have this like
this [demo]. Given such a Q you can always make a controlled version of it
and that’s because this second registered B is effectively controlled. When
you said B to 0 the map does nothing, because B is 0, this is 0 minus 1 to
the 0 is 0, so it’s the identity map.
>>: Oh, I see.
>> Robin Kothari: So the definition of the oracle already includes control.
And this is the traditional way to define it, but if you were given a version
of the oracle that does not have control then there is no way to manufacture
it.
>>: Okay.
>> Robin Kothari: That’s hard, that’s provably hard I guess.
So, yeah, so this is a little gadget and what it does is it uses 1 copy of Q
and with some probability it does Q to the alpha times psi. So right now
this seems like it doesn’t help at all. Like firstly it uses 1 full query to
do a fractional query, so that didn’t help us out and it does it
probabilistically so that’s pretty crappy as well. It seems like we didn’t
get anything out of this, but the good part is this circuit output 0, like
the outcome you want, with probability very close to one. And it’s 1 minus
big theta of alpha and by that I mean that the amount by which it’s far away
from 1 is linear in alpha. So if alpha is really, really small, like say
0.00001 then this is almost probability 1.
So you have almost certainly succeeded when you do this map. This still
doesn’t seem great because you have still used 1 full query to do a
fractional query. It’s like, well, how does that help you? But, just keep
this in mind that this succeeds almost certainly. So this is just the
definition of the gate. It’s not really important, but just for
completeness, if somebody wants to know what the gates are.
So yeah, so keep this gadget in mind and now what we can do is we can use
this gadget over and over again. So say you want to do, you know, you have
some starting state. You first want to do U 0 on it; you just do it, well
that’s just a unitary that’s free. You want to do Q to the 1 over M, so you
do that using this gadget over here. Don’t measure this final state, just
let it be for now and then apply U 1 on the output of the circuit, you can do
that. Then take a fresh [indiscernible] and do this gadget again on this
state to enact the second gate and so on. Like do each of these gates one
after the other on the output of the circuit.
So what I mean is essentially implement this circuit over here, which is you
have M [indiscernible], on for each fractional query you are trying to do.
So you start with do use 0, do this fractional query gadget, there should
have been a gate here which I have just pushed to the end, because it doesn’t
matter, there are not more gates happening on this line. Then you do U 1,
and then you do the fractional query gadget on the same state, because it’s
the same state you are trying to evolve with these fractional queries. And
you do this for all M and you collect it up into this one bit circuit that we
are calling a segment.
And what does this segment do for you? What it does is if you measured all
of these cubits and if you got zeroes for all of them then the circuit did
exactly what you wanted it to do. It did all the fractional queries right,
it did all the unitaries in the middle right, like you are just good to go.
So what’s the probability of all of them being 0? So as I said before the
probability of one of them being 0 is very close to 1 and it was bounded away
by some linear function in M.
So if you, it’s like a coin toss where the probability of the bad outcome is
like 1 over M and the probability of the good outcome is 1 minus 1 over M and
if you toss this M times what’s the probability that you always succeed?
It’s some constant, so this is a circuit that has some constant probability
of succeeding and it’s some fixed constant that you can compute based on this
M. So this is related to what you were asking before and it’s something,
maybe it’s like .1 or something. Yeah, it’s easy to compute with some tail
bound.
So this is a circuit that implements the map that we wanted to implement with
constant probability, where you have succeeded when all of these cubits are
0. So that’s great right, so that’s what we were heading out to do. We
wanted a circuit that takes, that effectively implements a 1 query algorithm
in the probabilistic or non-deterministic fashion. But, we also want the
circuit to make very few queries. Like as written, this circuit now makes M
queries. So the original circuit you are trying to implement, I mean this
fractional query algorithm, this makes only 1 query, its cost is 1, but now I
have given you a circuit that makes M queries so it seems like that wasn’t
useful at all.
case.
That was just a huge waste of time, but that’s not really the
So now is where the real magic happens here or maybe it happened in the
previous step, I don’t know, but something really nice happens which is this
is the circuit I showed you on the previous slide. So think of this input
state over here, which I am going to blow up over here. This is some fixed
state; it doesn’t depend on the oracle or anything. It’s just 0 with this
matrix R sub alpha acting on it. And R sub alpha, let me go back to where I
defined R sub alpha, here. All right. It’s a matrix like this and these
expressions are hard to parse, so I will just tell you what happens.
R sub alpha is actually a really tiny rotation and it takes [indiscernible] 0
to essentially something to [indiscernible] 0 and a very little bit of
[indiscernible] 1. So if you think of this tensor product state it’s a
tensor product, an M full tensor product of state that’s essentially
[indiscernible] 0 and a little bit of non [indiscernible] 0 things. So if
you write it down there is a large rate on the first term which is like the
all 0 state, then there is a little bit of weight on the state that have only
one 1 inside them and then there is even less rate on the states that have 2
ones and so on. So the weight decreases with the Hemingway of the strings.
So what you can do is this is a state that has superposition, like has weight
on every single bit string, but the weight goes down as the number of ones
increase. So what you can do is you can just kill that part of the state.
So this state that has very high overlap that has overlap with the high
Hemingway states, you can just kill that part, just truncate it. And because
the distribution is very strongly peaked around 0 this truncation is just
going to affect your circuit by an epsilon. So the question is: how far in
Hemingway do you need to go to be effected by epsilon? And that’s only this
much and this comes out of doing a [indiscernible] calculation.
So that’s this question of: I give you M coins that are very strongly biased
towards 0, so the probability of getting a 1 is like 1 over M and the
probability of getting a 0 is 1 minus 1 over M and I toss all of these M
coins and then what’s the most likely even? It’s probably you get all
zeroes, but in fact I don’t want to just find out the most likely event, I
want to cover all the events that have probability of 1 minus epsilon. So
you just need to go to maybe a couple of heads appearing or a couple of
zeroes, because it’s so unlikely that you get 1 that if you just accounted
for the possibility of there just being a couple of ones you have basically
covered all possibilities. And the possibilities that you have neglected are
only an epsilon fraction.
So what we do in this circle is just replace this state with the truncator
state and this truncator state has a property that it’s epsilon close, but it
has very few ones. Why does this help us, that it has very few ones? The
ones decide when the queries happen. So these are controlled query gates and
the queries only happen when there is a 1. So if most of the time, if no
queries are happening, then you don’t really need to be making that many
queries. So for example, if you know that this state over here has at most
this number of ones then really there is only this number of queries
happening in the circuit. Like it looks like there are more queries, but
there is a way to rearrange the circuit so that you only make this number of
queries, because in any branch of the super position there is only this
number of ones.
>>: Are you still doing M U’s?
>> Robin Kothari: Right, right, M U’s?
So the U’s are free.
>>: Why are they free?
>> Robin Kothari: So we only charge the number of queries made to this oracle
in this model.
>>: Okay?
>>: In query model.
>>: And that’s fine, but I am just saying the point is you had replicate the
U you would have done with one query to be now M U’s and you don’t care if
you only have to do two queries, I still have to do M U’s now instead of 1 U.
>> Robin Kothari: Right, right.
>>: And U is where the actual system is running.
>> Robin Kothari: Uh.
>>: The query is just telling you the parameter that goes at this point and
you still have to use it.
>> Robin Kothari: Yeah.
>>: So the depth of the circuit got massively bigger.
>> Robin Kothari: Uh, well actually, yeah, that’s not necessarily true.
>>: Okay.
>> Robin Kothari: What you need to do here now is I need to rearrange the
circuit so that it actually only makes this number of queries. Like, as
[indiscernible] still makes M queries. And in the new circuit, what will
happen is there will be a bunch of --.
>>: What if I use different U’s? Like a U that does M minus 3, which is the
0 case and then 3 of them that aren’t [indiscernible] the queries. What
about that?
>> Robin Kothari: There would be new U’s, yeah.
>>: Like I haven’t seen a reason why that M is not a lot more complex or more
depth then these U’s at the M the level of them.
>> Robin Kothari: Yeah, so if your U’s are complicated to do then this isn’t
going to help.
>>: That’s fair, I just needed to hear that and I am okay.
>> Robin Kothari: Yeah, yeah, yeah, I mean say in these query models the
assumption is the query is the main cost and reduce the dependence on that.
>>: Okay.
>> Robin Kothari: Yeah, so I mean all non query operations are free, anything
you feel like.
>>: I think you need to be more cautious here. I mean even if the cost of U
is negligible, right, what you are claiming is you end up doing
[indiscernible] an O of 1 queries.
>> Robin Kothari: Right.
>>: So that means that the entire thing, right, doesn’t depend on anything.
>> Robin Kothari: Uh.
>>: [inaudible].
>> Robin Kothari: Right.
>>: [inaudible].
>> Robin Kothari: Yes.
>> But, if M is very large, if like M tens to infinity then giving even a
tiny cost associated to U will add up and will add [indiscernible] and the
second point is it will outgrow. [indiscernible].
>>: Well and plus as the M’s grow you have to more of the fractional Q’s to
get the Hemingway correct. So then you also have to do more queries also.
>> Robin Kothari: Right, right.
>>: So it’s not a clean and simple setup. You can claim that it’s
practically useful, but in a [indiscernible] sense you have kind of ruined
them.
>> Robin Kothari: No, no, I guess, I mean maybe we need to come back to the
definition of what’s the standard query complexity model. So in the standard
query complexity model you are allowed to call the query the query gate Q
some number of times, that’s going to be the cost measure. But, you are
allowed to do arbitrary unitaries otherwise that are not charged for. Like
you can even solve un-definable problems, you can solve exponentially hard
problems. The standard query complexity model allows you to do input
independent unitaries, whichever ones you like, as many as you want.
>>: Mr. Kothari we agree, we are talking more about if I was asked to put
this on a machine there are practical reasons why you can’t ignore the other
items.
>> Robin Kothari: Right, right, so when I get to applying this to Hamiltonian
simulation I will need to care about this, because I am going to give you a
bound on the total number of gates needed and I am going to show you that are
also polynomial. So when I apply this to the specific case of Hamiltonian
everything is efficient, but in general there is not much I can say, because
in the query complexity model you are just allowed to use gates that are just
crazy to implement. So, you U 0 itself could be solving an un-decidable
problem and then there is nothing you can do in any model to make that
efficient.
>>: Also, I have an issue here if whether alpha should be literally seen as
depending on M.
>> Robin Kothari: Sorry, what do you mean?
>>: So right now if I look at this literally, alpha is some magical small
anvil.
>> Robin Kothari: Uh, so alpha is just --.
>>: Like you do not have to change as M grows.
>>: Oh, yeah no.
>> Robin Kothari: No, no alpha is 1 over M.
I haven’t defined it here.
>>: [indiscernible].
>> Robin Kothari: Right, so alpha is the exponent of Q so when I define the
gadget, so this little gadget it does Q to the alpha for you. So in our
example we wanted Q to the 1 over M, so alpha is 1 over M for the next two
slides. So here, this alpha is 1 over M; it’s exactly 1 over M.
>>: So we know that in practice the cost of doing small anvil increases as
the anvil increases. I mean even with [indiscernible] magic.
>> Robin Kothari: Oh, you mean for actually implementing [indiscernible]?
>>: In terms of actually implementing the R’s.
>> Robin Kothari: Yeah, yeah, yeah.
>>: But, his comment is there are very few of them because you only have to
do a few queries. So the [indiscernible].
>> Robin Kothari: No, I guess what I
charge for this, but when I actually
for it and it won’t be too bad, like
will be some dependence on M when we
exactly this thing.
am
do
it
do
saying is that in this
Hamiltonian simulation
will be fine. It will
Hamiltonian simulation
model we don’t
I will charge
scale, there
because of
>>: But I would write alpha depending on M right there, so there is no --.
>>: [inaudible].
>> Robin Kothari: Yeah, yeah, it’s a good point; maybe I will put alpha
equals 1 over M on the side, that’s good.
>>: [inaudible].
>>: Well let him get to the point where it actually costs him.
>>: There are more simplifications ahead.
>> Robin Kothari: Right, right, so you are concerned with the query cost?
>>: Is it an amortized cost? You had said you ran over all these binary
screen, but they are heavily concentrated at the low wage. So couldn’t you
use that to kind of say how many of these queries actually executed? Don’t
you have to accommodate for the worst case?
>> Robin Kothari: Right, yeah, yeah, so this is not a proof. So this is a
circuit that still makes M queries, because how do you count the number
queries? You look at the circuit and see how many cue gates there are. So
from this circuit I would have to write down for you another circuit that
only has this number of cue gates in it. Like the full circuit, when you
look at it drawn on a piece of paper with only a constant or whatever this
number of cue gates run and I am just saying that can’t be done morally
because they aren’t really happening in super position.
But, I need to show you that, which I am not going to because that’s kind of
technical. So, but yeah, morally I am trying to convince you that in super
position, if every branch of the super position only makes 5 queries even
though the gate has 100 queries then some how there should have been a way to
write this circuit down so that it only made 5 queries. So I guess that’s
what I am saying.
Yeah, okay, so that’s the last piece. That was to show you how this is done.
So that completes this chain of reasoning over here. Okay, so, okay, I will
probably just take 5 or 10 more minutes and wrap this up. Okay, are there
any questions about this chain here? I wanted to prove that, so I reduced it
to this, which I reduced to this, then I proved this and then by the sequence
of reductions I have proved what I wanted to prove. Does that kind of make
sense?
Okay, yeah, so now this reduction from here to here is also kind of technical
so I am just going to try to convince you why Hamiltonian simulation is at
all related to a fractional query simulation, because at first glance they
have absolutely nothing to do with each other. One is a result about some
query thing, some exotic query model and the other is about Hamiltonians. So
what is this reduction? So take a very simple case of you have two
Hamiltonians, H 1 plus H 2, H is the sum of these two Hamiltonians and let’s
define Q wanting to be E to the minus I H 1 and Q to be E to the minus I H 2.
And you want to do this, you want to simulate the Hamiltonian H for time T,
so that means that you want to do this gate, or this unitary.
So we know that approximately, by the [indiscernible] product formula that
this matrix is approximately, you do the first one for time 1 over M, you do
the second one for time 1 over M and then you do this whole thing, M times M.
And this is approximately true, the error depends on M. So this guy I have
defined as Q 1 and this guy I have defined as Q 2. So this gate is
essentially Q 1 to the T over M times Q 2 to the T over M done M times.
>>: Just a notational question: isn’t that M queries to Q 1 and Q 2?
>> Robin Kothari: Uh, no, no, you mean here?
>>: Yeah.
>> Robin Kothari: Well I guess it depends on what you mean by queries. But,
like if you think of Q 1 and Q 2 in this fractional query model where you
only charge for the total sum of the exponents this is T over M and it is
happening M times, so it’s really only happening in T times in this magical
model where you only charge for these things. But, as I previously convinced
you, even though this model seems crazy you can actually convert it to the
standard model.
So, right, so if you go through this conversion it turns out that this
actually works, but you need to decompose the Hamiltonian into a sum of
Hamiltonians, each of which look like query articles. They look like the
kinds of things that the previous model assumed. So how do you get that or
what does that mean? Actually, let me go back to the slide where I described
that. So this fractional query gadget, the only assumption we made on Q is
that Q was a unitary that squared to 1, which is a property shared by this
unitary of course. When you square it you get 1 because minus 1 squares to
1.
So you need to come up with a sum of Hamiltonians such that E to the minus I
H 1 squares to 1, which means that if the eigenvalue is only plus minus 1, so
if the unitary is eigenvalues are plus minus 1, the Hamiltonians eigenvalues
values are 0 or pi. So you need to decompose your Hamiltonian into a sum of
Hamiltonians, all of which only have 2 eigenvalues 0 and pi. So now that’s
the new challenge, like I give you an arbitrary sparse Hamiltonian and you
need to break it up into sum of Hamiltonians, each of which have two distinct
eigenvalues and they need to be exactly 0 and pi.
Well now that sounds like just a crazy problem. How can I solve that or why
would that be something that’s easy? It turns out that this is actually kind
of easy. People have done it or done things similar to this before and a lot
of the previous Hamiltonian simulation techniques have studied this kind of
problem. So the way that you do this is you break it up into two steps:
first you would decompose H into a bunch of 1 sparse Hamiltonians and what 1
sparse means that in every row or column there is only 1 nonzero entry. And
almost every Hamiltonian simulation technique until now has had this as the
first step. You always decompose it into the sum of one sparse Hamiltonians.
This can be done; there are bunches of ways of doing it and we give a new way
of doing it in our paper, which is better in some respects. But, anyway,
that’s doable.
And then given a 1 sparse Hamiltonian you can decompose into Hamiltonians
that have only 2 different eigenvalues. So that’s essentially this step, so
you take this Hamiltonian, break it up into a bunch of Hamiltonians, each of
which are something that the fractional query model understands and knows
already how to deal with and then you use that whole reduction and see what
you get at the end. So you have to specifically do it, like you have to do
it again for Hamiltonian simulation, like it’s not easy to see if you just
plug it in, what happens. So we go through and do this and try to compute
what is finally going to happen.
>>: So if I have N terms to start with, right, how many do I have here at the
bottom?
>> Robin Kothari: Here?
>>: Yeah.
>> Robin Kothari: So if you have N terms in the sum? Uh, I am sorry what do
you mean? Like if the Hamiltonians already decompose --.
>>: The sum of H of N Hamiltonians and now I am going to go and break each
one of those apart, each one is going to explore by how much?
>> Robin Kothari: Okay, well that depends on the property of the individual.
So what do your terms look like are they de-sparse or are they?
>>: No.
>> Robin Kothari: Uh.
>>: I am doing final chemistry.
>> Robin Kothari: Oh, okay.
>>: So if you are going to span the entire some of them will be very tiny,
some of them will span the entire set of cubits.
>> Robin Kothari: Right.
>>: So they are one sparse in this language and [inaudible].
>> Robin Kothari: Right, right, right.
>>: [inaudible]?
>> Robin Kothari: So like individual Hamiltonians are like products of poly
operators, right. They are like sigma Z to like sigma X whatever something
of that kind.
>>: [inaudible].
>> Robin Kothari: It’s, I mean, um --. So actually if you have these poly
operators as in quantum chemistry you don’t really need to do this. I mean,
the only thing we wanted from your Hamiltonian was it had two different
eigenvalues; poly operators already have that. So you don’t need to go
through this decomposition. So in your application to quantum chemistry,
yeah, forget about this, you can directly jump to the end, because you
already have just 2 eigenvalues.
>>: Okay, so I don’t explore my number of [inaudible]?
>> Robin Kothari: No, no, uh, right. We are trying to deal with the general
case where they are space, so we need to worry about this. But, yeah that’s
nice and in specific cases where you already have a decomposition into a nice
form you can avoid this stuff. So yeah, that’s what essentially links these
two problems together. I mean I have only sketched it because the details
are gory, I guess. So this is what I would explain. So, you know, I have
convinced you that this thing makes sense and on the previous slide I have
showed you the reduction, so now the claim is that this makes sense based on
all of this that I have tried to convince you in the last hour.
Um, okay, and there I also a lower bound, but maybe I will skip that. I
mean, it’s not as interesting, I mean people who work in query complexity all
the time might be interested in seeing exactly how the lower bound goes, but
maybe I can tell you later or I can tell you if there is time. Instead I
would just like to highlight some other results that are in this paper or I
will go back to give you a big picture perspective. So the previous
Hamiltonian simulation algorithms had this log style independence on this
Hamiltonian, on N and we show how to get rid of this and it’s not specific to
our algorithm. We show how to get this in other algorithms.
So there was a reason this log star end kept coming up and we just tackle the
thing directly and show you that, well you never need it in the first place.
It’s not a big deal, like in log star, it’s at most 6 for all these values of
N, but theoretically it’s very unsatisfying to those, this annoying function
of N. It also made the paper’s really long, because every time you would
explain this log star N bit that takes like 3 pages, because the algorithm
that gives you log star in running time has to be complicated. How would you
get a function like log star unless you had a complicated algorithm to serve
it? So that’s nice and our algorithm also has gait efficiency.
So this get’s to the point of how you actually do this. So, and by gait
efficient I guess we mean that the total number of non-query gaits is in the
total number of 1 or 2 cubit gates that we use in addition to the query gates
this is pretty comparable to the query complexity. Like, it’s maybe a log
factor over the query complexity and our algorithm also works for time
dependent Hamiltonians and there is some dependence on the first state of the
Hamiltonian, etc. If you care about time dependent Hamiltonians. We have
some improvements. There is a D squared term in our running time, but if the
Hamiltonian is local then it’s actually only D, there are some savings if you
know the specific form for the Hamiltonian or of you know an explicit
decomposition to poly’s like in the quantum chemistry example, then there is
a speed up you can get.
And yeah, as I explained, the error is optimal for both problems, meaning
both for Hamiltonian simulation and for the fractional query model, if you
think of them separately. We show two different lower bounds showing you
that you need this dependence on error. So, it’s not possible to improve
that. So, yeah, so what’s open? So what’s open is like our algorithm, say
for the fractional query model or for the other model it goes like this: we
know a lower bound of this expression. So, the epsilon dependence by itself
is tied. We know that you need linear number of queries, this was shown
before, but like that doesn’t mean that it has to go like this. I mean in
particular this goes like T log T “ish” in T, so there is definitely
possibility of improving that, but also it doesn’t need to go like this
particular function. It could go like T plus log over epsilon, log, I mean,
there are possibilities for improving the upper or lower bound, I guess is
what I am saying. Not by too much, they are kind of close, but there is some
way to improve it. And then of course there are applications like what we
can do in instance that people care about; for example quantum chemistry.
And then there is this third question which is very specific to people who
work on this: that our dependence on the degree is quadratic, but we think it
should be linear and we don’t know how to do that and it’s important for
things that have high degree. So that’s something that’s also open, okay,
yeah, I think that’s the end of what I wanted to say. Are there any
questions or anything?
>>: So for number 3, do you see any way around this problem other than
devising a graph coloring algorithm that’s as sufficient as [indiscernible]?
>> Robin Kothari: No, no, I think what’s going to have to be done is someone
is going to have to come up with a good [indiscernible] coloring algorithm
that uses [indiscernible] of D colors to color a [indiscernible] by
[indiscernible] graph.
>>: [inaudible].
>> Robin Kothari: Yeah, well that only makes your life easier, but I guess
you could assume is [inaudible]. The only way I know [inaudible].
Any other questions?
[clapping]
Okay, so --.
Download