>> Alex Bacharov: All right. Hello, everyone. ... from Macquarie University from down under. He’s an Australian...

advertisement
>> Alex Bacharov: All right. Hello, everyone. Today we are hosting Dominic Berry
from Macquarie University from down under. He’s an Australian research counsel
future fellow at Macquarie working on the quantum formation and metrology, and
quantum optics.
Today he will tell us about progress in quantum Hamiltonian simulation. Without
further ado, Dominic Berry.
>> Dominic Berry: Thank you. So I have two parts to this talk and this part is
looking at some brave new work on doing compression of trotterization for
Hamiltonian simulation, which is actually giving us speed ups which people have
been trying to get for a few years unsuccessfully, and then the second part is some
slightly older work, which was probably just last year, but that is on using quantum
walks for Hamiltonian simulation.
And that does out-perform the newer work in some ways, but the newer work outperforms that in other ways. So with this I encourage everyone to ask questions at
any time during the talk because I don’t want people to be just sitting there and not
understanding things, and I ‘d rather not get to the end of the talk then have
everyone be bored. Okay? So that’s for the very simple introductory stuff. The
computing was originally coming from the idea of Feynman [phonetic] that you
could use quantum systems to simulate other quantum systems, and you adjust the
thinking of having some physical system, which would be encoding on a different
system in a quantum computer and then you would be trying to have evolution in
your state of your quantum computer, which would be mirroring what the actual
evolution was in the physical system you were trying to simulate.
So essentially that’s the type of thing that we’re looking at. And these two different
scenarios which people have looked at for this first of all was the scenario by Lloyd,
which was that given that you are looking at some system which is a [inaudible]
product of a relatively small system, so a product of q bits on the systems with
limited dimension, and then in quantum systems the Hamiltonian typically isn’t
some global thing over the entire system, it’s going to be a sum of interaction
Hamiltonians, which are just interacting between the two particles or two
subsystems. And what you get then is you get a Hamiltonian, which is a sum of
terms, and then these individual terms, the things that are going to be easy to
simulate.
Now, and alternative scenario that has been looked at is that you have a
Hamiltonian, which is sparse, and the idea there is that, well, if you had a
Hamiltonian which is the sum of terms like this it would automatically be sparse, but
then you could imagine that you don’t have something which is given to you in a
nice form like that. You have something where if you are given a row number you
can calculate where the non-zero elements are in that row of the matrix, but if you
don’t already know how to actually express it as a sum, and this is essentially this
[inaudible] simulation problem, which you could use for more physical simulation
problems, but you can also use it for using Hamiltonian simulation as basis for other
algorithms.
So people looked at this for things like they [inaudible] and Lloyd work on solving
linear systems. Okay? So in the standard methods going back to Seth Lloyd’s work
in ’96 you have your Hamiltonian, which is a sum of terms, and these individual
Hamiltonians the HK is a limited dimension in direction Hamiltonians, and then for a
short time you can express the evolution like this, so it’s just a product of the
evolutions under those individual Hamiltonians, and then for long times you just
divide it up into R intervals and you want to choose the R in order to make that
overall area small, whereas if you just did this on the entire time then the area
would become large for larger T.
So for the sparse Hamiltonian simulation you’re thinking of something like this
example here. So in this example we have no more than two elements in each
column, and correspondingly each row, and of course its emissions. So you have an
element there corresponding to a complex conjugate there.
So then you would be describing, you would imagine that there would be some way
you could calculate given a column number where the non-zero elements were and
what the values were, so you would describe this by an oracle and say that you were
given a JK for the row and column number, for example, in some [inaudible] state,
and then it’s putting the value of that matrix element into that [inaudible]. And then
you would be using that to try to get the simulation.
So to give you some, the quick overview of various techniques we’re doing. So the
original proposal was by Aharonov and Ta-Shma back in 2003, which was for
decomposing that should read one sparse Hamiltonian actually did, the
decomposing the Hamiltonian into one sparse Hamiltonians, and then the work by
me and Richard Cleve and some other people was working on improved
decomposition techniques, which are more efficient. And Andrew Childs has also
improved on that further.
And the other thing, which we want to improve, is the scaling of the complexity in
terms of the overall time of the simulation. And for that you can use higher-order
decomposition formulae, rather than just a standard trotter formula. So these are
given. These arbitrary order ones are originally given by Suzuki in the ‘90s and then
we are looking at applying this to the quantum simulation paper back in 2007.
Yeah?
>>: [inaudible]
>> Dominic Berry: Oh, okay. So this is actually two sparse because it has no more
than two non-zero elements in any row or column. So the one sparse is just
breaking it up so you have just one non-zero element in any row or column. So
okay. So ->>: So you’re breaking things up as a [inaudible]?
>> Dominic Berry: Yeah, into a sum.
>>: We would like to know what you’re doing, of course.
>> Dominic Berry: Yes, it can always be done, but if all you’re given is [inaudible] -The algorithm given the row positions of things that are given row then it’s hard,
tricky to do that a [inaudible] way.
And you would imagine that if you have, say D non-zero elements in any column,
then you could just break it up into a sum of D one sparse Hamiltonians, but that
turns out to be very, very tricky and we’d have no way of doing that coherently on a
quantum computer. And the best we can actually do is a D squared, and this work
by Andrew Childs and Robin Kothari was actually breaking Hamiltonians up into the
order D. But these Hamiltonians were efficiently simulatable but they weren’t
actually one sparse. So we still don’t have any way of systematically breaking up
into least and order D squared one sparse Hamiltonians.
>>: What was the D?
>> Dominic Berry: Oh, the D was the number of non-zero elements in a row or
column.
>>: How [inaudible]?
>> Dominic Berry: Yeah. So just looking at what the maximum ->>: [inaudible].
>> Dominic Berry: Okay. So Hamiltonian simulation of course is going by this
formula and there are a number of quantities, which we’re going to be talking about
here. So I’ve already mentioned we have the time of evolution of the Hamiltonian,
we also have the sparseness, and another important thing is the epsilon, which is the
allowable error in the simulation.
So one problem with a lot of earlier algorithms was actually no scaling polynomially
in epsilon. And ideally we’d want algorithms, which are scaling polynomially in the
log of one over epsilon, which would allow you to do accurate simulations far more
quickly and that’s what we’re actually achieving with this new approach.
We also have another couple of things. The norm of the Hamiltonian to be simulated
and typically the norm of the Hamiltonian pairs up with the time because you can
just rescale the Hamiltonian and the time to give you the same problem and you can
also look at time-dependant Hamiltonians, which I’ve indicated there. And then you
have the norm of the time derivative of the Hamiltonian become something
important. And you can also be looking at norms of higher-order derivatives as well,
but I won’t really be getting into that in this talk.
And there’s also the dimension of the system, which I won’t really be looking at
scaling in that in this talk just for simplicity. Okay. So as I was saying before, the
standard methods you always get scaling the polynomial in the allowed area and
also if you’re using these trotterization approaches for time-dependant
Hamiltonians you get a result that depends heavily on the derivatives of the
Hamiltonian. So if you have a rapidly changing Hamiltonian that’s going to be
inefficient.
Also -Yeah?
>>: So [inaudible] scaling here?
>> Dominic Berry: Yeah. This is scaling of the complexity so [inaudible] number of
articles the Hamiltonian. You eventually quantify the complexity and two things,
one is the number of oracles the Hamiltonian and the other is the number of
additional gates, and really you want it to be efficient in both of those things.
Yeah?
>>: [inaudible]
>> Dominic Berry: Yeah.
>>: Suppose you have in each column [inaudible]. But some rows ->> Dominic Berry: No, no. This is [inaudible]. If you take the transpose complex
conjugate it’s the same thing. So it’s always going to be the maximum in each, the
same maximum in the rows as it is in the columns.
>>: [inaudible]. So you want always to take the maximum of both?
>> Dominic Berry: Yeah. Well, it’s the same number.
>>: It’s always the same.
>> Dominic Berry: Yeah. Now, you can also look at a slightly different problem
which will give you, looking at -- you could also look at things like unitary
implementation, in which I will have right at the end of this so I might have time to
talk about it, but they have to worry about the sparseness in both directions.
Okay, so yeah. Getting back to here we have a scaling which is always super linear
in T when we do these trotterization approaches, whereas the lower bound is
actually strictly linear scaling in T and also the scaling is at best D cubed in the
sparseness and that’s using the scheme of Andrew Childs and Kothari.
>>: What do you mean the lower bound?
>> Dominic Berry: The complexity of the simulation. So essentially it’s telling you
that you can’t -- if a system is evolving over time in T you can’t simulate the
evolution of it with complexity that’s less than linear in T. So ->>: What does it mean scaling the complexity? [inaudible].
>> Dominic Berry: Well, once again the scaling and the number of oracles and -yeah.
>>: What is the best? What is the worst?
>> Dominic Berry: Well, the worst is infinitely if you don’t have an algorithm.
>>: Right. But if I do have one [inaudible].
>> Dominic Berry: Well, if we don’t answer these ones. L is going as D to the power
of four plus one or two K I think, so it’s effectively D to the fourth. And theirs goes
like D cubed. And if you want to know what the lower bound is then the lower
bound is actually root D, which is because if you can simulate, if you’re thinking of
having a problem which is encoding a search problem, then the lower bound on the
complexity of search means that you can’t solve it in less than root D. Okay.
So these are two distinct areas. So this first one is, these things are improved by the
first algorithm, which I’ll be talking about. That’s the new one. And this second pair
of things is improved by the quantum walks approach, which is in the second half of
the talk.
>>: Are you assuming so [inaudible]?
>> Dominic Berry: No, we don’t have any dependence on condition number at all.
>>: I’m afraid to lose you. So at best means at worst case if the cubed is the worst
case, what can you reach in the worst case?
>> Dominic Berry: Essentially yeah. So if you have an artificially sampled problem
then you can solve it more easily, but the only -- any algorithm which is going to be
solving looking at solving all problems is going to be one problem which is going to
be order D ->>: [inaudible] So worst case which say, average case -- Sorry, if average case.
Because when you go ->> Dominic Berry: For these types of things the average case is actually basically
the same as the worst case. And we have a slightly different thing with the second
algorithm because with the second algorithm we can actually prove that we have
scaling at worst as D to the two-thirds, about for any realistic problem it goes like
virtually root D. And it’s just that we can’t rule out really, really pathological
examples where it would be going like, D to the two-thirds, but this is D cubed
scaling isn’t like that. It’s like you’d expect pretty much everything to be scaling like
that.
Okay. So this is the first algorithm. Their calling this poly log Hamiltonian
simulation because we get simulation which is linear in some things, in the time for
example, and then it has a poly log factor, which is basically all the parameters
which I was talking about. So to go into the results in a little bit more detail, so our
result was that we can decompose a sparse Hamiltonian into order D squared one
sparse Hamiltonians, and we have a complexity which is, so this log star is
essentially if you’re starting with N you look at how many times you have to take log
to get down to two or three or something like that. And if you’re looking at any
sensible number it’s labeled no more than five or so.
So you can really imagine that that’s constant. We can improve that to order D and I
should’ve corrected this so this isn’t actually one sparse Hamiltonians, it’s efficiently
simulated Hamiltonians but then there’s an additional cost of complexity in linear in
D, and that’s that Childs and Kothari paper. Then if we look at using the arbitrary
order the product formula, actually called Lee Trotter Suzuki formula, we get a norm
of H times T to the pair of one plus one on 2k. And that K is like the order of this
product formula.
And in the scaling in terms of the allowable errors is like one on epsilon to the one
on 2k. So if a large K you can make this a small pair of epsilon, but you can’t make it
logarithmic in one on epsilon. And so this is actually the summarizing the result in
the second part of the talk, which is that we can use quantum walks which gives us
strictly linear scaling in the norm of H and T, and then that makes the scaling in the
error worse.
Then if we’re looking at time dependent Hamiltonians, then we get a similar scaling
as this except we now have dependence on all the derivatives of that Hamiltonian up
to 2k order of derivatives I think in here.
>>: Just for completeness, what’s the definition of the norm of H here?
>> Dominic Berry: So if I remember correctly this is like the two norms. So if
you’re thinking of the two norm of the state and it’s multiplying the Hamiltonian by
the state it’s the largest factor by which the norm of the state can be increased.
>>: Okay.
>> Dominic Berry: Or if you’re looking at [inaudible] values it’s the ->>: The square of a large [inaudible].
>> Dominic Berry: Yeah, yeah.
The norm of maximum K, you don’t have a square factor in there. Okay. ->>: There’s various [inaudible].
>> Dominic Berry: Yeah. The important thing is that the norm is actually linear and
out of sight. Multiply each by a constant factor and that norm goes up by a
consistent factor.
Okay. So then another thing is if there’s another algorithm, which was proposed by
Doug Poulin and colleague who back in 2011, which was for time-dependent
Hamiltonians, which actually managed to get rid of entirely rid of the dependence
eon the derivatives of the Hamiltonian, but then it had much worse scaling in
epsilon. I think that was like one on epsilon cubed, or sorry, one on epsilon squared
in that case. And it also had worse scaling in the action T. Okay. So for the new
algorithm we have here we have scaling that goes like the norm of H and T, which is
like a law bound so you have to at least have that scale, and we also have a D
squared, which is improving on the D cubed, which was the best before except in the
case of the quantum walk approach, and then it’s poly log in everything else.
So the poly log scaling in the era has never been achieved before, and it’s also you
have the poly log dependence on the derivative of the Hamiltonian. So unlike the
earlier approach, which had poor scaling in epsilon but no dependence on the
derivative, we have some dependence on the derivative it’s only on the log of it, and
the scaling in the log of H and T is linear up to the poly log factor. And the only time
that’s been better before is with our algorithm, which had actually worse scaling in
epsilon.
Okay. So to give you the method for this, this is essentially based on the
Trotterizations, so well, first of all we want to decompose the Hamiltonian in to one
sparse Hamiltonians. Second thing to do is to decompose those Hamiltonians into a
sum with equal weightings of the sum of self inversed Hamiltonians and then we can
use just a simple Lee Trotter expansion of the evolution into a product of evolutions
under those self inverse Hamiltonians. Then we have a method in an earlier paper
for using some control cubic’s to turn this Trotter, and so this part don’t worry so
much about what this part means, I’m going to explain this in detail, they turned the
Trotter decomposition to essentially discreet sticks at super position of times, and
then we’re using a method which we just have in this recent archive paper for
performing efficient measurements on these control qubits. So don’t worry too
much if you don’t understand what these things mean, I’m going to explain them.
Okay.
So the first part is the decomposition into -- Yeah?
>>: Which [inaudible]?
>> Dominic Berry: Well, probably I’d say this one because decomposing into the
sum of self-inverse Hamiltonians. So people have looked at a few different
approaches before and such a quick workout to get a Hamiltonian each with sum of
things, which you could apply the rest of the recipe to. So this is what makes it all
work. Okay?
So just going into the idea -- So I think I’ve probably explained most of this before.
We have this matrix for representing the Hamiltonian. In this case is has no more
than two elements in any column and what you want to do is decompose it into two,
well, in general you’d have to decompose it into six D squared parts if you are using
the general recipe on a quantum computer, but just doing it by eye you can break it
into these yellow parts and blue parts, and each different color is corresponding to a
Hamiltonian which is one sparse. So it only has one element in each row or column.
So then, well, you can just use the scheme from the old paper there. Then the next
step is to break these one sparse Hamiltonians into self-inverse Hamiltonians. So
this is the trick that makes it all work.
>>: [inaudible]
>> Dominic Berry: Well, when I’ve worked out a special example to give in a
presentation.
[laughter]
Okay. So this is just for illustration. I’ll show you in general you’d have to do it into
six D squared parts. Yeah. Okay. So now I’ve grayed out what was the yellow ones
here in this slide, and then I’m thinking of breaking these colored ones up further.
So what I’m -- and I’ve, oh okay. So this one, which is a very light red is what I’m
calling the X component, and that is essentially things that are off the diagonal and
real. And if you think of like an X poly matrix then that’s like ones on the cross
diagonal. So essentially if you’re looking at a two-dimensional subspace, which is
like composed of the points on those row and column numbers, then this is
proportional to an X.
And similarly with the green ones here that is things which are off the diagonal and
imaginary. So that is essentially things which are proportional to sigma Y on that
subspace. And similarly with the zed [phonetic] is just things on the diagonal. So
the general idea is that we then are going to want to say that well, if we just looked
at this one on its own then that would be, we could say that that’s like a minus root
three times an X, and the X is self-inverse. But the problem is that we’re not going to
get things which in the different two-dimensional subspaces which all have the
same multiplying factor.
So if you look at this for example, you have a root two times I here and an I here, and
if they were both I, then you could just have something self-inversely, but that root
two makes it not work.
So this is just summarizing what I’ve just said. So I think I’ll just skip to the next
slide. So now what I’m showing is just these two elements, which are X ones. So
now what we want to do is decompose these into components of the same
magnitude. So ideally in practice what this epsilon H would be something very
small like 10 to the minus 10 or something so that you can accurately approximate
the entries. But here I’m just taking it to be one quarter for illustration.
So to get accuracy one quarter you break it up into things of magnitude two times
one quarter or one half. So minus one third you could approximate as a minus a
half, minus a half, minus a half, and then if you’re thinking of allowing things up to
magnitude of two just put in a plus zero. So then you’re thinking of having this part
going into a Hamiltonian, this part going into another Hamiltonian, etc. So you’re
breaking it up further into Hamiltonians, each with equal magnitude.
So here I’m just thinking of say taking the second component. So then we’re just
looking at a minus one-half in these entries. And if you just ignore everything else,
which is grayed out. So then if you are just looking at these then you have a bit of a
problem because you have things which are minus a half, and you have also in
general things which are plus a half in there as well. But you’d also have to have
zeros because you can’t have zeros to have a self-inverse matrix. So what we want
to do is to break it up further and say whenever it’s plus a half we say that it’s plus a
quarter plus a quarter, if it’s minus a half it’s a minus a quarter minus a quarter, and
a zero we just match a plus a quarter with a minus a quarter.
So then we’re going from two [inaudible] values, so you have something
proportional to a plus or minus one, which is what we need for a self-inverse matrix.
So in this example I’m just saying taking the first component and saying we’re going
to look at the one which is minus a quarter for these, and then we have a whole lot
of rows and columns with just zero in them. So we can just fill in with one quarters
on the diagonal and so then I’m indicating here with sort of the subscripts that
Hamiltonians are breaking things into, and I’m calling it a plus there to take the first
component. And for the minus component you’d take that minus a quarter on a
diagonal.
So then if you then take that, so remember that one quarter was the epsilon H. If
you take that out you have the epsilon H times a one sparse matrix. And so you’ve
broken up -- that epsilon H is small you’ve broken up your Hamiltonian with a good
accuracy into just the sum of one sparse matrices.
>>: So in general do you need long precision matrices here or something?
>> Dominic Berry: Well just ->>: Just three [inaudible] or four?
>> Dominic Berry: No, no, no. This epsilon H is going to be something tiny. Yeah.
So you’re breaking it up into an enormous number of self-inverse matrices, which
totally ->>: How big is the enormous? Is it long or [inaudible]?
>> Dominic Berry: No, it’s not long in one over epsilon. It’s going [inaudible] it’s
going into compression. Yeah.
>>: Linear. I mean you can’t do any kind of [inaudible] point scheme? You see what
I’m saying?
>> Dominic Berry: Yeah. So it turns out that the compression scheme doesn’t really
work with, well, it wouldn’t really be floating point anyway. If it was binary
expansion then you would need things which were self-inverse but with different
weightings, and then we’d have a compression scheme which doesn’t really work in
that scenario.
And one of the things I think is ->>: Weight would be all over the place.
>> Dominic Berry: Sorry?
>>: In the case I was proposing. It would be all over the place.
>> Dominic Berry: Yeah, yeah. Okay. So now getting to the Trotter expansion, so
now for illustrative purposes I’m just assuming we’ve broken this up into a sum of
two Hamiltonians, which are then going to be self-inverse or proportional to selfinverse. So in general we’re going to have a time-dependent evolution for
expressing our evolution of our quantum state. And this T is just, script T, is just
indicating the time ordering operator, which you don’t really have to worry about.
But you can think of that as breaking up the time into a large number of small time
intervals where you have evolution on the one Hamiltonian under that short time at
one time TJ, and one time evolution under the other Hamiltonian, and then in the
Trotterization you are basically taking a finite value of R.
So then the error is then scaling like, so this in the time-dependent case this is going
to be a maximum of the norm of H and the norm of H prime. It’s going to be a
maximum of that times the T squared over R. So we would think of doing this first
of all on our decomposition into one sparse Hamiltonians and then have a further
decomposition into the self-inverse.
So the idea of bounding the complexity then is the complexity is basically
proportional to this R, so if you want to have error no larger than epsilon you just
swap around an epsilon and an R and have the R’s proportional to the lambda T
squared over epsilon.
So this is what gives you the one over epsilon scaling for standard trotterization and
also a T squared scaling. So now the next trick is that when we’re looking at things
which are self-inverse then we want to -- then if it’s say epsilon H times U of one,
then this evolution of [inaudible] identity here should cause theta times an identity
minus an IU one times sine theta.
So the thing is that when this U one is self-inverse you always get an expression like
this and you don’t get a [inaudible] of U one because U one squared just gets you
back to the identity.
And this thing you can implement probabilistically using a control qubit. So what
you want to do is sit a control qubit in the super position of zero and one, then have
a control on that to do your U one on your target state, and then if you have a
projective measurement, which is essentially projecting onto with that brow
[phonetic] you’re going to get this funnel state with that evolution under that
operator. And if you’re looking at having a delta T which is small, then this theta the
theta is actually epsilon H times the delta T, and we’re wanting to have epsilon H
really small, which means that this state is very close to zero, this state is very close
to zero, and we have an overall failure probability, which is just going to like epsilon
H delta T.
So if we just have one of these then it almost always works. So to turn things into a
form which I’m going to be using for the larger scheme, instead of starting with this
you can think of starting with a zero and then having a rotation which is rotating it
into and putting those real parts on with the rotation matrix and then have a P
which is just putting the phase part on so that we get this state here.
And similarly you can have a rotation and have the final detection the zero state.
>>: [inaudible]
>> Dominic Berry: Yeah, yeah. So it essentially it doesn’t actually give these one
control qubits, it gives things sort of in a more -- actually I’m not sure I’ve never
exactly reached figures that were in that type, but it does explain things in detail.
>>: So you’ve made a zero one? Which is [inaudible]?
>> Dominic Berry: Sorry? Oh. It measures as a super position of zero and one. So
to think of it in terms of measuring in zero and one basis you have to put that
rotation over it first.
>>: You’re saying it’s forced you will get that there over time?
>> Dominic Berry: Yeah, with the [inaudible] you get that measurement result, but
we have to really think of what’s also going to happen if that fails how to go about
correct if that fails.
>>: What is the coefficient that you get one?
>> Dominic Berry: That is going epsilon H delta T. The idea is that we’re making
both of these small and we have an overall probability of success, which is
reasonably good.
>>: And if you get one you just throw it out?
>> Dominic Berry: No, no because we have to do a sequence of things, which means
we have to do a simulation which is basically reversing the thing and if we have a
failure we actually hit like a square root of U right here. And that’s something which
we can correct deterministically. So we can -- it’s actually quite a complicated
procedure because you have to go back through the entire thing and correct it and
then you have a probability of a quarter of your correction failing, in which case you
have to correct again but you overall have a -- it’s like a random walk except you
have a very strong weighting of going towards success.
>>: [inaudible]
>> Dominic Berry: Yeah. Yeah, that’s another very good self-inverse is important
and they’ve actually been looking at sort of other scenarios which could potentially
make things nicer in other ways but then the correction operations we don’t really
know [inaudible]. So that’s why we go with this approach.
Okay. So now as I was saying before we’re looking at an overall total time of T that
we’re wanting to simulate over, and we would be thinking of using a Trotter
Decomposition over that entire time, but then breaking up the time into intervals of
length one quarter and it’s one quarter in its original paper but it’s going to be
something slightly smaller in our scenario. And so then we call N for the number of
Trotter intervals which we are using in this time period of length one quarter. So
then when we do that the probability of success for using that controlled operation
multiple times is going to be at least three quarters, which means that we have a
good probability of success and if we don’t have a success and we have a good
probability of correction.
Okay. So now this is illustrating what happens if we have many controlled
operations. So we’re thinking of having basically each of these is what I was
showing before except I’ve throw out the Ps for simplicity. So in a Trotterization
you wouldn’t have a U2 U1 U2 U1 etc all interlinked together, and then we have each
of these controls on different qubits. So then if we look at what’s happening at the
beginning here these are all very tiny rotations.
So this was a rotation like this where the alpha is approximately equal to one and
the B is the one over root M, and the M was the number of these things, which is 16
in this diagram. So then you’re looking at this state here it’s like this tense at M
times and this is like [inaudible] in the super position, and what it’s going to give
you because the more betas, this is essentially the weighting is going down
exponentially in the -- so this log of X which I’ve illustrated here this X is a bit string
and the log of X is the number of ones in this bit string.
So essentially the probability here is going down exponentially in the number of
ones. So typically you’re getting strings like this where you have lots and lots of
zeros and only a few ones. And so you can think of what the positions of each of
those ones are. Now, what I’m illustrating here is this is actually a multiple
controlled operation on all of these qubits and it’s going and looking at all of those
qubits and looking at where the first one is. And so if you remember before I was
alternating U2 U1 U2 U1 etc, so what that’s telling you is that if the first one is in an
even position you should be doing U1, if it’s an odd position you should be doing a
U2.
So what these controlled operations are doing is essentially just looking at the pair
of the position of the one in the control qubits and doing a U2 or a U1 based on that.
And ->>: So what’s with the big oval [inaudible]?
>> Dominic Berry: So then there’s also a third alternative, which is that it’s run off
the end and there’s no more ones in that bit string in the super position. In that case
it just does the identity.
Okay? So as I was saying before you have this strongest weight on having the
smallest number of ones. So what I’ve indicated here does the whole thing except
these ones out here are almost never going to get done. So what you can do is look
at, is identify these later ones which are never going to be or almost never going to
be done and [inaudible], and the overall circuit is still going to be accurate within
the error that you need provided that the K I’m going to use here is the number of
these that are attained provided that you choose that appropriately with epsilon and
you still have the overall error of the procedure of the probability bounded.
So now just rearranging things to make it a little bit neater for the following slides.
So now what we’re doing is we’re replacing those long forms of the control qubits
with the compressed form. So what we’re doing is we’re having different registers
which are just storing the positions of the ones. So we’d have a P1 up to P, which is a
number of ones in the string, and then since we have a fixed number of registers
we’d just have to pat out with something, that would just make it an M to pick
something different.
I see people pulling terrible faces there. [laughter]
So it’s a very simple compression technique, but then when you get to do the
measurement then it gets a little bit more difficult. So essentially what we have is
we have a -- so this is indicating the compressed form of our control qubits. So these
three up here is a position of the first one. This is the position of the second one.
This is the position of the third one. So actually what these -- yeah, this is encoded.
So actually this control is actually only being controlled on these. This one is only
being controlled on these. And this one is only being controlled on these. So first of
all is the question of how to do this preparation here, and second is how to do the
final operation.
So if you remember before we had all of these Rs on the R state. So to do the V on
this we can use something like this. So essentially what we have is one qubit, which
is a control qubit, which we have a rotation on. And then we have a whole lot of
controlled rotations on the remaining qubits, which can give us essentially almost
exactly what the state is we want, except we have some extra clean up steps to fix
those padding.
So if you remember these padding qubits here and we need some extra clean up
steps to fix those, but otherwise it’s really straightforward. And you also need to -so this is going to give you relative positions between the ones and you also need to
do some operations to get absolute positions of the ones, which is what you need for
these controls.
>>: So what’s in the [inaudible]? This is all V here?
>> Dominic Berry: What this is indicating is all V so that we -- so use this V to get
absolute positions of the ones.
>>: So what are ovals there, the tall ovals? What are the tall ovals have to do with
anything?
>> Dominic Berry: Right. So this one is just actually controlled the top three, this
one is only controlled in the ->>: So what it says is if all three of them were one then ->> Dominic Berry: No, no, no, no, no. These qubits here are encoding a position. So
it’s like, it’s just there are three qubits encoding the position where one falls in the
previous thing.
>>: Yeah I understand, but I mean if that thing one [inaudible] coming out of it to
[inaudible], right, then you have to attack the identity with -- I mean, that UI that I
has to line up with the I on that first set of three UIs. Right? So as I varies that
unitary [inaudible] the I will change?
>> Dominic Berry: Yeah, yeah. That’s right.
>>: So what you’re saying is that the three I values -- so I don’t understand how UI
varies with I, I guess is what I’m saying.
>> Dominic Berry: Oh. Well, it’s just looking to see if this one is over even for this
control of operations. This one is changing with this one’s operation and so forth. I
guess when you’ve already got two there you’d only be needing to -- well, I guess
you’re needing to do two things. You need to check whether it’s over even or if it’s
in the inside [inaudible].
So other than that it’s just a normal control operation which could be relatively
easily ->>: Yes. It sounds to me like the question was what allowed you to neglect the rest?
>> Dominic Berry: To neglect what? Sorry.
>>: You lost me anyway on exactly how the controls are working and why you can
eliminate all the [inaudible].
[inaudible]
>> Dominic Berry: This one is controlled the position of the first one, but if there’s
no ones at all then you just do nothing. This one, which is controlling the position of
the second one, but if there’s no more than one one you don’t do anything with it.
And so if you go out to here to this one, if it’s all ones, which happens with a
ridiculously small probability, so basically all of the ones you’re throwing away are
only happening with very, very tiny probability.
>>: Okay. So these control functions are quite complicated.
>>: Yeah, relatively complicated but they gate the compositions which we’re
working, going to statistically there’s only so many ones.
>> Dominic Berry: Yes. Yes.
>>: [inaudible] So K you can ignore.
>> Dominic Berry: Yeah. So K is selective so the error from discarding those strings
required anyway is very small.
Okay. Then we were talking about how to make that. So the details of that aren’t so
important, because what we’re going to look at is how to do this measurement,
which is kind of tricky because ideally what we’re wanting to do is take these
compressed qubits and decompress them and do a whole lot of these R rotations
and, as I was saying before, we want to have measurement results of zero on
everything but we’ve got a probability of a quarter of having some measurement
results of one. So if we do have a failure we need to work out where that failure is or
we can’t correct it. So we have to be able to work out where the positions of the
ones are on that measurement on the uncompressed basis.
But typically there’s only going to be a small number of ones. So when you get this
measurement result it’s actually something which you could compress back down
again. So if you think of ideally what we’d want to be doing is to start with a
compressed basis, uncompress, do the R’s, do the measurement, then compress back
down again but we’re going from a compressed basis to a compressed basis, so
surely we have some way of going from one to the other without expanding out to
an inefficient number of qubits.
So ->>: It’s very large. The expansion could be really big.
>> Dominic Berry: Yeah. Yeah. Because what we’re eventually doing is taking our
trotterization, which is putting like a vector of epsilon in there from the
trotterization, then we’re getting another factor of epsilon because we’re breaking
up each of the individual Hamiltonians into these tiny parts, which are going to be
proportional to self-inverse. So then that number of things is going to be like the
number of control qubits we have, which is going to be huge!
So we don’t want to be doing things in that uncompressed basis. Okay. So the
reason why I was going back to this slide is because we’re thinking of having things
in a compressed form of zero originally, and then doing a V operation, which is
taking us to the state we want, and then if we had the state we wanted here and
inverted this operation we’d get all zeros and measuring would be getting us all
zeros. But because we have these control operations here we’ve actually changed
the state because it’s now entangled with this thing. So if we still actually were
lucky and got the all zeros measurement we would have performed the correct
measurement that we wanted.
So we do know how to do a measurement that succeeds, but as I was saying, we
don’t always have a measurement that succeeds. If it doesn’t succeed we need to
know where it’s failed. So to explain this first of all what we can look at is doing a
partially non-destructive measurement. So we’re thinking of instead of just
measuring in the compressed basis what all these individual bit values are, we can
say we’re going to measure it as all zeros or everything that’s orthogonal to all zeros,
so it’s something in this plane here.
And you can think of doing it with this type of controlled operation here. So this is
just some extra [inaudible] in a state one if all of these things are all zeros, then that
flips this to a zero and you have a measure of zero here, whereas if anything is a one
here, oh sorry -- no, no. That’s right. Yeah. So then if anything is a one here this
doesn’t get flipped and it comes out as a one. Yeah.
So now in the following diagrams it’s going to be shown by this oblong here and so
we have a recursive measurement which we want to do which looks something like
this. So we’re essentially taking the entire thing and doing this predictive
measurement between all zeros and everything orthogonal here, and if you assume
that you find it’s not, you then divide it into two and you keep on doing it like this.
>>: [inaudible]
>> Dominic Berry: Yeah, yeah. So if it does succeed you don’t go any further.
>>: The bottom it just controls ->> Dominic Berry: So the way it’s happening here is it’s going to one here and one
here, so at this stage you’ve found at least one one, then here you’re finding that you
have at least one one in this half, which ends up being this one and this one you find
at least one in this half, which turns out to be this one. And then at this stage this
one finds all zeros. So these are all grayed out because you don’t actually need to do
any further operations.
And similarly there. And in this way you end up finding all your ones without
having to go out of the fully compressed basis. And what you find is that the
complexity of this the number of measurements is the steps that you need if you’re
actually finding K ones out here as going like K times log M. So there’s essentially
log M is the number of times you have to divide in two and you have K for the having
to find that one and having to find that one.
So essentially it’s still efficient because you don’t have anything that’s going linearly
in M there. So how am I going for time?
So I have a bit more explanation of how this goes, which perhaps isn’t so
illuminating. So we have to be thinking about having to actually do things not just in
that original basis. We have to do that operation, which is inverting that
preparation, and then redoing it after the measurement, and then we have to break
things up -Then we have to have an operation, which is breaking up the compression into two
halves. But I’ll just skip over that. Just bringing all of these details together, what
we end up doing is dividing ->>: So after above K steps you make the correction?
>> Dominic Berry: Well, well if you’re thinking of what happens after all those
measurements all you’ve done is located the value and the positions of the errors,
and then you have to go back and make a correction. But the correction is basically
as difficult as doing the simulation in the first place. You’ve got to run it backwards
and put corrections for those individual errors in the forward simulation.
>>: And how many [inaudible]?
>> Dominic Berry: It goes according to laws of probability for random walks. So
you have an exponentially small error of needing more than some fixed number of
iterations in the walks, so when you go to fully quantify the complexity you have to
look at sort of how many errors it’s having to correct and how many steps it’s having
to go backwards to actually do the full correction and the length of the more
[inaudible]. But it all turns out that since it’s so heavily biased towards everything is
[inaudible].
Okay. So this is where we bring all these parts together. So we had first of all a
division of the Hamiltonian into this number of parts. So we originally had a D
squared or a 6D squared for the original decomposition. Let me just throw away the
six for the order scaling. And then we have a norm of H on epsilon H to get accuracy
epsilon H for this approximation. And that is just because and I actually put a
maximum of H in here for the paper because essentially you’re just looking at the
maximum absolute value of an entry in the matrix and then you might need to break
that up into order a maximum of H parts to get down to order one, and then another
one on epsilon H parts to get down to size epsilon H.
So then we are -- if you remember before I said you are using time segments of
length one quarter, and the idea of the time segments of the original 2009 paper was
to get probability of success of three quarters. Now, we actually need to subdivide it
into a parts of length proportional to one on M times epsilon H, and we have the
order of this number parts.
Now, it might seem nasty that it has a one on epsilon H there, but it comes into
product M times epsilon H, and M was divided by epsilon H in the first place. So that
actually gets rid of a factor of epsilon H, which is something that we need because,
so we have this number of segments, and essentially what the scaling is coming from
is that we have this number of segments and each of these segments we can do with
complexity that’s essentially poly logging all the other things. So then when we look
at the overall complexity we get this thing coming down here times the polylog
factor, and this poly log factor is coming from that compression, and then just
multiplying this M times the epsilon H gives us a D squared norm of H, which is
coming here. And then the time just comes down there.
So that’s where our overall scaling for the algorithm comes together, where it comes
from. Okay.
So that was part one for that simulation. This is part two, which is just as easy.
Okay, so this is using quantum walks for simulation which is strictly linear in the
evolution time. So these three ingredients to this, one of these is what’s called a
Szegedy quantum walk, so I’ll need to explain a bit about what quantum walks are
first I think, and then we use a coherent phase estimation, which is a bit like a
[inaudible] phase estimation, and then we also need to use a controlled state
preparation.
So the general idea is that this quantum walk it’s going to be operating essentially on
the state in some expanded space, and the quantum walk is going to have
eigenvalues and eigenvectors, which are related to the eigenvalues and
eigenevectors of the Hamiltonian. So there’s a general procedure where if you have
one operation, which you can do, and another operation which you want to do, and
the eigenvalues and eigenvectors are related in a nice way, you can do phase
estimation to work out what the eigenvalues are and then artificially put on the
eigenvalues that you want for the operation that you wanted for them.
>>: So a quantum walk is a unit of estimation?
>> Dominic Berry: Yeah, yeah. So you should have the slides if I’ve -- explaining
this in more detail. So the normal quantum walk you have what you call a position
subsystem in a coin subsystem. So essentially the coin is just a qubit and the
position is like integers. So if you think of what a walk is doing you’re actually have
a coin flip operator, which would be like [inaudible] and then so essentially if you’re
starting on a plus or minus one you’re getting a plus or minus in this thing and that
coin operator isn’t affecting the X.
And then the step operator is looking at the value of the coin and it’s doing a plus
one or a minus one on the position depending on the value of the coin. So if you
think of what that’s doing you can start in the zero position and if you just had a
classical walk it randomly do a step to the left to the right and you get essentially a
Gassian [phonetic] distribution approximately, and the width of that is going out like
the square root of the number of steps.
But when you do a quantum walk you have actually amplitudes for going out to a
distance, which is linear in the number of steps, and sort of this was an original
motivation that people had for thinking that quantum walks might give you
computational speed ups.
And there was some very clever work with sort of doing quantum walks on more
complicated graphs rather than just in lines, which give interesting speed ups, but I
won’t go into that. What we’re using is what’s called a Szegedy quantum walk,
which uses more general operations, which we call control diffusion operations.
So to explain what these are doing, these are a rather strange thing, I’m not sure if
you familiar with this type of thing. It is essentially getting you a reflection and
these are controlled reflections. And the first is controlled by the first register and
acts on the second register, so before we had a position a coin register, now we have
two registers which are basically symmetric between each other. And the second
one is essentially the opposite of this. It’s controlled on the second register and acts
on the first register.
So then what these are doing is so the C you take some general matrix Cj and you
have these states, which are expressed in terms of the square root of the entries of
that matrix, and then what this C is that’s a projector on I on the first system and it’s
a state CI on the second system. So it’s like a controlled state preparation. So then
what happens when you construct this thing is that essentially what it does is it
takes -- this operation is looking at the value of I on the first register and based on
that register it reflects around CI in the second register.
Okay. So then we have a similar definition for the R and the quantum walk is just
ultimately doing these two steps. So what we want to do is look at the eigenvalues
and eigenvectors of this step operator and it turns out that those eigenvalues and
eigenvectors are related to a matrix which you can form from the C matrix and the R
matrix.
So the idea now is to use a symmetric system so the dimensions of the two things
are the same and these C and Rs are just the complex conjugates of the entries of the
Hamiltonian. Now, the eigenvalues and eigenvectors of the walk are going to be
related to those of the Hamiltonian, which are the -- I don’t think I’ve given the
formula for just because it’s not very illustrative but the thing which is fairly
important for this is we need to modify this state preparation.
So you remember before we had this state preparation to this state. So remember
this was going to be complex conjugate of HIJ, so if we just did the same thing we
would just be having this part, but that wouldn’t actually be a normalized state with
this Hamiltonian. So you can think of putting this multiplying factor in and if you
just ->>: [inaudible]
>> Dominic Berry: Sorry? So this if you just ignore this delta for the moment, if you
have a one on -- actually yeah, I think you’d actually have to have a one on root
sigma I there to make sure that that thing is normalized. But then the sigma Is are
dependent on the value of I, which would mean that you would have to have an I
dependent thing out here, which is not what you want. You want to have just a
constant here, so to make sure that the entire thing is normalized you have to have
this extra part out here.
And then you have an overall super position, which is just controlled by those
matrix elements of the Hamiltonian. And there’s a couple of difficulties here, which
is that the sigma I is not something which we know how to calculate, and well, that’s
the main difficulty, and another thing that you’ll notice here is I’ve put in a delta
there rather than a one, and the reason is that we can make this delta small. If we
make this delta small and this thing becomes small and it’s mostly weight on a single
state, and this means that when we do a step of the quantum walk it’s actually very
close to the identity. And this might seem like a counter productive thing to do, but
it turns out to make things quite nice.
So call this the laziness parameter. So as I was saying, this thing is unknown, which
is a problem. And so the three-step process we have for this Szegedy quantum walk
approach so this is due to Andrew Childs back in 2009 is to start with the state in
one of the subsystems, so this is a state we want to simulate the Hamiltonian on and
perform controlled state preparations so that we have a joint state across both
registers. Now, we have this Szegedy quantum walk which we can then do over
these two registers, and then that approximates the Hamiltonian evolution and then
at the end you can convert the controlled state preparation which gives you and
approximation of the desired final state in just one of the registers.
And the idea in step two is that you can do it with that small laziness parameter and
you can essentially just do things as in that procedure, but you can also use this
phase estimation approach, which I was mentioning before.
>>: It’s more like eagerness parameter. [inaudible]
>> Dominic Berry: What parameter?
>>: Eagerness.
>> Dominic Berry: Eagerness?
>>: Because when it’s small it’s lazy.
>> Dominic Berry: [inaudible]. I seem to remember a draft where we called it an
unlaziness parameter or something. Okay. So --
>>: [inaudible]
>> Dominic Berry: That’s because Andrew was using the previous walk, which was
defined by Szegedy, but Szegedy didn’t use it for Hamiltonian simulation. Andrew ->>: [inaudible] everything’s equivalence going to triple walks. So people built
[inaudible] stuff on top of that for his other things.
>> Dominic Berry: So now this is explaining how this simulation by phase
estimation works. So this is for sort of a very general procedure where you want to
implement one unit where you can implement a unitary V, which in this case would
be the step of the walk, but you want to implement a different operation O. And in
our case this O is still a unitary, but you can also think of doing this for non-unitary
things, which is what for example, Harry Heisman and Lloyd [phonetic] linear
systems work does. So now the idea is if these operations share Eigen states but the
eigenvalues are related by function, then this is something very useful.
So here get a E to the I lambda where lambda is a real for eigenvalues of this V, then
you want to put in an F of lambda here. Now, you can implement this unitary V
many times and perform phase estimation in a coherent way just by using a
[inaudible] style of estimation, and then what that gives you is this side then has an
extra register where you put an estimate of this lambda, which you are using to
index these eigenvalues and Eigen states.
So then what you can -- and this side lambda is just the individual weightings in the
original state. And then your F of lambda is something, which can be imposed, and if
this is a unitary, then the F of lambda is just a phase factor, which you can do
deterministically and you can do non-unitary things if you are allowing yourself
some probability of error as well.
So you have a little bit of difficulty because this isn’t necessarily an exact estimate so
you have some error due to that, but ignoring that all you’re getting is you’re getting
this F of lambda factor in the super position. So then when you invert the phase
estimation you wipe out that thing and just get this, and this is what you’d want if
you were just performing the operation O on your initial state.
Okay? So in our case what the V is is it’s a step of the walk and the O is the evolution
for some time. And the idea -- okay, so this is where I give the eigenvalues, so if you
have a eigenvalue M for the Hamiltonian, then the step of the quantum walk has
eigenvalues like this, so you actually get a plus or minus, E to the plus or minus I
[inaudible] M delta. And the operation O would be evolution of the Hamiltonian, so
you want to actually put those eigenvalues in there. And you can, it’s a little bit
more complicated than the case I just explained because you also have these plus or
minuses. So a single eigenvector for the Hamiltonian goes to actually two
eigenvectors for the step of the quantum walk, but you can actually work out which
one it is and it doesn’t really affect the analysis much.
So we have a range of new events. So some of these are going beyond just
Hamiltonian simulation. So what we do is actually combine the lazy quantum walk
with phase estimation, which improves the efficiency, and we also have procedure
to achieve the controlled state preparation and we can then combine that with
trotterization and that is what would give you the root D scaling.
And another thing we’ve looked at is applying this to unitary operations, which is
quite interesting because if you look at a non-sparse case you can actually apply this
to non-sparse cases as well. You can say if you’re given a dimension N unitary, then
it’s got N by N, N times N entries in there and normally if you’re doing a
decomposition of this thing into gates then you’re going to need complexity at least
in [inaudible] to do your operation, but if you think in this oracle type of form where
they ask for a value of the unitary at some location, and you have the oracle can give
you that and just representing some calculation you can do, then you can implement
that unitary with complexity root N in most cases, which is kind of surprising.
And that’s where that result I mentioned before where we can only prove rooting in
typical cases and not in possibly those extreme cases, which give us the problems.
So I’ve got ten minutes left there I think, so I won’t explain everything, but I’ll just
explain a part of it. So we have steps of the quantum walk giving us eigenvalues like
this. So if we’re thinking of M delta being a rotation on the circle, then we have
actual rotations giving actual eigenvalues given to us by the quantum walk going
like X sine M delta here and a pi minus x sine m delta here, and if you remember
before we had the little bit of problem between this plus and minus, but because
these things are so far apart, and this delta is small these things we can very easily
distinguish. So then if you were using -So we were just using the lazy quantum walk then we could basically just say that
the arc sine was close to linear and approximate the arcsine by the M delta by M
delta, which is just approximating this thing by this thing here, which is quite close,
and that’s what happens in the lazy quantum walk.
And then the error in the lazy quantum walk is just proportional to the linearity in
that arcsine. But if you also use phase estimation you can get an approximation of
that M delta and just use that to work out an approximation of that correction you
want there, and because that’s a small thing in the first place and it has a small error,
that gives you quite an improvement, and that gives you a better result than just -you could alternatively estimate M delta and estimate this from the phase
estimation and then do the sine rather to get the M delta, but if you did that you’d
get a way worse result than combining your lazy walk and the phase estimation.
So the second thing, which I mentioned, is this idea of -- so remember what we were
wanting to do is essentially controlled reflections and you can do these controlled
reflections if you are able to do the state preparation. Now, state preparation, this is
like state preparation with an oracle. And if you go to a paper back in 2000 by
Grover he looked at this problem and the idea is that you’re starting with a equal
super position state with some ancilla, and if you rotate the ancilla according to the
amplitude of the state, which you can do if you are given an oracle, then you can get
an amplitude sine K on the zero here.
And if you were able to measure it and get rid of this one component then you’d
have exactly the right weighting that you wanted the sum of the sine K times K. So
you can also use amplitude amplification, which gives you a better result. So I won’t
go into the amplitude amplification because that’s a bit complicated, but I’ll just
compare what we have here. What we have here is like this, and the state we
wanted to prepare was like this. So you remember I was saying we were using a
small value of delta, so this thing is small.
And then this thing is large, and here as well. This thing is going to be large because
that’s taking the weight of all the other parts of the state other than the K part. And
then this thing is essentially corresponding to this thing, and the insight that we use
is essentially that we can take this second turn, the Grover preparation to take the
role of this laziness term in the lazy quantum walk, and that actually enables us to
get very efficient simulations in terms of the norm of action T.
And it also means that we don’t need to know this sigma I, because remember
before we had this sigma I here, which we don’t know how, or we can’t efficiently
calculate just from an individual entry of the Hamiltonian, but we don’t have to
calculate it because we can just use this approach.
So then what we get is we get actually you’re going to be looking at the maximum of
these two things. So what you have here is you have a norm of H times T over a root
epsilon, and the root epsilon is coming from that improvement in the combining the
lazy quantum walk and the phase estimation, otherwise it would be one over
epsilon.
And if that is actually small or the sparseness parameter is large, then you have to
look at this thing. So in this thing the dependence on epsilon is actually disappeared
all together apart from having to have this be larger than this, and you have a linear
scaling in that sparseness parameter.
So that is actually getting us strictly linear scaling in the norm of H and T and
improved scaling in the sparseness. So that scaling in the sparseness is even better
than the scaling the sparseness for the compressed approach for explaining first. So
if you do an alternative approach with non-sparse cases and I think actually I
seemed to have to switched from a lower-case d to a capital D there, you can get an
alternative bound, which actually looks like this, which is kind of a weird expression.
But if you think -- so this is actually worse scaling in the time because it’s now to the
power of three on two, but you’ve got a root D factor in there.
But the other problem is that you actually don’t have just all of the same norm in
here. You have a product of three different norms of H. Now, if these were just
going like the spectrum one, which is this one, then you’d just have the spectral
norm of H times T to the power of three over two and you’d have a nice root D in
there. But the problem is that we have a one norm in here and one norms can be
big.
And the worst case is that, so this would be if you had a Hamiltonian with all entries
which were about the same magnitude, then you would have a one norm of H, which
would be going like root D times the norm of H.
>>: Is that H maximum infinity norm of H?
>> Dominic Berry: that’s just taking maximum absolute value of H. So -[inaudible]
>>: Absolute [inaudible].
>> Dominic Berry: Okay. So ->>: That’s what it is.
>> Dominic Berry: Yeah. So ->>: [inaudible] infinity is a maximum.
>>: No, no, no.
>> Dominic Berry: Some scripted max because it’s maximum absolute value of an
entry. Not because infinity is maximum. Anyway, all that really matters is sort of
the relations between these things. So if you think of -- so typically this H max is
going to be smaller than the spectral norm, but if you have some row that has a, so
just assume that the spectral norm is one for simplicity, if you have one row that has
a one in one entry and all the rest are zeros, then the H max is going to be equal to
the spectral norm. If you have the opposite case, which is that you have all entries
which are equal, then this thing is going to be going like root D times the spectral
norm.
But in that case this norm would be one on root D. So in that case you’d have this
product would actually still be proportional to the spectral norm, and if you had the
other case where you had one one and the rest zeros, it would still be proportional
to the spectral norm. And in the other case you’d get this scaling here it’s scaling
like root D.
But you can also have a worst case scenario where the one norm is going like the
root d times the norm of H, and the max norm is just going like norm of H as well.
And that gives you a D to the ¾ scaling. So you might ask where the improvement in
D comes from. So now we’re just replacing the D with N because we’re thinking of
the non-sparse case, and the first to advance is allowing us to get to root N in many
cases, but not all, and the cases where that fails or where the non-zero elements -No sorry, the cases where this works where the non-zero elements are of
comparable size, so if you have say N of these elements all about the same size, then
the height of these are all going to be about one on root N. I’ve just taken the norm
of H to be one for simplicity, or sorry, the spectral norm of H to be one.
And in that case you get that scaling and this thing turns into something with a root
N in there and you have norm of H times T to the three on two.
>>: [inaudible]
>> Dominic Berry: Well, if you think of, for example, say something which is doing a
furrier transforms. Well, if you’re encoding a unitary in a Hamiltonian you have a
furrier transform in the unitary then it’s just going to be like that because you have a
whole lot of elements of equal size. And if you have something which is one sparse
and it just has ones in the entries then that’s also going to do the same thing.
So the idea to get better simulation is to actually break this thing up into
components of equal magnitudes. So in this case you wouldn’t really need to break
it up, but I’m just indicating this to show you what you’d do in the general case. So
you have an H1 case, which is all the biggest elements. And you have an Hl case,
which is all the smallest elements. And then you would break it up into a whole lot
of intermediate chunks as well.
So now the idea is if you’re breaking it up into Hamiltonians where the non-zero
elements are all right about the same magnitude, then you get this proportionality
working for each of those Hamiltonians on their own. And then you can join this up
in a trotterization, and I won’t go into the details, but I think I’ll just skip to the end
actually, because we’re out of time.
So anyway with that approach, which I was just explaining, we can essentially get
things down to root N in most cases, but there are certain pathological cases where
you can break things up into, break the Hamiltonian up into components with about
equal sizes and you are going to get the one norm blowing up and becoming much
larger.
And when we do simulations we find that that basically never happens but you can
construct very artificial examples where it does happen. Anyway, so the conclusions
of the overall talk is this is the conclusions for the first part is essentially that the
first thing gives us Hamiltonians simulation with this type of complexity where we
have a D squared norm of H and T and rest poly log, which is providing poly log
scaling in both the error and the derivative of the Hamiltonian, which is quite nice.
And in the quantum walk approach gives us even better scaling in the D, in the H,
and the T, though it’s going to give us worse scaling in the epsilon. And you can also
have a trade-off to have better scaling in some quantities and worse scaling in the
epsilon.
So anyway, that’s the end of the talk so thank you.
[applause]
>>: There’s still -- I’ll talk to you later. I don’t understand why you’re penalizing
yourself for the fast decomposition. And that’s why your D squared -- what are you
talking about? I mean ->> Dominic Berry: Oh, that’s just because you have to break -- Yeah, but you have to
break it up into D squared parts in the first place.
>>: Well, but you could break it up into D parts, which would be more work.
>> Dominic Berry: Yeah, but that would require actually going through by hand and
we don’t have any -- it’s actually an algorithm to do it -- [inaudible]
>>: Should you have to count the complexity with the decomposition? That’s my
question.
>> Dominic Berry: You would.
>>: If I just give you the Hamiltonian in that form, or I do it classically, I mean it’s a
classical algorithm.
>> Dominic Berry: Okay, so -- So to break it up you sort of have to seek a whole lot
of elements of the Hamiltonian, and that’s putting a multiplying factor on your
complexity already.
>>: If you kill it, yeah. If I gave you the Hamiltonian -- [inaudible]
>> Dominic Berry: [inaudible] Hamiltonian in an already in the sum of D parts ->>: Suppose I gave it to you in one sparse form.
>> Dominic Berry: Yeah, if it was in a one sparse form. We don’t want to be having
to rely on one sparse in general.
>>: Okay.
>> Dominic Berry: Sort of the motivation for this is doing things like, if for example,
you would want to simulate some sparse system of linear equations and [inaudible]
or if we have some other element.
>>: Well, I was once involved in a computer project that could only be reference
permutations.
>> Dominic Berry: Okay?
>>: Essentially it requires sparse loads. It was an array machine, it could fetch a
whole array at once, but it had to be a permutation. And it was very fast if you could
do that.
>>: Any other questions?
>> Dominic Berry: Did anyone understand the second part of the talk? Good.
>>: [inaudible]
>> Dominic Berry: A little bit worried that the questions sort of beat it out at the
end.
>>: The problem is that it’s not a random walk, it’s a quantum walk. It’s
deterministic, which means that you have to start with a fairly good overlap of the
final state you’re looking for at the preparation level actually to get there.
And the problem is the preparation doesn’t guarantee you will ever get to the
answer you’re actually looking for, even though you’re at a reasonable preparation,
unless you know and one way is to emphatically evolve the preparation into
something that at least has a 50 percent overlap of the final state, and then you use
the techniques for walking.
It’s an alternative technique called quantum stochastic walks that supposedly fixes
this by adding noise really, so you really are doing a random walk at the point. But I
haven’t implemented it or read the papers well enough to know how well it works.
>> Dominic Berry: Okay.
>>: It’s going to work worse in ->> Dominic Berry: It sounds like a totally different application than this.
>>: But that’s the problem with most of the quantum walk papers, when you start,
for instance the [inaudible] ones that were done. If you do a big enough graph you’ll
never get the right answers because you can’t compare it close enough. The small
graphs will work because they’re all overlapping with the final states anyway. And
the same problem holds with quantum chemistry, is unless you know how to
prepare in which the state to be close to a ground state, you never find the ground
state.
So if you know the answer you get the answer, well that doesn’t help much. So yeah,
a lot of this has the problem. Not your stuff, I’m talking about the underlying
approach has a lot of problems. Speeding up is a good thing, but we also need to
prepare better is part of the problem.
>>: So you only use this with the application of Hamiltonian simulation [inaudible]?
>> Dominic Berry: Well, that’s one application. Another application is if you’re
looking at the work on [inaudible] simulation. So you have a method of doing
[inaudible] simulation, which is basically doing evolution under Hamiltonian and if
you do that simulation of evolution under Hamiltonian using a Hamiltonian
simulation approach you basically can turn it into discreet theories, whereas if it’s
just a continuous Hamiltonian simulation it’s really requiring continuous queries to
methodical, which isn’t really sort of realistic.
And also if you’re thinking of say, trying to simulate differential equations
[inaudible] approach.
>>: Any more questions?
Thank you. Lets -[applause]
Download