24100 >> Krysta Svore: So today the Quantum Architecture and Computation Group is welcoming Nathan Weeb here to talk with us. He is a post-doc at the Institute for Quantum Computation, IQC, at University of Waterloo. And today he's going to be talking about designing quantum circuits for efficiently simulating many body dynamics. So Nathan, I'll turn it over to you. >> Nathan Weeb: Great. Thank you very much. Thanks a lot for coming. Well, you've heard what I'm going to be talking about. This is work I've done collaboratively with static [inaudible] as well as Barry Sanders. And the outline of the talk is laid out as follows. So first I'm going to provide an introduction to quantum simulation algorithms. And then I'm going to discuss by providing some simple examples how to simulate a few very primitive many body Hamiltonians and then I'll use the intuition that we've developed from this discussion in order to show how to construct a circuit that automates or sorry a classical algorithm that automates all of this reasoning. And finally I'll discuss how these results can be optimized by grouping terms in appropriate ways for parallel execution. So in general, quantum dynamics is known to be very hard to solve. Or at least it's believed to be very hard with the best known algorithms that we've got. I did a search, and the most sophisticated quantum simulation that I was able to find was performed by this super computer, and it was able to simulate the dynamic, quantum dynamics on a 42 qubit system at the cost of roughly $100 million. Now, this is something to me that was pretty impressive, because I've done these things by hand. And general dynamics, I have a really hard time pushing upwards of 20 Q bits. 42 Q bits is a substantial feat especially since it gets much harder as you try to make the simulator work for bigger and bigger systems. So in general, the number of the computational resources needed in order to perform the simulation will scale exponentially with the dimension of the system. So, for example, if you've got 42 Q bits it took only one of those super computer clusters. But using optimistic scaling we'd need two for 43 and so on and so forth. This gets absurd, because if we want to even simulate a modest scale, quantum system, say 115 Q bits, going through the math and computing the weight of all of the processors in the super computer required to simulate the 115 qubit system in the same amount of time we'd require at least an amount of processor mass that's equivalent to the mass of the earth. So something's wrong here. Especially if we believe that nature contains 115 qubit systems in it. I.e., systems of say 115 interacting electrons. So how is nature getting away and answering these questions without requiring a super computer the size of the planet. And the answer, I believe, as well as many in my field believe is through quantum information. So the basic idea is we'd like to get around this problem by exploiting quantum mechanics in order to simulate quantum mechanics. And the intuition works as follows. Let's imagine that we've got some physical system that we would like to simulate. And specifically we have a quantum state that's in that physical system and it follows some natural evolution described by a system Hamiltonian to a final state over here. We'd like to simulate that by taking this additional quantum state and constructing a state that's logically equivalent to it in a quantum computer. Then, instead of this nice, smooth evolution that we get by the Hamiltonian, we approximate the evolution by sequence of quantum gates that are permitted for the particular quantum computer. The result of applying those quantum gates will be a quantum state that if everything is done correctly it should be logically equivalent with air tolerance to the actual state we would have observed in the physical system. So that's the whole idea behind this method of using quantum computers to simulate quantum dynamics. There's some really big strengths and there's some really big limitations. I'll start with the bad news first. The bad news is that unfortunately you actually get zero information out of this system by doing one simulation. I mean zero information. Because of the fact you've got a quantum state and you haven't measured it. You haven't extracted any information from it. If you want to learn something from it, then you can perform a measurement. But the problem is for certain things you may require a large number of measurements. Like if you wanted to learn the quantum state, in general, you'd have to do some process like quantum state tomography. That's exponentially expensive. So you're not going to get -- you're not going to get an efficient estimate of the quantum state as a result of one of these simulation algorithms. Also, in order to guarantee that we can emulate the evolution with the small number of gate operations, we need to make several promises about the form of the system Hamiltonian we're simulating. This won't be efficient for all possible systems. But if these caveats are made then we can find expectation values and certain eigen values by repeating the simulation, polynomial number of times, and processing the data. So the other big strength about this is unlike algorithms such as shore's algorithm, we don't necessarily need very many qubits in order to outperform what the best classical computers can do. As the discussion previously about the mass of the earth, hopefully alluding towards. So I'd like to now provide some background about how the field has progressed. Originally this idea of simulation was proposed by Feynman when he was noting that he wasn't able to write very efficient simulation code for quantum systems but quantum systems seemed to be doing it naturally. This led him to propose the idea of a quantum computer. But since it took quite some time since then for somebody to actually formally talk about how to do these simulations. In the first case was done by Seth Lloyd in '99, where he showed how to simulate quantum dynamics in the restricted case that the Hamiltonian is a tensor product of, sum of tensor product of low dimensional systems. So that work was subsequently improved by Arnov and Andrew Childs [phonetic] and others in 2007, this was improved further by Dominic Berry, [phonetic] Gray [inaudible] Richard Cleave and Barry Sanders, who came up with substantial optimizations to the algorithm that caused it to run in near linear time as opposed to like quad time or T to the three-halves. Recently myself and a few other co-authors have generalized these ideas to rigorously show how to simulate time-dependent Hamiltonians as well. So that's sort of many of the most recent results involving quantum simulation algorithms. How are the complexity of these algorithms typically assessed? And the way they're assessed is in a very high level perspective. Often what we're interested in is a number of times a Hamiltonian black box is queried. This black box works as follows: You provide a quantum state that encodes a row and column and it outputs in another state encoding matrix of the Hamiltonian, HXY. Also, there's an additional black box that's often considered that will tell you the locations of nonzero matrix elements. Those are the two things that are considered to be the resources that are required for these algorithms. And the cost is given by the number of accesses you have to make to this black box. And the best of those algorithms that I described previously has the following scaling for the number of calls that are needed. So it scales slightly worse than the fourth power of the sparseness of the overall Hamiltonian. Nearly linearly with the norm of the Hamiltonian and evolution time and sub polynomially with the air tolerance. >>: [inaudible] sparse somewhere. >> Nathan Weeb: Of each what? >>: Of the definition of D sparse. >> Nathan Weeb: A matrix is D sparse if each row and column contains at most D nonzero matrix elements. So that is basically the results. Now, there's a bit of an issue we would like. Unfortunately, to the best of my knowledge very few ion traps have a black box. So the question is how do we go about taking this from this theoretical description that uses this black box to something more fundamental that we could consider actually implementing. And in general, and fortunately, it's very difficult for an arbitrary black box to consider doing a circuit decomposition of that. However, for some particular cases, specifically local Hamiltonians, results are known. Recently Weimar, Muller, Buckler and Lesinovski [phonetic] found a trick that actually can be used in order to simulate many body systems without requiring a black box or doing any of this sort of nonsense. So the main drawback behind their technique is unfortunately they went through the analysis and they were correct about their trick. But you have to have a lot of knowledge and a lot of intuition to see how to apply that trick to be able to simulate a general Hamiltonian. It's not something you can straightforwardly tell a computer to do and it will output the correct result. Also, they don't use optimizations as proposed by previous simulation schemes such as the use of high order integrators or other such tricks. What we do is we rectify these two problems. We make it automatic so that a computer can directly output a quantum simulation circuit, and also we include the optimizations that are used in the best simulation algorithms to date. So that's basically what we do. Now, just to give an outline for the rest of the talk, what I'm first going to do is I'm going to discuss simulation of many body Hamiltonians and I'm going to define what I mean by a local Hamiltonian and I'm going to show how to simulate some extremely simple Hamiltonians and lead into eventual classic circuit building program. What do I mean by a many body system? This is provably the most trivial slide of all of them. What I mean is a system that has many interacting bodies. So that's somewhat tautological, but there's many examples in physics that have these properties. For example, Izing [phonetic] models that have interacting magnetic dye poles on some sort of lattice or qubits in an optical lattice are often described in these many body systems. Out of them we're interested in a particular case. We're interested in K local Hamiltonians. And these are Hamiltonians that are the sum of terms that are tensor products of at most K nonidentity poly operators. So, for example, the following operator here, H, is too local. Meaning that it's the sum of terms that have at most tensor products of two distinct poly operators. So that's what I mean. And many important Hamiltonians end up falling into this category like Heisenberg model, Izing model, tort code Hamiltonian. >>: C1 plus C2? >> Nathan Weeb: Should be tensor product identity on both of those cases, but yes. >>: [inaudible]. >> Nathan Weeb: In this particular case, it should be tensor product identity on that side and tensor product on this side. So there's many cases of physically relevant Hamiltonians that fall into this class. And people would like to simulate them but there aren't any efficient methods yet. So out of K local Hamiltonians there's actually two distinct classes that I'm going to talk about briefly in this presentation. And cases are physically local and just strictly K local. You notice there's actually a pretty big difference. The difference is if you make a graph of each qubit in the system and draw an edge between each qubit that directly interacts with each other many physical models only have nearest neighbor interactions. So what that means if you take a look at the graph, each vertex has constant degree for these physically local Hamiltonians. However, generic K local Hamiltonians allow you to have a complete graph of interactions. So as you might imagine, this case is substantially simpler than the generic K local case. And because of the fact that many physical systems fall into this category, it's worth us discussing the difference between a performance of our algorithms in this case and in that case. So how do we go about simulating these things? So I'm going to present the easiest possible Hamiltonian you can consider simulating of this form. It's a tensor product of K poly Z operators with a weighting term in front of it. So the way you simulate -- the reason why this is the simplest is because of the fact it's already diagonal in the computational basis. So what that means specifically is that if you act with this on a particular quantum bit string, it will just give you that quantum bit string back. So the -- see why this is so easy. Let's consider the action of the time evolution operator, which is defined to be E to the negative IHT on an arbitrary initial state. By decomposing this as a sum of different computational basis states, we end up getting that by definition. So since the poly Z operator just acts as a phase flip depending on whether the corresponding bit is 0 or 1 we can actually take a look at what this operator is. It will simply provide a phase flip depending on what the values of in this case the first qubit is, the second qubit and all the way up to the Kth qubit. Here I don't mean direct sum I mean exclusive or between all of these things. So what that means is that means we can simulate the evolution by just simply doing a Z rotation on a qubit that encodes the parity. So if we can store the parity of all of these K bits here and perform a Z rotation on it then we'll enact exactly this phase rotation we end up seeing here. That's the idea behind the simulation circuit. So the corresponding circuit that ends up doing this is as follows: It takes an initial qubit string, performs a controlled knot gate on this last qubit to compute the parity. Then is a Z rotation on that parity qubit and undoes the computation of the parity. Again, we need to undo the computation of the parity to make sure that we don't require a very large number of N sylabits to do this computation. Now that that's done let's talk about a slightly more complicated example. Let's say that instead of Z tensor K, I switch the first two operators to poly X operators instead of Z. This makes it harder because poly X operators aren't diagonal. So we can't just do a rotation in the eigen basis. Instead what we need to do we need to transform to the eigen basis of this new Hamiltonian, and we do that by performing the hadamard transform on the Kth bit and first bit then we can treat that as a Z rotation in its eigen basis and then return to the computational basis by this transformation. So if we do that, then the resulting circuit looks identical except now there's hadamard transforms. >>: [inaudible] the [inaudible] numbers. >> Nathan Weeb: Yes. >>: No, no. Oh, they commute. >> Nathan Weeb: They're on different qubits, yes. But, yes, you're right if they didn't commute them. So that's the resulting circuit. And you can imagine Y is pretty much exactly the same. The only difference is if we replace those Xs with poly Y operators, now we have to do a slightly more complicated diagonalizing operation. And the one that works is pi by 8 gait to the sixth hadamard and then it's inverse on the other side. You might be wondering why the 6 and 2 you do pi by 8 eight times you get the identity that's why it's 6 and 2 in order to make sure that we can represent these transformations as just hadamard pod pi by 8 and C0 which are universal set of gates. So that is the more complicated example. So with this you can see how we can go about simulating any Hamiltonian of that particular form. It's a tensor product of XY and Z gates. What we do is we form single qubit operations to diagonalize the Hamiltonian. We simulate the Hamiltonian in its eigen basis and then return to the computational basis by undoing the transformation in the first step and that's it. So this isn't quite fully general, unfortunately, because of the fact that you can have a Hamiltonian that's the sum of noncommuting terms. So if it's the sum of XX and ZZ, because these two operators aren't diagonal in the same basis, we can't apply directly the same trick. So instead what we do is we do a time-honored approach. We use a Trotter formula. Trotter formula says, okay, well, if we ignore the fact that these two operators don't commute, then we can write the exponential of the sum as a product of exponentials. And the error in that ends up scaling quadratically with the evolution. So it's only very accurate for extremely small steps. Fortunately, we can always break a long simulation up it into a sequence of short steps. And if we do that, then we end up with scaling. If we break it into R steps, then we end up with scaling that goes like T squared over R. So we can always make R as big as we want in order to make this error arbitrarily small. The big catch is, well, unfortunately R has to be pretty darned big in order to get this to work out. So you'd like to get a better trade-off in some cases. So in that case the answer is to use higher order product formulas that actually respect commutators and simulate their action. So an example of a slightly higher order formula is this string splitting here which is just like the ordinary Trotter formula but it's Sim tried and the reason why I discuss this is because this formula is actually the basis behind a much more powerful approximation building technique which is known as the Trotter, Suzuki formulas and the idea behind it is we start out with this initial approximation S 1 and then we build a higher order approximation by doing this following step. For some value of S 1 that you compute, what you do you take two time steps forward using this integrator S 1 between 0 and T. You get almost to the end and you do something bizarre. You take a step back and then you do another two times steps forward. Although this looks counterintuitive, something like you wouldn't want to do you'd imagine you'd always want to go straight to the end. It turns out by doing these evolutions in this kind of convoluted order, what you can do you can cancel out many of those error terms. So rather than having an approximation which is order T cubed this will end up giving you an approximation that's order T to the fifth. >>: The previous slide. >> Nathan Weeb: Sure. This one? >>: Okay. I'm not asking anything. Just wanted to soak it in. >> Nathan Weeb: Sure. No problem. There's a lot of material here. My apologies. Okay. So this is a method that you can use to turn S 1 into a higher order integrator. And this method doesn't just end there. It turns out you can plug S 2 into the same sort of approximation building thing and make S 3, which will be a seventh order formula. And in general you can do that and here are the coefficients that will end up working for generating a Kth order formula from it. So that is the method that in general is used in order to fully optimize these simulations. So by doing this, we end up getting near linear scaling of the circuit size with T. And without using this optimization by staying with a low order formula, then we'll end up getting something that will be like T to the three halves or something worse like that. So that is the technique that in general will allow you to get around these problems involving sums of Hamiltonians. However, a question you might ask is what the heck is R have to be. Can you give me a value of R and promise that if you choose this value of R, then the error will at most be epsilon. And in general upper bounds have been proven. The two best results out right now are these results. And they end up predicting that the error in the Kth order Suzuki integrator minus the actual evolution is upper bounded by this over here. Numerical evidence suggests this upper bound is far too loose. It can be too pessimistic by three to four orders of magnitude in some cases. But, nonetheless, the results prove that if you take a value of a number of time steps that's greater than or equal to this quantity, which is derived from the error estimate here, then you guarantee that your error is going to be less than epsilon. So that's the importance of this. It promises that your error is less than epsilon. And this is especially important for simulation, because it's hard to assess the error in a simulation that you can't compare to the output of a super computer. So it's important to have bounds on the error. But if you don't need to be rigorous then you can take R to be whatever you feel like. So and furthermore one of the things that's important is optimizing your choice of R is also important because the total cost of implementing a simulation algorithm is proportional to R. >>: I lost track of what K is. >> Nathan Weeb: K here should be ki, which is the order of the Trotter Suzuki formula. Sorry, that's a typo. Again, just to summarize what the strategy is, is we want to simulate an exponential that's the sum of two noncommuting terms and we use one of these high order splitting formulas to break this up into a sequence of exponentials that are just like the kind that we discussed previously. And then we use our circuit decomposition intuition that we developed from that in order to handle each of those independently and that's how you go about building a simulator. The question is: How do you automate this in a fashion that actually isn't more work than doing a simulation in the first place? The last thing you want to do is make a Rube Goldberg contraption type automation procedure. Fortunately, our algorithm is a lot better than these examples of automation. So the idea behind our classical algorithm is this: What we do is we begin with input, taking an efficient representation of the system Hamiltonian as a string. We require a parameter that tells how local the Hamiltonian is, i.e., K. Say it's a K local Hamiltonian. The number of qubits the Hamiltonian acts on has to be provided as an input and also the desired evolution time and the desired number of time steps used in the overall evolution have to be provided as input. Then the circuit design algorithm automates the reasoning that I discussed previously and outputs a string that represents the quantum circuit. So now let's discuss what the inputs and the -- how to encode the inputs and the outputs using this. And there's many ways. This isn't unique. This is just something that we arbitrarily came up with. So the idea basically is this: Hamiltonian in general will look something like this. It will be the sum of these local terms with different weights in front of them. So the way we encode it is we store the weights as a vector. So we store each of those as its vector denote bold face A and we also store strings that encode each of these individual terms. It will tell you whether it's an X times an X or Y types a Z. And if we store two strings that encode those pieces of information then we have a complete description of the Hamiltonian. So that's the basic idea. The way we practically do it is as follows: We encode each of those terms as a concatenation of two strings, and the string represents the first string L represents the numbers of the three types of poly operators present in a given term. So, for example, in this X 0 X1 case there's 2x operators. There's no Y operators and there's no Z operators. So we store that as a string 200. The S on the other hand will tell you the locations of those poly operators. So in this particular case -- and it stores the locations of the X operators, the Y operators and the Z operators separately to avoid ambiguity. So in this particular case there's no Y or Z operators, and the X operators act on the zeroth and first qubit respectively. That's how we would go about encoding it. And the first example of the Hamiltonian is given here. So this represents the encoding of the total Hamiltonian. This description is also efficient for constant K. Because of the fact that we only need a constant sized string to represent each term. And furthermore, there is N to the K such terms and K is a constant. So for K local Hamiltonian there's a polynomial number of terms where N is the number of qubits in a system. And each of them has constant size so therefore the input -- output is polynomial -- sorry the input is polynomial and therefore it's reasonable to work with. >>: How is this taken zero is different from regular zero? >> Nathan Weeb: That's the first NP string. >>: [inaudible]. >> Nathan Weeb: It means null. >>: This is null. And stand-alone zero is one element with ->> Nathan Weeb: Yes. So in this particular case, if the output is similarly encoded. So we encode a C0 gate in the output is just the string C0 and INJ following it. We refer to a hadamard gate is hadamard times identity and pi by 8 gait. Hadamard added I indicating which qubit it acts on and pi by 8 gate is just S and the hadamard. That's the output, and it's designed to be implemented in order from left to right. So the way this will end up working, you begin with a description of a Hamiltonian that you have at an abstract level. You encode that as a sequence of strings. You feed that to your compiler, and your compiler will output another string, which can then be directly interpreted as a quantum circuit. So that's the overall idea. And, again, the strategy that we use to go about and achieve this is we use a Trotter Suzuki formula to turn our simulation into a product of exponentials. We then generate a simulation circuit for each exponential and concatenate them together to come up with a simulation circuit that describes one of the R times steps. The final step is very easy. We just glue our identical time steps together. One after another. And that will enact the overall simulation. So now I'd like to discuss how to encode the output of the trot trot formula in more detail to make it in a fashion that fits inside that paradigm. So the idea is that using some algorithm, doesn't matter exactly how, we ended up getting this sum of individual Hamiltonian terms. H -- I've denoted unfortunately H1 through H3. It's not Hatamars. I mean these are actual products of poly operators. So we have some output that looks like this. And then what we do is we encode each of these terms as a string. Again, we don't want to store them as a matrix because the matrix is very big. So we can uniquely specify each of these terms over here by specifying the Hamiltonian which is done by giving the weight of the coefficient, the number of each type of poly operators in there and their locations. So that specifies the H1 and this specifies the evolution time for the first term. So we can come up with a string representation for each of these exponentials that are in the product. >>: Are those the same? >> Nathan Weeb: Excellent question. In general they won't be the same. The reason why is because of -- let me go back a bit. This. Notice that these are not just the same length either. This backwards step is much longer than the previous two. So in general many of these durations will be the same but some of them will be different. So I need to specify which ones are different. >>: Base trotter they were all the same. >> Nathan Weeb: For the base trotter they were all the same. >>: Before you did the steps. Okay. >> Nathan Weeb: Okay. So then we use this output of the algorithm as a sequence of strings of this form as input for circuit construction algorithm. So our circuit construction algorithm is pretty straightforward, although it looks kind of ghastly. So what we do is we take as input a description of one of these exponentials. And what we want to do is we want to output a circuit that simulates it. And the way we do it is we take each element in SIX, which is the description of the locations of each poly X operator in the term. And then we want to transform to its eigen basis. And we do that by applying the hadamard transform on the qubit on which it acts. That's what the first step says. Second step says, well, do the exact same thing for the Y operators instead of -but we have to use a slightly different diagonalizing operator, we have to use this ghastly thing here. But it's intuitively exactly the same. So after these two steps, the term, the individual exponential has been transformed to its eigen basis. So what we do is we find some particular value L, which is the maximum entry of all of these things here. And the reason why we do that is because I mentioned that we want to compute the parity when we do these things. And we'd like to not use an extra N syllabit to store that parity.. what this step is is finds arbitrarily the qubits that this acts on that has the highest label. And it is the appropriate rotation just on that qubit. So that's what this is. It finds the qubit with the highest label that's nontrivially interacted with. And then for each element in this SI over here, i.e., each location at least one poly operator acts on we need to compute the parity. We do that by doing a C0 with the control being the gate that has the thing in it and target being the last one. And obviously we don't do a C0 from itself on to itself. We can skip that. So we continue through. The next thing is the RZ operators that I alluded to, they are not in our fundamental gate set. So we have to transform them into things that are. And to do that we use a Solovay-Kitaev algorithm as proposed by Nielsen and Dawson. But there may be more efficient versions that could be in existence soon. But I should of course mention that in general in particular architectures, our Z gates are things that are relatively natural to carry out but we stay away from that because of the fact arbitrary precision rotations aren't something that's a very physical resource. So we prefer to continue what happens by discretizing it into our fixed gate set. These last three steps are pretty self-explanatory. They just undo all the stuff that's done up here. So that's it. >>: So on step 5, we had a some epsilon as input to the overall algorithm, right? >> Nathan Weeb: Right. >>: So how do we translate into a tolerance requirement. >> Nathan Weeb: Excellent question. >>: Consolidated that. >> Nathan Weeb: How that translates is we have, from your value R you use for the number of time steps, and the order of Trotter Suzuki formula that you use, you know how many exponentials you're going to have in your sum. So you want the error in your, in the Solovay-Kitaev algorithm to be like epsilon over the number of exponentials. Because of the fact that the error grows at most linearly with the number of such terms. So you can guarantee then that the error will be at most a constant multiple of your error tolerance if you do that. So that's what you do. Now let's put this all together and then again describe how we would go about and build the overall simulation from this. And it's again more or less what I said. We apply the trot trot algorithm with the inputs that describe the particular Hamiltonian and compute a sequence that ends up describing the set of exponentials that are needed to perform one small time step. Then we glue our such time steps together and output. That's basically it. So now that I've described the overall algorithm, I'd like to talk about some applications to show how well this ends up performing. So the classic -- one of the classic examples that's worth talking about is the honey comb model which is important because of its relevance to topological quantum computing. So this is an example of a physically two local Hamiltonian, the reason why is because each qubit represented by a circle here is coupled only to nearest neighbors. And furthermore each of them, regardless of how big you end up making the system, each of them is only coupled to at most three other qubits because of the fact that the number of couplings doesn't increase with the size of the system it's physically local not just K local. So that's what we ->>: [inaudible]. >> Nathan Weeb: Difference between white and black. There's a -- okay. Basically they're not -- they're not of the same class. All the axises can be translationally mapped on to each other whereas the couplings have a different sort of form from the whites and blacks, that's why they're listed differently. >>: [inaudible]. >> Nathan Weeb: Yeah. So if you take a look for the Z gates, or, sorry, the white gates, couple down by Z here, we can't just translate that down to this lower one without rotating and make these two equivalent. All the rest of these, though, can be mapped just directly on to each other. So it's a rotation. So when we go about doing the simulation there's three different types of interactions we end up getting out of this. Each of those interactions can be decomposed into simulation circuit that will handle them. Going through the work that I described previously, these describe the YY couplings, these describe the XX couplings and these describe the ZZ couplings we have to glue all these steps together and we can figure out the cost. Doing this we end up finding for fix number of time steps R the following number of gates required for the simulation, which is a neat result because of the fact that nobody actually before this, to the best of my knowledge, had provided a decomposition into fundamental gates like this. And, again, you can do better. We listed in terms of Z rotations because of the fact that [inaudible] could give slightly different results depending on what your rotation angles are for a given time. >>: What's to be down here? >> Nathan Weeb: N is the number of qubits in the system. >>: Lives. >> Nathan Weeb: Yes. And the value R ends up scaling like this, and you can treat little M here as being order N in this particular case. So roughly speaking this ends up scaling like N squared. So it's very efficient. Similarly, and let's take a look at simulating models of super connectivity, particularly pairing models between electrons. And Woeburg and Litdard [phonetic] showed that such Hamiltonians can be written in this particular form over here. So this is a form that also is amenable to our simulation techniques, so we can actually simulate that, actually even using the exact same circuits that we described for the honey comb model. So doing the exact same thing, gluing all of them together we end up finding the following costs. And one of the key things to note is before we only had N times R basically. But here we have N squared. And that difference is because of the fact that the model, if I go back, you'll notice it actually isn't physically local. There's coupling between every other qubit in the system. And that ends up fundamentally changing the complexity. So our result ends up giving performance scales like N to the fourth and slightly worse than linear scaling in the time you want to simulate the system for. And this is much better than the previous result, which is nearly quadratically worse in terms of simulation with time and has a worse factor of N as well. So that's what we end up getting out of this. In general, we can talk about what this thing will look like when we apply it to an arbitrary K local Hamiltonian or an arbitrary physically local K Hamiltonian and the number of operations that we end up getting ends up scaling something like this. It ends up scaling a little bit worse than N to the 2 K for generic K local Hamiltonian whereas for physically K local Hamiltonian it's a little worse than N squared. That's a big difference for a large value of K. Even for K equals 2, that's a nearly quadratic difference. So that's one of the reasons why denoting the difference between these two different classes of local Hamiltonians is important for our context. >>: What's the meaning of the Z membership symbol here? >> Nathan Weeb: What I mean it's like I would use for big O notation, but I can't put a big O around the outside of it because my use of little O in the exponent here. It makes it a little bad reading. So the way to read this is it asymptotically scales as worse as function. >>: It's belong to this asymptotic family? >> Nathan Weeb: Yes, exactly. So how do we optimize these circuits further? Now that I've provided a rough prescription for how you can go about doing it, and the way -- one of the ways we can do this is that actually many of the terms in the Hamiltonian will commute. And if we just were to randomly put these commuting operations into the trot trot, then it will kill the depth of our circuit, because there's all sorts of operations that in principle could be done in parallel but no longer will we be able to if we naively put it through. One of the ways to optimize the results is by finding the terms that commute with each other and grouping them so we can execute them at the same time. So the way that we do that is as follows. So you can see an example of this for this particular Hamiltonian and here are two different ways that we can use the trotter formula to approximate the time evolution operator in this case. In this first case you notice these two operations in the brackets commute, because they act on different qubits. But in this particular case we can't commute any of the terms. So if we go through the circuit design algorithm, we find that the resulting circuit for Case A looks like this. And Case B it looks like that. And it's a little hard to get an eye for. So I drew this out so you can get an idea what the circuit depth looks like. And you'll notice that in this particular case the depth is reduced in contrast to the other one. So that is one of the things that we can do. So how do we end up finding whether or not terms in the Hamiltonian commute? The way we do it is we take a look at the following relationship. So it turns out that equivalence for two terms in the Hamiltonian commuting is this condition over here. So basically what I mean here is the size of a particular set of locations where the poly operators both have disjoint action, is at most -- is equal to 0 mode 2 so it has to be an even number of locations where they both have different actions. You'll see things like this in, say, the toric code where you have XX couplings and ZZ couplings that are happening on the same qubit but actually the two terms commute because there's an even number of interchanges. And this refers to this, this criteria captures that, as well as the case where you actually have different actions on each qubit. That's what we get out of it and basically our grouping algorithm for grouping these terms is really straightforward. We just compute that term and we group them all together in a fashion that has all commuting elements in a particular group done first and then every element in the next commuting group is done after that and so on and so forth. So that's how we do that. The question is what is this end up doing for our algorithm? And there's actually substantial improvements in the complexity for the two cases. So in the case of -- let's go down. So in these particular cases, we end up getting scaling with the number of qubits that goes like N to the 2K plus some small sub polynomial functions minus 1 in general using grouping; whereas, if we didn't use groupings, you remember the number of operations required which is equivalent to the time if we don't use parallelism. Scaling works like this. So we end up getting a reduction by factor of N, by grouping in the worst case scenario, by using this algorithm. For physically local Hamiltonians, it's even better. We end up going from something that's like N squared in our previous work to something that is like N, if we use grouping and we perform all operations and commute in parallel. So as a result, this can actually really, really seriously improve the performance of our algorithms in architectures that permit parallel execution. In architectures that don't permit parallel execution it's not clear what form of improvements this will end up giving. So to conclude, we present a constructive argument or constructive algorithm that generates circuits for simulating many-body systems. Our circuits are more efficient than previous constructions that have been considered, and also, as an important point, we don't actually require any N syllabics for this. We use N qubits to simulate N qubits, which is a big advantage for people who don't -- for existing quantum computer implementations that are limited to at most 12 qubits or so at present. And also we show how to exploit parallelism which can lead to improvements in the execution time for things that allow parallel execution. There's a few opening questions. One of the important questions I think is, well, this grouping idea that we came up with, is there more that we can get out of it? By expressing the terms in ways that commute, are we actually reducing the error in the Trotter Suzuki formula more so than what the upper bounds say. Equivalently, when we do this grouping, every term that commutes has a simultaneous eigen basis. Yet when I was describing that previously I was at every step even for the commuting terms transforming to the diagonal basis, performing a simulation, transforming to the computational basis then transform to the diagonal basis. In many cases an intelligent choice of the diagonal basis may allow us to forego almost all of the basis transformations. So that is one way in which this can be improved. Also, the question is can we use compilers that actually optimize the circuit output in order to improve the performance of these really naive simulation circuits that end up coming out of this algorithm. And I would be very surprised if they couldn't be in many cases substantially improved. Also, in general, we might not be interested in every possible property of the quantum state. For example, for these systems we might only be interested in the average magnetization or some quantity like that. In those cases, do really we have to keep track every aspect of the quantitative system? There's some things that we can throw out because they won't affect it. That's one open problem and one I hope ends up getting addressed but I don't know the answer to it. And the final question is can these ideas be used in order to design very efficient simulation circuits that can be used in order to bootstrap a quantum computer? So thank you very much. [applause] >>: So considering your second item from the top. >> Nathan Weeb: Yes. >>: I would be very much interested in a case study if you would have some cases where you have manually shown how to reduce the circuit size. >> Nathan Weeb: The second point here? >>: Right. >> Nathan Weeb: Yes. That would be something that would be interesting. >>: This case study might help me to actually test some of the circuit optimization ideas. >> Nathan Weeb: Okay. >>: Concerning the third from the top, it looks like a projection exercise. Can you find a projection to some system with a smaller qubit size, for example? To see -- while at the same time making sure that whatever projection loses is not what we are interested in. >> Nathan Weeb: That's exactly the idea. >>: Could you also start with Hamiltonian design, could you put less into the Hamiltonian the thing you don't care about [inaudible]. >> Nathan Weeb: That's a very good question. But in many cases, aside from direct application of perturbation theory it's very difficult to say what terms you'll care about and which ones you won't necessarily care about for a particular observable. I mean, if one term is a million times smaller than all the other ones then it's a trivial task. But in general I think it's a hard problem. >>: And how do you foresee this helping, the fourth point? How do you foresee this helping to bootstrap? >> Nathan Weeb: The way I see this potentially having application for bootstrapping is ideally what you want to do with bootstrapping is you want to have a partially characterized and partially controllable quantum computer that doesn't have access to all of its qubits because of poor characterization. The idea is you would perform a simulation of how some ideal circuits should perform on a subset of that larger space and then based on the outcomes of that you benchmark and compare the performance of that particular uncontrolled system with different sets of controlled parameters in an attempt to learn how to control that and increase your number of effective qubits in a system. So you do this. You build a larger quantum computer after reclaiming say one additional qubit you have and then you repeat this process until you can control every physical qubit in the system. >>: Bigger question, though. You seem to be using implicitly the idea of qubits. And. >>: Qubit defense. >>: All your qubits are ideal and all of your gates are ideal. >>: You're absolutely 100 percent correct. >> Nathan Weeb: This analysis is not taking a look at any form of gate errors, any form of gate imperfections, other than what happens by implementing our Z gate by [inaudible]. So all of this is done in a very idealized model. >>: [inaudible]. >> Nathan Weeb: And decoherence. Yes. So, yes, you're absolutely right. There's much more realism we can build into this. But in many cases this is going to be device-specific. And our goal here was to come up with a general circuit building model that's appropriate on any platform. Refinements to these ideas I think would be very appropriate for optimizing the performance of simulation circuits in the presence of these problems which will be specific to an implementation. >>: Solve [inaudible] first. >>: If that doesn't work you're dead. >> Nathan Weeb: You're dead in the water. >>: Ideally. >>: There was zero negativity in my comment. >>: Obvious. The point there's a lot of work to get -- I think that might have had to go with the bootstrapping question, when you started talking about a real system now we've got a whole different layer that makes it even harder, which is figuring out how to understand the qubits you've actually been dealt. >> Nathan Weeb: And under what circumstances for a real system you're not going to be able to perfectly control, you're not going to be able to perfectly reclaim them. So the question -- that's why I think it's still an open question and an interesting one to determine to what extent can quantum simulation algorithms be used to build a bigger quantum computer. >>: Nowhere here did you talk about preparation of the starting site. >> Nathan Weeb: Never ->>: So are we just going to assume ->> Nathan Weeb: Assume you can prepare your favorite state. Or equivalently assume you've got some machine that at the beginning of the algorithm output that to you. >>: Have you implemented a simulator on top of the strings of this or was this just the work to figure out how to get the strings. >> Nathan Weeb: This is just the work to figure out how to get the strings. >>: Have you coded all that? Just wondering. >> Nathan Weeb: No, I haven't. >>: Is this all in the paper, in the archive. >> Nathan Weeb: Everything's in here except for the grouping. That bit is being added. And in preparation for a submission to a different journal. >>: I'm just saying what journal do I look for it in? >> Nathan Weeb: We'll be submitting it to shortly to the new Journal of Physics, and we'll have a new version on the archive hopefully very soon. >>: I'm always looking for the updated version, whatever happens. >> Nathan Weeb: Exactly. >>: Good stuff. A lot to digest. I've read the paper twice. I have to go back a third time. >> Nathan Weeb: I think there's some improvements that can be made. >>: That's my understanding. >> Krysta Svore: Thank you. >> Nathan Weeb: Thank you very much. Really grateful that everybody came.