>> Krysta Svore: So hi, everyone. Thanks for coming to today's talk. Today, we have Martin Roetteler here from NEC. He received his Ph.D. at the University of Karlsruhe in 2001 in Germany. After that, he held a post doc position at the Institute For Quantum Computing at the University of Waterloo in Canada. And now, he's a senior research staff member at NEC, which is in Princeton. And he leads the quantum IT group there at NEC. He's published more than 90 refereed journal and conference papers and is co author of a book on quantum information. His research interests are quantum algorithms and quantum error correction. Today, we'll hear about quantum algorithms. Martin will tell us about quantum rejection sampling. So let's welcome Martin. >> Martin Roetteler: Thanks, Krysta. Thanks a lot for the long, elaborate introduction. Very nice. I want to talk about algorithms. In particular, about a problem of sampling. So you all know classical sampling where you have prorated distribution, you want to sample from them. In quantum, we have a problem that arises in some specific situations where we would like to do some sort of quantum sampling so we want to prepare certain quantum states into the memory of the computer and maybe then, afterwards, sample from them or massage them into something else. So that talk will be dealing with that problem. Technically, in some part. I want to start taking you on a little detour, though. I want to start with some news coverage you see in the media about certain company, maybe get my take on it. Maybe the tyke is still not fully developed yet. But I want to take you on a little tour and then arrive at a problem that I'm very passionate about. It's called the hidden shift problem. And I'm passionate about it because it relates to a feature of a quantum computer that I personally thing is maybe the most powerful thing. It's the ability to do fully transform. It's very, very fast. And that thing I want to show you today something that maybe you've seen before, maybe not. It's that you can use the Fourier transform in a way that it helps compute convolutions. Classically, that's a very, so important feature of the Fourier transforms. Quantumly, I think that has not fully leveraged yet, that's feature, and I want to take you on that trip too, and I want to show you how that can be done. Let me start first, all the hype, you see a lot of articles these days about quantum computers, are they around yet. Some machines are called quantum computer where it's maybe not clear yet whether it's indeed quantum. And Google apparently has recently purchased one. Lockheed a few years ago purchased one. So I have a few slides at the very end of the talk where I give my personal take on it. But like I don't know. I personally don't know. Is it quantum or not, I don't know. I'm very curious about it. And for the last three or four years, I had to deal with that situation because our management really wanted to know what to do with these machines. So I had to form an opinion about them, and I can share you what my opinion is. And in a nutshell, I'm curious. I would like to find problems that are very good for these machines and where we could leverage, or I would like to find more evidence that shows that these machines do not work. In particularly, if you scale too many spins, there are maybe reasons coming from statistical physics that tell us that these machines are getting maybe that they will break in some way. So I want to give you a little bit of a flavor, even though I'm not a physicist. I'm not like I'm a computer scientist. I can only, like, repeat some of the work we did in our group regarding that specific analysis. Okay. Coming back to the questions, is it hyped too much, or is there a breakthrough coming. So I want to talk about the hidden shifts, which is a problem convolutions, right, how can we use the Fourier transform to form convolutions, and then get into rejection sampling, which is the idea to massage a state into a target state, roughly speaking. We analyze that using a semi definite programming for very specific problem. And then get into some other algorithmic areas. Oh, by the way, feel free to ask throughout the talk. I like it much better if people ask right away. So any questions up to that point? Okay. So why do we do quantum computing at all? Well, the ultimate price would be to find problems where we can show exponential speedups. That's the promise of a quantum computer, to find problems where we can beat a classical machine, like the best known classical algorithm where we can beat that potentially. That's, I think, the holy grail of that field, and, of course, there's been a lot of work been done where people show polynomial speedup, maybe a square root improvement or a two third improvement or so. These things are also very interesting, but in a way they are one has to look much, much more careful on a concrete instance if you get a speedup or not. Like the constants really matter and I've just, like, I just actually finished this weekend my involvement in an iApple project so I survived the thing where we had to look at massive sets of instances and perform baseline estimation. What is the actual cost, given like a hardware model, what is the actual cost of the algorithm. It turned out that the polynomial speedups don't help much at all in these cases, because there are so many overheads due to error correction, for instance, you have to pay make three, four, five orders of magnitude overhead for the error correction. But there are other overheads we also have to pay, like to make things reversible, you have to pay an overhead that usually leads to very big additive terms in our estimates, and they're so big that in order to show an improvement, you have to go very far out in the instance size. If you're interested, I can talk much more about that too. In a nutshell, we try to leverage interference in an algorithmic way. So we have kind of computational path, and we want to bring them to interference to cancel them out. That's something you cannot do with a classical, probabilistic computer. Otherwise, a quantum computer is a kind of probabilistic machine, but it has an extra ability of cancelling passes that you don't want. You can bring them to destructive interference. And so right now, it's known that they are good for some problems that fall maybe within cryptography, maybe fair enough to say the problems where it's really, really good, they are like breaking SA and elliptic curve systems and such things. But presumably for a real business case, one would want to look at different problems, because how many of these could you sell if that's the target application? It's not so clear, right? So if the quantum computer, however, is used for simulation of quantum systems, arguably, there's a bigger market for that, right? More people would purchase a machine specifically to simulate quantum systems. Then they the number of people who would purchase a machine for breaking RSA, like my personal belief is if you indeed get to that state where you have a large scale computer people would probably move to some other schemes for probably keep cryptography. Even if they are, they would need larger keys and whatnot. So that's actually another thing that's worth studying, like can we tackle them too. But I think like the real market value, some day, will sit here in simulations. NEC's quantum computer is based on super conducting qubits and it's quite small. This is like an older layout where the qubits were very strongly coupled, were very close, and there was a circuit that performed a coupling. Nowadays, there are other ways to lay out a circuit. So Santa Barbara has a group that's top in that field, and Yale and IBM, for instance. In Europe, there are very groups. So the current trend there is not to do these kind of very close geometric couplings anymore, because they're kind of coupling them too strong. So nowadays, they are like so called resonators between them. They're like little cavities so you can put them very far away, and still be able to couple them. And that has many advantages. For instance, you can read out the individual qubits much better than in this layout. So this one was good to demonstrate to queue bit algorithms. So in a lab, two qubit algorithms nowadays can be done in a super conducting qubit in the circuit model. And as far as I heard from people who were at the APS meeting this year ago the trend is now to scale them up. So people are very careful scaling it up, but there are groups who can do three. There might be groups who can do four, and there might be groups who can do up to like five, six or so in the near future. So this will definitely scale up. Not this particular layout, but the one with resonators. I think it will scale up quickly. We're not there yet where error correction really would play a big role. But arguably within a few years, there would be a time where the systems are getting into the hundreds and thousands of qubits and then error correction would be important. Any questions so far? That's just to motivate what we're doing. Okay. So I'm sure many have heard, like, for factoring, there is an efficient quantum algorithm. The best known classical algorithm is the number field shift, the generalized number field shift, to be precise. It has a run time that's exponential in the bits, the number of bits of the number you want to factor. And the best known quantum algorithm has a polynomial running time. So there's a big gap there. Of course, you might ask, like, how do I actually implement this, and that's a big topic. But it would break a lot of the public key crypto that's currently used to set up, to initiate secure communications. Then people usually switch to a symmetric side once it's initialized. But in the initial stage, you have some asymmetric method. So that will be broken. And some of the things I worked on were generalizations of this idea. So it's, in essence, it's abelian problems. Things compute. You have a group which in that case is just a large cyclic group, and you have some property of a subgroup of that big cycling group. In that case, you want to find a period so you just specify a subgroup of that group which you want to find. And then the quantum algorithm can find a generator of that subgroup that, in turn, can be used to factor. So this idea itself can be abstracted out, and one arrives at a problem called a hidden subgroup problem, and that's kind of you can apply that suddenly to many groups. You're not restricted any more to cycling groups or abelian groups. You can apply it to any group, any finite group. You can even apply to infinite groups. You could apply it to the SU2 group and so on. People have thought about all these things. And why did people do that? Well, there would be some instance which would be very nice. If you could solve that hidden subgroup problem, completely generally, there would be beautiful things we could suddenly do. We could do actually, I have a slide about that later in the deck. We could do graph [indiscernible] morphism. You could decide whether to give them graph size morphic or not. We could do lattice problems. Certain lattice problems we could answer, like finding the shortest vector in a lattice of some type. There's always some little asterisk saying a little fine print saying it doesn't work for all lattices, right. The lattices have to satisfy some condition. But we could certainly tackle some of these SVP problems too if we could do it generically. >>: [indiscernible]. >> Martin Roetteler: That's a good question. So the question is how do we actually specify the subgroup in the general setting. It's always it always has to be given by it has always to be given implicitly by a function that you can evaluate as a circuit, for instance. The function takes as an input a group element of a group. So there's a parent group, an aim incumbent group. It needs to take as an input that group element and output something. That output domain is not really important. That range is not really important. But it must have the property that you can evaluate the function. And then you must have the promise that the function takes the same value on the subgroup. If it would fluctuate in terms of value, it would immediately break. >>: Characteristics function . >> Martin Roetteler: It's a characteristic function, but it needs to be more. If it's just if it's just some board I can write on if I have just a characteristic function, it's not going to be enough. That's a very good point, actually. Suppose that's my ambient group, right? It's a big domain. Suppose I encode the elements as bit strings of order N, of length N, right, some kind of encoding. And then these guys would be implicitly defined by a function. But if it were just a characteristic function, so I get constant one here and say constant zero out here, this approach would not work, actually. Because what would happen is what we do is we take Fourier transforms of that function, sort of. And if it would take a characteristic function, think of it like this. You have your group G, you your H here, and you got your little bump telling you, right. But you've got all these zeroes out here. Now imagine that's encoded in a phase somewhere, right? A phase encoded that would mean maybe I get a phase here of minus one and a get a phase of plus one out here. If I take a Fourier transform of this, I get a huge peek at the zero frequency. That's not good. I would and a Fourier transform really, to be quite honest, is the only way we can tackle these problems. We go to a Fourier domain for this, which is not the standard Fourier transform. It's like a Fourier transform for that group G. I can also talk about that a little bit later. But as like a way to diagonalize the group action in a basis so that the action becomes very low call suddenly, right. Technically speaking, there are like irreducible representations of that group. But you might think of them as being frequencies. So you would get, from the frequency pictures, you would get a huge peak out here and then maybe a little bit of signal that tells you what is actually that H. But if it's just that, it would not. It would not work. >>: People can see >> Martin Roetteler: I'm sorry, okay. I was just going to indicate down here that the spectrum of that guy would have a huge peak here and then maybe a little bit of signal outside. It would encode what that H is, but if you just measure from here, you would typically fall in the big peak and not sample anything. So that's why a second ingredient is necessary, and that's on different cosets of the group, so if you shift it slightly, you want different values of the function. You don't want just have that. You want to have maybe a phase here, a phase here, and so on. And that turns out to be enough for abelian groups. So if you have ambient group G is commutative and you have that partitioning here into all these characteristic functions of all the cosets, and they're all different on different cosets, then actually you can, by magic reconstruct this age. So that's why it's a little bit restrictive. Essentially, it's the idea of taking a characteristic function that's right. The question is what are the real world problems we can map to it. Some kind of lattice problems can be mapped to it. Graph isomorphism can be mapped to it. And there's some more. Sometimes it's kind of contrived, but there is definitely more work to be done on that particular area. >>: [inaudible]. >> Martin Roetteler: It would be ver non abelian. For isomorphisms, it would be actually be if you take isomorphisms of rigid graphs, where the graphs themselves have no self isomorphisms, you would get a [indiscernible] product of N to the S2 so you get two copies of SN. You're allows to permute the first graph, the second graph. But you also allowed to swap the two graphs. So you get this would be the group. It's kind of nice. Like the representations would be very easy to write down. It's just pairs of representations. Two types. One type is given by pair of representations of SN, and then you can induce to it that group. There's a process called induction that gives you. Or there's another type where actually you have the same if you have the same representation of the two factors, then it can be shown that you can actually extend it to G. So there's kind of two cases where you extend it or you can induce it if they're different, say different shapes. Then you can induce it to G. So it's completely understood what the representations are. It's completely understood what the Fourier transform are. What's not understood is what sense can we make of that non abelian Fourier spectrum. If that's the group, if the group is this nasty thing, we suddenly don't have scalars in the spectrum anymore. We have matrices in the spectrum because the spectrum looks like some block diagonal list of matrices. So those are now the what used to be a Fourier coefficient is now a matrix. And the question is if we zoom into these matrices, which is what we can do physically, we can actually assume that we are given one of these matrices, we can prepare it. The question is what do we then learn about the hidden subgroup from just having that matrix. And that actually is very frustrating for that group in particular. There's not there does not seem to be much information in here in just a single actually, we can show that if you just follow that standard recipe of preparing supposition, sampling from here, generating these states, getting exactly these matrices, we can show there is not enough information here. When I tweak it, one might for instance say let's take many copies of this and then do joint processing. There might be enough information there, but nobody knows how to extract it. So that's something I actually would like to work on in the future too, like can we come up with ways to kind of make probably can't see that, make many copies of a state like this, zoom into the blocks and then do a joint processing. But not many techniques are known in general to do joint processing on many registries at the same time. Like one is the PGM, so called PGM, but that doesn't seem to help here. All right. Yeah, more questions? So actually, some of these negative results have led to some initial results in post quantum crypto. What that means is let's find schemes that are still secure, even if a quantum computer is around. That seems like it may be premature to ask that question, but definitely there are people interested in that. People at NIST, for instance, they might be interested in that. If a quantum computer is around, what recommendations can we actually make, what systems to use, right? What is a good public key system and what are reasonable security parameters taking into account these quantum attacks. That's actually a question people might be interested. And there's only very little is known right now in terms of what alternatives there are against quantum attacks, but I think it's an emerging field. Another emerging thing, and this is slightly different in terms of the what actually needs to be done is actually attacking ciphers. So you might ask, okay, now I have maybe AS, right. I can observe plain text ciphertext pairs. What does it actually cost to tackle that on a quantum computer. What's the actual hardware cost if I wanted to break it using a quantum computer. Say using Grover's algorithm. There's no speedup to be no exponential speedup, but what would the actual like if it would actually have a quantum computer and I would want to implement that attack, what would be the hardware resources to do it? That's one question, for instance, tackled in this field of quantum computer cryptography crypto analysis. And one can play with certain models, variations of the model. We just finished a paper where we showed that if you give the attacker a lot of power; for instance, you weren't to attack a block cipher, you give the attacker the power to ask for plain text ciphertext pairs where you have not the key but the bit flipped version of the key. That's a model studied in classical crypto. It's called related key attacks. So it was used, for instance to break some wireless encryptions, WEP, for instance, was broken using related key attacks. So you don't get the encryption itself, but you get the encryption and the encryption under keys that are related to the actual key, all right? So in practice, of course, people just used a key and then you know the key schedule, the next key will be maybe an increment by one or so, right. And that knowledge is used, can be used to break. So we recently in the paper with [indiscernible] Stein, we showed that if the attacker has the ability to ask for all bit flips, like the encryptions on all bit flips of the key and to do it in super position, then we can actually break any block cipher. Any block cipher. Only one condition in the cipher, namely that a sufficient number of plain text ciphertext pairs must characterize the key uniquely. But for actual ciphers that actually one of the design conditions that it's not happening. So for AS, for instance, two will is a very high probability, suffice. If viewed two plain text ciphertext pairs, there's only one key that can produce that. That's, of course, not a realistic attack model for actual breaking a cipher, but we found it interesting to ask the question, like, once we go there, what are the interesting attacks, what is the how to formalize these things. So yeah. All right. Sometimes people joke that there are only two algorithms around, factoring and search. That's all. I'm always trying to argue, no, there's more. There is more. There's also many cases of exponential speedups unknown. There are many cases of polynomial speedups unknown. But as a true there's a truth in that statement that they are only very few primitives really that we know of in algorithm design. And they seem from the ability to amplify amplitudes by some iterative process, typically very sequential. We do something that increases amplitudes we like and suppresses some we don't like. But it's slow. And the other has to do with the Fourier transform, right. So like the ability to find periods. And in a sense, most of these problems most of these algorithms use these features, or they use them in conjunction. >>: So is it a hidden shift problem, or a hidden subgroup problem? >> Martin Roetteler: In a sense, it's going to be both. In this talk, it's going to be both. But on that slide, it is a hidden subgroup problem. Like here for instance, it's the Heisenberg group and here, in the affine group, one can find hidden sub groups. Not for all sub groups of the affine group, but for certain sub groups, if they're large enough, one can find them. I'm not sure if there's any shift in this one. No, here HSP always refers to a hidden subgroup problem. So let's go back to the Fourier transform. In a nutshell, this is why Shor's algorithm works if you have not seen it before, this gives you a very quick explanation how it works. So in order to factor a number, one reduces to the problem of finding a period of function so think of what you can do is set up a state in a quantum memory that has amplitudes that are very binary. They're either zero or nonzero, and the nonzero ones are spaced out in a very regular pattern. You know for sure that they're always spaced out with a period R that you're trying to find. Once you can find the period R, you can factor. You're done. There's a reduction to that. So the only hitch is that you will not get this comb of these peaks, but you will get it with an offset that you cannot control. That's the only down side. Otherwise, you can just set up that state perfectly in a quantum memory by just using the exponentiation map. Okay. So what to do with that. So if you would just sample, it would be no good, because you would, the next time you set up the state, you would have a different offset, so you just sample randomly, essentially, right. If you sample now, you get a random sample. So Shor's idea was just take the transform of this, what it will do is it will transform that shift here into a linear phase on top of the delta peaks, right. If you just look at the amplitudes, it will still be very uniform, spaced out with 1 over R, but the information about the shift goes into a linear phase, a complex phase that sits on top of these pulses. But if you measure, we don't feel that phase. We measure, we pick up a particular state with amplitude given by the absolute value square of that number, and so we will actually sample a multiple of 1 over R. And now there is a classical reconstruction that goes from K over R, it extracts that R, right. If it would just be written there as a rational number, it would be easy, right? You would just look at the denominator and the denominator would be it. It's not as simple as that, because you cannot work with any length here the Fourier transform, you have to work with a length which is a power of two. So you get something of the form alpha divided by power of two. But there's a classical algorithm based on continued fraction expansion of that number you get, which extracts for you that R. But essentially, what the algorithm does, and why it works, is it's able to forget information about something you don't want. So you don't want the coset information. By taking the Fourier transform, you can forget the coset information. In general, that's also what we want to do. We want to go from the coset that we can get. And generally, we can kind of zoom in here and what we would like to do is we would like to go back to here kind of forget the coset representative. And what the Fourier transform does is it goes from that to some perpendicular space, all right? So if it's abelian, everything's very nice. And a coset goes to just a coset of a perp group. And then knowing the perp group, you can go back to the original group. But in a non abelian case, there is no such concept of a perp space. That's why this doesn't work. But for the abelian case, that's exactly how it works. This idea works for general abelian group, and the, like a little bit more mathematically what happens is you would set up two registers in your memory. So the zeroes stand for some state you know how to create. Actually, there are many of them. You need mean qubits. You initialize them all in zero, and then you create a super position of all inputs to your function. The function is just model exponentiation. And then you evaluate that in super position. You create that state and then you can ignore the second register out there. And what you then have is a state that looks like this. That's all you needed really. Kind of the bottleneck is the implementation of that map F. And there's some nice work on this, like Chris in the audience, he works on how to lay out that F in the very shallow circuit on a 2D nearest neighbor architecture which actually what you would have in a physical computer. And that, in turn, boils down to do arithmetic efficiently. If you want to implement the function F in that particular case, that's what it reduces to. The Fourier transform itself is not even very costly, that step itself is not very costly. It turns out that you can take the classical Cooley Tukey algorithm, which factors that very dense matrix, the Fourier transform matrix, into a sparse product. But it's not just sparse. It's very tensor sparse. Sparse alone is not enough to be efficiently implementable. But if you have a lot of tensor products in there, sort of that's simplifying. But that's always a good structure to be exploited, because in a quantum computer, tensor products come for free. You only need to implement the tensor factors. Then it actually affects a matrix like this. So that, in essence, allows you to improve over the N log N algorithm for FFT. In a quantum algorithm, you can do it with log. Actually, one can even shave off a few more factors here. So Cleveland and Watters have shown that you can do this in log N log log N if you want. And there are various approximations that one can do to the circuit, for instance, these rotations become very, very small. At some point, one can just neglect them. One can just prune a lot of these controlled rotations. And there are many other things that can be done. One can, essentially, compress that circuit to linear depth by rearranging the gates. And if you have ancilla qubits available, you can even compress it further. You can, in extreme case, compress that to a dep log N when N is the number of qubits. But in practice that's probably not when you want to do, because then you have a lot of ancillas to keep track of. Anyway, bottom line is you should really think of a Fourier transform as something that can be done as very at very, very little cost on the quantum computer. I like to think of this actually as being no cost. It's like you can do a Fourier transform for free. Like an optical computer. I don't know if you know about optical computing. But in optical computing, the idea always was if you have a signal, which is then just a light, it's a light wave, it can propagate through a lens. Then if you're at the right point, like in the focal point of the lens, this thing will do a Fourier transform of the incoming wave and it will do it physically at the speed of light, kind of almost infinitely fast, you will do a Fourier transform. This is sort of an analog. That's a little bit cheating, of course, because you have to affect these gates. And doing a gate might take some time, right. Because you need to do it logically, and there might >>: [inaudible] I understand this is a >> Martin Roetteler: These are the rotations about one over a power of two. Yeah, the [indiscernible] are something like one and then you get E to the 2 pi I. Sorry. Here we have DN. This is the large. So >>: [indiscernible]. >> Martin Roetteler: So you see, like here, the N is exponentially large. We need to do exponentially small rotations. Here, those guys would really be over K with those exponentially small. I might be off by one here, actually. Because the smallest case, you would need to do, right, if that's the circuit, you need to do a one I here, right? I think it's right. This one is right. So the smallest case, if it's just one qubit, there's no phase, do a Hadamard. If it's two, you need a controlled one I. If it's three, you need to do a control T gate and [indiscernible] and so on. But if you implement that on a logical level, these guys still cost, right, because we have these small rotations and you guys know best, right. They will cause a lot of complexity when you expand them. So on top of this, you would get, say if epsilon is the constant, you get a constant expansion. But that could be significant, right, that constant, and that might even depend on your gate set that you have available for fault tolerant implementations. >>: If it would be possible to use some other Fourier framework, like [indiscernible] or even prime factor algorithm, if you do PFA, then your rotations in all these different prime factors. >> Martin Roetteler: it. >>: That's a very good point. I thought about I've been trying to get Krysta to look at it. >> Martin Roetteler: Right, right. That's a very good question. The question is, if I understand you correctly, the question is how much of the classical extensive literature, how much of that can we import into quantum, but it's all about the N, right? The N could be prime. The N could be prime. The N could be prime. The N could be a composite of two primes. The N could be a power of a prime. The N could be whatever. And for all these cases, there are many, many algorithms that take care of it. Here for instance, there's a method by Rotter which takes which reduces the DFT of length P to a cyclic convolution of length P minus 1, and then you can actually recurse, right? So you take your DFT matrix, and you realize if you permute it suitably. I don't know if you can see it, but you can permute this part of the Fourier transform matrix into a circulant form by applying a suitable permutation on the left and the right. Then it's a circulant, and then you can diagonalize the circulant with a smaller DFT and recurse. So these things, they're kind of interesting classically, and they lead to in quantum, we always have the problem that we have to deal with the permutations. They might be nasty. We have to implement them. They're operated on exponential spaces. They might be still costly. But also, like this guy, we could argue, we could do recursively, but the diagonal matrix in here, it's a long time that I looked at that, but I thought it was kind of one has to know something about maybe the quadratic character over that field or so. But some work has to be done. It doesn't come out of the box. This one's kind of nice, because it's a tensor product. >>: That's the one I was really >> Martin Roetteler: That one is really, that's a home run, right? In that case, we have a co prime case. We can implement it by tensor factors. Then how do we do these factors? We would have to resort to something like this. If we ever run into that case, we can't do that anymore so we have to do a Cooley Tukey type of formula, which is good, but it has these twiddle matrices. So the twiddles, they connect the two factors in that case. So essentially, what you get here is a one tensor DFTQ, and a DFTP tensor 1, but there is a correction matrix here in the middle. Let me call it T. And that thing is a diagonal matrix and there's also like a permutation similar to like a bit reversal. That thing will eat up some of the cost. But in principal, yes. What I was interested in was the Ph.D. student, I could never show it, is basically, there's a method called Bluestein method. Bluestein allows you to embed any DFT of length N into a larger DFT of length 2 to the K. So you can a us choose a K large enough so that you can implement the DFT N as a sort of a, a few invocations of a DFT of power of two lengths. >>: But here you, don't even need such a strong result. really want is to find the cycle. >> Martin Roetteler: >>: All you That's right. So if the Fourier transform is too long >> Martin Roetteler: You don't care, exactly. Because what happens if you don't have the exact length, it will just need to do some broadening of the peaks, and you're still happy because you'll have the sample. For some of the convolution based algorithms, I think it's still fine if you have the wrong length, but then it's less clear that a widening of the peaks doesn't matter. So it's kind of all this >>: Sounds like research to me. >> Martin Roetteler: It's research, right. But this is still an open problem. Can we actually embed a DFT into a larger one up to some correction. And Bluestein is really nice because it embeds the DFT, it again permutes the matrix and then it makes a tuplets matrix out of it, and then you have this giant tuplets matrix, which it's tuplets and it's circulant and you can diagonalize this guy again with that long DFT and then kill it. But for that, I did never find a quantum analog. Because the matrices, I could write down that were not unitary. So okay. I'm doing very bad on time. How much more time do I have? Like 20 minutes? >> Krysta Svore: We have the room until noon. >> Martin Roetteler: That sounds very good, but I don't want to okay. Let me move on. Okay. This was the application that the prime application right now in quantum computing. Just sample from the Fourier spectrum of function. But what about this idea? Classically, it's very beneficial to perform convolution of two signals. Obviously, right, suppose you have some signal in your memory, it's prepared for you, and you want to find out how much or how well that correlates with a given reference function, right. Maybe you're interested in parts of that signal that look like this, and you want to find out the exact locations in space where it correlates very well. And so classically, it's known that this can be done in N log N operations. If N is the length of the signal, you can do it in N log N operations. How? Well, you can just invoke the Fourier transform again. You take the Fourier transform of the signal and of the function you want to correlate it with. Then you perform a point wise multiplication of these two spectra and do an inverse Fourier transform. It can be shown mathematically that that sequence of operations just corresponds to the convolution of the two functions, F and G. It's a cyclic convolution [indiscernible]. And because all of the DFTs can be done in N log N operations, and the [indiscernible] modification is just N operation, the whole thing is N log N. Okay? So that looks like a very good idea now suddenly for the quantum computer, right, because the quantum computer can do these Fourier transforms essentially for free. It's just log N square operations. Why can we use it to perform massive convolutions on exponential spaces essentially for free, which would be a fantastic application for quantum computer. Why can't we just do that? So you see what the problem is? >>: [inaudible]. >> Martin Roetteler: Yeah, the multiplier not only gets expensive, it gets impossible, because >>: The exponential size. >> Martin Roetteler: Its exponential size space and it might not be a unitary operation anymore. You see? You're right. So it has these exponentially many components. The spectrum of that reference function that we need to know anyway. We would need to know all these different frequencies, and then perform these little multiplications. But on top of this, some of these frequencies might be might even be zero, right, for all we know, or they might be very non uniform. So that point graph multiplication does not correspond to unitary in that case. But the thing is, and this is kind of the message in this talk, for some functions, it is a unitary. So for some functions, when and then exactly when your spectrum of G is flat, okay, so let's plot the absolute value of the spectrum. If that looks like this, if it's constant and, say, it's one over square of N, right, or also if it's very close to that, but then we can perform the point graph modification, but just performing a diagonal matrix with these elements on a diagonal. So these let's cook up a matrix, let's call it UG, which has the Fourier transform at zero and so on. Up to [indiscernible] minus one. If they're all flat, then we can renormalize them by multiplying by square root of N and then they become complex phases of modulus one, okay. So that's a unitary. Everything else is zero. And by multiplying that unitary with the spectrum, which is just like F hat of Omega Omega, we can do that, we get the result that is just the point wise multiplication of the two spectrum. Okay. And then we are home and dry, right? Because we take a [indiscernible] Fourier transform prepared the convolution state. The question becomes yeah, question? >>: [inaudible]. >> Martin Roetteler: Question becomes what function do we know which have this characteristic. There's actually a bunch of questions. The first question is how do we actually get that state in the memory. How do we get the state we want to analyze in the memory. That's a big question. And my answer to this is we must be able to get it implicitly somehow. We must have a circuit that's very small that prepares that state for us. So you might ask, well if I already have a circuit, what questions can I ask about it? Certainly not going to be a two dimensional picture of some landscape. I'm going to find an object in it, right, because we don't have a circuit for that. that thing into the memory. We'd have to actually prepare But F arguably could be a function that we want to analyze. Maybe it is a pattern of a cipher, the output of a cipher. Now you shift that same pattern in time, you get the same pattern, but it's shifted in time. We certainly have a circuit for that. So that like arguably, we can do it, but that's a tricky point, how do we actually get F. The second tricky question is your point, like how do we actually perform these diagonal elements out here. There must be some knowledge about G in order to do it. And the third question is for which G is that applicable at all? What functions actually have a flat spectrum that sounds like a little bit too much to ask for, a function with a flat spectrum. But the third thing actually is not so untypical, because if you think of, say, a random function, it has a spectrum, well, it's not flat, but it has it fluctuates around this one over N, right. It fluctuates. So it maybe looks a little bit like this, but that's kind of good. So we could presumably renormalize that function, right, we could divide the actual value, we could divide it by this and make a phase out of it. Or we might do some other tricks where we actually add a qubit and implement that rotation even though it's a non unitary. So some of the slides I have that deal with that situation of renormalizing. But first, when I first studied this problem, I was interested in what classes of functions are there that produces an exactly flat spectrum. So in the finite, in the discrete case, you can actually find such functions that have that are flat in time and they're also flat in frequency. Yeah? >>: If you can prepare [indiscernible] DFT to take a logarithm of that before taking [indiscernible] DFT. >> Martin Roetteler: You mean the logarithm of >>: Of the DFT. >> Martin Roetteler: Of the DFT as a transformation. Like the DFT, [indiscernible] and take the log of this? I think that would not be so easy. You would wouldn't you get something like so the eigenvalues of the Fourier transform, there's just four, right? There's like plus minus one, plus minus I, according to some pattern. And the matrix that diagonalizes is that's a little bit tricky to write down. And the eigen basis of the Fourier transform, you can do it, of course. Like you can >>: Solve it here. If you take the log and then take the inverse DFT, you've got a [indiscernible] speech analysis. >> Martin Roetteler: >>: I see. And it separates out pitch from frequency periodicity. >> Martin Roetteler: I see. >>: We have the fundamental pitch information and so, for instance, become speaker independent to make the same thing [indiscernible]. >> Martin Roetteler: >>: I see. [indiscernible]. >> Martin Roetteler: I see. >>: And if you filter that at the end, you now have something you can do a lot of analysis on for speech. >>: Yeah, but that's [indiscernible]. >>: I'm just saying if you actually get the logarithm because if you didn't have G, you only had DFT a path which you could prepare, let's say >> Martin Roetteler: It's a good point. I would probably not try to diagonalize DFT in that case, do something with the spectrum and bring it back. I would probably look for direct method of implementing the Cepstrum. It's a long time. I heard about that, but I forgot what the kernel was. But I thought about some transformations related to the DFT, for instance, some fraction Fourier transform or chirp transforms. They can typically all be implemented, or cosine transforms they can be implemented. It's always the problem with these things is there doesn't seem to be a good use case of quantum computing. So there have been some work on wavelets, for instance in quantum computing. Like how to implement wavelets, [indiscernible] cosine transforms and so on. But I never heard of any really good killer applications for any of these methods. The problem is always the same, how do we get the signal inside the computer and what is the question we actually want to solve by filtering, for instance. But having said that, there might be some applications. Maybe there's really a use case where we want to filter something out from a signal to make it independent or eliminate a feature or so. And I've not done I've not thought much about these use cases. Yeah. More questions? No? So the question is like what functions can we find which are flat. Before I explain some of them, the application for this would be the so called hidden shift problem. So hidden shift problem is you have a function, and now you get a shifted version of that same function and you want to find out what is that shift. You already get sense that that's very related to this, right? Because if you have your signal and now you shift it, say, in time, that's exactly the case of this. But this shift, this notion of the shift could be more general than just a time domain or a cyclic group. It could be anything. It could be a Boolean vector taking the XOR, or it could be any other group, really. You could formulate that problem for any group. The (with that is, again, well, it's nice. It's a very general concept, but it really works only for really works well only if the group is abelian, and that specific problem, even for abelian group, it's hard. If it's a large cyclic group, that doesn't work either. So nobody knows how to solve the hidden shift problem over a large cyclic group. So right now, the only things I know is it will work over a over the Boolean domain. Then it works really well. So if that's that is the Boolean XOR operation, or one can also extend that to say Z 3 to the N or something very small but constant modulus, then one can find this S. But other than that, that's very open if one can attack it, and that actually relates also to the hidden subgroup problem. So you can whenever you have a problem like this, you can set up an instance of a hidden subgroup problem, you can find a suitable group. There's no longer the same group here. Even if that's abelian, then you typically get something non abelian. >>: [indiscernible] two element groups. >> Martin Roetteler: Not two element. I mean the binary vectors are the direct product of the Z2. So, I mean, Z2 >>: Two to the >> Martin Roetteler: To the N. What happens if you make the reduction from H, okay, I don't know a good it's called the HSSP, okay? No, let's call it hidden shift problem, right. You can reduce hidden shift problem to hidden subgroup problems. So if the hidden shift problem is over an abelian group, G, you can set up a hidden shift if the hidden shift problem is over an abelian group G, you can set up a hidden subgroup problem over, again, the semi direct product of that group. So that's like sorry for those who don't know that concept of semi direct product, but it means you take two copies no, not two copies. You take, you take the group and you add another component, and now you in order to multiply two elements, you're no longer independent. You have to take into account what's written in the extra component to know what's going on. So that kind of >>: [indiscernible]. >> Martin Roetteler: I was thinking of [indiscernible] group, right. Which is special case where they interchange. But it could be modular, it could be any action of a Z2. So in that particular case, it's actually the action by inversion, right. You take the inversion action, and that defines a semi direct product. And in case the S inferences, you get the [indiscernible] group. DN. And so here's already the issue with that. So if you want to solve the hidden shift problem, for here we end up with a hidden subgroup problem for the hidden group, and nobody knows how to do that. And if one would know how to do that, one can actually tackle some lattice problems. So that's kind of why this, like just applying that idea to the idea to find shift in time, like if our goal is to shift things in time and to identify the time shift, right, then it doesn't it's not as easy. We don't get a free lunch. We cannot just say we'll reduce it to the [indiscernible] hidden subgroup problem and then solve it. More has to be done. But if it's as simple as the Boolean, the Boolean domain, I'm going to show you next how to do that for the Boolean domain. >>: You can also apply it to GF2, the [indiscernible], because that's the hidden subgroup problem for ciphers over that. >> Martin Roetteler: G here? >>: What do you mean, like the group, the group Yeah, take Z2 to the end quotient [indiscernible]. >> Martin Roetteler: And then you're interested in the multiplicative structure of that guy, that's hard. That's going to be hard, unfortunately. >>: It is hard, but it's the same I've heard, I don't know, but I've heard that people apply hidden subgroup techniques to find the period there as well. >> Martin Roetteler: Oh, okay. If you just want to find the period, that's fine. If you want to find the order of an element, so now let's assume your polynomial was irreducible. Now we've got a beautiful field. You get an element, you want to find out the period of that. That's really period finding. And yes, we can do it. Here, it seems even more simple, because we just take the additive structure of that group, not the multiplicative structure, which is this, but we shift the function by an unknown shift and we get that black box, which implements the shifted function. And the only thing we're allowed to do is ask the black box about value, like value and function value pairs. And from that alone, one can actually show the classical algorithm cannot do this. So there's a relatively simple argument that a classical algorithm let's see if I have that somewhere, yeah, a classical algorithm will be faced with a problem that like it makes a few evaluations of the function and then there's always a huge number of shifts that are completely consistent with that data, right. So the classical algorithm, whatever it is, let's assume for a moment it's deterministic, it would have queried that function and the shifted function, but it will have only access to these very few points where it actually shifted, right. Sorry, no, that's the red guys are the ones it queried. And you now can tweak the if you're an adversary, you can actually change the problem to a different S that's completely consistent with all these samples that were made. And that's why a classical algorithm has an exponential query complexity for these problems typically. But in quantum, one can solve it actually with one query. I want to show you how that works. So there are some cases of functions that would have that property. For instance, the Legendre symbol into the has that property, that it has a flat spectrum. Like technically speaking, you have to take out the zero point here. It drops to zero here but the rest is flat. In the case of the Boolean domain, so Z2 to the N, there are functions which are completely flat. Fourier transform then is the so called [indiscernible] Hadamard transform. It's this function. And there are functions that are so they're plus minus one functions that have the property that all the frequencies, the frequency is defined like this, they're all the same, they're all like this. They're sometimes called bent functions in cryptography. And they are like they're known they can only exist if it's even there are some families unknown how to construct them. There's no, like, complete classification of them as far as I know. But like some of them are very easy to write down, for instance. Even if you can partition the invariables into two blocks and take the inner product between these two blocks. That's always a bent function. And so on. There are a few cases like this. So there are examples for functions which are flat, and if it's flat, you can do this. You can just prepare the zero. State is again a register with many zeroes. You prepare an [indiscernible] of X. You evaluate the shifted function into a position, right, and let's say we compute into the phase that can also always be done. And now, we compute a Fourier transform. And now we use the fact that, again, if you shift a function, could be the Fourier transform, you pick up a linear phase. That's really important thing. If you don't remember anything in this talk, this is the only thing I want you to remember. If you take a shifted function, you have a linear phase out here, and now if you can uncompute that thing, it's kind of backwards from that discussion here, right. Their roles are interchanged between F and G. But if you can uncompute this by multiplying with a diagonal matrix, you end up with this state and then you do another Fourier transform and you get S. It's exactly, S, and you get it deterministically. So okay. So the algorithm looks like this. It takes a Hadamard transform to prepare all the inputs. It evaluates G. It performs the Fourier transform, then uncomputes all these Fourier coefficients, Hadamard transform, and you get S. So the only difficulty in quotes is that if you have like a general function that's not bent, it's that we might have not a flat spectrum anymore. And typically, you get a very non flat spectra. That depends on the circuit somehow, right, the more shallow the circuit is, it turns out that the high frequency guys, they die off somehow. >>: [indiscernible]. >> Martin Roetteler: You must have the promise that they are shifted between G and F. You must have the promise. This F star here is the dual bent function. They promise you must have on top of it. >>: You know, all you really need is some way to do that multiplied. Having it flat is one way to do it. >> Martin Roetteler: Yes. >>: Suppose, though, it was a tractable tensor product of such functions, for example. That is, it's a bunch of different things and it's a tensor product of them. So I can materialize the tensor product in the same way. undo it. I don't know if you could >> Martin Roetteler: I don't fully understand, but it sounds to me that you would like to take several copies of something and then use those to implement this on a subspace maybe and forget about the rest. It turns out that's a good idea. But in quantum, we've got to be very careful that we don't entangle stuff outside our like stuff you want to have inside our computation space, we must not entangle it with anything outside. So if we did something like this, we must make sure that at the end of the day, all these extra tensor factors are entangled anymore. Either they're cleaned up, they're brought back to the same state, or we measure them and we can still say something what happens here. I wouldn't want to say we looked at this idea, but like I'm going to show you one idea how we can renormalize coefficients by looking at the larger space. But the larger space will not be so much larger. It will be just one extra bit and we'll use that one extra bit to perform a rotation so that we can do any angle here. But then like in practice, that might be very expensive because the rotation depends on the frequency, and could be anything, right. It could be any angle. So it could as well be dead. But then this result is really then just a query result. So it shows that we only need to query these boxes a few times. Doesn't say anything about the time complexity of the operations we need to do. But query complexity is a very well established model. Many people looked at that. There were more questions? No? Okay. So if it's not flat, actually Dave Meyer and I think it was a student of his, a Ph.D. student of his, they proposed a method to renormalize it like this. But that leads to very big distortions, especially if your function dies off quickly. You might pay a very big price when you renormalize it. It might not be a good phase element. So I think of these things actually as like in optics as phase elements. So the Fourier transforms are lenses, and these things are like a phase element that just modifies. So we want to design a phase element that performs the convolution for us. People have done that in optics, but it doesn't seem to be a lot of rigorous statements about these methods. There's a lot of heuristics, how to do it, how to enforce that it's a convolution and it's flat and I can tell you also more about that if you're interested. But there doesn't seem to be a lot of rigorous analysis, how long it takes to converge to a phase element and so on. So what we did is we did a different route. So we wanted to come up with these phase elements, but kind of to do it systematically. So just to rehash, so we have two cases, kind of as extreme cases. For the bent function case, we know just one query is enough using the algorithm I just showed. To identify the shift. On the other hand, also, like the delta function is also a valid function. You could ask, find out the shift of two delta functions, but that reduces to search, right. You could encode a search problem into that if you were able to find shifts between delta functions that the shift now could be the answer to some arbitrary complicated problem you want to solve. And for that case, it's known that the complexity, you can never do better than squared of N, no matter how hard you try. Even if you take several copies, no matter there's a result showing that for search, a lower bound on the query complexity is squared N, and it can also be matched with an upper bound. So the true answer is squared of N. But if you look at a spectra of a delta function, when you have it as a plus minus one function, so it's constant everywhere, except for one point where it's minus one, it again looks like this. There's this huge peak. It looks like this. There's this huge peak. Like these things, they look very different, right? On the one hand, you have something that's constant. This guy has this huge peak. So this is easy, this is very, very hard. This intuition is the more flat it gets, the easier we can solve these hidden shift problems. So we wanted to make that motion precise, and it's very tantalizing, because if you look at the algorithm, you see for this is the hidden shift algorithm, right? Hadamard, and then you evaluate G Hadamard F and Hadamard and you're done. What does Grover do? Well, it does an initial Hadamard to set up equal throughs, and then it operates the same iterates the same operator many, many times, and that slowly, slowly, slowly rotates the state into the form you want. And then at the end of the day, if everything can be done exactly, say if square of N is an integer, you actually also get S back. So they look very similar, right? It's just that you get away with one round here, and here you do many rounds. And it turns out that we can actually we can make that more precise. So the idea is we have that, like, say, we have a general function. We get a spectrum that might look a little bit spiky. And a spectrum is really this. It encodes the information about the shift in a phase like this, and it has these Fourier transforms. What we would want is to forget that, to get that state, right, and then we have another Fourier transform and we're done. In here, we just Fourier transform the state and we get S. So the question is how can we, if we have a copy of this state or several copies of this state, how can we make this state? Just what kind of process do we have that makes it. And the idea is extremely simple. It's like if we have this distribution according to Fs, how can we make the flat distribution out of it? So that's like in classical analysis, classical sampling theory sometimes called rejection method, so if you have the ability to sample according to some distribution it's called a P. But you would really like to sample according to S, there's a way to do it. So what you can do is you can renormalize this S so it fits under, under P, and then in order to produce a sample, you do the following. You first sample from P, which is what you can do. You have a physical apparatus to sample from P, and then you make a secondary thing. You kind of, you toss a coin and you accept that as a sample if and only if so it's kind of you kind of, you generate a uniformly distributed variable and you accept if that outcome was under in here, right, if that was in here, you accept. If it falls within here you don't take the sample and you redo the procedure. You pick another X and you do the same thing. It can be shown that the expected number of times you have to do that is just there's one over gamma. Okay. So that's actually also the down side of this method. If you to rescale it very much so it fits. If your gamma is very large your multiplier is a very small number, you might have to do it a very, very long time. And so that allows you to sample from something you have actually no physical machine for. Suppose to press maybe you have no way to sample from this S. But just like tweaking it, would just do it several times so it works like this. So this thing we can do quantumly too. So how this works is as follows. So instead of a probability [indiscernible] P, we have a state. So we have a coherent state. It has certain amplitudes. They correspond to the probabilities, and it can be entangled with another register, actually. And that other thing could be anything, actually. We don't make any assumptions. They could be even an infinite space. No assumptions made about that. What we want to have is we want to have that same the same entanglements between the K and that other register. We want to maintain it, but we want to change the amplitudes, two different amplitudes sigmas. So how can we do it? In that case, it's assumed that we have complete knowledge about the sigmas. They're completely known. But what is not known is these states. In the application of the hidden shift problem, they are just phases. It's not even a qubit. It's just a phase, a complex phase that encodes somehow to S. But we would like to maintain the phase to make it an equal [indiscernible] position. So it turns out if you just want to do this exactly, no error is allowed, then one can exactly determine what this gamma is. So there's no choice. You have to take the minimum of these quotients here and then that's going to be the query complexity. That's how many times you're going to need that's how many copies of that state you will need to perform the task. That's not very good if you have to do that exactly. Because that could be a very, very small number. Some of these sigmas could be very, very small and that would kill that result. But it's fine to prepare the target state with some error, and that will help us a lot. >>: [indiscernible] a smaller number, not a bigger number. >> Martin Roetteler: But it turns out if some of the signals are small, you get a very I'm sorry. Maybe it's the pis that are bad. Some of the pis are bad. >>: But one of the gamma should be a max. >> Martin Roetteler: But the problem is that's a very small thing. Then we get a very large like in case of the classical sampling, so when does that happen that that's very small? >>: It's just in your picture >> Martin Roetteler: Right, you have to scale, right, right. In scaling, we have to multiply with a very small number. That's what scaling does. >>: What happens when (Multiple people speaking.) >> Martin Roetteler: I think it's fine. >>: They're rather different here. goes up. >> Martin Roetteler: >>: I think it's fine. This gets really low and That's right. Suppose P were minus S or something. >> Martin Roetteler: Right, exactly. Or assume that some of these components are really very, very small. Maybe drops off very quickly. That's exactly what happens, actually, for the delta functions. One is very big. That's fine. But all zeroes so whatever you get, actually, you have to scale it. So under all these almost zero components, they're not exactly zero. >> Martin Roetteler: Reject almost everything. >> Martin Roetteler: You reject almost everything. Actually turns out if you just apply that to the search problem, you don't even get the Grover speed up. You're worse than Grover if you just do [indiscernible]. But it's kind of a middle ground between bent and the Grover case where this method is better than what you can do naively. So it's the mental error. We can do an error. So the algorithm now looks like this. So you start with the state you're given. That corresponds to the distribution P. And now you pick an additional qubit. Just need one more qubit, zero. And what you do is you make a conditional rotation, conditional on K. You rotate that qubit into that state. You can do it. That's basically big block diagonal matrix with two by two blocks that all perform these rotations. We can and That's okay? do it only because we have that knowledge about what pi okay, sorry, this notation here changed from that slide. because I stole it from various those are the sigmas, Those are the previous sigmas. Okay. Those we have complete knowledge about the pis and sigmas so we can perform the rotations. And then we measure the first register, and we keep only the case where we actually measured one. If you measured one, then we have a state that looks like this. It's renormalized to deltas, divided by the two norm of that whole delta vector, which is exactly what we wanted. >>: But the one could be very rare. >> Martin Roetteler: >>: Exactly. So we have to run it a lot and still same problem. >> Martin Roetteler: Exactly. And we know that there are cases where we just cannot hope for anything better, because the delta functions, right? The search, we cannot hope for better. But this method helps if your function is not if it's bent, everything is fine, right? But if it's just a little bit non bent, if it's a random function, or if it's it's bent but only change a very few values, that will only change the spectrum slightly. What can we do then? The original algorithm doesn't apply anymore. The renormalization by Curtis Meyer might ruin the whole thing. But this method is kind of graceful. It allows us to change it in a way so that we can still prepare that state and apply the hidden shift. The rest of the algorithm is the same. So the idea is to go from that spectrum to that spectrum, and then from that state to that state, and then Fourier transform will do the trick. Yeah, okay. So of course we need to be able to perform these rotations and then once we're lucky, we end up with this state, and we're done. Now we can analyze actually okay, I'm sorry. This notation has changed. Kind of twisted my head now, because what was the epsilon on the previous slide is now a vector. I apologize for this. But this essentially tells you like how much error we allow in each of these rotations when we go to that target state. We might not want to go exactly to that, but we might also allow an error, and that error is baked into that form here. If it's ideal, the error will be no error. Then all these coefficients will be one. We get the exact state we want. If there is an error, we might have like an error vector, which might depend on W. But we can say what the overall complexity of the algorithm is. It's exactly given by this one over the epsilon vector. That's how many queries it will take to convert the state into the other state. We can give an interpretation of this vector. That's kind of, I don't know, some people can get maybe intuition from there. If you take the spectrum, and now you fill it to some level with water. The water level is given by the overall error you're willing to accept. And then you get a vector, that vector will be an epsilon vector, okay? That vector will be the vector you're trying to get here. And we can show that for a given target probability, for the algorithm, that's the best you can do. That's the best we can characterize that using a semi definite program that we can write down. >>: So that's where SDP comes in, picking the water level? >> Martin Roetteler: Picking the water level out for that algorithm. Trying to find the best algorithm, the best rotation schedule, right, because the rotations themselves are not completely defined because we allow error in each rotation. We want to find what's the best schedule for these errors, and that can be characterized as an SDP. It turns out, actually, that this whole procedure then corresponds to reperforming the [indiscernible] rotations and performing a single qubit for normal measurement. In the modeling here, we assume nothing. We assumed that that's basically could be high dimensional, the measurement could be POVM and so on. But what popped out of that SDP was one qubit is enough, and [indiscernible] measurement is enough. You don't have to do a general POVM. That's the optimal schedule for that. [indiscernible] last year at ITCS, which is a computer science conference. And this year, we had a generalization of this when we looked at several copies. Actually, this is just the first step maybe towards a new interesting result. We just looked at what happens if we don't have just one copy, but several copies. How could we entangle them in a meaningful way and what happens if we do that. So we looked at the so called pretty good measurement and we found out a pretty good measurement in that case has a nice structure. Just corresponds to this circuit. That's kind of the single qubit circuit, and then in order to do the PGM, you just need to do a bunch of CNOTs in that case. And then what you get is a state that looks like this. It has the information about the shift. It's entangled with a big register, and apriori we cannot say much about that. But it turns out that this kind of performs the self convolution of the function with its be, like, it kind of the more copies you take, the you take the initial spectrum and you perform a convolution with itself, like so you perform a convolution of the function with itself and then take its Fourier transform. >>: So it's a central [indiscernible]. >> Martin Roetteler: We didn't get there yet. We haven't done that analysis yet. How quickly, like how to relate T really with the success probability. We have just one study. One small result we showed. But we have not showed yet how to really improve, like how to turn that into an efficient algorithm to find S by using many copies. We have not done that yet. And for that, we would need to do a study like it's probably not too involved. I don't think we need very advanced probabilistic methods. >>: Because [indiscernible] those things is just adding the [indiscernible]. >> Martin Roetteler: What you do is you take the spectrum, you [indiscernible] yeah, exactly. You have independent variables. You add them up. >>: You're adding them up, so they converge like [indiscernible]. >> Martin Roetteler: >>: I think so too. Or whatever. >> Martin Roetteler: I think to think of this in terms of the spectrum of the function and then I just look at spectrum and I raise the coefficients to a power, which means I drop them exponentially. I drop them geometrically. And the largest one will survive that process. See, it will all focus on the largest one. That's very, very fresh. Very ongoing research, what we can do with that idea. But my feeling tells me we have to look at several copies if we want to use it to go from the Boolean domain here to, say, if you want to go from here to that domain. One of the things to try would be to take several copies and to see how the spectra look like. If you take several copies and we look at the PGM, what will happen? In a sense, that has already been studied in the paper by a [indiscernible] and Bacon and Childs. They've looked at the [indiscernible] HSP and the PGM approach but maybe that could rediscover some of thighs results. Maybe it will help to improve some of the things they did, because it's a different flavor. It's a free analytic flavor. And what they did is they reduced it to subset sum problems. Which might be related to that idea of taking sums of several independent variables. But when you actively try to uncompute, you get these values. Now you perform a computation which tries to kind of do a bin packing for a target vector. But here, like I always like Fourier methods. That's very intuitive to me. >>: Start with the exponential advantage. >> Martin Roetteler: That's right, that's right. of the day, we still have an advantage to show. >>: So at the end So the gold has been found in that neighborhood. >> Martin Roetteler: Yeah, the gold is what 18, like 49. It's not 1849 yet. I think I'm running out of time. I'm going to skip this part, I think. I want to leave you with one message. Because we talked about hidden sub groups already. So the message is that hidden subgroup problems, they really depend a lot on the group structures. Abelian ones are fine. They can be done efficiently. But in a non abelian case, if the group is an [indiscernible] group for instance already not known how to solve them, they're sit metric products are not known, even if you take two copies of if you take an SN and 2N, like SN itself, it's not known how to solve it. And there are encouraging things, though. Like information theoretically, it's not known there's always enough information about the hidden subgroup if you take enough copies. That's why mean people are actually rushing in the direction or were rushing up to some point. It's known that there exists a measurement a POVM, that takes as an input several copies of that coset and then output the information you're interested in. That picture is specific for the graph isomorphism problem, very related to the isomorphism problem, but it's a generic picture. Whatever your subgroup problem is, you can take several coset states and you can extract N and it's enough to take log G many copies. But nobody knows how to implement this, and we were asking the question. So do we really need to do that? Do we need to entangle all these coset states and do some classical processing. Or is it maybe enough like ensure you really do single coset state and then measure it. That's enough. So is it maybe enough to do K where K's maybe two or three or small number. Maybe that's enough to extract a subgroup. But it turns out that that's not enough. So Moore, Russell and Schulman have shown that for two and for three, it's not enough, and then in the stock paper a few years ago, we show that you have to go to N log N of these copies if you want to do graph isomorphism. That's actually, that's kind of depressing, because it's a very big quantum computer you need if you can't to tackle a graph isomorphism. It's almost excessive, right, because there might very well be a classical efficient algorithm. So you don't need any quantum computer as all, maybe, to solve that problem. And in practice, it seems like it works well to just look at a spectra. And here we would need a gigantic quantum algorithm that takes N log N of these registers and each register has many qubits. And we don't even know how to implement the measurement. But still, there was analysis of the Fourier approach to that problem. And its representation theoretical, it works with analyzing, like, there's a certain basis in which all these states become block diagonal at the same time, and the basis transform is the Fourier transform over that group. And after you do it, your several copies of the coset states, they will look block diagonal. You can do a POVM that selects just one block. And the question is how do you further process that information in the block. And so we show that actually there's no good basis at all for the case where you want to distinguish a trivial subgroup from an order two subgroup in the symmetric group. Intuitively, what happens is if you have the case where it's trivial, it's just flat. If it's order two, it's also pretty flat, no matter what the basis is that you choose. So there's a probabilistic arguments coming in, but we were able to show that the distance of the probability distribution that you're able to sample, no matter what basis you choose, and the uniform distribution looks like this, where K is the number of copies. And this is a very small term so you need to lift that up in order to even have a chance to solve the problem. And technically speaking, it boils down to bounding expressions like this, they're like character expressions. These are the characters of the symmetric group. H is an involution. D is a degree of the representation, and we get kind of a projection. This is geometric part of the expression. This is the reputation [indiscernible] part. That ranges over all the characters. And it can be shown that those quantities are very, very small unless K is N log N. And it falls kind of it's a win win analysis. We say either [indiscernible] is kind of has a very high degree, like for SN, typically the [indiscernible] have exponential degree. They're very large. And for those that quantity is extremely small. So we can kind of neglect them or we can bound them. The other cases, if the low dimensional then that might be a significant quantity, that fraction here. But then we have geometric arguments that this part is has to also be small. So we can show that this whole expression is very small, meaning no matter what basis you chose so carefully, you will sample something that's very close to the uniform distribution and that will have no information about the subgroup. So that was, roughly speaking, the argument behind it. So essentially, that stopped all the research up to now in that as far as I know, in this hidden subgroup approach to graph isomorphism. There is very exciting other approaches where people try to apply more physical intuition to it. Maybe like walks, try to define the walk on the two graphs and see if the walks are different. I don't know very much about that. I know, like I know that they were also [indiscernible] found with those approaches, but maybe that's a better approach, tackle graph isomorphism in the long run. Okay. One way, actually, one can make use of this negative result is one can define a crypto system that's secure against certain types of attacks. Namely, you can define a one way function. If you could break that one way function, you would actually be able to break graph isomorphism so because we have that result, we know one would need actually a pretty large quantum computer to actually mount this attack. So that is an indicator that this one way function here is essentially just to pick a bunch of random vectors and you multiply your input with that, matrix matrix multiplication with a random matrix. But it's known that if you could break that, then you could also do graph isomorphism. But there's more to be explored here. There are definitely more it would be nice to see other crypto systems that have this flavor. And it would be nice to see more attacks on crypto systems. Okay. Maybe the last two minutes, my take on D wave. Yeah, I guess most of you know that it's an adiabatic paradigm. So one starts with the Hamiltonian for which the ground state is very well understood. Could be, for instance, all spin up or so. And then your final Hamiltonian encodes a problem. A problem that you want to solve. You know that the ground state of that problem encodes the solution of some hard problem. And now the idea is if you vary this slow enough, it will track the ground state and you end up here. You can sample, you find a ground state. One path to do that would be, for instance, this. But this is by no means the only one. There might be many paths that go from the initial Hamiltonian to the final Hamiltonian. So in our group, actually, when all this D wave stuff came out, I asked the members in our group, let's look at this, because we cannot ignore it. It might be very unlikely that what is advertised here is, indeed, working like this. But it might be a probability of maybe ten to the minus five or four or whatever that this works. And if that's the case, the impact would be huge. So we cannot completely ignore this whole approach, because if it has if it works to impact is huge, even though in my point of view, at least, this statement that it attracts the ground state is, I think, I think not what happens in that system. That's probably much more noise happening so it's not just tracking the ground state. But so I ask the members and at that point, Boris Altshuler is a condensed meta physicist, and we had a post doc, Harry Krovi, Jeremy Roland, who was a research staff member at that point. They looked at that. They assumed, yeah, indeed, the let's assume that the computer does this time evolution. But then a question they asked was will it work for solving three set. And around that time, that was what was advertised that this computer could do. It could help you solve three set. So they looked at random instances of three set, which is typically what you meet in practice. If you want to solve a practical problem, machine learning problem, you deal with a random instance. If you do software [indiscernible], you do some bound model checking, or you map to three set instances and they looked very random. And they're very good solvers for that around. Like 10, 20 thousand variables, they're not a big deal for these solvers. further. So they could take over even much, much So the question was, what can we say about the performance of this algorithm in that very special case. They use perturbation theory and found at the very end of this time evolution, right, when your T becomes one, what happens is that you will have level crossing and you will not find a solution of your problem. At that point, I didn't understand anymore why that didn't end the whole discussion about what a D wave computer is good for. Because for random instances of NP complete problems, like three set, the computer is not useful. >>: [indiscernible]. >> Martin Roetteler: Exact cover is very closely related. It's true. It's a different problem, but three set and exact cover are both very local in the sense that all the clauses involve very few numbers. But then what happened is there was actually something very exciting happening. So the whole theory of NP completeness, while it's close on reductions, that's really not the end of the store, because reductions don't preserve randomness well. Some instances of NP complete problems might not have a concept of random instance. And that was immediately what happened after it, we heard from D wave, and I have a lot of respect for these people. I spoke with several of the scientists, and we invited them to workshops. The problems that we're looking at was instead of three set, let's look at clique. Clique finding, same as maximum independent set, is also NP complete. There's a notion of randomness for that problem, but it's very different, right? And it's not close on the reductions. If you take a random three set instance and are you reduce it to the clique, you will not get a random instance that you would expect from the, like intuitively, right. As this [indiscernible] model of randomness for graphs, you just pick edges at random according to some probability, and then you might ask what's the largest clique in there, right. And that's what they argued might be a better problem for them to tackle than three set. And they admitted that the computer is not good for three set. And I think that this finding, which, to be quite honest, still relies on some assumption. It relies on the assumption that by introducing this randomness in the instance, you get an effect called localization, which actually forces the wave function to be very local and that happens very end of the time evolution. This is just an assumption. So in this paper, they assume that if you have the graph, it's like the hypercube, you will have this Anderson localization effect, and it's not a proven statement. It's just an intuition coming from physicists. But they believe this is what happens for this order systems as far as I know. That's why I have to say I'm not an expert in this, but the people I spoke with say yes, this is a very credible evidence that the algorithm group failed for this. But it now boils down to, if you look at clique, it will boil down to the size of the clique that you're looking for. If you just take a random graph, the largest clique you can expect is actually known to be very, very, very small. It's just something of the size P log N or so. If you take a random graph on N vertices, pick an edge with probability P to do that for all edges, you get a graph. You look at the largest clique and you try to find that. Then you get something like this for the binary case where P is half. You get this. This is the expected clique size. Okay. So now, a classical algorithm can find cliques up to that size in polynomial time. For instance, by just doing some sort of brute force search for all the clique possibilities. You can do this. So this is my this is not my personal belief. But I believe that this D wave computer might be good to find cliques of up to this size or maybe even larger. Maybe a small constant times this. But the arguments that were put forward in that paper that grounded three set, I think they can be applied to the problem of the planted cliques. So if you generate a graph at random and you planned in a huge clique, say a clique of size squared of N, right, and then you forget where it was and then later you wake up and you want to know, hey, where was that clique, right. So that problem of the planted clique should be able to be handled by the exact same arguments that these guys handled for three set. So what I'm trying to say is it all boils down on the size of the clique. Very large, I think these things could be done. Very small, and like I called it a tiny clique problem. Arguably, one could do with D wave computer. And why is that interesting? I think it would be extremely interesting. If one could show that using a D wave computer, one can find cliques of size or even two times the size, one would have a super polynomial separation between the classical algorithm, right, which runs in time say two to the log N. That case, right? Because two to the log N is not a polynomial. It would establish a >>: Barely not a polynomial. >> Martin Roetteler: It's just barely not a polynomial. Some people call that quasi polynomial. But if that could be shown, immediately it would show that a D wave computer, no matter what it does, but it would do something you cannot do in polynomial time classically. >>: That is the D wave computer is not the canonical [indiscernible]. >> Martin Roetteler: Yes. >>: It has, in fact, some artifacts of all sorts of >> Martin Roetteler: Yes. >>: So let's think of this as an adiabatic quantum computer. [indiscernible] it's also non adiabatic to a [indiscernible]. >> Martin Roetteler: >>: Yeah, yeah. But it isn't that anyway. It is something else. >> Martin Roetteler: It is something. The question is, what is it good for? Can we have some evidence that this is good for something? And I'm interested in kind of finding any such evidence, because this device is there, and now the question is, is it good for something? So I think this problem with it, like tiny cliques, I have no idea if that can actually be solved. Because it would involve actually finding a scheduling for that machine and then running many experiments and having evidence that it's faster than the classical algorithm. But my feeling tells me that might be possible. One idea would be, for instance, just take the regular scheduling, and then run it to a point where the wave function is still extended across all the solutions and then you don't keep going, because you might get localized. But maybe just sample then. Maybe then the distribution about the actual cliques is actually very local and you will get a clique. I don't know. It's just an intuition. >>: [indiscernible] the gap closing, which means the probability of getting the right answer is >> Martin Roetteler: You want to stay away from there. would want to stay away from there. You >>: It's past that point that you get useful information, though. >> Martin Roetteler: The thing is to find that sweet spot where we have not passed the point yet. You're close to the point, your wave function is still extended and then you want a sample. >>: I have another quibble with this [indiscernible] paper. >> Martin Roetteler: Yes. >>: Is that the most optimistic [indiscernible] localization is perhaps that the gap would go to zero. Maybe exponential of some power system size. So it may still be that Altshuler and company could be right, but that you might go faster than the >> Martin Roetteler: They had some kind of optimistic assumptions how fast it closes. They might have like some linear like the exponent, maybe they assumed it's one or so. >>: Whatever the exponent it, E to the L or E to the square >> Martin Roetteler: Exactly. And there was some criticism from [indiscernible], some people at NASA. They criticized it, that they assumed that the gap would close too fast. I thought that Boris had a response to this. But I don't remember what the response was. How he draws the intuition that it closed so fast on the hypercube, I don't know. But what I can tell you is like [indiscernible] gave a talk very recently at Princeton about this, and he had a plot of these like the bimodal distribution that everybody has seen, but he also had some plots about what happens if you want to find the ground state for some [indiscernible] models. And what sometimes you don't find it. And if you don't find it, what is the actual hemming distance of the point you find in terms of how many bit flips are you aware. And there was a huge chunk of the distribution which was actually N bit flips away, which is what Altshuler and company predicted. That you would actually be kind of very far away for these solutions. Kind of that's an argument for me that this intuition they had in the paper is maybe on the money, that there is localization, and it happens at the well, the analytic argument is it happens at the very end. But it might have happened even before. >>: Well, I mean, some of the numerical studies of what the gap was [indiscernible]. >> Martin Roetteler: >>: Right, yeah. All depends on the problem also. >>: Yeah, [indiscernible] paper on it also had, you could also have a polynomial closing the gap, but multiple gaps. So your probabilities go up to jumping the gap exponentially. >> Martin Roetteler: That's right. It's very confusing. I find it very confusing. For instance, also, there is like P, like just take clauses they're all linear, right. It's clearly solvable on the classical machine with the Gaus algorithm, but even there it seems to fail, right. If you just like take linear XOR or so, and you guys will know more about this, I'm sure. But it's just my feel tells me that machine might be interesting to do some things. Like maybe, like things related to, say, just storing information, retrieving it. I don't know if anybody looked at that. But in case for classical networks, like [indiscernible] networks, it's known, you know, maybe not exactly known, but it's like the capacity has been of these networks has been narrowed down to a very small region. Like there's like maybe 0.138 or between 0.138 and 0.5. 0.15 times N if N is the number of spins. >>: Much better [indiscernible] but I can do it in a circuit model better than I can do it in a actually does that and gives you beautiful super positions of storage states, like 50 percent storage but then get everybody out Grover. Like you only get it out once. We're over time. You need lunch, so let's do that. We'll talk after you're done. >> Martin Roetteler: Okay. But that's interesting for me to know, because I thought maybe that could be an indicator that what happens in that system is, indeed, quantum and useful. Like if the capacity I mean, we all know we cannot store modern N bits and N qubits, right, because of [indiscernible]. But maybe if the constant is just different, maybe it's an indicator what happens is quantum. Maybe it's useful for that reason. That's the most sketchy part of my presentation. But I just wanted to get it across. I don't think it's like complete completely bad what happens in the developments around the machine. Of course, there's a lot of hype and so on, a lot of press. All right. Thanks a lot. Thanks for sticking around.