>> Krysta Svore: Good afternoon. Martin Suchara is a Postdoctoral

advertisement
>> Krysta Svore: Good afternoon. Martin Suchara is a Postdoctoral
Scholar at UC Berkeley. His research interests are in quantum
computation, including quantum error correction and quantum algorithms.
He has been working on the development of new quantum error correcting
codes that have a high error correction threshold and can be
efficiently decoded. There's a long bio. He's done a lot of good stuff,
and we're thrilled to have him here. Martin.
>> Martin Suchara: Thank you for the introduction. Today I am going to
talk about quantifying the resources need to do real-world computations
on the quantum computer using the topological and concatenated error
correcting codes.
Quantum error correction is, of course, a very important problem. You
cannot build a quantum computer without it. And also it's a challenging
problem because it's different than classical error correction where if
you classical information you have just bits of zeros and ones, and the
only error that [inaudible] is a bit flip. But quantum information
continues, and we have a new range of partial errors but also phase
flips and phase shifts that we need to error correct.
The two main families of quantum error correcting codes that can
address these areas are the concatenated codes which were developed
starting with [inaudible] Peter Shor in 1995. And topological error
correcting codes which started with [inaudible] Alexei Kitaev in 1997.
Both of these code families have some advantages and disadvantages, and
these are summarized in this table. The topological codes have much
error correction threshold which means they can tolerate much higher
error rates. And also computation with topological codes can be done
using local operations. For concatenated codes if we, for example,
wanted to do the two-qubit controlled-NOT operation we have to
typically swap qubits before we can perform a local controlled-NOT
operation. But for topological codes we can use braiding which can be
done purely by using local operations and no swapping is needed.
But topological codes, they have to solve a difficult problem to decode
the errors. Typically a minimum rate matching problems is used to guess
which error occurred given syndrome information. But for concatenated
codes the decoding is much simpler. So the classical controller that
controls these error correcting codes is a bit more involved for a
topological codes.
Also it is not clear which code is better in terms of number of qubit,
number of gates and [inaudible] all the resources. And this is the
focus of this talk. I'm going to address the issue quantifying the
resources needed to do error correction with these two code families.
And also I will look at some ideas I have about simplifying the
decoding for the topological code.
So this is the structure of the talk, first just some background about
concatenated and topological codes. Then I will lay out the methodology
that I used to quantify the resources needed to do error correction
with the two code families. And finally I will speak about building a
faster decoder for decoding errors in topological error correction. The
stabilizer formula is very useful to characterize error correcting
codes. The stabilizers are basically the syndromes we need to measure
to diagnose errors. They are generated by the stabilizer group which
the elements of this stabilizer group [inaudible]. And the stabilizers
act trivially on the codespace. So if we have a state, psi, which is in
the codespace then the stabilizer doesn't change the state which means
we can safely measure these syndromes without affecting the encoded
information. And we can learn about errors from these syndrome
measurements.
Concatenated codes. Perhaps the simplest example of a concatenated code
is the Bacon-Shor code which is the quantum analog of the classical
repetition code where we encode each bit three times and then we use a
[inaudible] to decide whether we encoded zero or one. So to encode
logical state zero in the Bacon-Shor code, you need to also protect
against the phase flip, not only a bit flip but also a phase flip. So
to encode zero we use the three plus state and to encode one we use
three minus state initially. And then to protected these pluses and
minuses against bit flips, we use the repetition code so zero would
become triple zero and one would become triple one.
So as the name suggests concatenated codes have concatenated structure,
so we can repeat using smaller building blocks we can build the code
from ground up. And up to a certain limit this will improve the error
correcting properties of the code. Decoding of errors with concatenated
codes is very simple. In the case of the Bacon-Shor code it's basically
just simple [inaudible] as the stabilizers.
What is often sited as advantage of these concatenated codes is
transversal nature of many of the gates. So what does this mean? If a
gate is transversal, our operation is transversal in the code then we
simply need to apply the gate n times for each of the n building blocks
in the code. So, for example, the controlled-NOT operation is
transversal. So we have these two blocks of nine qubits each which
represents two logical qubits. And to perform the controlled-NOT
operation we need to do CNOTs between the corresponding pairs of
qubits.
But actually from the research estimation perspective, even though this
gate is transversal it is fairly expensive because we cannot generally
do controlled-NOT operations on a physical quantum computer between any
pair of qubits. They have to be located at the same location; they have
to neighboring qubits. So that means we have to use the [inaudible]
operation to move the qubits around, and this is one reason why using
concatenated codes is expensive.
And then, of course, we also have gates that are not transversal. So
it's possible to show that we need at least some non-transversal gates
to do universal quantum computation with the concatenated codes. And so
here is an example of the T gate which is non-transversal in the BaconShor code. So the gate uses an ancillary state that needs to be
distilled. This is the distillation circuit which takes 15 copies of
the ancillary state T plus, and it distills a single copy of the
ancillary state with higher precision. And this distillation process
needs to be repeated sufficient number of times to achieve the target
fidelity of the ancillary state. And then the ancillary state is used
in this circuit to apply the T gate. So the original state psi was
here. Here was our ancilla and after this circuit is applied the T gate
is applied to the state psi which appears here.
Topological quantum error correcting codes have a very different
structure. They consist of qubits that live in a regular grid. In this
picture this is the surface code of Alexei Kitaev [inaudible]. The
qubits live at the edges of the grid, and the stabilizers which are
shown in this picture -- We have two types of stabilizers, the stars
here and the [inaudible]. And by measuring these stabilizers we can
error syndromes which gives us information about the errors that
occurred.
There are other examples of topological quantum error correcting codes.
This is the 5-square code that I developed at IBM. Each code has a
little bit different properties. One of the key advantages of this code
is that in order to reconstruct the syndromes we only need to do twoqubit measurements. So the red objects in this figure at the
stabilizers, but they decompose into smaller gauges. And by measuring
gauges of rate two we can reconstruct the syndrome information. And
this can be beneficial in situations where rate four entangling
measurements are too expensive to perform on certain quantum
technologies.
In addition to locality, another advantage of the topological codes is
the way they perform computation. Computation can be done by braiding;
controlled-NOT operations can be done by braiding. So how does this
work exactly? Well the Pauli operators logical X and logical Z are
strings in the lattice. So here is the basic lattice which doesn't
contain any defects which means that the syndrome measurements are
enforced in the entire surface of the lattice. And this lattice in this
state encodes two logical qubits. And this picture shows the logical X
and Z operators corresponding to these two logical qubits. Now let's
say we want to encode more qubits and we want to perform controlled-NOT
operations in this system, so what do we have to do? Well, it turns out
that if we puncture a hole in the surface, we will increase the number
of logical qubits that are encoded.
So first of all what do I mean by puncturing a hole? Do we have to
change the physical layout of the system? And the answer is no. To
puncture a hole, we simply stop enforcing the measurements of the
stabilizers in the region where the hole appears. So this is the hole,
and we don't measure any of the stabilizers in that region. And if we
introduce two such holes, a pair of holes that will represent one
logical qubit. And this is easy to see because the logical X and Z
operations will be strings that connect the pairs of defect and strings
that loop around the holes. Now which string is X and which string is Z
depends on the exact location of the hole in the lattice because there
are two kinds of holes: rough holes and smooth holes. But regardless,
to perform controlled-NOT operations in this system we simply take one
of the holes which corresponds to the controlled qubit of the
controlled-NOT operation and we will move the hole to the target qubit
then we braid the hole around the hole of the target and you move it
back. Now this works if the control is smooth and target is a rough
hole. And there is a simple conversion circuit that can address the
problem when we have both control and target qubits represented by
pairs of smooth holes.
So this is very convenient because this operation can be done simply by
changing the location of the syndrome measurements so there is no
physical movement of information needed to the controlled-NOT
operations. And how do we decode errors that occur in these topological
codes? Well, when we do our syndrome measurements we will detect
locations with non-trivial syndromes which are shown in red color in
this figure. So these four syndromes were measured as non-trivial
syndromes. And then it is easy to see that these syndromes occur at end
points of strings of errors. So if we have a string of X errors then at
the end point of the string of errors there is going to be a nontrivial syndrome. The same for Z errors, there is this star syndrome
here and here which detects that there is this string of errors
consisting of a single Z error here.
So once we measure our syndromes, we can use minimum weight perfect
matching to guess how these syndromes should be connected. We can used
Edmond's Blossom Algorithm, for example, to do this. And then once we
connect the pairs of matching syndromes we will correct -- So this is
the Placket syndrome. This is also Placket syndrome so we know if these
syndromes are matched we need to do X corrections, bit flips, on a
string connecting the syndromes. So the red axis represents the
correction that is being done and the black X marks, they show the
actual error that occurred. So in this case after we apply error
correction, we just apply this loop operator consisting of bit flips on
this loop here. But it turns out this is fine because this loop is a
loop that is in the stabilizer group so it doesn't affect the encoded
state of the system.
Now in reality this matching problem is actually a 3-dimensional
problem because our syndrome measurement itself will be faulty. It
needs to use a quantum circuit to measure the syndromes, and that's
going to be prone to errors. So some of the syndromes are going to show
as false even though there was no error that occurred and vice versa.
So to address this we can use basically an analog of the 2-D problem by
introducing the third dimension. And here in the third dimension each
of these lines represents a single syndrome that is measured over and
over again. And we will mark a red point if the syndrome measurement
outcome is different than the measurement outcome in the previous
round. And this way we will obtain a set of points in three dimensions
that we can match. And now depending on the error rate of measurement
versus the error rate in the memory, we can adjust the rates of these
edges of the temporal dimension compared to the rate of the edges in
the space dimension. And if we solve the minimum weight matching
problem then there's high probability we will correct the errors that
occurred.
This problem is -- So this is a classical problem that the classic
[inaudible] has to solve. But it turns out if we have millions of
qubits and we need to repeat the syndrome measurements a certain number
of times, this is going to be a large problem that is going to be
expensive on a classical computer. The surface code, of course,
exhibits threshold behavior so there is certain threshold. If the error
rate is below the threshold, we can correct the errors that occur at
this rate perfectly well as long as the code distance is big enough. So
as long as we have enough physical qubits encoded the information, we
can correct the errors. If we exceed the threshold, we cannot recover
the information. So I did a simple simulation that estimates the
threshold for various topological codes. I used C++, and I did Monte
Carlo simulation. So I will inject a random error into the system
according the probability model that is described here, and then I will
try to decode the error, correct it. And I will record the frequency
with which I can correct the errors at the given probability level
after certain number of Monte Carlo repetitions. And this is the result
that will come out of the simulation tool. So here I varied the error
rate with which I injected errors into the system. And on the Y axis is
the percentage of the time that I can successfully recover from the
error. And you can see that as I increase the distance of the codes, as
I increase the number of physical qubits that encode the information,
the curve in this picture will get sharper and sharper. Which means if
I am just below the thresholds, the percentage of failures is going to
be very small. So almost always I will be able to recover. And if I'm
above the threshold then the percentage of failures is large so I
cannot recover. But if I choose...
>> : When you say the percentage of failures, you mean the ones that
were not recovered?
>> Martin Suchara: Percentage of failures is the number of Monte Carlo
iterations that resulted in failure so I could not recover the original
encoded state.
And we can see that we need to choose code of sufficient distance. If
we want certain -- Let's say we want at least 5% probability of
succeeding and our physical error rate is 3%, well then we need to
choose a code distance at least 8 to guarantee this performance.
So the key properties of topological codes. It is easy to increase and
decrease code distance and that way influence the reliability of the
resulting code. Local operations are sufficient to do controlled-NOT
operations and basically computation with the codes, and they have
error correction threshold that is significantly higher than threshold
of concatenated codes. But the drawback is that the classical
processing for error decoding is time consuming.
So next I'm going to move to the actual research estimation with the
topological and concatenated codes. First I'm going to give a brief
overview of the properties of the quantum technologies that we consider
as part of the quantum computer science project and which form the base
of this resource estimation. Then I'm gong to show you the methodology
that I used to quantify the resources needed to perform operations with
the error corrections codes and the numerical results.
So this is the structure of the research estimation task. We will
consider a certain set of quantum algorithms, and for each of these
algorithms we will express the number of logical qubits we need to
perform the task, the number of logical gates to do the computation and
also information about parallelization factor and the length of the
two-qubit operations in the system.
For the quantum technologies we will consider the gate times and
fidelities and the memory error rate on qubits that sit idle in the
system. And we will also consider properties of the four basic families
of error correcting codes. They can show [inaudible] C4/C6 and the
surface code. And all this information feeds into our resource
estimation tool that produces a qubit layout and it estimates the
circuit delay, gate count and the fidelity of the entire computation.
Yes?
>> : You say number of logical qubits, but that number probably depends
on the infiltrate.
>> Martin Suchara: Yes. So we will choose specific problem of certain
size and then we will express this. So right now we finished phase one
of the QCS project where the problem sizes were hard-coded. But now we
are moving to phase two and we are going to parameterize the problem
sizes so that we can adjust a simple parameter which describes the size
of the problem and that way we can obtain gate counts. For example,
let's say we want to run a quantum computation for one year. What size
of problem can we solve? So we can solve the inverse problem if we
parameterize. And this is what we are working on right now.
>> : Like for a given size you take the worst [inaudible] that you can
get?
>> Martin Suchara: For a given instance of the problem, are we
considering...
>> : [Inaudible] for given size of instance, that's what you said
[inaudible]. You said [inaudible] the size [inaudible].
>> Martin Suchara: The parameter would be the -- So let's say we want
to solve some graph problem and you want to -- Let's say the trianglefinding problem and you want to find if there is a triangle inside of a
certain graph then the parameter would be the number of vertices of the
graph. So of course the more vertices the harder the problem is going
to be and the higher the gate count is going to be.
So this is a collaborative project. I have been coordinating the work
on the resource estimate on behalf of the USC team. This is -- I have a
project, the resource estimation is currently being done by four teams
independently. And I have a number of collaborators at a number of
universities who I owe that great work to analyze the properties of the
algorithms and quantum technologies.
In our project we are studying four families of error correcting codes,
Bacon-Shor, [inaudible], C4/C6 and surface codes. We have seven
algorithms which all of these algorithms have very different
qualitative properties; they use different quantum primitives. Some of
them use quantum simulation. Some of them use quantum random walk,
[inaudible] transform. So it's a very diverse group of algorithms. And
we are considering also six technologies and six quantum control
techniques that enable us to reduce the errors in these technologies.
And this talk rather than presenting the cross-product of these results
which we had to obtain in the project I'm just going to show the
highlights of our results.
So for the quantum technologies our goal is to obtain, for a range of
realistic quantum technologies to obtain the gate times and the gate
errors to perform physical quantum gates in the system. And in the QCS
program we studies the effects of quantum control protocols on the gate
errors basically which control techniques are effective at reducing the
gate errors and which are not.
Regarding our methodology, we used Monte Carlo simulations to insert
random noise and study the effect of control protocols on this noise.
We used optimization tools to optimize choice of control parameters.
And also we used gate constructions because not all quantum
technologies support all elementary quantum gates. And even if they do,
sometimes like smart decomposition of the gates from a different set of
gates can result in a smaller error. So this is on technique we used as
well. And this technique was actually very helpful in reducing some of
the errors for some technologies.
The two of our favorite technologies are superconducting qubits and ion
traps. superconducting qubits have errors that are [inaudible]
Markovian in noise and, therefore, it is difficult to reduce these
errors by using controlled techniques. The error of the gate is
basically proportional to the duration of the gate, and if we use
sophisticated control techniques that increase the duration of the gate
that will actually lead to increase of the error. So primitive control
is the most successful with superconducting qubits. And the errors
would be roughly ten to the minus five for a quantum gate on average.
Ion traps have extremely low error rates except the measurement error.
But we were able to use a circuit decomposition that uses three
measurements to bring down the error of the measurement down to roughly
ten to the minus nine which is in line with the low errors of the error
gates for ion traps. So these are our two favorite technologies because
they have low errors and they can be used both with the topological
codes and concatenated codes. Neutral atoms also work reasonably well - oh well, sort of well with topological codes but the errors there are
about ten to the minus three so that's just below the threshold for
topological codes but they cannot be used as concatenated codes.
And for quantum dots and photonics our results currently do not allow
us to use these as any of the existing quantum error correcting codes.
So here is an overview of the numbers that feed into my research
estimation tool. So here are the main -- I just selected some of the
technologies, superconducting qubit, ion traps and neutral atoms, and
I'm showing the average gate time for the average operation. Our
resource estimation tool actually takes the gate count for all the
individual gates. So the information that feeds into the tool is more
detailed than what this table shows. So this is the average gate time.
Then we have gate error for each gate operation. And then we have
memory error, unit of time per nanosecond. So the key observation here
is that ion traps have the lowest error rates by far but also the gate
times of ion traps they are not very short. They are roughly 3 orders
of magnitude slower than the gate times for superconductors. So
superconductors are faster but they are a little bit more error prone.
The errors would be ten to the minus five. So it is certainly
interesting to compare these two technologies because just by looking
at this table it's not clear which one is better.
And then we have neutral atoms which they are both slow and error
prone, so they are certainly not going to be [inaudible] among these.
>> : [Inaudible] the memory error rate is zero.
>> Martin Suchara: That is right but I'm not convinced that -- This is
unphysical. I think this is artifact of our model. So to do the
estimation I picked three quantum algorithms. I picked algorithms that
will give us reasonable running time. So in our project we are
considering algorithms that have just quadratic speedups also, and
these result in huge gate counts. We're considering running these
algorithms on problem instances that are very ambitions. Again, this
leads to big gate counts. So here I'm trying to handpick the algorithms
and problem instances that will result in results that are humanreadable. So if we actually built the quantum computer, we will be able
to see the result of the computation in a couple of months.
So one such problem is estimating the ground state for a molecule. I
picked a molecule that is not too difficult to analyze. So I picked
this glycine molecule, small organic molecule, which only requires 50
basis functions to describe the ground state. And I calculated -- I
decide to calculate the ground state in a fairly crude way with only
five bits of accuracy. So this is the first problem.
>> : Which basis was that? Which of the quantum [inaudible] basis did
you use?
>> Martin Suchara: I'm not sure about this.
>> : STO3G? 4G? P321? Don't remember?
>> Martin Suchara: I'm not sure.
>> : Okay. How many qubits did it require? [Inaudible].
>> Martin Suchara: Fifty qubits for the basis and then ten qubits
additionally. The second problem is the binary welded tree algorithm.
So the problem formulation is as follows: I will take two binary trees
and weld them together in the middle. So here the top part is one
binary tree. Here at the bottom is the second tree. The two root
vertices are special. One of them is marked as the start vertex and the
other node -- the other root is the finish vertex. And the goal is to
start at the start vertex and find the finish vertex by relying on an
oracle. And the oracle will give us information about the graph. If we
supply the oracle with a label, a name of a vertex in the tree it will
give us the names of the neighboring vertices. And each vertex has
actually exponential number of labels assigned to it and the oracle
will return just one of the labels. And we have to decide once we query
the oracle at a certain vertex, we have to decide which of those three
neighbors we will move to next. And this way we have to find the finish
vertex. And using classical computation it was shown that it was
impossible to find the finish vertex in sub-exponential time. But there
is a [inaudible] quantum algorithm that uses the continuous quantum
random walk. And we chose this problem for tree depth 300 which is an
instant size that would be very difficult to do on classical computer.
>> : [Inaudible] my question.
>> Martin Suchara: Yes?
>> : So tree depth is n equals 300, tree depth?
>> Martin Suchara: This is depth. So the...
>> : So there's a lot of nodes on the tree?
>> Martin Suchara: The number of nodes is going to be huge. There's
going to be like -- Right. Exponential.
>> : Binary tree two to the...
>> : [Inaudible].
>> : Well, but it gets bigger and gets smaller so it's half.
>> : These are perfect trees?
>> Martin Suchara: Yes, well they are not exactly perfect. They are
just almost perfect. I think we can assume that they are perfect for
this purpose. They are not perfect in the middle because it turns out
there is this strange feature of the problem. If we have exact binary
trees and we weld them in the middle, we will have vertices of degree
two in the middle. And then there exists a classical algorithm that can
exploit this information. It knows, basically, when it hits the middle
and it becomes easy to guess. At which point in the pass we make the
wrong turn. So they are scrambled a little bit but they are basically
binary trees.
>> : So since you scramble it will be now with the remaining passible
trees.
>> : [Inaudible] pass.
>> : [Inaudible] you mean the center is a random [inaudible]?
>> Martin Suchara: Right.
>> : Yes.
>> : Yes.
>> : So now -- So in addition to number of nodes there is a [inaudible]
particular tree that you used, pair of trees?
>> : So what I did the first time around, how [inaudible] number which
depends only upon the size of the problem and it also depends on
[inaudible]...
>> : Permutation of the joining, of the [inaudible]...
>> : Right.
>> : Is this problem available on the public literature?
>> Martin Suchara: It is. There are at least two papers I'm aware of
that studied this problem....
>> : [Inaudible] look it up.
>> : Yeah....
>> : [Inaudible] binary welded tree benchmark or something.
>> : Oh, okay.
>> : And find out enough detail to --.
>> Martin Suchara: And then the third algorithm I studied is the
triangle-finding problem. So, again, it's a graph problem where the
graph has n vertices, and our task is to decide of there is a triangle
in this graph. So the instance of the problem is set up to be this
[inaudible] instance where the graph is dense but there are very few
triangles. Actually there is going to be either one triangle or no
triangle at all. And the task of the algorithm is to decide if we are
given instance of a graph that it has one triangle or no triangle.
So the triangle here is in the middle and then we have these components
which contain -- Each of them has n over six nodes. And these edges
mean that all the vertices and the individuals components are -- all
the pairs of vertices are connected by edges.
So there is no triangle in this region but a single triangle in here.
So we consider it a graph with two to the fifteen nodes.
>> : So what's the challenge here? What's the classical way of solving
it if there is one?
>> Martin Suchara: Well, I guess the classical way of solving it -- Oh
well there is an oracle also that tells you how the structure of the
graph looks like. So I guess you have query it. You have to tell is the
name of the vertex again and it tells you which are the neighbors. And
then you have to scramble this information....
>> : What size graph has been done classically, I think it's better to
ask?
>> Martin Suchara: I believe that for this problem 32,000 nodes is
actually pushing the boundary because this is the original parameter
given by [inaudible] and from phase one and we tried to -- or they
tried to come up with parameters that are pushing the boundary. Yes.
>> : [Inaudible] looking for like quadratic [inaudible] speedup here?
Or what is the [inaudible]?
>> Martin Suchara: I believe the quantum log here gives exponential
speedup too.
>> : Exponential speedup. Probably?
>> Martin Suchara: I'm not sure about provably. I know that for welded
tree it's provable. For the triangle problem I am not willing to bet.
>> : It's binomial number of triples of nodes. You can go through all
triples of nodes just exhaust [inaudible].
>> Martin Suchara: Right, but I believe the oracle will -- All right.
[ Silence ]
>> Martin Suchara: I will have to check up the speedup claim for this
algorithm. So for the welded tree it's exponential, and for tree depth
300 this would not be solvable classically. For the ground state
estimation this molecular -- So we originally studied molecular levels
more complicated but actually the gate counts coming out of that are
huge. So I decided to analyze a simple molecule. And for this molecule
this would be, I believe, a classically solvable problem.
So this table summarizes the gate counts that we obtained. So again I
picked the welded tree and triangle finding problem. They use the
quantum random walk which is fairly efficient, so the gate counts
coming out are not too bad. And also for ground state estimation we are
using a simple molecule. So we have ground state estimation ten to the
twelfth gate. The number of qubits needed to do the calculation: it's
sixty. These are the logical qubits. And then we have parallelization
factors which tell us on average how many of the gates can be done in
parallel in the circuits. In my opening there is scope for improvement
in these parallelization factors. We don't know if we can lay out the
circuits in a better way that is more parallel.
And for each of the algorithms we not only have the total gate count
but we also have a breakdown by gate type. So this is the ground state
estimation algorithm. So we know how many state preparations we need,
how many [inaudible], controlled-NOT, S and Z gate measurements. And
this is the information that feeds into...
>> : [Inaudible] go back?
>> Martin Suchara: Yes.
>> : You're able to do ground state without any rotations? [ Silence ]
Who derived the circuit? Maybe that's a better way to put it
[inaudible]. Do you know where it came from...?
>> Martin Suchara: Oh, so I think actually the Z rotations are
[inaudible] rotations. I think that's misleading.
>> : Okay. Okay. Yeah, it's arbitrary and Z, that's fine. There are
some Z's really.
>> Martin Suchara: Yes. That's -- That's a misleading choice of --.
>> : Okay, that's fair. I was like you can do that without rotations.
Okay.
>> Martin Suchara: So for the actual analysis of the error correcting
codes, so what is the overhead? So, so far we only saw number with
logical qubits and logical operations. What is the number of physical
operations we need to do? So we looked separately at the three
concatenated codes and at the surface code. So for both code families
we assumed a tiled layout. So we have a 2-D structure that contains the
physical qubit. And this structure is divided into tiles where each
tile is in charge of one logical qubit. The tile has enough space to
include ancillas. So if we need to error correct the information in
that logical qubit, we can do so within the tile. And the third
dimension in this space is reserved for a classical control. So how
exactly looks the layout of the qubits inside of the tile? We had to
determine that separately for each of the error correcting codes
because they encode information differently, operations are done
differently. And the number of qubits is going to differ inside of each
tile too. So here is the tile structure for the [inaudible] code at the
second level of concatenation. So the first level tile so it contains
six by eight physical qubits. And then at the second level we use six
by eight of these level one blocks to construct the tile. And we have
to choose a sufficient level of concatenation to guarantee high
probability of success of the calculations. So I chose a cut of 50%
success probability for the calculation and I used the equation, I
believe it was originally shown by [inaudible] in a paper that shows
how many concatenation levels we need. And from there we can calculate
the physical number of qubits. So what is going on inside of the tile
during computation? Well the size of the tile depends on the code.
There is existing literature that shows us the location, the optimal
location of the qubits inside of the tile and the sequence of
operations we need to do to perform error correction as well as logical
operations.
So existing literature tells us what to do for Steane and Bacon-Shor
code. But for the C4/C6 code we had to come up with our tile design. So
here is a specific example from the paper of Svore that shows the
operations inside of the tile for the Steane code. So the tile size is
six by eight. And the Steane code, it uses seven physical qubits to
encode one logical qubit. So these red qubits here, the red locations,
they're the locations where the data lives. And then here on the
interior of the tile we have some ancillas that are needed in order to
do syndrome extraction and error correction for the Steane code. So
this determines the location of the data qubits, the ancillas and also
a sequence of operations. So the paper of Svore actually shows the
exact gate sequence of all the gates that need to be performed to do
error correction as well as all other operations.
This is a snapshot of the tile at a particular moment in time doing the
error correction. And these errors they represent physical gates that
being done. So these are SWAP gates and controlled-NOT operations. The
layout has to be optimized to minimize the number of operations that
need to be done to ensure good reliability of the circuit and to
minimize the amount of movement. So these SWAPs are done because
controlled-NOT operations can be only done on qubits that are next to
each other. So this is actually a very expensive thing for the
concatenated codes, the movement.
Our tools: we used recursive equations to express the gate counts for
the desired level of concatenation. So we count the number of
elementary gates and the time we need to do these operations taking
parallelism into account. We also estimated the additional space just
needed to do ancilla state generation at the desired level of fidelity.
We used similar methodology for the topological codes. So as I said
earlier in topological codes a pair holes represents a logical qubit.
So we assumed this layout. We have to ensure sufficient spacing of the
holes because loops that connect the holes are logical operators. If we
have holes that are too close to each other there is going to be a lowrate logical operator and the code is going to be error prone. So we
calculated a code distance that is sufficient to, again, guarantee that
the calculation succeeds with high probability. And we also took into
account the movement of the holes during braiding. So we left enough
space in between the holes so that another hole can be braided without
effecting negatively the error properties of the code. So to obtain the
physical qubit count, that is very simple. We just obtain the code
distance and then from code distance we can get size of one tile and we
multiply by number of logical qubits and that give us number of
physical qubits. For gate counts we started by first calculating the
exact running time of the entire computation, and then from there we
were able to calculate the number of gates that are needed to do the
error correction which is done all the time on essentially all the
qubits. So this is the major component of the total gate count.
And then finally we added the small number of additional gates needed
to do gates such as measurements, [inaudible] and so on.
So numeric results. These are the results for superconducting qubits
which -- So if you recall this is the technology that has very fast
gate times, on average gate only takes 25 nanoseconds. And as far as
errors goes, this is somewhere in the middle of the road among the
technologies I showed you. It's ten to the minus five error pair
logical gate for the first gate. And this table shows the resources for
the three algorithms for the instance sizes we discussed.
So you can see that the surface code will result in computation times
from months to years for these instances. Here are the gate counts. And
this is the physical qubit count. So we need a few million qubits for
the computation. The Bacon-Shor code which is representative of the
concatenated codes requires much higher computation time. And I think
the reason is we need several levels of concatenation, three or four
levels of concatenation to address the gate errors. And that will
translate in a huge gate count and then huge time needed to do the
computation. And also the number of physical qubits is going to be high
because of all this concatenation. We need to store the information
somewhere. Yes?
>> : How sensitive is this number? Like if you pushed it ten to the
minus six, would everything lose like three orders of magnitude in the
results?
>> Martin Suchara:That is a good question. It is sensitive to it...
>> : [Inaudible] state the question.
>> Martin Suchara: Oh, yes. The question is sensitivity to the gate
errors. If we change the gate error will the resource requirement
change a lot? And the answer is yes. It is sensitive because of
concatenated codes for example there is this very sharp transition. If
you go from three concatenations to four concatenations the resource
requirements will blow up by three, four orders of magnitude perhaps.
And that's by changing a single number. If you adjusted the boundary
then this can cost you a lot. But it's very interesting, I'm going to
discuss this in the next slide, that...
>> : I have a question about [inaudible]....
>> Martin Suchara: Okay.
>> : So in the ground state estimation we sort of agreed here that you
need to have various [inaudible] as the gates, right?
>> Martin Suchara: Yes.
>> : Did you count each one as simple gate or did you estimate to the
kind of [inaudible] of decomposition into the [inaudible]? In other
words, does it represent a T count involved in implementation of a
simple [inaudible]?
>> : Your Z gate's one gate or were they a hundred gates or...
>> : Right.
>> : ...were they [inaudible] equivalent?
>> Martin Suchara: [Inaudible] do the decomposition.
>> : Of the rotation?
>> : You did.
>> : Yes
>> : Oh, okay. Okay. So that is the real number of gates?
>> Martin Suchara: Yes. Yes.
>> : Okay.
>> : Right. So what do -- Can I ask what [inaudible]? Yeah, I have a
sinister...
>> : Yeah, of course.
>> : ...reason to ask this question. We kind of improved [inaudible] by
three orders of magnitude recently. And I was wondering how much it
would have impacted the ground state [inaudible]...?
>> Martin Suchara: Possibly. I'm sure there could be improvements....
>> : [Inaudible] a big count?
>> : No, no, but he had Z up there was the full rotation gate. So when
he goes into here he has to multiply it by ten to the second, ten to
the third to get the accuracy [inaudible], even five-bit accuracy.
>> : So it didn't actually have the [inaudible]?
>> : Right. So the point is you could probably drop three digits, the
ten to the twenty-second might drop down to ten to the nineteenth
instead if you didn't have to do the -- if you could get three orders
of magnitude better on the rotations. Is that what you're getting at?
>> : Yes.
>> : Yeah.
>> Martin Suchara: Yes, I would be certainly interested in learning new
techniques to improve the decompositions.
>> : It's still up at ten to the nineteenth. I mean, I'd still -- You
know, it's a simple molecule. It's still a hell of a lot of gates.
>> Martin Suchara: Right. But we have lots of problem instances that
had much higher numbers. I am just showing numbers that are human
readable.
>> : Again along these lines: is it clear that if you -- So you said
you took 50% success rate for the complete algorithm. Is it clear that
if you drop that to something like ten to the minus six and you just
repeated the experience like a million times, is it clear that it
wouldn't be better than waiting for a thousand years for one
[inaudible] 50% success probability?
>> Martin Suchara: That's a great question. We haven't done this
calculation and it's possible that we can obtain lower running time by
targeting lower fidelity of the algorithm and repeating it a few times.
Also, there are basically many optimizations that could be done and we
haven't taken into account. One such other optimization is distillation
of the ancillas. Perhaps if you distil them with lower -- your target
fidelity for the ancillas is lower, perhaps that will introduce
inaccuracies but since the distillation, again, is this concatenated
process if you use one [inaudible] concatenation in the distillation
process, that saves you a lot of time and a lot of gates. So it's
possible that these tradeoffs can improve the resource estimates
significantly.
>> : So in that earlier slide did you have the logical gates and so on?
I mean how many -- Suppose there was no errors.
>> Martin Suchara: Was it this one?
>> : The one -- Yeah, that one.
>> : Yeah.
>> : That Z is really rotated Z. It must've expanded out [inaudible]...
>> : I understand. So there might be some big number in there.
>> : Right. That gets much bigger.
>> Martin Suchara: Yes.
>> : All the rest of them are sort of what they are which says you
something [inaudible] to the fourteenth or something plus whatever the
Z count -- whatever the T count in Z is and -- ?
>> : Yep.
>> : So they each count and so -- Yeah.
>> Martin Suchara: So the next question that I wanted to ask was if
there is some [inaudible] -- So we saw that the topological codes
outperform the concatenated codes for the superconductors. So the next
question I wanted to ask is there some [inaudible] where actually the
surface codes are not performing as well? And the answer is yes. So I
considered these three technologies. We have neutral atoms which have
huge error rates on the gates. And then we have superconductors which
have lower errors and then we have ion traps which have even lower
errors. And, okay, so the result here is very interesting. For the high
errors of neutral atoms you can only use the surface code because the
surface code meets the threshold, ten to the minus three, but the
concatenated codes they just cannot deliver.
And [inaudible] where we had -- Oh, and the time to do the calculation
is huge because the gate time for neutral atoms is three orders of
magnitude longer, slower than for superconductors. Superconductors we
already know that the surface code will win. But then if we decrease
the error rate even further for the ion traps the interesting
observation is that the concatenated codes actually do better than the
surface code. And the reason is these very small error rates, we only
need one level of concatenation. So error correction on the
concatenated code is super cheap. But the surface code, the duration of
the computation is almost independent of the code distance because all
the operations in the surface code are done in parallel. All the
syndrome measurements are parallelized. All the operations, the
braiding, are highly parallel. So what actually determines the running
time for the surface code is the duration of the gate.
So some key observations: the surface codes are better in most regimes
unless you're dealing with very low error rates. CNOT gates are the
dominant gates for logical circuits, but as far as physical gates CNOT
was the most frequently used physical gate for the topological codes.
And SWAP is the most frequent gate for concatenated codes. And the
reason is that for concatenated codes we need to use SWAPs to move
information around so that controlled-NOT operations can be done
locally. Various surface codes do not need the SWAPing.
So finally I will just use the last five minutes to describe some
thoughts about building a faster decoder for topological codes. So, so
far your only concern was the quantum resources, but we have to keep in
mind that for the surface code we will also need a classical controller
that decodes the errors that occur. So we know that we need to be able
to solve a problem that has millions of physical qubits. Also we know
that syndrome measurements will be inaccurate so we will need to decode
errors after several rounds of error measurements, syndrome
measurements. And we know that we like technologies with low gate times
because the surface code works well with technologies with low gate
times such as superconductors. So we would really love to have a
decoder that works in real time. And this is work that I'd like to do
in the coming months. I started looking at the decoding problem. To
decode errors for the surface code we need to solve the minimum
matching problem to match pairs of syndromes. And one key observation - So even though this problem is [inaudible] solvable -- We would
typically use the Edmond's Matching Algorithm to solve it -- it is
going to be too slow for the problem sizes here considering. So one
observation that is due to Fowler is that we don't need to solve the
most general minimum weight matching problem but we can prune certain
edges from consideration. So here is an example. We have a vertex V and
we want to match it to some other vertex. And Fowler's observation
tells us that points in space cast shadows, and if there is here point
number one, it will cast a shadow here. Point three will be shadowed by
it. So vertex V, you only need to consider the edge from V to one but
we can ignore the edge from V to vertex number three. And the reason
being you can show that, assuming a minimum weight matching would use
the edge from V to point number three, you could find an alternate
matching that would have lower rate.
So now this is a two-dimensional picture and, of course, we need to
solve the problem in three dimensions. It turns out that in two
dimensions there is a very simple linear time algorithm that can prune
the edges, prune the shadow edges. But I did not manage to find a
counterpart in three dimensions. In fact, I believe it's unlikely that
such an algorithm exists. But I found a heuristic that allows us to
prune the edges in linear time with very favorable results. And the
heuristic works as follows: so we are given the vertex V and we want to
find the candidate edges. So we will look in the four geographic
dimensions for the closest point, and then you will create a bounding
box in these two dimensions. And we know that any point that vertex V
will connect to is going to be inside of this bounding box. And then we
look at points in the third dimension perpendicular to this screen, and
we find the closest such point in each direction. And we connect vertex
V to these points. So I looked at the resulting average degree of the
vertex in the three-dimensional graph that I obtain by simulating the
surface code. And so here is the number of qubits in the surface code,
so I went up from a hundred qubits to one million qubits. And I looked
at the average degree of a vertex after the pruning. And it seems that
the number approaches about 180. So we have -- In the limits we will
have a constant number of edges per vertex. And I tried to solve the
minimum weight matching problem on just my laptop computer and see how
long it takes to do the matching. And these are the results. And you
can see that scaling is also linear. So for one million qubits it takes
a bit over a minute to construct the graph, generate the errors and
prune the edges. And then it takes about seven or eight minutes to do
the matching itself. Yes, there is a question.
>> : You've listed the number of qubits but what is the error rate by
which you're running your simulation because that has a big effect on
the number of vertices you need to pass to the graph?
>> Martin Suchara: I was running fairly close to the threshold -- at
about [inaudible] of the threshold. If the error rate was much lower
then I think the result would be -- we could decode errors for higher
number of qubits because there would be just fewer syndromes. But
probably I think the statistical distribution of the points in the
space would be very similar. So I think the degree of the node would
actually be the same. Yes, another question.
>> : Performance of the decoder because now you're using this heuristic
to parallelize your algorithm, if I understand you correctly?
>> Martin Suchara: So this is not parallel yet. This is...
>> : But it can be.
>> Martin Suchara: But it can be. And I'd like to look at
parallelization....
>> : So [inaudible] possible using this [inaudible], right?
>> Martin Suchara: So it is certainly possible. I have actually
parallel implementation. I think I have it in the next slide. I have a
parallel implementation for the pruning which is easier to do. I mean
the pruning is just a very simple heuristic. I don't have
parallelization for the matching itself which uses the Edmond's
Algorithm. It uses prime and duel updates.
>> : [Inaudible] which does this trick of creating these bumps around
the vertices and having these smaller graphs. Does that affect the
performance of [inaudible]?
>> Martin Suchara: Oh yes. It's very significant. I would not be able
to do -- I would probably have to stop at somewhere between a thousand
and ten thousand qubits if I didn't do the pruning.
>> : Sorry, I didn't mean in real time. I meant in probability of
decoding errors.
>> Martin Suchara: It doesn't affect the probability of errors because
we are still solving the exact same problem. We can show that we only
prune edges that must not be in the minimum rate matching.
So in conclusion, I did some work on topological quantum error
correction. I hope to develop two new quantum error correcting codes in
this space, the five-squares code and the triangle code.
And in the past year I worked on the quantum computer science project
of IARPA to estimate the resources required to run a variety of
algorithms on realistic quantum machines using four families of quantum
error correcting codes. And currently I'm looking at the problem of
parallelizing a decoder for topological quantum error correcting codes.
Thank you.
[ Audience applause ]
>> Krysta Svore: Questions?
>> Martin Suchara: Yes.
>> : I don't know if there's a simple answer to this. If not, we can
take this offline. But what about the bottom and top boundaries when
you decode a surface code with all the measurements? Can you take into
account perforation errors? And what about the last layers of
measurements? Do you have to kind of -- So I guess the point is that
you measure many times to be more sure about the syndrome outcome. But
the last measurements you did might have errors which you're not sure
are good, so you want to decode what has been the best in some sense.
So how do you deal with this issue?
>> Martin Suchara: So I think there are two or perhaps three schools of
thought how to deal with this. One of them says regardless of how you
solve this problem you could just do a dummy round of perfect
measurements and then run your algorithm on this instance. And then
with the knowledge that in reality you would do something else such as
do the error correction maybe ten rounds behind schedule, behind the
actual measurements.
The second school of thought would be obviously to simulate exactly
what would be going on in the real system. Which is perhaps this: you
would do the decoding with some lag. And the third school of thought is
to use a periodic boundary condition in the time dimension. So you
would connect, essentially, points to the boundary. Yes, [inaudible].
>> : So just a comment. I think you collaborated with all three of us
on this project. But...
>> Martin Suchara: Yes.
>> : On the surface code estimate, you had some high numbers. And I
just want to say that for some of the distillation procedures we
assumed a very conservative model of how we were distilling magic
states. And so some of those numbers I think you showed could be
reduced fairly significantly with a penalty in area [inaudible]...
>> Martin Suchara: I changed that already actually.
>> : Oh, okay.
>> Martin Suchara: Right. So we had earlier results that were more
pessimistic because we used a conservative assumption that only a
single CNOT would be done in the surface code at any given time. But
the numbers I showed right now, I tried to parallelize all the CNOTs
that I could including in the state distillation.
>> : Yeah, but in our state distillation we...
>> Martin Suchara: And, yes, we need a large number of qubits than
originally predicted. That is correct.
>> Krysta Svore: Thank you again.
[ Audience applause ]
Download