Optimal Control of Quantum Systems ss-NMR, Noise Resilience, and a Scalable Newton’s Method

advertisement

Optimal Control of

Quantum Systems

ss-NMR, Noise Resilience, and a Scalable

Newton’s Method

D.L. Goodwin, iMR-CDT: 1st year Presentation, University of

Nottingham.

3rd December 2014

Optimal Control at Southampton

1

Introduction to Optimal Control

Numerical Optimisation

Optimal Control

GRadient Ascent Pulse Engineering

2

Solid State Nuclear Magnetic Resonance

Crystallite orientation ensemble average

ssNMR Results

3

Noise Resilience of Optimal Control

Pulses

Ensemble average

Noise Resilient Results

4

The Newton-Raphson Method

Gradient Calculations

Sophie Schirmer’s Gradient

Hessian Calculations

GRAPE - Efficient Hessian Calculations

5

Acknowledgement

Introduction to

Optimal Control

Numerical Optimisation

Basic extrema finding

I

Optimisation is the process of finding minimum or maximum

(extrema) of functions

I

Newton-Raphson method uses the gradient at a trial point to, incrementally, approach an

I extrema of an objective function x k +1

= x k

− f ( x )

∇ f ( x )

Numerical Optimisation

Basic extrema finding

I

Basic method to use a step length , α , and a step direction , p , to iteratively find and extrema.

I

We calculate a step direction, then decide how far to move in that direction: x k +1

= x k

+ α k p k

I for minimisation, the step direction should be a descent direction.

p k

= − B k

− 1

∇ f k

Gradient Descent B k

Newton’s Method B k is the identity matrix is the exact Hessain matrix ∇

2 f ( x k

)

Quasi-Newton Methods B k is an approximation to the Hessian.

Numerical Optimisation

Comparison of Methods

Gradient Descent Cheap, Scalable, Slow Convergence

Newton’s Method Can be Expensive, Good Convergence.

BFGS Requires line search, No matrix inversion, Each update in

O ( n

2

).

l-BFGS Requires line search, No matrix inversion, Each update in

O ( n ).

Numerical Optimisation

Example - Rosenbruck Function

Optimal Control Theory

I

The idea of optimal control is to use all controllable parts of the full Hamiltonian to steer an initial quantum state into a target state.

I

A measure of “success” for this control can be fidelity; the overlap between the controlled, final state and the target state (1 begin identical states, 0 being orthogonal states).

I

The control parts of the Magnetic Resonance Hamiltonians are rf pulses for NMR, and mw pulses for ESR.

I

The optimal solutions to the stated optimisation problem are a set of discrete-time control pulses.

Optimal Control Theory

I

H

0

, and the parts we can control,

( k )

.

ˆ

( t ) = ˆ

0

+

X c

( k )

( t )

ˆ

( k ) k

I

The control operators (e.g. Electromagnetic fields), each have a coefficient, c

( k )

( t ), dictating the controllable time-evolution.

I

Find { c

( k )

( t ) } to take our system from an initial state, | ψ

0 i , to a target state, | σ i

Optimal Control Theory

I

Use fidelity as a metric to measure the overlap between target state and final state (1 = total overlap, 0 = no overlap)

J { c

( k )

( t ) } = Re h σ | exp

(0)

Z

T

− i H

0

+

X c

( k )

( t ) H

( k )

!

dt | ψ

0 i k

0

I

The problem is to find the maximum of the fidelity (or the minimum of 1 − fidelity).

GRAPE

I

Assumption: control pulse sequence is piecewise constant.

I

Time-ordered exponential now a sequential multiplication of propagators.

J { c

( k )

( t ) } = Re

D

σ U

N

U

N − 1

· · · U

2

U

1

ψ

0

E

   

ˆ n

= exp

(0) 

− i

H

0

+

X c

( k )

( t n

)

ˆ

( k )

∆ t

 k

Solid State Nuclear

Magnetic Resonance

Ensemble Average

Crystallite orientation, Lebedev grid, Floquet space

I

Take an ensemble average over the crystallite orientations.

I

Feed into a numerical optimisation through GRAPE.

I

Average over a Lebedev grid.

I

Initialise each of the ensemble to the same random guess.

I

Calculate the Liouvillian in a

Floquet basis at each of the grid weights.

I find the local cost and its gradient.

I

Take the weighted average over the cost and gradient.

Ensemble Average

Noise Resilience of

Optimal Control

Pulses

HCF simulations

+140

C

160

Hz

Hz

H

F

Figure: Interaction parameters of a molecular group used in HCF small system state transfer simulations, with a magnetic induction of 9.4 Tesla.

protein backbone simulations

C

30 ppm

O

N

− 11 Hz

C

55 ppm

55 Hz

C

7 Hz

175 ppm

− 15 Hz

N

H

Figure: Interaction parameters of a typical protein backbone used in large system state transfer simulations, with a magnetic induction of 9.4 Tesla.

Ensemble Average

I

In real experimental apparatus, the control channels will include some level of noise as most electrical systems do

I

Just as with the ssNMR optimisation, we average over an ensemble of systems.

I each system has the same initial random guess

I

In simulating this noise, the solution is to create an ensemble of systems, each with their own instance of noise affecting the control channels.

I

Noise is additive Gaussian, and is defined at the start of the simulation for each noise instance.

I

We then optimise over the ensemble.

Ensemble Average

1.0

0.8

0.6

N

=

2

5

N

=

2

4

N

=

2

3

0.4

N

=

2

6

0.2

0.0

0.01

0.10

Noise Level

1.00

Figure: Convergence of the noise simulations to a number of noise instances

Ensemble Average

1.0

0.8

0.6

1

H

13

C

19

F

1-spin

Total populations of correlations in the system (upper-right), and coherence orders (lower-right).

2-spin

0.4

0.2

0.0

1.0

3-spin

20 40 60 80 100120 0 20 40 60 80 100120 a direct product basis set the full state space, L , is a direct sum of correlation order subspaces:

1

H

13

C

19

F

0.8

0.6

L = L

0

⊕ L

1

⊕ L

2

⊕ L

3

0.4

0.2

0.0

single-quantum double-quantum

20 40 60 80 100120 0 20 40 60 80 100120

Time point Time point

Figure: Population of states local at each spin of HCF (upper-left), and total population of each state (lower-left).

where the population of a correlation order, p k

, is given by the projection onto its subspace p k

=

L k

| ˆ i

Ensemble Average

1.0

0.8

0.6

0.4

0.2

0.0

0 20 40 60 80 100 120

Time Point

Figure: Measures of similarity between the trajectory of a noisy system compared to that of a similar one without noise

The

Newton-Raphson

Method

Objective function and Gradient Calculation

I

The problem of second order methods, that require a gradient calculation, is reduced to:

1.

Propagate forwards from the source

2.

Propagate backwards from the target

3.

Compute the expectation of the derivative

*

σ U

N

U

N − 1

· · ·

( k )

∂ c n = t

ˆ n = t

· · ·

ˆ

2

U

1

ψ

0

+

Augmented Exponential

Gradient Calculation

I

We can use a the work of C. Van Loan (1970’s) and Sophie

Schirmer to find the derivative of the control pulse at a specific time point exp

− i

ˆ

L ∆ t − i

0 −

ˆ

H i

ˆ k

( j )

∆ t t

!

=

 e

− i

0

ˆ

∆ t ∂

∂ c k

( j ) e

− i e

− i

ˆ

∆ t

ˆ

∆ t

Hessian Matrix

Avoiding Singularities

I

Symmetric

I

Non-singular

I

Size is time-points multiplied by control channels.

I

Diagonally dominant.

Hessian Matrix

Calculating the Hessian elements

I

Calculation of the Hessian elements requires the second order derivatives

I

Scales to O ( n

2

) calculations (compared with O ( n ) for a gradient calculation).

I second order derivatives can be calculated with a 3 × 3 augmented exponential.

exp

− i

ˆ

L ∆ t − i

0

ˆ

H

( k

1 n

1

− i

)

∆ t

ˆ

L ∆ t − i

0 0

0

ˆ

H

( k n

2

− i

2

)

ˆ

∆ t t

=

 e

− i

0

ˆ

∆ t ∂

∂ c

( k

1) n

1 e e

− i

ˆ

∆ t

− i

ˆ

∆ t

0 0

2

∂ c

( k

1) n

1

∂ c

( k

2) n

2

∂ c

( k

2) n

2 e

− i e e

− i

ˆ

∆ t

− i

ˆ

∆ t

ˆ

∆ t

Hessian Calculation

Expectation of second order derivatives

I

We can reduce the computation to scale with O ( n ) by realising that we can store propagators from gradient calculation, and then perform an extra set of 3 × 3 augmented exponentials for the

(block) diagonal elements.

*

σ U

N

*

*

σ

σ

U

N

U

N

ˆ

N − 1

· · ·

( k )

∂ c n = t

ˆ

2

U

1

ψ

0

+

2

ˆ

N − 1

· · ·

( k )

∂ c n = t

2

ˆ n = t

· · ·

ˆ n = t +1

( k )

∂ c n = t

ˆ n = t

· · · U

2

U

1

ψ

0

+

+

ˆ

N − 1

· · ·

( k )

∂ c n = t

ˆ n = t

2

· · ·

( k )

∂ c n = t

ˆ n = t

1

· · ·

ˆ

2

U

1

ψ

0

Hessian Regularisation

Avoiding Singularities

I

A singular matrix is one that is not invertable.

I

Common when we have negative eigenvalues.

I

We can regularise the Hessian matrix:

TRM - We can add a multiple of the identity to the Hessian

H reg

= H + λ I where λ > max { 0 , − min [ eig ( H )] }

RFO - Prevents the algorithm taking large steps....

Results

Comparing the Newton method

10

0

10

-1

10

-2

10

-3

10

-4

10

-5

10

-6

Steepest descent

50 100 150 200 250

10

0

10

-1

10

-2

10

-3

10

-4

10

-5

10

-6

BFGS

50 100 150 200 250

10

0

10

-1

10

-2

10

-3

10

-4

10

-5

10

-6

L-BFGS

50 100 150 200 250

10

0

10

-1

10

-2

10

-3

10

-4

10

-5

10

-6

Finite Difference Hessian

50 100 150 iterations

200 250

10

0

10

-1

10

-2

10

-3

10

-4

10

-5

10

-6

Newton (fmincon.m)

50 100 150 iterations

200 250

10

0

10

-1

10

-2

10

-3

10

-4

10

-5

10

-6

Newton-TRM

50 100 150 iterations

200 250

Acknowledgement

Acknowledgement

Marina Caravetta University of Southampton

Sophie Schirmer Swansea University

Frank Langbein Cardiff University

Gavin Morley University of Warwick

Anton Tcholakov University of Warwick

Sean Wright University of Southampton

Spinach

Ilya Kuprov University of Southampton

Hannah Hogben University of Oxford

Luke Edwards University of Oxford

Matthew Krzystyniak Oxford e-Research Centre

Peter Hore University of Oxford

Gareth Charnock Oxford e-Research Centre

Dmitry Savostyanov University of Southampton

Sergey Dolgov Max-Planck Institute for Mathematics in the Sciences

Zenawi Welderufael University of Southampton

Download