 z  relative Renyi entropies

advertisement
z
relative Renyi entropies
Nilanjana Datta
University of Cambridge
jointly with:
Koenraad Audenaert
Royal Holloway & University of Ghent
arXiv:1310.7178
Quantum Relative Entropy
-- a fundamental quantity in Quantum Mechanics & Quantum
Information Theory :
The quantum relative entropy of
  0, Tr   1;

w.r.t
,
  0:
(density matrix/state)
D(  ||  ) : Tr ( log  )  Tr ( log  )
supp   supp 
well-defined if

log  log 2
Classical counterpart:
D( p || q ) : 
xX
px
px log ;
qx
p   px  xX ; q  qx  xX

D(  ||  ) acts as a parent quantity for von Neumann entropy:
S (  ) : Tr ( log  )   D(  || I )

(  I )
It also acts as a parent quantity for other entropies

Conditional entropy
 AB
A
B
S ( A | B)  : S (  AB )  S (  B )   D(  AB || I A   B )

Mutual information
 B  TrA  AB
I ( A : B)  : S (  A )  S (  B )  S (  AB )  D(  AB ||  A   B )
Some Properties of
“distance”

D (  ||  )
D(  ||  )  0
 ,
states
 0 if & only if   
Joint convexity:
n
For two mixtures of states
   pi i
&  
i 1
n
 p
i 1
i
i
n
D (  ||  )   pi D ( i ||  i )
i 1

Invariance under
joint unitaries
D(U U * || U  U * )  D (  ||  )

Data-processing inequality

Monotonicity under quantum
operations
any allowed physical process on a
Quantum operation:
quantum-mechanical system
Most general description given by
a completely positive trace-preserving (CPTP) map

()
Data-processing inequality (DPI)
D ( (  ) ||  ( ))  D(  ||  )
This is a fundamental property for any relative entropy
Significance of the quantum relative entropy in
Quantum Information Theory
It acts as a parent quantity for optimal rates of
information-processing tasks e.g.
 data compression,
 transmission of information through a channel etc.
in the
“asymptotic memoryless setting”
information sources & channels are assumed to be
 memoryless
 available for infinite number of uses (asymptotic limit)
(n  )

E.g. Transmission of classical information
classical
info
N
Noisy quantum
channel
Optimal rate (of classical information transmission):
classical capacity
C (N ) 
maximum number of bits transmitted per use of N
memoryless: there is no correlation in the noise acting
on successive inputs
N n
:
n
successive uses of the channel; independent
“asymptotic, memoryless setting”

To evaluate
C (N ) :
classical
info
x
En
 x( n )
input
N n
n
N
uses
n
encoding

One requires : prob. of error
C (N ) :
N  n (  x( n ) )
channel
output
Dn
decoding
(n)
e 0
p
x'
as
n
Optimal rate of reliable information transmission
-given in terms of a mutual information:
(obtainable from the relative entropy)
Other important relative entropies
(1)

relative Renyi entropies:
  RRE
(2) Max- and min-relative entropies
(1)

relative Renyi entropies:
  RRE
1
D (  ||  ) :
log Tr(   1 ) 
 1
lim D (  ||  )  D(  ||  )
 1

Also is of important operational significance,
*
(2) Max- and Min- relative entropies

Max-relative entropy [ND 2008]
Dmax (  ||  ) : inf  :   2  


Min-relative entropy [Renner et al 2012]
Dmin (  ||  ) : 2 log ||   ||1
*
Properties of the min-max relative entropies
Positivity:

If
 ,   D (H ),
D* (  ||  )  0
for
*  max, min
just as
D(  ||  )
Data-processing inequality:

D* ( (  ) ||  ( ))  D* (  ||  )
for any CPTP map
Invariance under joint unitaries:

D* (U U || U  U )  D* (  ||  )
†

†

for any unitary
operator
U
Interestingly,
Dmin (  ||  )  D(  ||  )  Dmax (  ||  )
*
Operational significance of the Max- and Min- relative
entropies in:
One-shot Information Theory [Renner; ND]
asymptotic, memoryless
One-shot information theory
classical
info
x
x
E
input
encoding
N
single use
N
N(x )
channel
output
D
x'
decoding
One-shot classical capacity := max. number of bits that can be
transmitted on a single use
C (N)
(1)
:given in terms of a mutual information
obtained from the max-relative entropy
In Summary……
there is a plethora of different entropic quantities
which arise in Quantum Information theory

-- which are interesting both from the mathematical
and operational points of view;

hence it is desirable to have a
unifying mathematical framework
for the study of these different quantities.

Recently, such a framework was partially provided:
by a non-commutative generalization of the
  RRE
[Wilde et al; Muller-Lennert et al]

1
log Tr (
D  (  ||  ) :
 1

  Quantum Renyi Divergence
(sandwiched Renyi entropy)
(1 )
2

(1 )
2

)




Quantum Renyi Divergence
Recently, such a framework was partially provided:
  RRE
by a non-commutative generalization
  QRD
D  (  ||  ) :
IF [ ,  ]  0 THEN
[Wilde et al; Muller-Lennert et al]

1
log Tr (
 1

(1 )
2


Tr  

(1 )
2
(1 )





)


Tr(   1 )

  Quantum Renyi Divergence

Recently, such a framework was partially provided:
by a non-commutative generalization
[Wilde et al; Muller-Lennert et al]
  QRD

1
D  (  ||  ) :
log Tr 
 1

IF [ ,  ]  0 THEN
  RRE
  RRE
(1 )
2


Tr  

(1 )
2
(1 )




1
log Tr(   1 )
D (  ||  ) 
 1





“super-parent”:
D(  ||  )
Dmin (  ||  )
Dmax (  ||  )
 1
1
2
 
D  (  ||  )
if [ ,  ]  0
  RRE

Properties of
D  (  ||  )
D (  ||  )
properties of
Dmin (  ||  ), D(  ||  )
Dmax (  ||  )
“super-parent”:
D(  ||  )
Dmin (  ||  )
 1
  12
 
D  (  ||  )
 (
Joint convexity of D


Dmax (  ||  )
[Frank & Lieb]
||  )
for
1  1
2
 joint convexity of Dmin (  ||  ), D (  ||  )

D  (  ||  )
monotonically increasing in 
[Muller-Lennert et al]

for    .
 Dmin (  ||  )  D(  ||  )  Dmax (  ||  )
 (
Data-processing inequality for D

[Frank & Lieb; Beigi]
D (  ||  )  D (  ||  ),
||  )
for   1 2
 DPI for Dmin (  ||  ), D(  ||  ), Dmax (  ||  )
Limitations of the
  QRD

The data-processing inequality is not satisfied for

The important family of
from the
  QRD
  RRE

  0, 1 2
can only be obtained
in the special case of commuting operators
(Q) Can one define a more general family of relative
entropies which overcomes these limitations ?
(A) Yes!

The more general family is a……


Two-parameter family of relative entropies
D , z (  ||  );  , z  
  z relative Renyi entropies:   z  RRE
They stem from quantum entropic functionals defined by
Jaksic, Ogata, Pautrat and Pillet
for the study of entropic fluctuations in non-equilibrium
statistical mechanics
:
reference state of a dynamical system
  t : state resulting from 
due to time evolution under
the action of a Hamiltonian for a time
t.
*
  z  RRE
Definition:
  D (H );   P (H ) : supp   supp 
1
D , z (  ||  ) 
log f , z (  ||  )
 1
with the trace functional

 z
f , z (  ||  )  Tr   

, z 
Take limits for
  1;
z0

 Tr 

(1 )
2z

 2z
 Tr   

(1 )
z

 
z
(1 )
z



z
(1 )
2z


2z






z
z
*
Retrieving all other relative entropies
from the D , z (  ||  )
Quantum Renyi axioms for a relative entropy
†
†
D
(

||

)

D
(
U

U
||
U

U
)
Unitary invariance :  , z
 ,z


Tensor property:
D , z (    ||    )  D , z (  ||  )  D , z ( ||  )
*

Order Axiom:
z |   1 |
    D , z (  ||  )  0
    D , z (  ||  )  0
etc.
Order Axiom

z |   1 |
    D , z (  ||  )  0
1
D , z (  ||  ) 
log f , z (  ||  )
Let 0    1
 1
 r.t.p.
    log f , z (  ||  )  0

Proof:
    f , z (  ||  )  1
r.t.p.
    f , z (  ||  )  f , z (  ||  )
 f , z (  ||  ))  1
0  1
r.t.p.
    f , z (  ||  )  f , z (  ||  )
……….(a)
 z
Tr   


(1   )

z
For
0 
(1 )
z



z
 z
 Tr   


z
(1 )
z



z
z
 z 
 z 
Tr      Tr    







,
 1, x is operator monotone:       

z

z
 z 
 z    f (  ||  )
 ,z
f , z (  ||  ) =Tr      Tr    




(1   )
 (a) holds if 0    1, i.e. if
 1, i.e. z  (1   )


z
0  1

For

Similarly, for


 1
Order Axiom:
 z |   1 |
order axiom holds for z  (1   )
order axiom holds for z  (  1)
 
 D , z (  ||  )  0
 
 D , z (  ||  )  0
Data-processing inequality (DPI)
 : CPTP
D , z ( (  ) ||  ( ))  D , z (  ||  )
(Q) For which parameter ranges does
D , z
satisfy the DPI ?


Data-processing inequality for the
Theorem:
0  1
z  max( ,1   )
[KA,ND],
[Hiai]
  z  RRE
Data-processing inequality (DPI)
Proof of DPI in the blue region:
0    1;
z  max( ,1   )
D , z ( (  ) ||  ( ))  D , z (  ||  )
[Frank & Lieb]:

To prove DPI
it suffices to
prove that
f , z (  ||  )
(trace functional)
is jointly concave
for
0  1
Data-processing inequality (DPI) contd.
Joint concavity of

f ,z 
 : CPTP
Proof
DPI for
D , z
for
0  1
D , z ( (  ) ||  ( ))  D , z (  ||  )
Stinespring’s Dilation Theorem:
Action of

on a state
 
  D (H ) :
unitary
U (    )U *
 D (H  H 2 ) :

some fixed
: state in H
2
Tr2

Tr2 [U (    )U * ]

 ( )=Tr2 [U (    )U ]
*
Joint concavity of
Proof: contd.

Let

Set:
du:
f ,z 
DPI for
D , z
for
0  1
 ( )=Tr2 [U (    )U ]
*
normalized Haar measure on all unitaries on
I
du uAu  (Tr A). where  
N
*
X =U (    )U  D (H  H2 )
H2
*
dim H2
 (Tr2 X )  
du
(
I

u
)
X
(
I

u
)

*
=Tr2 [U (    )U ]  
*
Integral representation
= (  )  
f ,z  DPI for D , z for 0    1
Joint concavity of
1
D , z (  ||  ) 
log f , z (  ||  )
 1
D , z ( (  ) ||  ( ))  D , z (  ||  )  f , z ( (  ) ||  ( ))  f , z (  ||  )
 du (I  u ) U (    )U
  du V (    ) V
(  )   
u
*
u
*
(I  u )
*
Vu  (I  u ) U
f ,z  DPI for D , z for 0    1
Joint concavity of
1
D , z (  ||  ) 
log f , z (  ||  )
 1
D , z ( (  ) ||  ( ))  D , z (  ||  )  f , z ( (  ) ||  ( ))  f , z (  ||  )
Tensor
property
f , z ( (  ) ||  ( ))  f , z ( (  )   ||  ( )   )
 f , z (  du Vu (    ) Vu* ||  du Vu (   ) Vu* )
IF jointly concave
  du f , z (Vu (    )Vu* || Vu (   )Vu* )
  du f , z (    ||    )
 du (I  u ) U (    )U
  du V (    ) V
(  )   
u
*
u
unitary invariance
*
(I  u )
*
Vu  (I  u ) U
Joint concavity of
f ,z  DPI for D , z for 0    1
1
D , z (  ||  ) 
log f , z (  ||  )
 1
D , z ( (  ) ||  ( ))  D , z (  ||  )  f , z ( (  ) ||  ( ))  f , z (  ||  )
f , z ( (  ) ||  ( ))  f , z ( (  )   ||  ( )   )
 f , z (  du Vu (    ) Vu* ||  du Vu (   ) Vu* )
IF jointly concave
  du f , z (Vu (    )Vu* || Vu (   )Vu* )
unitary invariance
  du f , z (    ||    )
 f , z (    ||    ) normalization of the Haar measure
 f , z (  ||  )
 f , z ( (  ) ||  ( ))  f , z (  ||  )
Tensor property
*
In fact:To prove DPI it suffices to prove that

1
z
f , z ( A)  f , z ( A, K ) : Tr( A z KA
A  0, K

Why?
K )
is concave in
A.
[Carlen & Lieb]
fixed matrix
Because for
the choice
* z
0 I  A    0 


K 
;

0


0
0


f , z ( A)  f , z (  ||  )
Concavity of
f , z ( A, K ) in A

So focus on
proving concavity of


Joint concavity of
f , z (  ||  )
f , z ( A)
*


Concavity of

 z
*
f , z ( A) : Tr  A KA K 


for   (0,1); z  max{ ,1   }
Set p   ; q  1   ; z =
pq
z
z
1
f p ,q ( A)  Tr  A p KAq K * 

z
1
z
Concavity of
f p ,q ( A)
for
1
pq
p, q  (0,1)
*
Key ingredients: Pick functions
(holomorphic functions that map the upper half plane into itself)
Pick Functions: Holomorphic functions defined on the upper-half
plane:
I  () :    : Im   0
with their ranges in the closed upper half plane   : Im 
Then
f ( ) is a Pick function if
Im 
 0

Im   0  Im f ( )  0.
Re 
Also known as: Herglotz functions or Nevanlinna functions
Example of a Pick function

f ( )   p ; 0  p  1 defined on the cut plane
Let
 e
p
p log| | ip arg 
e
arg   ( ,  )
 |  | ei arg 
Im   0  arg   (0,  )
Im

 arg  p  p arg   (0,  )
( 0  p  1)
 Im   0
p
Hence
f ( )   p is a Pick function
Re 
Subclass of Pick functions

Let
( a, b)
P ( a, b)
be an open interval on the real axis
Then
P(a, b) : the subclass of Pick functions which can be
analytically continued across the interval (a, b) into the

lower half plane such that the continuation is by
reflection w.r.t. the real axis.
f ( )  f ( )
Im 


Loewner’s theorem:
P(a, b) : the class of functions
that are operator monotone on
( a, b)
a
b
Re 
Relevance of Pick functions for our proofs

A Pick function f ( )  P (a, b) has a unique canonical
integral representation of the form
b
1
t
f ( )       (
 2 ) d  (t )
t   t 1
a
where
  0;   ;
and
d  (t ) : a positive Borel measure on the real t  axis for which
b
f (iy )
;   Re f (i );
  lim
y 
iy
 (b)   (a)  lim
y 0

1
b
Im f (t  iy)dt


a
2
1
(
t

1)
d  (t )  

a
a
b
Stieltjes inversion
formula
Re 
We want to prove

Concavity of
This would
imply DPI
for
D , z (  ||  )
here
f p ,q ( A)
for
p, q  (0,1)

Concavity of
f p ,q ( A)
for
p, q  (0,1)
(Q) How do Pick functions enter into the proof?

A0
Consider 2 related functions:
F ( )  f p ,q (Q   R)
G ( )  f p ,q ( Q  R)
where: domain of

f p ,q
Q  0, R  R*
has been extended to complex matrices
The 2 functions are related :
F ( )  f p ,q (Q   R)
1
1
  f p ,q ( Q  R)   G ( )


f p ,q ( A)  Tr  A p KAq K * 
1
pq
homogeneous
of order 1
1
F ( )   G ( )

f p ,q ( A)
F ( )  f p ,q (Q   R); G ( )  f p ,q ( Q  R)
Claim: Concavity of f p ,q ( A) for A  0
amounts to proving : Concavity of F ( x), for

x  ;
(over the domain of
F)
Concavity of
f p ,q ( A)
for
A0
 Concavity of F ( x), for x  ;

Suppose
A1 , A2  0
& for
x  [0,1]
f p ,q ( xA1  (1  x) A2 )  xf p ,q ( A1 )  (1  x) f p ,q ( A2 )
Set
Q  A2 , R  A1  A2 ;
concavity
x
F ( x)  f p ,q (Q  xR);
F ( )  f p ,q (Q   R)
 f p ,q ( A2  x( A1  A2 ))
 f p ,q ( xA1  (1  x) A2 )  xf p ,q ( A1 )  (1  x) f p ,q ( A2 )
[ f p ,q ( A1 )  F (1); f p ,q ( A2 )  F (0)]
F ( x)  xF (1)  (1  x) F (0)
F ( )  f p ,q (Q   R); G ( )  f p ,q ( Q  R)
f p ,q ( A)
Concavity of
F ( x), for x  ;
implies DPI
Proof of concavity outline in 4 lines
G ( ) is a Pick function

Prove that

It hence has an integral representation


This carries over to
F ( x);
1
 F ( )   G ( )

Then proving concavity of F ( x) amounts to proving
concavity of the integral’s kernel
– which is straightforward!
To prove DPI for
0    1;
z  max( ,1   )
Concavity of F ( x), for x  ; (over the domain of
domain of holomorphy of
F)
F ( )  f p ,q (Q   R );
  x  iy, x  1/ | c |; c || Q || / min ( R)
c
2
x
F ( x)   x    
d  (t )
tx  1
c

kernel
2
x
g ( x) 
tx  1
2
0
g ( x) 
3
(tx  1)
for x  1/ | c | 1/ t
 F ( x)  0  F ( x)
concave
SUMMARY
Introduced a 2-parameter family of relative entropies,
-- that unifies the study of all known relative entropies
  z  relative Renyi entropies:   z  RRE

D , z (  ||  )

These satisfies the quantum generalizations of Renyi’s
axioms for a relative entropy.

Focus: For which parameter ranges does it satisfy the DPI ?

Proved the DPI for
0    1;
theory of Pick functions.
z  max( ,1   ) using the
DPI
?
DPI conjectured
Thank You!

Thanks to Koenraad Audenaert;

& to Felix Leditzky for preparing the figures.

Summary and conjectures regarding DPI
Regions of concavity (blue) and conjectured
convexity (orange) of the (reparametrizede) trace
functional
f p ,q ; p   / z ; q  (1-  ) / z
Download