-
A Quantitative Landauers Principle
Philippe Faist,1 Frederic Dupuis,1 Jonathan Oppenheim,2 and
Renato Renner1
1Institute for Theoretical Physics, ETH Zurich, 8093
Switzerland2Department for Physics and Astronomy, University
College of London, WC1E 6BT London, U.K.
(Dated: November 7, 2012)
Landauers Principle states that the work cost of erasure of one
bit of information has a funda-mental lower bound of kT ln(2). Here
we prove a quantitative Landauers principle for arbitraryprocesses,
providing a general lower bound on their work cost. This bound is
given by the min-imum amount of (information theoretical) entropy
that has to be dumped into the environment,as measured by the
conditional max-entropy. The bound is tight up to a logarithmic
term in thefailure probability. Our result shows that the minimum
amount of work required to carry out agiven process depends on how
much correlation we wish to retain between the input and the
outputsystems, and that this dependence disappears only if we
average the cost over many independentcopies of the input state.
Our proof is valid in a general framework that specifies the set of
possiblephysical operations compatible with the second law of
thermodynamics. We employ the technicaltoolbox of matrix
majorization, which we extend and generalize to a new kind of
majorization,called lambda-majorization. This allows us to
formulate the problem as a semidefinite program andprovide an
optimal solution.
Introduction.Landauers Principle [1, 2], and moregenerally the
relation between the second law of ther-modynamics and information
theory, has received muchattention in the past decades. Studies
have notably fo-cused on fundamental limits on heat generated by
com-putation [2], the exorcism of Maxwells demon via infor-mation
theory (see eg. [3]), and generalizations to quan-tum settings such
as the characterization of entanglementthrough thermodynamical
considerations [4], or the de-termination of the work cost of
information erasure withthe help of quantum side information
[5].
Landauers Principle can be stated in the followingway. Consider
the erasure process of an unknown bit,i.e. the logical operation
that resets the bit to a referencestate (e.g. zero). Landauers
Principle asserts that anyphysical implementation that performs
this erasure, us-ing a heat bath at temperature T , has a work cost
of atleast kT ln(2), where k is the Boltzmann constant.
Moregenerally, Landauer noted that all irreversible operations,and
not only the erasure of a bit, must cost work due tothe transfer of
entropy from the information-bearing de-grees of freedom to the
environment, which causes thesystem to dissipate heat. Bennett
refined the formula-tion of this principle and showed its relevance
in thermo-dynamics (exorcising the Maxwell demon [2, 3]) and
incomputation [6].
The work cost of thermodynamic processes in the con-text of
information theory has been studied for variousclassical and
quantum systems. Szilard [7] originally con-sidered a
single-particle gas enclosed in a box with a pis-ton and noted that
kT ln(2) work could be reversibly ex-tracted from the gas at the
expense of losing the infor-mation about which side of the piston
the particle is on.The reverse process corresponds to erasing this
informa-tion, bringing the particle on one definite side at kT
ln(2)work cost. Landauer [1, 8] studied the example of a par-ticle
in a double-V shaped potential, which representsa bit of
information, and showed that its erasure costs
work. While these results apply to fully unknown bits,the bounds
have to be adapted if the system we erase ispartially known. In
such a case, the average amount ofwork needed is lower bounded by
kT ln(2)H(X), whereH(X) is the Shannon entropy of the system X and
wherethe average is taken over many independent repetitionsof the
erasure process [3, 9, 10]. This result has beenderived and
extended in several contexts such as usingquantum computers
performing data compression [11],Hamiltonian models [12] or in a
resource theory frame-work [13, 14]. This bound can also be
generalized to otherprocesses, for which the average work cost is
then givenby the amout of entropy the processes transfers into
theenvironment. We refer to Janzing [15, 16] for a proof ina
resource theory framework.
Generalizations to a single-shot regime, where state-ments are
made about individual processes rather thanmany repetitions of
them, have been proposed, for ex-ample in terms of majorization
conditions [13], and interms of entropic quantities which take into
account ap-proximate transitions and a probability of failure
[1719]. Explicit Hamiltonian models have also been usedto study the
case of erasure with quantum side informa-tion [5]. It is usually
assumed that the system carry-ing the information has a degenerate
Hamiltonian. Morerecently, these thermodynamic considerations have
beenextended to the case of non-degenerate Hamiltonians [1921], and
the majorization condition also adapted to thisscenario [17, 19,
21], based on ideas from [2225].
In the present article, we revisit Landauers principle inthe
light of general quantum processes. Our main resultis an explicit
and rigourous expression for the fundamen-tal minimal work cost of
any process E that acts on asystem X and brings it from a state to
a new state .The bound is robust, i.e. it holds even if one
toleratesan error probability . The work cost W of such a pro-cess
is lower bounded by the amount of entropy that hasto be dumped into
the environment, as measured by the
arX
iv:1
211.
1037
v1 [
quan
t-ph]
5 N
ov 20
12
-
2smooth conditional max-entropy [26, 27],
W > kT ln(2)Hmax (E|X) . (1)Here, the entropy is evaluated
for the state which isa purification of the output state obtained
by applyingthe process E to a purification of the input state X
(seeProposition 3). The entropy measure, Hmax, is part ofthe smooth
entropy framework that is widely used insingle-shot quantum
information theory [2630]. Its for-mal definition will be given
later.
Our quantitative Landauers Principle is tight up tologarithmic
terms in the failure probability of the imple-mentation of the
process E . Indeed, we can devise an ex-plicit process carrying out
the requested mapping E thatis nearly optimal. This near-optimal
process is based onthe scheme proposed by del Rio et al. [5], which
erases asystem using available quantum side information.
Our bound is valid in a general framework that speci-fies the
set of physically allowed operations. This frame-work conceptually
separates the operations that are in-trinsically thermodynamical
(e.g., the erasure of infor-mation) from those that simply
correspond to reversibleinformation processing (e.g., unitaries and
the additionof ancillas). The former will be those that cost work
orthat are capable of extracting work from a system; thelatter are
done for free, i.e. at no work cost. We assumethat our systems have
a completely degenerate Hamil-tonian. The set of allowed operations
is motivated bythe second law of thermodynamics, which forbids
cyclicprocesses whose net effect is to extract work.
For the proofs we use a characterization of our frame-work by a
relatively simple and intuitive generalizationof the notion of
majorization which is inspired by previ-ous work where the
eigenvalues of the input are rescaleduntil the input majorizes the
output [17, 19], achievedfor example by appending a work system
[21]. We termour generalisation lambda-majorization, and provide
amathematical characterization of this notion in terms ofcompletely
positive maps that satisfy some normalizationconditions.
In the asymptotic limit of many identical and identi-cally
distributed (i.i.d.) copies of these systems (i.e., theprocess is
repeated n independent times, En, onn i.i.d. input states n), we
obtain as a corollary of ourmain result a value for the average
work cost of erasureper copy,
W > [H (X) H (X)] kT ln(2) , (2)which is in agreement with
the informal formulation ofLandauers principle, that the work cost
of any process isdetermined by the decrease of entropy in the
information-bearing degrees of freedom (see [16] for a proof in a
re-source framework).
We should point out that the general bound (1) can bearbitrarily
larger than the average bound (2). This devi-ation highlights an
important feature, namely that corre-lations between the input and
the output of the transfor-
mation play a significant role in the single-shot regime.It is
important to not only consider the input and outputstates, but also
the whole process, or computation, thatis performed on the actual
input. This is natural andgeneralizes the classical case where this
consideration isobvious, since a classical computer acts on the
actualstate of a register and not on its probability
distribution.In the quantum case, we specify the full algorithm
(orcomputation) as a completely positive map, which inher-ently
tells us which correlations are preserved betweenthe input and
output systems. While the transformationof a state into another
(e.g. in a resource theoretic ap-proach) is a relevant question, we
focus in this paper onthe case where the computation is given, thus
fixing allthe correlations that are preserved or destroyed
betweenthe input and the output.
As a simple example, consider X to be a fully mixedqubit, i.e.
in the state X =
1212. Suppose we wish to
transform this state into another fully mixed qubit again,X
=
1212. There are two obvious processes that achieve
this goal: we may (a) simply copy the input qubit to theoutput,
or (b) throw away the input and prepare a newfully mixed qubit.
Both processes (a) and (b) provide therequired output. However, if
we had information aboutthe specific state in which the qubit
initially was (e.g.suppose we had kept a qubit C that was maximally
en-tangled with the input), then in the case of process (a), Cwould
remain entangled with the output; however in thecase of (b), C
would have lost all correlations with theoutput qubit. In this
first example, both processes costno work: (a) is the identity
process, and in (b), the workdissipated to erase the qubit is
retrieved again when weprepare a new mixed qubit.
However, the work costs of these processes differ if weconsider
less trivial input and output states. Let X bea quantum system
composed of n + 1 qubits, in a stateX where the first qubit is
randomly zero or one withprobability 1/2, and the n remaining
qubits are eitherall zero if the first qubit is zero, or all in a
fully mixedstate if the first qubit is one. This state has the
distri-bution {1/2, 2(n+1), 2(n+1), . . . 2(n+1)} and is depictedin
Figure 1. Suppose that we wish to bring this systeminto the state X
= X , i.e. the same state as the inputstate, using either process
(a) or (b) again. Process (a)would simply copy the input to its
output, and would notcost any work, since it is the identity
channel. However,process (b) first has to erase the input state and
then pre-pare the output state. If we are lucky, the n qubits are
instate |0 . . . 00 . . . 0| (if the first qubit is |0) and we
canjust erase the first qubit using kT ln(2) work. However,if we
want to erase the system with certainty, we haveto consider the
worst case in which we have to erase nfully mixed qubits (which
occurs with the non-negligibleprobability 1/2). So the erasure work
cost may be as badas (n+1)kT ln(2). In order to prepare this state
again asthe output of the process, we may think of tossing a cointo
decide in which state |0 . . . 00 . . . 0| or |11|2n12nto prepare X
in. If we are lucky, we have to prepare a
-
3FIG. 1: The probability distribution of a state in which
single-shot effects become important, even for large systems.
Aregister of n + 1 qubits are in a state such that if the
firstqubit is zero (with probability 1/2), then all the rest are
zerotoo; if the first qubit is one (with probability 1/2), then
allthe rest are in a fully mixed state. The spectrum has a
largeeigenvalue (1/2), but also has a large support size (2n + 1);
asa consequence, Hmin() 1 and Hmax() n can differ byan arbitrary
amount.
mixed state on n qubits and extract nkT ln(2) work inthe
process, but in the worst case, we have to prepare|0 . . . 00 . . .
0| and cant extract more than just kT ln(2)(from the coin toss).
Hence, in the worst case, process(b) costs a total of nkT ln(2)
work, which can be arbi-trarily larger than the (zero) cost of
process (a); in fact,the gap diverges as n.
This example shows that in the general single-shotregime, the
specification of only the input state X andthe output state X does
not suffice, and correlations be-tween the input and the output
contribute to determinethe minimal work cost of the process
(although these cor-relations are not relevant in the asymptotic
i.i.d. regime).Our result (1) incorporates this property
intrinsically andprovides a bound that is valid for any given
process.
The remainder of this paper is organized as follows.We will
first present the mathematical framework usedto model thermodynamic
processes. We then introducelambda-majorization, which captures all
possible opera-tions in our framework. Lambda-majorization is
charac-terized in terms of completely positive maps that
satisfysome specific normalization conditions, and we use
thischaracterization to derive the main result by formulat-ing the
problem as a semidefinite program. The latteris solved by providing
optimal primal and dual feasibleplans with the same value, which
guarantees optimal-ity of the result. Finally, some special cases
are derivedwhich recover some previously known results.
Framework.Consider a quantum mechanical systemX in an inital
state described by the density operator. Our task is to bring the
system X to another state, while attempting to maximize some kind
of notion ofextracted work in the process. Throughout this pa-per
we assume that the Hamiltonians of the systems weconsider are
completely degenerate.
We first postulate two basic operations of thermody-namical
nature, involving a heat bath at temperature T :
the erasure of a single qubit to a pure state at kT ln(2)work
cost, and the corresponding reverse process whichextracts kT ln(2)
work by transforming a pure state intoa fully mixed state. Here k
is the Boltzmann constant.These operations are motivated by the
variety of explicitphysical thermodynamical frameworks in which
they canbe performed, for example using Szilard boxes [7, 18] orby
isothermally manipulating energy levels of Hamilto-nians [5, 12,
20]. Crucially, we assume the second law ofthermodynamics, and
require that there exist no opera-tion that would allow us to form
a cycle for which thenet effect would be the extraction of work.
This justifiesthat no other work extraction procedure can yield
morework than kT ln(2) from a pure qubit, or else a cycle withnet
work gain could be formed by appending an erasureprocess, itself
only costing kT ln(2).
Apart from this constraint on the set of allowed opera-tions, it
is natural to also allow usual quantum informa-tion processing.
Since our Hamiltonians are degenerate,we can allow all global
unitaries and they cost no work.We do not need to use the fact that
these unitaries are im-plementable by a device operating in contact
with a heatbath, since expanding the class of allowable
operationsactually strengthens the bound we derive. In practice,one
has very crude local control over the operations, andthe acting
agent does not know which unitary is beingimplemented, however,
this is actually not an obstaclefor implementation [11, 14]. In
addition to unitaries, wewill allow pure ancillas to be added to
the system, whichpermits more general computation. Crucially,
ancillaswill have to be restored to their initial pure state, so
thatit is not possible to hide a work cost in an ancilla thatwas
left mixed.
The following framework is motivated by the aboveconsiderations.
The processes we allow are (finite) com-binations of the following
elementary operations:
(a) Bring n qubits (of the system X or an ancilla A)from any
state to a pure state (erasure) at costnkT ln 2 work;
(b) Bring n qubits (of the system X or an ancilla A)from a pure
state to a fully mixed state while ex-tracting nkT ln 2 work;
(c) Add and remove ancillas in a pure state at no workcost, as
long as all the ancillas have been restoredto their initial pure
state at the end of the process;
(d) Perform arbitrary unitaries (over X and any addedancillas)
at no work cost.
Operations (a) and (b) are those of thermodynamicalnature, and
may be carried out in a wide range of existingframeworks as
mentioned above. One may view theseoperations as defining a
quantity which we call work.
On the other hand, operations (c) and (d) are
purelyinformation-theoretical. They allow us to perform anyquantum
information processing circuit, since we allowpure ancillas to be
added. However, there is the condi-tion that randomness may not be
disposed of for free,
-
4namely that ancillas have to be restored to their initialpure
states at the end of the process.
Lambda-Majorization.We will now provide a simplemathematical
characterization of all operations allowedin our framework.
First, note that the operations (a)(d) allow the useof so-called
noisy operations [13], which correspond toadding an ancilla system
N in a fully mixed state, per-forming a joint unitary, and removing
the ancilla. Specif-ically, a noisy operation is composed in our
framework offirst an operation of type (c) (adding a pure ancilla
of nqubits), followed by an operation of type (b) (extractingnkT ln
2 work from the ancilla making it fully mixed),then one of type (d)
(performing the necessary unitaryto carry out the noisy operation),
and finally an opera-tion of type (a) (erasing the ancilla back to
its pure stateat a work cost nkT ln 2). The total process has a
workbalance of zero. This means that we may thus carry outnoisy
operations for free within our framework and usethem as building
blocks for more complex processes. Inthe following, we deal
implicitly with the ancilla N andit should not be confused with
further ancillas that willbe added.
The following result by Horodecki et al. [13] relatesnoisy
operations to the mathematical notion of majoriza-tion [31,
32].
Noisy Operations and Majorization. The transitionon system X
from state to state is possible by noisyoperation if and only if
.
Majorization between two (normalized) states captures the fact
that is more mixed than , or thatthe eigenvalues of can be written
as a mixture of theeigenvalues of . Formally, majorization can be
char-acterized by the existence of a unital,
trace-preservingcompletely positive map that brings to [3336].
Achannel E is trace-preserving if E (1) = 1 and unital ifE (1) =
1.Proposition 1. Two positive matrices and satisfy if and only if
there exists a trace-preserving, unital,completely positive map E
satisfying E () = .
The notion of majorization is discussed in more detailin
Appendix A.
We will now provide some background insight for themeaning of
our new concept of lambda-majorization.The idea is to characterize
how well a state majorizesa state . Suppose that we have a system X
in state Xand we want to bring it to the state X , where X X .In
this case, one can simply carry out a noisy operationas described
above. Suppose now that we have an ancillaA that is in a fully
mixed state, 1A|A| , and suppose that weare fortunate enough for X
1A|A| X |00|A to alsohold (for some pure state |0A on A). Then by
apply-ing a joint noisy operation on both systems, this
wouldcorrespond to actually erasing the system A for freeduring the
transition . We could then say that
UNITARY
FIG. 2: Lambda-Majorization corresponds to absorbing a cer-tain
amount of randomness from an ancilla during a unitaryoperation. The
system X starts in state , and the ancillaA in a state with 1 fully
mixed qubits with the remainingqubits pure. The goal is to devise a
global unitary that willbring the system X to the state , while
leaving the leastpossible number 2 of fully mixed qubits in A. The
difference = 12, is the work extracted by the process; if the
valueis negative, it corresponds to a work cost. In the main
text,we allow a noisy operation instead of a unitary operation,but
one could simply add more mixed qubits to the ancilla oneach side
and use those to implement a noisy operation witha unitary.
the randomness of the ancilla A was transferred intosystem X. We
will view this type of transition as workextraction on system X
during a transition X X .
In another situation, it might be that X X . How-ever, in that
case, for a large enough ancilla A the ma-jorization X |00|A X
1A|A| will hold. The cor-responding noisy operation then leaves us
with a mixedancilla that started off pure; we will view such a
transi-tion on system X as costing work.
Such operations can be performed within our frame-work, using
operations (a)(d). In particular, the rela-tion to work is given by
elementary erasure and workextraction (operations (a) and (b))
applied to the ancillaA after the transition to restore it to its
initial state.
In general, the ancilla A may start with 1 mixedqubits and end
up with 2 mixed qubits after a noisyoperation; we consider in this
case to have extracted(1 2) kT ln(2) amount of work. This situation
is de-picted in Figure 2. Both considerations above aboutwork cost
and work extraction are encompassed, sim-ply because we count the
difference in the amount ofrandomness present in the ancilla before
and after theprocess. This is the idea behind the concept of
lambda-majorization, whose definition we can now state.
Lambda-Majorization. For two density operators X ,Y on two
systems X and Y , we will say that X -
majorizes Y , denoted by X Y , if there exists a
(large enough) ancilla system A, as well as 1, 2 > 0with = 1
2, such that
21121 X 22122 X ,
where 21121 and 22122 are fully mixed states on1 (respectively
2) qubits of A, and where the remainingqubits of A in each case are
pure.
-
5An expression for by how much a state majorizes an-other was
originally introduced in [17] and used in [19], inthe context of
work extraction games from Szilard boxes.Their measure, the
relative mixedness between and
, corresponds to the optimal such that .
Lambda-majorization captures the possible processes
that are allowed in our framework. Indeed, if ,
then one has 21121 22122 for some1, 2 with = 1 2. Hence, there
exists a noisy op-eration (itself a combination of operations
(a)(d) withzero total work cost) that performs the transition
from21121 to 22122 . The 1 mixed qubits thatwe have appended to can
be created by appending alarge pure ancilla (operation (c)), and
using operation(b) to extract 1 kT ln(2) work from 1 qubits,
render-ing them fully mixed. At the end of the process, afterthe
noisy operation, we need to restore the ancilla in apure state; we
thus need to erase (operation (a)) the re-maining 2 qubits, costing
2 kT ln(2) work. The totalextracted work is then (1 2) kT ln(2) =
kT ln(2).Conversely, each individual operation (a)(d),
individu-ally transforming some state into a state and costing
work W , implies the lambda-majorization with
W = kT ln(2). This is clear for operations (c) and(d). For
operations (a) and (b), this follows from resultsderived in
Appendix A 3.
The ancilla system above may be viewed as some kindof
information battery, as was suggested by Bennett [2]who suggested
using a blank memory tape as fuel toextract work. In this case, the
ancilla can be used as astorage of purity (or as a storage for
mixedness orrandomness which we would like to get rid of), whichis
increased or decreased by processes like the ones sug-gested
above.
It turns out that one can characterize lambda-majorization by
the existence of a completely positivemap satisfying some special
normalization conditions,analogously to Proposition 1.
Proposition 2. Two normalized density matrices X
and Y on two systems X and Y satisfy X Y if
and only if there exists a completely positive map TXYsatisfying
Y = TXY (X), such that T XY (1X) 6 1Yand TXY (1X) 6 21Y .
A channel TXY that satisfies the two last conditionswill be
referred to as a lambda-majorization channel.
Furthermore, although the channel T is not directlya physical
channel (it can be, for example, trace-decreasing), it can always
be viewed as part of a uni-tal channel E , in the sense that T can
be obtained byprojection onto specific subspaces and tracing out
theancilla A of the channel E (see Appendix A 2). In turn,unital
channels are a (strict [37]) superset of the noisyoperations.
Recall that our task is to find a lower boundon the work cost of
all possible processes allowed in ourframework, which we will do by
optimizing the work costover all processes that perform a given
state transition.
FIG. 3: Our main result gives a fundamental lower bound onthe
work cost W of a process transforming a state X (puri-fied by a
ficiticious |XR) into a new state XR obtained byapplying a process
EXX . The lower bound to the work costis given by the entropy that
the process E has to dump intothe environment E (in which XR is
purified), as measuredby the Renyi-zero conditional entropy H0
(E|X).
However, instead of considering only the unital channelsE that
are noisy operations, we will relax this last condi-tion and
consider all unital channels E , and thus allow theoptimization to
range over all T that satisfy the condi-tions of the above
proposition. This will make our lowerbound even stronger, by
showing that the lower boundstill holds even if we relax somewhat
the assumptions inour framework.
Main Result.We are now ready to derive our mainresult. Consider
a system X in the state X . This systemcan always be purified by a
reference system, R, in a purejoint state |XR.
Allowing actions defined by our framework on X, wewill study the
transition of this state to a state XR, byapplying a process TXX .
The systems are depicted inFigure 3.
The task we would like to solve is the following. GivenX and a
process EXX , and given a purification |XRof X and an output state
XR = E (XR), we would liketo find the least amount of work W one
has to pay forany process in our framework that implements the
actionof E on . As we have seen in the previous section, wecan
formulate within our framework all possible processesas
lambda-majorizations, so our task is actually to find
the best such that X X , with the corresponding
lambda-majorization channel T from Prop. 2 satisfyingT (XR) =
XR.
Our main result gives an upper bound on the optimalamount of
work that can be extracted by this transition,or equivalently, a
lower bound on the minimum amountof work that will have to be paid
in order to perform thetransition. The main result follows directly
from follow-ing technical proposition.
We are given an input state X and a process EXX .Let |XR be a
purification of X , and let XR =EXX (XR). Let also XRE be a
purification of XRin an environment system E. The Renyi-zero
entropy
-
6H0 (E|X) [26, 38] is defined by
H0 (E|X) = maxX>0trX=1
tr [XE X ] , (3)
where XE is the projector on the support of XE .
Proposition 3. Then the -majorization X X
holds, with the channel TXX from Prop 2 satisfyingT (XR) = XR,
if and only if 6 H0 (E|X) .Main Result. Any process in our
framework acting onsystem X that implements the channel E when
given in-put X (or equivalently, that brings the state XR to
thestate XR) has to cost at least kT ln(2) H0 (E|X) work.
In other words, the minimal work cost of a transitionfrom to is
given by the amount of (information-theoretic) entropy dumped into
the environment, condi-tioned on the output of the computation.
This is pre-cisely the quantitative generalization to correlated
quan-tum systems of the original Landauers principle [1].
It is worth noting that instead of specifying the chan-nel E ,
we may also simply specify the output state XR,which completely
determines the process (on the supportof X) since it is the
Choi-Jamio lkowski state correspond-ing to E rescaled by X (XR = E
(XR)). One can thusunderstand the input to the problem to actually
be a bi-partite state XR, such that X is the required output,R is
the input that will be fed into the process, and anycorrelations
between X and R specify parts of the out-put that we wish be
preserved and not be modified, orthermalized, by the process.
The full proof of Prop. 3 is provided in the appendix.We provide
the general idea of the proof in the following.
Proof Sketch of the Main Result. The main idea of theproof is to
write the optimization problem as a semidefi-nite program for the
variables = 2, TXX (the Choi-Jamio lkowski representation of TXX),
and the dualvariables X , XX and ZXR. Let ()tX denote the
partialtranspose operation on X. The optimal extracted work is
given by the following semidefinite program:
Primal
minimize:
subject to:
trX [TXX ] 6 1X : XtrX [TXX ] 6 1X : XX
trX[TXX
tXXR
]= XR : ZXR
Dual
maximize: tr (ZXR XR) trXXsubject to:
trX 6 1trR
[tXXR ZXR
]6 1X X +XX 1X
The optimal value = 2H0(E|X) is achieved (see
Appendix B) by the completely positive map TXX =trE
[VXXE ()V
], where VXXE is the partial isom-
etry with minimal support relating XR to XER (bothbeing
purifications of the same R = R).
While it is clear from the formulation of our problemthat T is
already completely determined on the supportof X (expressed by the
condition T (XR) = XR), theoptimization over T is done in order to
(at least formally)find the optimal action on the complement of the
supportof X . Also, the formulation of a lambda-majorizationproblem
as a semidefinite program is a more general tool-box that could be
used in the case where the mappingis not completely determined and
where arbitrary addi-tional semidefinite conditions can be imposed
at will.1
Allowing a Probability of Error. A smooth versionof the result
is straightforward to obtain. In this case, weallow the actual
process to not exactly implement E , butonly approximate it well.
The best strategy to detect thisfailure is to prepare |XR and send
X into the process,and then perform a measurement on XR. To ensure
theprobability of error does not exceed , the trace distancebetween
the ideal output of the process XR and theactual output XR must not
exceed . We can apply ourmain result to the approximate process
that brings to, and lower bound the work cost of that process
by
W ( ) > H0 (E|X) kT ln(2)> Hmax (E|X) kT ln(2) , (6)
where the second inequality is shown in [39] and involvesthe max
entropy measure Hmax as defined in [27, 28]. Forany > 0, the
smooth max entropy H max is defined as
H max (E|X) = min
maxX>0
tr X=1
logF 2 (EX ,1E X) , (7)
where the first optimization ranges over all EX such thatF 2 (,
) > 1 2 and where F (, ) = 1 is thefidelity between the quantum
states and [40]. Wewrite Hmax to indicate H
max with = 0.
If we optimize (6) over all possible channels T thatoutput such
XR, we obtain a bound on the extractable
1 For example, instead of fixing the process with T (XR) =
XR,one may have instead required that T (X) = X for given Xand X ,
not specifying and optimizing over what happens tocorrelations
between the input and the output (or, equivalently,one could
optimize over XR with fixed reductions X and R).In that case, the
semidefinite program can be used to obtainbounds to the optimal
value. This also implies that the relativemixedness introduced in
[19] can be formulated as a semidefiniteprogram.
-
7work with a probability of error ,
W > minXR
XRHmax (E|X) kT ln(2)
> minXRE
XREHmax (E|X) kT ln(2)
= H max (E|X) kT ln(2) , (8)where the first optimization ranges
over all XR suchthat the trace distance 12XR XR1 6 , and wherethe
second optimization ranges over all XRE such thatF 2 (XRE , XRE)
> 1 2, with =
2.
Tightness of the Bound.The bound given in the mainresult is
tight up to error terms of the order of log 1 .Indeed, lets
consider the following simple process: oneappends a large enough
ancilla AE in a pure state tothe input, so that we have our systems
in the stateXRAE = |0AE |XR. Let us consider a purification|XRAE of
XR. Since the reduced state on R of boththese states are the same,
R = R, there exists a unitaryU acting on X AE such that |XRAE = U
|XRAE .So we can apply this unitary onto our input at no workcost,
and we are left with |XRAE on our systems. Wethen apply the
protocol proposed by del Rio et al. [5] onthe system AE , using the
system X as a memory we haveaccess to, in order to erase the
ancilla AE back to a purestate. Recall that their process acheives
this task with-out modifying the reduced state XR, and at a work
costkT ln(2)Hmax (AE |R) +O
(log 1
). It is also straightfor-
ward to note that their protocol can be carried out withinour
framework. Thus, up to error terms of the order ofthe logarithm of
the error probability, our bound givenby (8) is tight.
Special Cases.From our main result we can recoverseveral some
special cases of specific interest as corollar-ies.
Von Neumann Limit. As we have seen in the intro-duction,
considerable previous work has focused on thelimit cases where many
i.i.d. systems are provided. Insuch a case, the process En is
applied on n indepen-dent copies of the input n, and outputs n. Say
wetolerate a probability of error . We may simply applyour
(smoothed) main result to get an expression for ourbound on the
work cost,
W > H max (En|Xn)n kT ln(2) , (9)however it is known that the
smooth entropies convergeto the von Neumann entropy in the i.i.d.
limit [41],
lim0
limn
1
nH max (E
n|Xn)n = H (E|X) , (10)
which allows us to simplify the expression to
H (E|X) = H (EX) H (X) = H (X) H (X) ,where the last equality
holds because EX and X have
the same spectrum being both purifications of the sameR = R. We
conclude that in the asymptotic i.i.d. case,the work cost of such a
process is simply given by thedifference of entropy between the
initial and final state,
W > [H (initial state)H (final state)] kT ln(2) . (11)We
emphasize that in this case the exact process is notrelevant, and
only the input and output states matter.If one considers the
example given in the introductionwith (a) the identity channel and
(b) a replacement map,and apply these processes on n independent
copies of thedistribution described in Figure 1, then in this
regimeboth processes cost no work.
Erasure of a Quantum System Using a Quantum Mem-ory. Consider
the setting proposed in [5], where a systemS is correlated to a
system M in a joint state SM , andwhere our task is to erase S
while preserving the reducedstate on M and any possible
correlations of M with othersystems. Formally, given a purification
SMR of SM , weare looking for a process that will bring this state
to thestate SMR = |00|S MR, i.e. we require the pro-cess to
preserve MR. In [5] a process is proposed thatperforms this task at
work cost
kT ln(2)Hmax (S|M) +O(log 1
),
where Hmax is the smooth max entropy [2729].
This is a special case of the general case consideredabove,
simply by considering X to be the joint systemof S and the memory M
, HX = HS HM . Note thatwe have SMR = |00|S MR, purified by |SMRE
=|0S |MRE , where |MRE = USE |SMR andUSE is an isometry from S to
E.
Then the bound on the work cost, tolerating a proba-bility of
error of at most , is
W > Hmax(E|SM) kT ln(2)= Hmax(E|M) kT ln(2)= Hmax(S|M) kT
ln(2) , (12)
where the first equality follows because is pure on Sand the
second by reversing the isometry U . We canimmediately conclude
that, within our framework, anyprocess that performs this erasure
has to cost at leastkT ln(2)Hmax(S|M) work. Thus, the process
proposedby del Rio et al. is optimal up to logarithmic factors in
theerror probability . Note that if we take the memory Mto be
trivial i.e. a pure state, then we are in the standardscenario of
Landauer erasure on a single system, and wehave W Hmax(S) which is
achievable, recovering theresult of [18].
State Transformation while Decoupling from the Ref-erence
System. Another special case that we can de-rive as a corollary is
if we consider the process thaterases its input and prepares the
required output inde-pendently. This would occur if we required the
outputstate to be completely uncorrelated to the reference sys-
-
8tem R. Being a replacement map, this process impliesthat XR = X
R. In this case, any third party Rthat would have been correlated
to the input is now com-pletely uncorrelated to the output.
Again, we may simply apply our main result with theadditional
condition that XR = X R. In this case,the purification of XR, XRE ,
takes a special form dueto the tensor product structure, with the E
system splitinto two ER and EX systems (E = ER EX),
|XRE = |XEX |RER , (13)where |XEX and |RER are purifications of
X andR, respectively.
The lower bound on the work cost W , given by ourmain result and
tolerating a probability of error of atmost , then reads
W > H max (E|X) = H max (ER)| +H max (EX |X)| ,
where =
2 and H max is again the smooth max en-tropy. Now, the spectrum
of ER is exactly the same asthe spectrum of R by the Schmidt
decomposition of |.This in turn has the same spectrum as X also by
theSchmidt decomposition of XR and because R = R.It follows that H
max(ER) = H
max(X). Also, by du-
ality of smooth min and max entropies [27], we haveH max (EX
|X)| = H min (EX) = H min (X), whereH min is the smooth min entropy
with purified distancesmoothing as defined in Ref. [28]. In
consequence,
W > H max (X) H min (X) . (14)That is, to transform a state
to while maximallydecoupling from the reference system, then one
has toerase to a pure state (at cost H max (X)), and thenprepare
(extracting work H min (X)).
Example: Erasing Part of the W State.To illustratesome points
mentioned above, consider the W state on asystem S, a memory M and
a reference system R givenby
|W SMR = 13
[|001+ |010+ |100]SMR . (15)
The reduced states on SM and M are respectively givenby SM =
13 |0000| + 23 |++| and M = 23 |00| +
13 |11|, where |+ = 12 (|01+ |10). By symmetry ofthe W state,
the reduced state on any two or one qubit(s)have the same form.
By actions on S and M , we would like to erase S,leading to the
final state on S and M given by SM =|00| M . Let us consider two
processes that achievethis goal: the first one will preserve
correlations with Rbut will cost work, the second will not cost
work but willmodify those correlations.
We may directly apply the special case above concern-ing the
erasure of a system conditioned on a memory:
the fundamental work cost of such an erasure, if one pre-serves
correlations with a reference system R, is givenby H0 (S|M). One
may explicitely calculate (see Ap-pendix C) in this case H0 (S|M) =
log 23 0.59 and thusthis process must cost at least this amount of
work.
However, one may easily notice that both SM andM have the same
spectrum {2/3, 1/3}. This meansthat there exists a unitary U that
performs the era-sure simply as |00| M = USMU, and this uni-tary
process does not cost any work. However, the cor-relations with R
are not preserved. Indeed, the uni-tary sends |00 to |01 and |+ to
|00, so one ex-plicitely calculates that the state after the
process isgiven by SMR = USMRU
= 13
[|011+2|000] =|0 1
3
[|11+2|00]. We notice that the reducedstate on M and R is now
pure and differs from initialone, given by MR =
13 |0000|+ 23 |++|.
Conclusion.The last few years have seen enormoustechnological
progress in micro- and nano-fabrication,making it possible to
construct engines and thermo-devices on a microscopic scale [4250].
In this regime,standard thermodynamic considerations (devised
origi-nally for macroscopic devices such as steam engines) arenot
necessarily applicable. At the same time, with theminiaturization
of computing circuits, thermodynamicaspects of information
processing have become increas-ingly relevant. In fact, the heat
dissipated by proces-sors is one of the main barriers limiting
their perfor-mance. Along with these developments, researchers
havestarted to investigate the laws of thermodynamics froman
information-theoretic perspective [5157].
The present work adds to this line of research, provid-ing a
rigorous quantitative relationship between informa-tion theory and
thermodynamics. One of our main find-ings is that this relationship
is more intricate than whatprevious results (which focused on
averaged quantities)may have suggested. In particular, it turns out
that thethermodynamic cost of a given information-processingtask
not only depends on the input and output state,but also on the
correlation between them. While thiscorrelation-dependence
disappears in certain asymptoticlimits, it cannot be neglected in
general and, in fact, maybecome arbitrarily large.
Acknowledgements.The authors would like to thankJohan Aberg,
Ldia del Rio and Joe Renes for manyenlightening discussions. PhF,
FD and RR were sup-ported by the Swiss National Science Foundation
(SNSF)through the National Centre of Competence in ResearchQuantum
Science and Technology and through grantNo. 200020-135048, and by
the European Research Coun-cil through grant No. 258932. FD was
also supported bythe SNSF through grants PP00P2-128455 and
20CH21-138799, as well as by the German Science Foundation(grant CH
843/2-1). JO is funded by the Royal Societyof London.
-
9APPENDIX
Appendix A: Formal Approach toLambda-Majorization
1. Preliminaries and Main Definition
Let HX , HY be two subspaces of a finite-dimensionalHilbert
space HZ , and let HA, HB be two subspaces ofa finite-dimensional
Hilbert space HC . Let d() denotethe dimensions of the various
Hilbert spaces H() andspecifically let d = dZ = dimHZ . Denote by L
(H ) theset of linear hermitian operators onH , byP(H ) the setof
positive semidefinite operators onH , and by S=(H )those operators
in P(H ) that have unit trace. Let alsoi() denote the i-th
eigenvalue of (in no particular
order), and i () denote the i-th eigenvalue of takenin
decreasing order.
Majorization is discussed in detail in Refs. [31, 32, 58].
Majorization. A matrix P(HZ) is said to ma-jorize P(HZ), denoted
by , if for all k,ki=1
i () >
ki=1
i (), and if tr = tr .
The notion of majorization defines a (partial) orderrelation
onP(HZ). When considering the set of densitymatrices S=(HZ), there
is a least element: the fullymixed state, 1d1Z .
Weak Submajorization. A matrix P(HZ) is saidto weakly
submajorize P(HZ), denoted by w ,if for all k,
ki=1
i () >
ki=1
i ().
Remark that if , S=(HZ), then the concept ofweak submajorization
is equivalent to regular majoriza-tion simply because the traces of
these matrices are al-ready equal to unity.
Doubly Stochastic Matrix. A dd matrix S is doublystochastic if S
ji > 0,
i S
ji = 1 j and
j S
ji = 1 i.
Doubly Substochastic Matrix. A n m matrix Bis doubly
substochastic if B ji > 0,
iB
ji 6 1 j and
j Bji 6 1 i.
The following theorem is due to Hardy, Littlewood andPolya
[59].
Theorem 4 (Hardy, Littlewood, and Polya, 1929). Let, P(HZ). Then
if and only if there existsa d d doubly stochastic matrix S ji such
that i() =j S
ji j() .
A similar theorem is obtained for weak submajoriza-tion and
doubly substochastic matrices [31].
Proposition 5. Let P(HX) and P(HY ).Then w if and only if there
exists a dY dX doublysubstochastic matrix B ji such that i() =
j B
ji j().
Majorization defines a partial order on states and hasa smallest
element, the fully mixed state. Also, a purestate majorizes any
other state.
Proposition 6. Majorization is preserved by direct sumsand
tensor products, i.e. if and , then and . The same holds forweak
submajorization.
A proof for the direct sum of two vectors can be foundin [31,
Cor. II.1.4]. We provide here an alternative proofalong with the
tensor product case.
Proof. Let S ji and S ji be doubly stochastic matrices such
that i() =j S
ji j() and i(
) =j S ji j(
).Then SS is also doubly stochastic and satisfies i() =
j(S S) ji j( ), because the vectors of
eigenvalues of the direct sum are simply the direct sumsof the
individual vector of eigenvalues. This shows that .
Analogously, S S satisfies ii( ) =i()i(
) =jj S
ji j()S
ji j(
) =jj(S
S) jj
ii jj(). SS is doubly stochastic,ii(S
S) jj
ii =ii S
ji S
ji = 1 and
jj(S S) jj
ii =
jj Sji S
ji = 1.
The same proof holds for doubly substochastic matri-ces, so
majorization may be replaced by weak subma-jorization in the
proposition.
We are now all set for a formal definition of
lambda-majorization.
Let R and let 1, 2 > 0 such that = 12 and21 , 22 are
integers. (The case when 2 is irrational willbe discussed later.)
Take HC of size greater than both21 and 22 and let HA and HB be
subspaces of HC ofrespective dimensions 21 and 22 .
Lambda-Majorization. For P(HX) and P(HY ), we say that
-majorizes , denoted by
,if there exists such 1, 2 such that 2
11A w
221B . Here 1A, 1B are the projectors onto therespective
subspaces HA and HB embedded in HC , ofrespective dimensions 21 ,
22 . Likewise, and areconsidered as living in HZ by padding them
with zeroeigenvalues as necessary.
We have assumed here that 2 is rational. If 2 isirrational, we
say that -majorizes if for all rational
2
with < , then .
The following proposition guarantees that the defini-tion above
does not depend on the exact values of 1and 2 but only on their
difference. This is the same assaying that a fully mixed state
cannot act as a catalyst.
Proposition 7. For any , P (HZ), and for any n,we have w if and
only if 1nn w 1nn .
-
10
Proof. If w , then the majorization passes over thetensor
product, and thus proves the claim. Conversely, if 1nn w 1nn , then
in particular, for any k 6 d,
nki=1
i (1n
n ) >nki=1
i (1n
n ) . (A1)
(d is the maximum rank of or .) But in(1n
n ) =1ni () and thus
ki=1
i () >ki=1
i () .
The following proposition is a direct consequence of
thedefinition of lambda-majorization, and just states thatyou can
move around randomness into or out of the an-cillas in the
definition of lambda-majorization.
Proposition 8. For any P(HX), P(HY ), andfor any R, n > 0, we
have
1n1n
+logn and
logn 1n1n .
Similarly to Thm. 4 and to Prop. 5, it is possible
tocharacterize lambda-majorization by the existence of amatrix
relating the vector of eigenvalues that satisfiessome specific
normalization conditions.
Proposition 9. Let P(HX) and P(HY ).Then
if and only if there exists a dY dX matrixT ki such that i()
=
k T
ki k(), satisfying T
ki > 0,
i Tki 6 1, and
k T
ki 6 2 .
Proof of Prop. 9. Suppose 211A w 221B with = 12. Then there
exists a doubly substochas-tic matrix S akbi such that
bi(221B
)=ak
S akbi ak(211A
),
with S akbi > 0,bi S
akbi 6 1 and
ak S
akbi 6 1. (Indices
a and b refer to the mixed ancillas of respective sizes 21
and 22 . Since we are considering weak submajorization,we can
safely ignore all zero eigenvalues and consider onlythe subspaces
(of different sizes on the left and right handside of the
majorization) on which , , 1A and 1B havesupport, as in Prop.
5.)
Now we have
i()
=b
bi(221B
)=a b k
S akbi ak(211A
)=k
(ab
21 S akbi
)k () ,
so one can define
T ki =ab
21 S akbi ,
which fulfills i()
=k T
ki k
(). Because S is doubly
substochastic, and using the fact that indices a (resp. b)range
to 21 (22), the matrix T satisfies
i
T ki =i a b
21S akbi =a
21bi
S akbi 6 1 ,
as well ask
T ki =k a b
21S akbi =b
21ak
S akbi
6b
21 = 2 .
Additionally, T ki > 0 because S akbi > 0.
Conversely, suppose that a matrix T ki exists, withT ki >
0,
i T
ki 6 1,
k T
ki 6 2, and i() =
k Tki k(). Let 1, 2 such that = 1 2 and
such that 21 , 22 are integers. Then let S akbi = 22T ki
for all a, b. Then S akbi > 0 and S satisfiesak
S akbi = 22
ak
T ki 6 22(
a
1)
2 = 1 ,
as well asbi
S akbi = 22
bi
T ki 6 22(
b
1)
= 1 .
The required weak submajorization for the
desiredlambda-majorization is provided by this doubly
sub-stochastic matrix,
bi(221B
)= 22i
()
= 22k
T ki k()
= 22k
T kia
ak(211A
)=ak
S akbi ak(211A
).
-
11
2. Formulation of Lambda-Majorization in Termsof Channels
Majorization can also be characterized in terms of uni-tal,
trace-preserving completely positive maps [3336].
Proposition 10. Two positive semidefinite matrices and satisfy
if and only if there exists a trace-preserving, unital, completly
positive map E satisfyingE () = .
Similarly, one can prove an analogous characterizationof weak
submajorization. The proof of this propositionwill be given
later.
Proposition 11. Let P(HX) and P(HY ).Then w if and only if there
exists a completelypositive map EXY : L (HX) L (HY ) such thatEXY
() = , with E satisfying EXY (1X) 6 1Y andEXY (1Y ) 6 1X .
Lets say that EXY is subunital if EXY (1X) 6 1Y .Then the two
conditions on the structure of the channelEXY in the above
proposition require the channel to besubunital and
trace-nonincreasing.
A subunital trace-nonincreasing completely positivemap can
always be seen as part of a unital, trace-preserving completely
positive map on a larger Hilbertspace. This is analogous of the
result that doubly sub-stochastic matrices are submatrices of
stochastic matri-ces [31].
Proposition 12. Let EZZ be a unital, trace-preservingcompletely
positive map. Let HX and HY be two sub-spaces of HZ and let 1X and
1Y be the projector ontothose spaces, respectively. Then the
channel E XY () =1Y E (1X 1X)1Y is subunital and
trace-decreasing.
Conversely, let E XY be any trace-decreasing, subuni-tal
completely positive map. Let HZ =HX HY , GY =1Y E XY (1X) > 0,
and HX = 1X E (1Y ) > 0.Then the channel defined by
EZZ ()= 0X E XY (1X ()1X)+ E (1Y ()1Y ) 0Y+(
0X GY
)()(
0X GY
)+(
HX 0Y)
()(
HX 0Y)
is unital and trace-preserving, and E XY () =1Y E (1X ()1X)1Y
.
In order to generalize this concept to our lambda-majorization,
lets introduce the concept of an -subunital map. These generalize
the notion of subunitalmaps to arbitrary normalizations.
-subunital Maps. Well call a map TXY -subunitalif it satisfies
TXY (1X) 6 1Y .
Proposition 13 (Composition of -subunital maps).Let HW HZ be
another subspace of HZ in addi-tion to HX and HY , and let TXY , T
YW be trace-nonincreasing maps. Assume that TXY is -subunitaland
that T YW is -subunital. Then their composition[T T ]XW is
-subunital.Proof of Prop. 13. The composition of TXY and T YWis
trace-nonincreasing,
T (T (1W )) 6 T (1Y ) 6 1X .Their composition is also
-subunital,T YW (TXY (1X)) 6 T YW (1Y ) 6 1W .We will now give
proofs for Props. 11 and 12, which
rely on the following lemma.
Lemma 14. Let TZZ be a trace-nonincreasing mapthat is
2-subunital. Denote by 1X (resp. 1Y ) the pro-jectors onto the
subspaces HX (resp. HY ) of HZ . ThenTXY , defined by TXY () = 1Y
TZZ (1X ()1X)1Y ,is also a trace-nonincreasing 2-subunital map.
Proof of Lemma 14. It suffices to note that the projec-tion map:
() 1X ()1X (resp. () 1Y ()1Y ) istrace-nonincreasing and subunital.
Then apply Prop. 13twice.
Proof of Prop. 12. The first part of the proposition fol-lows
from the lemma. To prove the converse, let EZZ asin the proposition
text, and notice first that the channelis its own adjoint:
E () = E (1Y ()1Y ) 0Y + 0X E (1X ()1X)+(
0X GY
)()(
0X GY
)+(
HX 0Y)
()(
HX 0Y)
= EXY () . (A2)The map is unital:
EZZ (1Z) = 0X (1Y GY ) + (1X HX) 0Y+ 0X G+HX 0Y = 1Z ,
and it is thus trace-preserving because of (A2). The
lastcondition, E XY () = 1Y EZZ (1X ()1X)1Y is obvi-ous from the
definition of EZZ .Proof of Prop. 11. By the weak submajorization
condi-tion, if tr 6= tr, we must have tr < tr. Consider
anextension space HY HZ (consider a larger HZ if nec-essary) in
which we extend by many small eigenvaluessuch that tr YY = tr,
while still having w YY .Now we have a (regular) majorization, YY ,
andcan apply Prop. 10.
The obtained map, EZZ , is then unital and trace-preserving. It
can be restricted by projecting the input
-
12
onto HX and the output onto HY ,
EXY () = 1Y EZZ(1X ()1X
)1Y .
This restricted operator, by the lemma, is a valid
trace-nonincreasing subunital map (take = 0).
Conversely, if EXY is a subunital trace-nonincreasingcompletely
positive map with EXY () = , thenone can dilate it with Proposition
12 to a uni-tal, trace-preserving completely positive map EZZsuch
that 1Y EZZ ( 0Y )1Y = . Note alsothat the map () 7 1Y ()1Y + 1X
()1X is apinching [31, p. 50, Prob. II.5.5], so we have 0Y EZZ ( 0Y
) 1XEZZ ( 0Y )1X +1Y EZZ ( 0Y )1Y w 1XEZZ ( 0Y )1X = .The last weak
submajorization is because some eigen-values were left out.
In the same way as lambda majorization can be charac-terized
with differently normalized doubly substochasticmaps, it can also
be characterized in terms of a differentlynormalized subunital
channel.
Proposition 15. Let P(HX), P(HY ) and R. Then if and only if
there exists acompletely positive map TXY : L (HX) L (HY )such that
TXY () = , that is 2-subunital and trace-nonincreasing.
Proof of Prop. 15. . Assume first that 211A w 221B , with HA, HB
(of respective sizes 21and 22) being subsystems of an ancilla
systemHC , with = 1 2.
By Prop. 11, there exists a subunital trace-nonincreasing
completely positive map EAXBY , suchthat
EAXBY (211A ) = 221B . (A3)
Now let the map T be defined byTXY () = trB
[EAXBY (211A ())] . (A4)This map is trace-nonincreasing,
T XY (1Y ) = 21 trA[EAXBY (1BY )
]6 21 trA (1AX) = 1X ,
and 2-subunital,
TXY (1X) = 21 trB [E (1AX)] 6 21 trB 1BY= 21Y .
The map T brings to ,
TXY (X) = trB[E (211A X)]
= trB(221B Y
)= Y ,
so that T satisfies all the claimed properties.. To prove the
converse, assume that a trace-
nonincreasing, 2-subunital map TXY exists, suchthat TXY () =
.
Choose 1, 2 such that = 1 2 and such that21 , 22 , are integers.
(Again, in case 2 is irrational,
approximate 2 arbitrarily well by rational numbers 2.)
Choose HC large enough to contain two subspaces HAand HB of
respective dimensions 21 and 22 . Let
EAXBY () = 221B TXY (trA ()) . (A5)This map is
trace-nonincreasing,
E (1BY ) = 221A T (trB 1BY )= 221A T
(221Y
)6 1AX ,
and subunital,
E (1AX) = 221B T (trA 1AX)= 221B T
(211X
)6 1BY ,
since = 1 2 and T is 2-subunital. Also,
E (211A X) = 221B T (trA (211A X))= 221B T (X) = 221B Y .
By Prop. 11, we eventually have
211A X w 221B Y .
Remark 16. A trace-nonincreasing, 2-subunital com-pletely
positive map TXY can always be written as inEq. (A4) for a
sub-unital trace-nonicreasing completelypositive map EAXBY , which
itself can always be writ-ten as projections of a unital map ECZCZ
(see text ofthe previous proof, and Prop. 12).
Conversely, for any unital map ECZCZ withE (211 X) = 221 Y , in
particular for anynoisy operation in our framework, the map T
obtainedby Eq. (A4) is trace-nonincreasing and 2-subunital.
In particular, for our purposes of optimizing over allpossible
processes of our framework with an additionalcondition to the
channel carrying out the process (namelyto preserve correlations
between our system X and thereference system R), we may impose that
condition di-rectly on the channel T to obtain an upper bound
on.
3. Properties for quantum states
We will consider in this section some useful propertiesof
lambda-majorization in the case where we considernormalized states
, . Here, weak majorization auto-matically implies (regular)
majorization because tr =tr = 1.
-
13
In this section, let S=(HX) and S=(HY ).
Proposition 17 (Lambda-Majorizing a Pure State).
For any pure state |0 HZ , we have |00| if andonly if rank 6 2
(obviously has to be negative orzero). Equivalently, 1n1n if and
only if rank 6 n.
Proof of Prop. 17. Assume first that |00|. Here
HY is the one-dimensional space spanned by |0, andtake HX the
subspace on which has its support. ByProp. 9 there exists a
single-row matrix T ki satisfyingT ki > 0,
i T
ki = T
ki=1 6 1 k,
k T
ki 6 2 such
that 1 = i=1(|00|) =k T
ki=1k(). We also have
k() 6= 0 because has nonzero eigenvalues in HX .Then
k T
ki=1k() = 1 =
k k() implies T
ki=1 =
1 k. That is, the condition k T ki=1 6 2 forces T ki=1to have at
most 2 elements, i.e. the rank of may notexceed 2.
The converse holds because any state majorizes a uni-form state
of the same rank.
Proposition 18 (Condition on Support Sizes for Lamb-
da-Majorization). If , then rank 6 2 rank .
Proof of Prop. 18. Notice that 1rank 1rank , andthus
1rank 1rank . Then, by Prop. 8 we have
log rank |00| ;
it remains to apply Prop. 17.
Proposition 19 (Being Lambda-Majorized by a PureState). Let the
state have maximum eigenvalue
max(). For any pure state |0, we have |00| if and only if max()
6 2. Equivalently, 1n1n ifand only if max() 6 1n .
Proof of Prop. 19. Let T ki be as in Prop. 9. Note herek only
takes value 1, because we consider HY being theone-dimensional
space spanned by |0. Then i() =k T
ki k(|00|) = T k=1i and thus T ki = i(). Then
2 >k T
ki = T
k=1i = i() for all i. In particular,
2 > max().Conversely, if max() 6 2, then let T k=1i =
i().
This matrix T satisfies the conditions in Prop. 9 and thus
|00| .
4. Optimal Lambda Majorization for NormalizedStates and Relation
to Single-Shot Entropy
Measures
Define the absorbed randomness (or relative mixed-ness [19]) of
a transition from to as the maximalamount of randomness that you
can get rid of, or the
minimal amount of randomness that you have to gener-ate, in a
noisy operation process:
R( ) = sup{ : } . (A6)Recent work has shown that this measure is
relevant
for the amount of extractable work of processes acting onarrays
of Szilard boxes [19].
The absorbed randomness has some tight relationsto single-shot
entropy measures, which we present here.These are reformulations of
results shown in [17, 18].
Proposition 20. The absorbed randomness definedabove satisfies
the following bounds.
Hmin()H0() 6 R( ) 6 H0()H0() .
Proposition 21. If |0 denotes any pure state, then thefollowing
relations hold:
R(|0 ) = Hmin() , (A7)R( |0) = H0() . (A8)
Similar explicit values can be obtained in the casewhere either
the initial state or the target state is mixed.
Proposition 22. If 1nn denotes the fully mixed state onlog n
qubits, then:
R(1nn ) = Hmin() log n , (A9)R( 1nn ) = log nH0() . (A10)
Proof of Prop. 20. Lower bound: Let 1 = Hmin() = log max() and 2
= H0() = log rank. By Propo-sition 19, we have 21121 and by
Proposition 17, 22122 . The majorization carries over to the
ten-sor product, 21121 22122 , and 1 2is a valid maximization
candidate for (A6).
Upper bound: Let = R( ) satisfying .Proposition 18 immediately
yields 2 6 rank rank , and
R( ) = 6 log rank log rank .Recalling the definition of the
Renyi-0 entropy H0() =log rank yields the required upper bound.
Proof of Prop. 21. Equation (A8) follows from thebounds of
Proposition 20, which become tight in thisspecial case. Equality
(A7) is a direct consequence ofProp. 19.
Proof of Prop. 22. The bounds of Proposition 20 becometight for
(A10). Equality (A9) is again a consequenceof Prop. 19, recalling
Prop. 8 which allows us to write
|00| +logn instead of 1nn .
-
14
Appendix B: Derivation of the Main Result:Formulation as
Semidefinite Program
Let HX be a quantum system in the state X . LetHR be an
additional quantum system and let |XR bea purification of X .
Suppose we want to bring the system X into a givenstate XR with
a lambda-majorization (here XR is notnecessarily pure; giving the
joint state with R allows usto specify which correlations we want
to preserve). Thetask is then the following.
Task. Find the best (maximal) , such that there existsa
completely positive, 2-subunital, trace-nonincreasingmap TXX
satisfying TXX(XR) = XR.
In other words, we would like to find the trace non-increasing
channel that satisfies TXX(XR) = XR,that has the smallest possible
TXX (1X).
This problem can be formulated as a semidefinite pro-gram in
terms of the variables (defined as = 2)and TXX (through its
Choi-Jamiolkowski map TXX).(See [60, 61] for a introduction to SDPs
in a style similarto what we use here.)
Primal
minimize:
subject to:
TXX (1X) 6 1X : X (B1a)T XX (1X) 6 1X : XX (B1b)
TXX(XR) = XR . : ZXR (B1c)
Dual
maximize:
tr (ZXR XR) trXXsubject to:
trX 6 1 (B2a)trR
[tXXR ZXR
]6 1X X +XX 1X . (B2b)
Note that since the channel does not touch R, we mustnecessarily
have R = R. Let E be an environment thatpurifies the output state
as XRE . As two purificationswith the same reduced state on R, the
two states XRand XR must be related by an isometry VXXE asXRE =
VXXE XR V . We can choose VXXE to bea partial isometry such that V
V = XE , the projectoron the support of XE , and V
V = X , the projectoron the support of X .
Now, define T by its Stinespring dilationTXX () = trE
[VXXE () V
], (B3)
and let = T (1X). We will show that this choice ofvariables is
feasible and optimal, and will derive a moreexplicit value of .
Condition (B1a) is satisfied by definition and (B1b)because V is
a partial isometry. Also, verifying condi-tion (B1c),
TXX (XR) = trE[VXX XRV
]= trE XRE
= XR . (B4)
Now calculate
= T (1X) = trE V V = trE XE= max
Xtr[XE X
]= 2H0(E|X
) . (B5)
We will now show that this value is optimal by exhibit-ing a
solution to the dual problem that achieves the samevalue. Let X = X
be the optimal X for the defini-tion of H0(E|X ) as in (B5), let
ZXR = 1R X andlet XX = 0. This choice is feasible since condition
(B2a)is automatically satisfied and condition (B2b) becomes
trR[tXXR ZXR
]= trR
[tXXR 1R X
]= trR
[tXX|R X
]= tXX X 6 1X X , (B6)
where X|R is a maximally entangled state on the sup-ports of X
and R. Let XRE and VXXE be definedas before. The value achieved by
this choice of dual vari-ables is then
tr [ZXR XR] = tr[1R X XR
](B7)
= tr[1R X VXXE XRV
](B8)
= tr[X VXXE X|RV
]= tr
[XXE
]= 2H0(E|X
) . (B9)
From this, we conclude that the optimal for thisproblem is
opt = H0(E|X ) . (B10)where XRE is a purification of XR.
We note also that this gives the optimal amount of ex-tracted
work. Of course, any 6 opt also is a solution.
Appendix C: Renyi-zero entropy of the W state
Let S and M be two qubits in the state SM =13 |0000|SM + 23 |++|
(where |+ is the Bell state|+ = 1
2
[|01 + |10]). Written out explicitely in thebasis {|0, |1},
SM =
1/3
1/3 1/31/3 1/3
0
.
-
15
(Empty entries are zero.)The projector on its support is
SM =
1 1/2 1/21/2 1/20
.We would like to compute the quantity
2H0(S|M) = maxM dens. op.
tr SMM .
Let M =(s1 s
2
s2 1s1
); then
tr [SM (1S M )] = s1 + 12
(1 s1) + 12s1
=1
2+ s1 . (C1)
Under the constraint 0 6 s1 6 1, this expression is
clearlymaximized when s1 = 1, yielding the value
H0(S|M) = log 32.
[1] R. Landauer, IBM Journal of Research and Development5, 183
(1961), ISSN 0018-8646.
[2] C. H. Bennett, International Journal of TheoreticalPhysics
21, 905 (1982).
[3] C. H. Bennett, Studies in History and Philosophy of Mod-ern
Physics 34, 501 (2003).
[4] J. Oppenheim, M. Horodecki, P. Horodecki, andR. Horodecki,
Physical Review Letters 89, 180402(2002), ISSN 0031-9007,
0112074.
[5] L. del Rio, J. Aberg, R. Renner, O. Dahlsten, and V.
Ve-dral, Nature 474, 61 (2011), ISSN 1476-4687.
[6] C. H. Bennett, IBM Journal of Research and Develop-ment 17,
525 (1973), ISSN 0018-8646.
[7] L. Szilard, Zeitschrift fur Physik 53, 840 (1929),
ISSN1434-6001.
[8] R. W. Keyes and R. Landauer, IBM Journal of Researchand
Development 14, 152 (1970), ISSN 0018-8646.
[9] K. Maruyama, F. Nori, and V. Vedral, Reviews of Mod-ern
Physics 81, 1 (2009), ISSN 0034-6861.
[10] B. Piechocinska, Physical Review A 61, 062314 (2000),ISSN
1050-2947.
[11] L. Schulman and U. Vazirani, in Proceedings of the
thirty-first annual ACM symposium on Theory of computing(ACM,
1999), pp. 322329.
[12] R. Alicki, M. Horodecki, P. Horodecki, and R.
Horodecki,Open Systems & Information Dynamics 11, 205
(2004),ISSN 12301612.
[13] M. Horodecki, P. Horodecki, and J. Oppenheim,
PhysicalReview A 67, 10 (2003), ISSN 1050-2947.
[14] F. G. S. L. Brandao, M. Horodecki, J. Oppenheim,J. M.
Renes, and R. W. Spekkens, ArXiv e-prints (2011),1111.3882.
[15] D. Janzing, P. Wocjan, R. Zeier, R. Geiss, and T.
Beth,International Journal of Theoretical Physics 39,
2717(2000).
[16] D. Janzing, Computer science approach to quantumcontrol
(Universitatsverlag Karlsruhe, Karlsruhe, 2006),ISBN
3-86644-083-9.
[17] D. Egloff, Masters thesis, ETH Zurich (2010).
[18] O. C. O. Dahlsten, R. Renner, E. Rieper, and V. Vedral,New
Journal of Physics 13, 53015 (2011), ISSN 1367-2630.
[19] D. Egloff, O. C. O. Dahlsten, R. Renner, and V.
Vedral,ArXiv e-prints (2012), 1207.0434.
[20] J. Aberg, ArXiv e-prints (2011), 1110.6121.[21] M.
Horodecki and J. Oppenheim, ArXiv e-prints (2011),
1111.3834.[22] H. Joe, Journal of Mathematical Analysis and
Applica-
tions 148, 287 (1990), ISSN 0022247X.[23] E. Ruch, R. Schranner,
and T. H. Seligman, Journal of
Mathematical Analysis and Applications 76, 222 (1980),ISSN
0022247X.
[24] E. Ruch, R. Schranner, and T. H. Seligman, The Journalof
Chemical Physics 69, 386 (1978), ISSN 00219606.
[25] C. A. Mead, The Journal of Chemical Physics 66, 459(1977),
ISSN 00219606.
[26] R. Renner, Phd thesis, ETH Zurich (2005),
arXiv:quant-ph/0512258.
[27] M. Tomamichel, R. Colbeck, and R. Renner, IEEE
Trans-actions on information theory 56, 4674 (2010).
[28] M. Tomamichel, Ph.D. thesis, ETH Zurich
(2012),arXiv:1203.2142.
[29] R. Konig, R. Renner, and C. Schaffner, IEEE Transac-tions
on Information Theory 55, 4337 (2009), ISSN 0018-9448.
[30] N. Datta and R. Renner, IEEE Transactions on Informa-tion
Theory 55, 2807 (2009), ISSN 0018-9448.
[31] R. Bhatia, Matrix Analysis, Graduate Texts in Mathe-matics
(Springer, 1997).
[32] R. A. Horn and C. R. Johnson, Matrix Analysis (Cam-bridge
University Press, New York, 1985).
[33] A. Uhlmann, Reports on Mathematical Physics 1, 147(1970),
ISSN 00344877.
[34] A. Uhlmann, Wiss. Z. Karl-Marx-Univ. Leipzig,
Math.-Naturwiss. 20, 633 (1971).
[35] A. Uhlmann, Wiss. Z. Karl-Marx-Univ. Leipzig,
Math.-Naturwiss. 21, 421 (1972).
[36] A. Uhlmann, Wiss. Z. Karl-Marx-Univ. Leipzig, Math.-
-
16
Naturwiss. 22, 139 (1973).[37] U. Haagerup and M. Musat,
Communications in Mathe-
matical Physics 303, 555 (2011), ISSN 0010-3616.[38] A. Renyi,
in Proceedings of the Fourth Berkeley Sym-
posium on Mathematical Statistics and Probability, Vol.1:
Contributions to the Theory of Statistics (1960), pp.547561.
[39] M. Tomamichel, C. Schaffner, A. Smith, and R. Ren-ner, IEEE
Transactions on Information Theory 57, 5524(2011), ISSN
0018-9448.
[40] M. A. Nielsen and I. L. Chuang, Quantum Computationand
Quantum Information (Cambridge University Press,2000).
[41] M. Tomamichel, R. Colbeck, and R. Renner, IEEE
Trans-actions on information theory 55, 5840 (2009).
[42] H. E. D. Scovil and E. O. Schulz-DuBois, Phys. Rev.Lett. 2,
262 (1959).
[43] J. E. Geusic, E. O. Schulz-DuBios, and H. E. D.
Scovil,Phys. Rev. 156, 343 (1967).
[44] R. Alicki, Journal of Statistical Physics 20, 671
(1979),ISSN 0022-4715.
[45] J. Howard, Nature 389, 561 (1997).[46] E. Geva and R.
Kosloff, The Journal of chemical physics
97, 4398 (1992).[47] P. Hanggi and F. Marchesoni, Rev. Mod.
Phys. 81, 387
(2009).[48] A. E. Allahverdyan and T. M. Nieuwenhuizen, Phys.
Rev.
Lett. 85, 1799 (2000).[49] T. Feldmann and R. Kosloff, Phys.
Rev. E 73, 025107
(2006).[50] J. Baugh, O. Moussa, C. Ryan, A. Nayak, and
R. Laflamme, Nature 438, 470 (2005).[51] J. Gemmer, M. Michel,
M. Michel, and G. Mahler, Quan-
tum thermodynamics: Emergence of thermodynamic be-havior within
composite quantum systems (Springer Ver-lag, 2009).
[52] S. Popescu, A. Short, and A. Winter, Nature Physics 2,754
(2006).
[53] N. Linden, S. Popescu, and P. Skrzypczyk, Physical Re-view
Letters 105, 130401 (2010).
[54] N. Linden, S. Popescu, A. Short, and A. Winter, NewJournal
of Physics 12, 055021 (2010).
[55] C. Gogolin, M. Muller, and J. Eisert, Physical
ReviewLetters 106, 40401 (2011).
[56] S. Trotzky, Y. Chen, A. Flesch, I. McCulloch,U. Schollwock,
J. Eisert, and I. Bloch, Nature Physics8, 325 (2012).
[57] A. Hutter and S. Wehner, ArXiv e-prints
(2011),1111.3080.
[58] A. W. Marshall, I. Olkin, and B. C. Arnold,
Inequalities:Theory of Majorization and Its Applications
(Springer,2010), ISBN 0387400877.
[59] G. H. Hardy, J. E. Littlewood, and G. Polya,
Inequalities,Cambridge Mathematical Library (Cambridge
UniversityPress, 1952), ISBN 9780521358804.
[60] A. Barvinok, A Course in Convexity, vol. 54 of
GraduateStudies in Mathematics (American Mathematical Soci-ety,
2002), ISBN 0-8218-2968-8.
[61] J. Watrous, Theory of Quantum InformationLecturenotes from
Fall 2008 (Online Lecture Notes, 2008),
www.cs.uwaterloo.ca/~watrous/quant-info/.
Formal Approach to Lambda-MajorizationPreliminaries and Main
DefinitionFormulation of Lambda-Majorization in Terms of
ChannelsProperties for quantum statesOptimal Lambda Majorization
for Normalized States and Relation to Single-Shot Entropy
Measures
Derivation of the Main Result: Formulation as Semidefinite
ProgramRnyi-zero entropy of the W stateReferences