1 Topic 2: Canonical Transformations and the Hamilton-
Jacobi Equation
Reading: Hand & Finch Chapter 6 (required), Goldstein 391-396 (supplemental, but I will
cover this material in the notes).
The �rst part of this topic includes the material in Chapter 6 with some supplementary
reading from Goldstein. At the end, we will cover in a bit more detail the relationship to
wave mechanics.
1.1 Canonical Transformations
We have considered coordinate transformations between sets of space co-ordinates
Qk = Qk (q(t) ; t) (1)
Where q has N components in general. These are called point transformations. Obvious
examples of this are going from Cartesian to cylindrical or spherical co-ordinates.
One advantage of the Lagrange formulation of mechanics is that it easily lets us choose
any invertable function of the Cartesian coordinates as generalized coordinates and then
write down Lagrange's equation directly :
d
dt
�@L
@ _qi
�� @L
@qi= 0 i = 1; :::N (2)
so we get the equations of motion with no contortions.
Is there a similar situation for the coordinates of phase space and Hamilton's formu-
lation? Not in general { you can easily invent transformations from (qi;pi) to a new set
(Qi(qi;pi); Pi(qi;pi)) (such a transformation containing both q and p is called a contact trans-
formation) so that the equations of motion for (Qi; Pi) do not follow from a Hamiltonian.
A simple example:
H =p2
2m; _p = 0; _q =
p
m(3)
Transform
P = pt so _P = _pt + p = p =P
t(4)
Q = qt so _Q = _qt + q =pt
m+ q =
P
m+Q
t(5)
Does a function K (P;Q; t) exist such that
_P = �@K@Q
(6)
_Q =@K
@P(7)
1
? If so, then
@ _P
@P= � @2K
@P@Q= �@
_Q
@Q(8)
But@ _P
@P=
1
t(9)
@ _Q
@Q=
1
t(10)
So this transformation does not lead to Hamilton's equation, and area in (P;Q) space is not
preserved under motion. A transformation like this is said to be non-canonical. This has
nothing to do with �nding the motion in the new variable. In particular if
dP
dt=P
t! dP
P=dt
t! P = kt (11)
dQ
dt=P
m+Q
t=kt
m+Q
t(12)
or1
t
dQ
dt� Q
t2=
k
m(13)
d
dt
�Q
t
�=
k
m(14)
so
Q
t=
k
mt+ a
Q =k
mt2 + at! q =
k
mt+ a
However any result that follows from Hamilton's equation does not apply in the non-canonical
space of (P;Q).
We can, however, �nd a class of transformations that are said to be canonical, in that
they do follow from a new Hamiltonian. These transformations (involving qi; pi) are called
contact (as opposed to point) transformations. These phase space of the new variables has
the same properties as the old (e.g. Liouville's theorem holds). Two examples for the free
particle:
H =p2
2m(15)
and apply the transformation
P = ap + bq
Q = cp+ dq
2
and the inverse
p =1
�(dP � bQ)
q =1
�(�cP + aQ)
where we assume the determinant of the coeÆcients � = ad� bc 6= 0.
So
_P = bp
m=
b
m�(dP � bQ)
_Q = dp
m=
d
m�(dP � bQ)
so we want to know if there is a K such that Hamilton's equations hold. We saw above that
this requires @ _P@P
= �@ _Q
@Q. In fact
@ _P
@P=
bd
m�
@ _Q
@Q=
�dbm�
So there is a new Hamiltonian, K:
K (P;Q) = �H (p (P;Q) ; q (P;Q)) =1
2m�(dP � bQ)
2(16)
so
_P =�@K@Q
=b
m�(dP � bQ)
_Q =@K
@P=
d
m�(dP � bQ)
as above. Except for the factor �, this looks just like what you might expect from Lagrangian
mechanics { just write the Hamiltonian in the new coordinates. If � = 1, that's just what
happens, however for a general case "just evaluate the old Hamiltonian in the new variable"
doesn't work.
To show that the contact transformations are really very di�erent from the point tran-
formations of Lagrangian mechanics consider this example { again the free particle:
H =p2
2m(17)
and the transform
P = p cos q
Q = p sin q
3
so p2 = P 2 +Q2 and
_P = _p cos q � p _q sin q = �p2
msin q = �
pP 2 +Q2
Q
m
_Q = _p sin q + p _q cos q =p2
mcos q =
p2
mcos q =
pP 2 +Q2
P
m
so
@ _P
@P= � Pp
P 2 +Q2
Q
m
@ _Q
@Q=
QpP 2 +Q2
P
m
so like in the previous example K does exist and
K =1
3m
�P 2 +Q2
� 32 (18)
Note this is not even close to H (p (P;Q) ; q (P;Q))! This transformation is "cononical" for
the free particle, but it is not canonical for other Hamiltonians. For example
H =p2
2m+mgq (19)
_p = �mg; _q =p
m(20)
_P = �mg PpP 2 +Q2
� Q
m
pP 2 +Q2
_Q = �mg QpP 2 +Q2
+P
m
pP 2 +Q2
and@ _P
@P6= �@ _Q
@Q(21)
So, this is a "cononical-like" transformation.
For a contact transformation to be a general canonical transformation, it must yield
Hamilton's equations for any Hamiltonian.
Now we're clear on what canonical transformations are, lets see how to construct them.
As your book does, I'm going to stick to 2-D phase space (1-D con�guration space) for the
following discussions. I'll also note that I will like your book take the generating function
approach to constructing canonical transformations. There is another, seemingly unrelated
approach that can be derived in terms of the matrix, or sympletic formalism of Hamilton's
equations. We'll cover this later (see also appendix A of chapter 6 and Goldstein pp 391�).
4
1.2 Generating Function Approach to Canonical Transformations
We saw last term (proven in a presentation problem) that two Lagrangians di�ering by a
time derivative of the formdF (q;t)
dtboth are valid descriptions of the same physical system.
Note that F does not depend on _q. Consider two di�erent descriptions of the same system
L0�Q; _Q; t
�and L (q; _q; t). These refer to the same physical system if
L0�Q; _Q; t
�= L (q; _q; t)� dF (q; Q; t)
dt(22)
Note here F can be a function of q; Q but not their time derivatives.
We want the Euler-Lagrange equations to hold in terms of the new variables. Integrating
both sides we haveZ t2
t1
L0dt =
Z t2
t1
Ldt+ F (q (t1) ; Q (t1) ; t1)� F (q (t2) ; Q (t2) ; t2) (23)
We can see that Hamilton's principle will hold in the new system if it holds in the old if we
take the variation of the above equation and assume that arbitrary variations in Æq imply
arbitrary variations in ÆQ (assuming ÆF vanishes at the end points). So, F can be used to
generate a new Lagrangian in terms of new variables for which Hamilton's principle holds.
Now we have to �gure out how to construct the new canonical momentum and the
new Hamiltonian to get the new con�guration space (in which everything we derive from
Hamilton's formalism applies). Note we have speci�ed a generating function, F , not a set of
transformation equations. We have to get the speci�c form of the transformation equations
from F .
To to the above, take the time derivative of F (Q; q; t):
dF
dt=@F
@q_q +
@F
@Q_Q+
@F
@t(24)
Now L0 = L�Q; _Q; t
�, so
@L0
@ _q=@L
@ _q� @
@ _q
�dF
dt
�=@L
@ _q� @F
@q= 0 (25)
and
p =@F
@q(26)
By de�nition the momentum canonical to Q is
P =@L0
@ _Q= �@F
@Q(27)
To get the tranformation explicity, we solve p = @F@q
for Q = Q (q; p; t), then we solve
P = �@F@Q
for P = P (q; p; t) (substituting our Q (q; p; t) in for Q).
5
To �nd the new Hamiltonian, K (Q;P ), we construct it from our Q;P and L0 :
K (Q;P; t) � P _Q� L0
= �@F@Q
_Q� L +@F
@q_q +
@F
@Q_Q +
@F
@t
= P _Q� L+ p _q � P _Q+@F
@t
= p _q � L +@F
@t
and �nally
K (Q;P; t) = H (q (Q;P ) ; p (Q;P ) ; t) +@F (q (Q;P ) ; Q; t)
@t(28)
Now can we use any F (Q; q; t) we want? Without proof, it is also necessary and suÆcient
that @2F@q@Q
6= 0. If this second derivative vanishes, the transformation will not be invertable.
Suppose we are given a set of transformation equations, how do we know if they are cononical?
First, express p; P as functions of q; Q; t. Then solve for F using P = �@F@Q; p = @F
@q, and see
if our conditions apply (F = F (q; Q; t) and @2F@q@Q
6= 0). This may or may not be possible to
solve - and we saw that not all contact transformations are canonical.
1.2.1 Types of Generating Function
We have considered generating functions of the form F (q; Q; t), where we assume ÆF = 0 at
t = t1;t2, and F satis�es the condition on the double partial derivative above. Now we can
perform a Legendre transformation on either q or Q to replace them with either p or P , so
that we can express the same canonical transformation by any of four generating functions,
F = F1 (q; Q; t) ; F2 (q; P; t) ; F3 (p;Q; t) or F4 (p; P; t). Why this is useful will be clear later.
We therefore get any generating function from any other one by a series of transforma-
tions. For example, to get F3 (p;Q; t), transform F1
F3 (p;Q; t) = F1 (q; Q; t)� qp (29)
and from@F3
@p=@F1
@p� q = 0� q (30)
so
q = �@F3
@p(31)
and
P = �@F1
@Q= �@F3
@Q(32)
Note that we have assumed that q; p; Q; P are all independent variables. This is �ne for the
purposes of the derivation, as we know they form an independent set dynamically. To get
the transformation equations which give us the functional dependence, we use q = �@F3@p
,
and P = �@F3@Q
.
6
We can go through the same exercise to get F2 and F4, but we have to change the sign on
the transformation (because of the asymmetry of the minus sign in Hamilton's equations).
F4 (p; P; t) = F3 (p;Q; t) + PQ
F2 (q; P; t) = F1 (q; Q; t) +QP
and (without going through the straightforward steps) we get the following transformation
equations:F1 (q; Q; t) p = @F1
@q; P = �@F1
dQ
F2 (q; P; t) p = @F2@q
; Q = @F2@P
F3 (p;Q; t) q = �@F3@p
; P = �@F3@Q
F4 (p; P; t) q = �@F4@p
; Q = @F4@P
1.3 Poisson Brackets
The Poisson bracket of two (arbitrary) functions, F;G with respect to canonically conjugate
pair is de�ned by
[F;G]q;p �NXk=1
�@F
@qk
@G
@pk� @F
@pk
@G
@qk
�(33)
The value of the Poisson bracket is independent of which set of conjugate variables we use
to evaluate the partials, so long as they are related by a canonical transformation
[F;G]q;p = [F 0; G0]Q;P (34)
where F 0; G0 are the transformed functions.
If we let F = Q;G = P; then
[Q;P ]Q;P = [Q (q; p) ; P (q; p)]q;p = 1 (35)
The signi�cance of this is that without knowing the form of the generating function for the
canonical tranformation q; p ! Q;P we can test whether a given relationship is canonical
or not. If this holds, then the transformation must be canonical (it is a suÆcient and
necessary condition). You can demonstrate this directly using the formulae of the canonical
transformation.
If we consider an arbitrary number of dimensions, [F;G]q;p = [F 0; G0]Q;P ; and@qk@ql
= Ælk,@qk@pl
= Ælk,... then we have
[Qi; Qk]p;q = 0; [Pi; Pk]p;q = 0; [Pi; Qk]p;q = Æik (36)
1.4 The Sympletic Property of General Canonical Transforma-
tions
(See Appendix A of your book or Goldstein ch. 9)
7
We can work out a suÆcient condition for a general canonical transformation easily using
a separate appoach. First introduce a systematic notation. Let
� =
0BBBBBBBBBB@
p1p2:::
pnq1q2:::
qn
1CCCCCCCCCCA
=
�~p
~q
�(37)
where � is a 2N dimensional vector, ~p; ~q are N-dimensional vectors, and we de�ne
J =
�0 �11 0
�(38)
where 0; 1 are N �N matrices and J is a 2N � 2N matrix. Hamilton's equations using this
notation are
_�i = Jij@H
@�j(39)
Now a few properties of the matrix J : ~J = �J, J�1 = �J, ! J2 = �1 so
(det J)2= (det (�1))2 = (�1)2N = +1 (40)
so
detJ = �1 (41)
Its value is actually +1 as can be seen by row interchange (I won't go through it but you
can write it out for yourself) - anyway you can see for the simple example���� 0 �11 0
���� = 1 (42)
So in summary, J is antisymmetric, has �1 as its square, and has unit determinant. Now let
�i = �i��j; t
�be invertible so that �i = �i
��j; t
�(this implies det
�@�i@�j
�is not identically
zero). Then
H�
��j; t
�= H�
��j (�i; t) ; t
�= H� (�i; t) = H�
��i��j; t
�; t�
(43)
@H�
@�k=@H�
@�i
@�i@�k
(44)
and
_�i =@�i@�j
_�j +@�i@t
=@�i@�j
Jjk@H�
@�k+@�i@t
(45)
8
so
_�i =@�i@�j
Jjk@H�
@�k+@�i@tj�
=@�i@�j
Jjk@H�
@�l
@�l@�k
+@�i@tj�
If it is the case that
(1)@�i@�j
Jjk@�l@�k
= kJil for constant k 6= 0 (46)
and that there is a function F (�i; t) such that
@�i@tj� = Jil
@F
@�l(47)
then@F
@�ljt = �Jli
@�i@tj� (48)
Then we have
_�i = Jil@
@�l(kH� + F ) (49)
and the new Hamiltonian is
K��j; t
�= kH�
��j; t
�+ F
��j; t
�(50)
if the constant k is 1, then we have an ordinary canonical tranformation, and otherwise what
some people call a "general canonical transformation".
In matrix notation if we let
(X)ij =@�i@�j
(51)
then we have
(X)ij (J)jk (X)lk = k (J)il (52)
or
XJ~X = kJ general canonical
XJ~X = J canonical (k = 1)
example:
P = p+mv (P = �; p = �)
Q = q + vt
(a Galilean transformation). Then
X =
�1 0
0 1
�(53)
9
so this is an ordinary cononical transfomation (but time dependent) with k = 1 if we can
�nd an F so that �@F@P@F@Q
�= �
�0 �11 0
��0
v
�=
�v
0
�(54)
so F = Pv is suÆcient. We then construct
K (P;Q; t) = Hp;q((P �mv) ; (Q� vt) ; t) + Pv + const (55)
For a free particle
K =(p�mv)
2
2m+ Pv + const =
P 2
2m(56)
(a symmetry). For the SHO
K =p2
2m+
1
2m!2 (Q� vt)
2(57)
and the solution is
Q = a cos (!t + �) + vt (58)
We supplement the above with two facts that I will not prove:
(1) The suÆcient conditions given above can be shown to be necessary also
(2) Any general canonical tranformation can always be compounded of a linear transfor-
mation
�i = !ij�j (59)
where !ij is a suitably chosen symmetric matrix followed by an ordinary canonical transfor-
mation. Thus, all the interesting transformations are the ordinary ones. In summary;
�i = �i��j; t
�(60)
XJ~X = J (61)
(X)ij =@�i@�j
(62)
K = H + F (63)
@F
@�ijt;all other �i's = �Jij
@�j
@tj� (be careful to note what is held �xed) (64)
_�i = Jij@K
@�j(65)
Note, if the transformation doesn not depend on time, take F = 0 so
K (�i; t) = H��j (�i) ; t
�(66)
in other words, just the analog of what happens in Lagrangian theory.
10
1.5 The Hamilton-Jacobi Equation
The Hamilton-Jacobi Equation is an important example of how new information about
mechanics can come out of the action by considering various kinds of variation of the tra-
jectories.
From the de�nition of the action, it is easy to see that it can be considered to be an
ordinary function of the variables q(2)
i ; t(2); q(1)
i ; t(1) by using actual motions connecting these
points in time-con�guration space:
S�q(2)
i ; t(2); q(1)
i ; t(1)�=
Z t(2)
t(1)L (qi; _qi; t) dt (67)
where qi (t) ; _qi (t) are the actual motions satisfying Lagrange's equations with the boundary
condition q(2)
i = qi�t(2)�; q
(1)
i = qi�t(1)�.
As an example:
L =1
2m _x2
H =p2
2m; p = m _x
then
x (t) = x(1) +x(2) � x(1)
t(2) � t(1)
�t� t(1)
�(68)
So
S =
Z t2
t1
1
2m
�x(2) � x(1)
t(2) � t(1)
�2
dt
=1
2m(x(2) � x(1))2
t(2) � t(1)
Now it is interesting to notice that
@S
@x(2)= m
x(2) � x(1)
t(2) � t(1)= p(2)
@S
@x(1)= �mx(2) � x(1)
t(2) � t(1)= �p(1)
and also
@S
@t(2)=
�12m
�x(2) � x(1)
t(2) � t(1)
�2
= �H(2)
@S
@t(1)=
1
2m
�x(2) � x(1)
t(2) � t(1)
�2
= H(1)
(note that in the above the H's could have been L's in this simple, ambiguous example).
11
Let's work out the general case of a change in S when we wiggle each of the 2n + 2
variables q(1)
i ; q(2)
i ; t(1); t(2). Consider the following picture
Then
dS =
Z t(2)+dt(2)
t(1)+dt(1)L (qi (t) + dqi (t) ; _qi + d _qi; t) dt�
Z t(2)
t(1)Ldt
= L(2)dt(2) � L(1)dt(1) + p(2)
i dqi�t(2)�� p
(1)
i dqi�t(1)�
The term p(2)
i dqi�t(2)�� p
(1)
i dqi�t(1)�comes from the case when the ends t(1;2) are �xed and
so the dq's are the changes at these times.
But dqi�t(1)�6= dq
(1)
i (see the picture). In fact,
dq(1)
i = dqi�t(1)�+ _qi
�t(1)�dt(1) (69)
and similary for (2). So
dS = L(2)dt(2) + p(2)
i
�dq
(2)
i � _qi�t(2)�dt(2)
�� L(1)dt(1) � p
(1)
i
�dq
(1)
i � _q(1)
i dt(1)�
= (pidqi �Hdt)jt(2)t(1)
So we have, with a slight change of notation in which you can think of (qi; t) as the free
variable and�q(0)
i ; t(0)�as initial constants
dS = pidqi �Hdt��p(0)
i dq(0)
i �H(0)dt(0)�
(70)
We can get the result@S
@qi= pi ;
@S
dt= �H (71)
12
and@S
@q(0)
i
= p(0)
i ;@S
dt(0)= H(0) (72)
very easily. Think of S as an inde�nite integral (i.e. t as a variable)
S =
Z t
Ldt =
Z t
(�H + pi _qi) dt (73)
sodS
dt= (�H + pi _qi) (74)
But S = S (qi; t), sodS
dt=@S
@qi_qi +
@S
@t= �H + pi _qi (75)
so@S
@t= �H;
@S
@qi= pi (76)
and we also have @S
@q(0)
i
= p(0)
i ; @Sdt(0)
= H(0).
Hamilton was intrigued by these equations, and he noticed the following. Remember
that
H = H (pi; qi; t) (77)
and so@S
@t+H
�@S
@qi; qi; t
�= 0 (78)
which is a non-linear partial di�erential equation for the function S (qi; t). (since its non-
linear, the sums of solutions are not necessarily solutions). Lets see how this works for the
simplest possible case:
H =p2
2m(79)
so@S
@t+
1
2m
�@S
@q
�2
= 0 (80)
S for the free particle is S = 1
2m
(x�xo)2
t�to
@S
@t= �1
2m
�x� xo
t� to
�2
@S
@x= m
x� xo
t� to
so combining
1
2m
�@S
@q
�2
=1
2m
�x� x0
t� to
�2
(81)
and Hamilton's partial di�erential equation works.
13
But what about the converse. Can you use Hamilton's PDE to calculate an S and use it
to solve mechanics problems? The answer is yes if you happen to �nd the "right" solution.
But, PDE's have an in�nity of solutions . For example, the above solution to the PDE can
be found by a familiar separation of variables procedure. Assume
S = X (x)T (t) (82)
so Hamilton's PDE becomes
XT 0 +1
2m(X 0T )
2= 0 (83)
where prime means di�erentiation wrt the function's argument. Then divide by XT 2 (as-
sumed non-zero) to get
T 0
T 2+
1
2m
X 02
X= 0 (84)
Since x; t are independent and we have f (t) +g(x)
2m= 0 we must have
f(t) = �k 1
2m= const
g (x) = +k
so
2mT 0 = �kT 2;dT
T 2= �kdt 1
2m(85)
so
1
T=
kt
2m+ a
T =1
k2m
(t� to)
X 0 =pkX
dXpX
=pkdx
2pX =
pkx + b
or
X =1
4k (x� xo)
2(86)
and
S = XT =1
4k (x� xo)
2
k2m
(t� to)=m
2
(x� xo)2
(t� t0)(87)
which is just the same as we got by direct integration.
But what if we had assumed
S = T +X (88)
14
so
T 0 +1
2m(X 0)
2= 0
1
2m(X 0)
2= k
! X =p2mkx+ a (89)
so
T 0 = �k; T = �kt + b (90)
so
S =p2mkx� kt+ const (91)
which is a whole lot di�erent than the previous solution! Note that this function is in fact
an inde�nite integral of the Lagrangian of a free particle with motion
x = vt + xo (92)
so
_x = v (93)
S =
Z1
2m _x2dt =
1
2mv2t+ const (94)
which is equal top2mkx � kt =
p2mkvt � kt +
p2mkxo = 1
2mv2t + const if 1
2mv2 =p
2mkv � k and const =p2mkxo
This simple example raises the question { How do we �gure out which solutions of Hamil-
ton's PDE are associated with actual motion and how do you extract that motion from the
function S?
Hamilton did not manage to solve the problem, but Jacobi did, and so the PDE is now
known as the Hamilton-Jacobi equation. We can understand what Jacobi produced if we
remember that if we write
S�qi; t; q
(0)
i ; t(0)�
(95)
(think of q(0)
i ; t(0) as constants) we also get
@S
@t(0)= H
�p(0)
i ; q(0)
i ; t(0)�
@S
@q(0)
i
= �p(0)i
or when we di�erentiate S with respect to constants t0; q(0)
i we get other constants, p(0)
i ; H0.
So Jacobi's rule for a system with N degrees of freedom:
(1) Find any solution S of the HJ equation which depends on N+1 arbitrary algebraically
independent (see de�nition below) constants,
S = S (qi; t; ai) + A i = 1; :::N (96)
15
(one of the constants is always additive and ignorable).
(2) Obtain the EOM by setting@S
@ai= bi (97)
where the bi are N more constants, and solve for the
qi = qi (t; aj; bk) i; j; k = 1::::N (98)
By alegebraic independence we mean that the N �N matrix
@2
@qi@aj(99)
has a determinant that is not identically zero so that the matrix "usually" has an inverse
(there may be singular points where there is no inverse { i.e. the determinant is zero). So
if you get a solution with ak, k = 1; :::N � 1 you cannot �ll out the set of constants by for
example splitting one of these into the sum of two new ones, so
anew =�a1; a2; :::; a
0n�1
; a00n�1
�(100)
where a0
n�1+ a00n�1
= aoldn�1. Clearly in this case two rows (if j determines rows) of @S
@qi@anewj
will be identical and the determinant is identically zero.
Check the theorem in two simple cases: L = 1
2m _q2 so
S =1
2m(x� xo)
2
t� to(101)
take xo as a1: then@S
@a1=
@S
@xo= �mx � xo
t� to= b1 (102)
Which gives a solution.
If we take to as a1: then
@S
@a1=@S
@to=
1
2m
�x� x0
t� to
�2
= b1 (103)
this also gives a solution
S =p2mkx� kt+ const (104)
take k as a1, so
@S
@a1=@S
@k=
1
2
r2m
kx� t = b1 (105)
or
x = 2
rk
2mt+ const (106)
{ a solution.
16
We can easily prove Jacobi's theorem:
d
dt
@S
@ai= 0 =
@2S
@ai@qj_qj +
@2S
@ai@t(107)
But@S
@t= �H
�@S
@qi; qi; t
�(108)
only pi =@S@qi
depends on the a's, so
@2S
@ai@t= �@H
@pj
@2S
@qj@ai(109)
so we get@2s
@ai@qj
�_qj �
@H
@pj
�= 0 = S
�_q� @H
@p
�(110)
If the matrix @2S@ai@qj
is invertible, then the only solution to these equations is _qj =@H@pj
, half
of of Hamilton's equations.
Next from
pi =@S
@qi(111)
get
_pi =d
dt
�@S
@qi
�
=@2S
@qi@qj_qj +
@2S
@qi@t
=@
@qi
�@S
@qj_qj +
@S
@t
�
=@
@qi(pj _qj �H)
and
_pi =@L
@qi(112)
which is just Lagrange's equation.
A very important special case is that in which the Hamiltonian is conserved (i.e. inde-
pendent of t) and since it is often the energy, call the conserved value of H the constant E,
and then S must be linear in t. Then
S = �Et +W (qi) (113)
satis�es the H-J equation if we have
�E +H
�@W
@qi; qi
�= 0 (114)
here E is a constant. You can consider this another proof of H = const if @H@t
= 0.
17
1.5.1 The Relationship of the H-J Equation to Quantum Mechanics
The H-J equation is an important step to formulating a wave equation for particles. Write
a solution of the H-J equation for a free particle in 3-D as
S = p � x� Et (115)
where the three components of p are the three constants. By the H-J equation
@S
@t+H
�@S
@xi; xi
�= �E +
p � p2m
= 0 (116)
we get the expected E = p2
2m. We get the EOM by
@S
@pi= xi �
@E
@pit = xio (constant) (117)
or
xi =pi
mt+ xi0 (118)
However, if you look at S you see that at a �xed moment in time, a �xed value of S de�nes
a plane and the vector p is normal to it. At a later time t +�t a plane of constant S will
still have p as its normal and it will be displaced along p by and amount Ejpj�t. If this is
not obvious, adopt the coordinate system so p = pex and we have
S = px� p2
2mt = p (x +�x)� p2
2m(t+�t) (119)
so �x�t
is the volocity at which the plane moves = p
2m. So we see that the particle trajectories
are normal to surfaces of constant S (although the planes of constant S travel at half the
speed of the particle). This may seem disappointing at �rst, until you notice that the group
velocity of the wave is _p
m. The particle trajectories have the same geometric relation to the
planes of constant S as the rays of optics do to planes of constant wave phase.
This suggests considering S as a wave phase factor
= oeiS~ (120)
where ~ is a scale factor with dimensions of S = (energy � time) or (momentum � distance).
No prejudice about its value follows from classical mechanics except that if in fact classical
physics is the ray approximation to some underlying wave mechanics, then ~ must be very
tiny, since it took almost 100 years from the time that Hamilton did the above for optics
before any wave character was experimentally detected for a material particle.
From
= oeiS~ and S = p � x� Et (121)
what equation does satisfy? We get
r =i
~ rS (122)
18
take the divergence of both sides
r � (r ) = r2 =i
~r �rS+ i
~ r2S (123)
for the free particle i~ r2S = 0, and
r2 =i
~r �rS =
�i
~
�2
(rS)2
(124)
and@
@t=i
~ @S
@t(125)
using the H-J equation with H = p2
2m
@
@t=i
~
1
2m(rS)
2=i
~
1
2m
��~2r2
�(126)
or
i~@
@t= � ~
2
2mr2 (127)
A familiar equation!
In fact the relationship between the wave equation and the H-J equation was well known
to Hamilton, his contemporaries and his successors. In optics, the analog to the H-J equatino
is called the eikonal equation. The S function is the icon or representative of the wave
function. The function eiS~ for suitable "~" in the optics case gives an approximation to
the wave equation in certain cases. It is the phase of a wave that carries most of the really
wave-like character of a wave (think of interference) and so it is usually the most interesting
physically.
It is interesting to inquire under what conditions the solution S to the H � J equation
makes a good approximation to the phase of the solution of the Schroedinger equation.
Without proof, if
� ~
2mr2 + V = i~
@
@t(128)
and we take
= A (r) eiS(r;t)
~ = A (r) ei~(W (r)�Et) (129)
then �1
2m(rS)
2+ V � E
�=
~
2m
�~r2A
A+ 2i rW � rA
A+ ir2W
�(130)
The r.h.s. can be approximated by zero when ~ is very small compared to other quantities
with the same dimensions in the physical problem. Thus for macroscopic bodies this is the
case since
~~ (mass of an electron) � cms� cm (131)
19
It is useful to associate a wavelength with W (x), which in 1-D
1
~W (x) =
1
~W (xo) +
1
~(x� xo)
dW
dxjxo + ::::
= W (xo) +x� xo
�o
where 1
�o= 1
~
dWdx
so
ei1~(W (x)�Et) �= ei(
x�xo�o
�!t) (132)
where ! = E~. Then the r.h.s. can be shown to be well approximated by zero if
�o
����dVdx����� kinetic energy (133)
i.e. the potential is slowly varying in space. In this case, the classical action S yields a
reasonable approximation to the phase function of the quantum mechanical wave function.
This corresponds to the WKB limit where the potential is essentially constant over many de
Broglie wavelengths.
1.6 Adiabatic Invariants
This section follows the derivation of Landau and Lifshitz (see pp. 154�).
Adiabatic invariants are quantities that remain essentially constant in a system that is
not closed, but where some parameter varies slowly. We're going to consider systems that are
strictly periodic and conservative when they are "closed" (i.e. when we keep all the system
parameters constant), and consider what we can say about the system as we slowly vary one
parameter. Speci�c examples of such systems would be a pendulum where we slowly change
the length of the string (by slow we mean slow compared to the natural frequency), or a
mass on a spring with a slowly changing spring constant. When you change the length of
the string in a pendulum you know ! increases, and E changes, but can we construct some
combination of parameters that stays essentially constant.
Call the parameter we vary �, and let it vary slowly (adiabatically) with time as the
result of some external action. If the period is T , we require
Td�
dt� � (134)
The energy E changes slowly (if we average over the rapid oscillations of the system) with
time as � changes, and dEdt
is then some function of �. This dependence can be expressed as
the constancy of some combination of E and � (called an adiabatic invariant) which remains
constant during the motion of a system with slowly varying parameters.
We can write the Hamiltonian as H (q; p;�), and the rate of change of the energy is
dE
dt=@H
@t=@H
@�
@�
@t(135)
20
Now @H@�
@�@t
depends on the rapidly varying q; p as well as on the slowly varying �. We want
to average over the rapid (periodic) variations resulting from the oscillatory motion to isolate
the slow variations in �dE
dt=@�
@t
@H
@�(136)
where we average over the rapidly varying (oscillating) H, but during our averaging time,
� remains essentially constant, so we can pull the @�@t
out of the averaging. Furthermore,
in averaging H we consider q; p to vary and � to be constant. We are essentially averaging
over the motion of the closed system (what would happen with � constant).
The average is
@H
@�=
1
T
Z T
0
@H
@�dt (137)
From Hamilton's equations, _q = @H@p, and we can change the integral over time to one over
q: dt = dq�@H@p
��1
T =
Z T
0
dt =
Idq
�@H
@p
��1
(138)
where the integral over q is taken over the complete range of variation of the coordinate
during one cycle.
So
dE
dt=@�
@t
@H
@�=@�
@t
Hdq�@H@�
� �@H@p
��1
Hdq�@H@p
��1(139)
Since we are taking the averages for constant �, H = E=const also over the integral, and
p is a de�ned function of q; E; �, p = p (q;E; �). So
H (q; p; �) = E (140)
anddH
d�= 0 =
@H
@�+
�@H
@p
��@p
@�
�(141)
@H@�@H@p
= �@p@�
(142)
if we substitute this into the expression for dEdt
dE
dt= �d�
dt
H �@p
@�
�dqH �
@p
dE
�dq
(143)
so I �@p
@E
dE
dt+@p
@�
d�
dt
�dq = 0 (144)
21
or exchanging the order of integration and di�erentiation
d
dt
Ipdq = 0 (145)
If we de�ne
I �I
1
2�pdq (146)
thendI
dt= 0 (147)
Remember the integral is over a period with constant E; �. So in this approximation I
remains constant when � varies.
I is the adiabatic invariant we have been looking for, and is a function of E; �. If we
look at the partial derivative with respect to E
2�@I
@E=
I@p
@Edq = T =
!
2�(148)
we see that the partial is related to the period.
The geometrical signi�cance of I is that it is related to the area of phase space enclosed
by the curve
I =1
2�
Ipdq =
1
2�
Z Zdp dq (149)
As a simple example, compute I for a simple harmonic oscillator with natural frequency
!:
H =1
2
p2
m+
1
2m!2q2 (150)
Since H = E = const, the phase path is an ellipse with semi-axesp2mE on the p axis andp
2E=(m!2) on the q axis, and the area divided by 2� is
I = E=! (151)
The signi�cance of this is that when the parameters of the oscillator vary slowly, the energy
is proportional to the frequency.
Suppose the SHO starts with maximum amplitude qo and mass mo at t = 0. Assume its
mass is gradually ("adiabatically") increased and its spring constant k is held �xed. What
is the amplitude q when the mass is m?
Eo =1
2kq2o
!o =
rk
mo
so
I =1
2kq2o
rmo
k=
1
2kq2rm
k(152)
22
or
q2 = q2o
rmo
m
q = qo
�mo
m
�1=4
In fact
q (t) �= qo
�mo
m (t)
�1=4
cos
Z t
0
sk
m (t)dt+ �o
!(153)
Historically the adiabatic invariance ofHpdq was taken to be of great signi�cance during
the early development of quantum mechanics (the "old" quantum mechanics). Remember
Plank's famous hypothesis that for an SHO
En = n~! (154)
(where n is an integer and ~ a constant) are the only allowed energies. As we see above
2�E
!=
Ipdq = nh (by postulate) (155)
is an adiabatic invariant and so remains constant under any kind of variation slow compared
to the period of the oscillation. Think of T = 2�!for optical light frequencies
� =c
�(156)
T =�
c=
500 � 10�9m
3 � 108 m=s = 200 � 10�17 s � 10�15s (157)
so fast macroscopic changes might occur in 10�9 s and still be adiabatic.
Ehrenfest adopted the adhoc principle that quantum conditions should be applied to
adiabatic invariants which Sommerfeld generalized to an "action variable",Hpdq.
1.6.1 Adiabatic Invariant for a Charged Particle in a Magnetic Field
There is a very important adiabatic invariant used in many �elds involving charged particles
moving in magnetic �elds. It can be expressed in many ways, one of which is that the
magnetic moment of a charged particle circulating in a magnetic �eld that changes slowly in
time is invariant. This also applies if a particle drifts along circulating around magnetic �eld
lines in an inhomogeneous magnetic �eld so that the average motion is through a changing
magnetic �eld.
Let's derive this using Hamilton's methods and the de�nition of an adiabatic invariant.
For constant B we need A. Although it is not unique,
A =1
2B� r (158)
23
works. Take B = Bez and use cylindrical coordinates
v =dz
dtez +
�� _��e� + _�e� (159)
So
A =1
2Bez � (�e� + zez) =
1
2B�e� (160)
so A circulates around the z axis and increases linearly in �.
The Lagrangian (refer to our previous discussion of L for a charged particle in a �eld) is
L =1
2m
�_�2 +
�� _��2
+ _z2�+e
c
1
2B�2 _� (161)
The canonical momenta are
p� = m _�
p� = m�2 _�+1
2
e
cB�2
pz = mdz
dt
z is cyclic, � is cyclic, and @L@t
= 0, so pz = const; p� = const; H = E = const. The
Hamiltonian is
H =p2�
2m+
p2z2m
+
�p� � 1
2
ecB�2
�22m�2
= E (162)
Thus the motion, treated as a trajectory in phase space moves on "the energy shell" de�ned
by
E � p2z2m
= E? =p2�
2m+
1
2m
�p�
�� 1
2
e
cB�
�2
= const (163)
24
This algebraic equation gives a closed curve in the (�; p�) plane that is quite simple in just
one case: the special case in which p� = 0. It is the only case in which � = 0 is allowed, or
the orbit passes through the z-axis.
For p� = 0;
E? =p2�
2m+
1
2m
�1
2
e
cB�
�2
(� � 0) (164)
This is the equation for half an ellipse, so we have the following geometry in phase space:
and in con�guration space
and
p� = 0! _� = �1
2
eB
mc(165)
and
2 _� = !c =eB
mc(166)
In this simple case we get out the cyclotron frequency, and we also get the adiabatic invariant:
2�I =1
2��
maxp�max
=1
2�p2mE?
11
2!cm
p2mE?
= 2�E?
!c
Since it is always possible to choose the origin of the coordinates so that the elliptical orbit
of the particle passes through the z-axis, the above result is in fact the general case for the
25
adiabatic invariant of a charged particle in a B��eld. We can express it in various ways
(see HW):
a) the radius of a particle's orbit changes inversely as the square root of the magnetic
�eld.
b) the amount of magnetic ux linking the particle's orbit is a constant
c) the magnetic momentof the circulating charged particle is a constant.
1.7 Action-Angle Variables
We'll now consider periodic systems where � is constant, so that the system is closed. We
want to perform a cononical transformation from the old q; p to a conjugate pair where I is
the new "momentum". The generating function that will get us there is
F =W (q; E;�) =
Zp (q; E;�) dq (167)
taken for a given constant E and �. For a closed system, we can replace E with I, since it
is a function of the energy, and we can write W =W (q; I;�) and
@W
@qjE =
@W
@qjI (168)
and so
p =@W (q; I;�)
@q(169)
26
(from the formulae for canonical transformations). We can use the other relation for canon-
ical transformation to get the "coordinate" variable
=@W (q; I;�)
@I(170)
So I and are canonical variables, where I is called the action variable, and the angle
variable.
We are considering a conservative system with a time-independent Hamiltonian, and we
have therefore used a generating not explicitly dependent on time. The new H 0 is therefore
just H expressed in terms of the new variables. H 0 is just E (I) expressed as a function of
the action variable. Hamilton's equations in the action-angle variables are
dI
dt= 0; _ =
dE (I)
dI(171)
The �rst shows that I is constant (as we knew). The second equation shows that is
linearly increasing with time
=dE
dIt+ const = ! (I) t + const (172)
and we equate it with the phase of the oscillations.
W (q; I) is a many-valued function of the coordinates which increases each period by
�W = 2�I (173)
We see this from the de�nition W =Rpdq, and I = 1
2�
Hpdq.
If we express q; p in terms of the action-angle variables, these must remain unchanged
when ! + 2� (with I const). So q; p are periodic functions of with period 2�.
The action-angle variables may seem to be of very limited utility given the restrictions,
but they are important historically and also in numerous astronomical contexts. These
action-angle variables can also be used to formulate the EOM when the system is not closed,
and � = � (t). Then we still have p =@W (q;I;�)
@qand =
@W (q;I;�)
@I, and
W (q; E;�) =
Zp (q; E;�) dq (174)
In the same approximation we used to get the adiabatic invariant, we calculateW (q; E;�) =Rp (q; E;�) dq and I = 1
2�
Hpdq taking � to have a �xed value, so that W (q; E;�) is the
same function it was before, but we then allow � to be � (t).
The generating function is now an explicit function of time, so we get H 0 from
H 0 = E (I;�) +@W
@t
= E (I;�) +@W
@�jq;I _�
where we express @W@�jq;I in terms of I and after di�erentiating with respect to �.
27
Hamilton's equations are then
dI
dt= �@H
0
@ = � @
@
@W
@�jq;I _� (175)
_ =@H 0
@I= ! (I;�) +
�@
@I
�@W
@�jq;I�j ;��_� (176)
where ! =�@E@I
��is the oscillation frequency calculated as if � were constant.
28