Top Banner
Notes on “A first course in General Relativity” (Schutz, 2009) Robert B. Scott, 1,2* 1 Institute for Geophysics, Jackson School of Geosciences, The University of Texas at Austin, Austin, Texas, USA 2 Currently at the National Oceanography Centre, University of Southampton, Southampton, UK *To whom correspondence should be addressed; E-mail: [email protected]. April 17, 2010
47
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: My Notes on Schutz2009

Notes on “A first course in General Relativity”(Schutz, 2009)

Robert B. Scott,1,2∗

1Institute for Geophysics, Jackson School of Geosciences, The University of Texas at Austin,

Austin, Texas, USA2Currently at the National Oceanography Centre, University of Southampton,

Southampton, UK

∗To whom correspondence should be addressed; E-mail: [email protected].

April 17, 2010

Page 2: My Notes on Schutz2009

2

Page 3: My Notes on Schutz2009

Chapter 1

Special Relativity

1.1 Fundamental principles of special relativ-

ity (SR) theory

In the footnote 4 on p. 3, I believe the answer to the first question is “no, thesoup is unaffected by a acceleration experienced by an astronaut in orbit.”This would appear to also cause problems for SR, since how do we know thatan observer is in an inertial frame? The acceleration cannot, necessarily,be measured locally. And there’s no special reference frame with which tomeasure ones acceleration.

1.2 Definition of an inertial observer in SR

Gives a “geometrical” definition of an inertial reference frame, or coordinatesystem.

Notes that gravity makes it impossible to construct such an inertial co-ordinate system.

1.3 New units

Introduces what Misner et al. (1973) called “geometric units”, wherein timeis measured in distance of light travel.

They claim the motivation is that c = 3 × 108m/s in SI, a “ridiculousvalue”. I disagree, since then a 1/3s becomes the ridiculously large 105km!

3

Page 4: My Notes on Schutz2009

4

Perhaps more useful motivation comes from velocity becoming a dimension-less parameter, space-time diagrams having the same units on all axes, andthe world lines of light paths having unit slope.

1.4 Spacetime diagrams

Fig. 1.4: v is of course a vector, so one should replace this with v = |v|.

1.5 Construction of the coordinates used by

another observer

This is an extremely important section. Unfortunately he doesn’t explainwhy the angle of the x−axis to the x−axis is φ = arctan(v), where v = |v|is the magnitude of the velocity of O along the x−axis axis. Rather thisresult appears in Fig. 1.5 without explanation, nor even delegating it as an“exercise for the student”. The result does follow from the construction ofthe x−axis , but the steps involved are not trivial. Below is my attempt ata proof.

Call the unknown angle between the x−axis and the x−axis α. Ex-tend the line from P to R all the way to the t−axis, and call this in-tersection Q. Draw two lines parallel to the x−axis, one through R andwhere it crosses the t−axis, call this U . The other through P and whereit crosses the t − axis, call this T . The events ξ,P ,R form a right trian-gle, with hypotenuse ξR = 2 a. We need the angle at RξP , which turnsout to be χ = π/4 − φ. (Call angle ORQ γ. Then φ + γ + π/4 = π,and angle ξRP , which is π − γ = π/4 + φ. It follows that χ = π/4 −φ.) So now we can compute the length RP = 2a sin(χ) = a

√2(cos(φ) −

sin(φ)). UR = a sin(φ). Then QR = UR/ sin(π/4) = a sin(φ)√

2. Sum-ming the two lengths QP = QR+RP = a

√2 cos(φ). We’re now after

OT = OQ− T Q. But OQ = OU + UQ, with UQ = UR = a sin(φ), andOU = a cos(φ). Also, T Q = QP cos(π/4) = a cos(φ) So OT = a(sin(φ) +cos(φ)) − a cos(φ) = a sin(φ). Note that the sought after angle α satisfiestan(α) = OT /T P = OT /T Q = a sin(φ)/a cos(φ), so α = φ, the desiredresult.

Page 5: My Notes on Schutz2009

Rob’s notes on Schutz 5

1.6 Invariance of the interval

This section purports to provide a proof of the invariance of the interval.But it assumes that the relationship between coordinates in different framesis linear, see discussion before Eqn. 1.2. But that’s not a general proof then!What if they were not linear, e.g.

∆xα = M0αβ∆xβ +M1αβ(∆xβ)2

Then M1 would have units of 1/length.Eqn. 1.2 is a bit sneaky! He claims that “the numbers ∆t,∆x,∆y,∆z

as linear combinations of their unbarred counterparts.” One would expectthen, that

∆xα = Aαβxβ

where the coefficients of the matrix Aαβ give the linear transformation. Andthen the interval would be written

∆s2 = ηαβ∆xα∆xβ

where ηαβ is the metric tensor. And so,

∆s2 = ηαβAαγ∆xγAβµ∆xµ

But instead he wrote down his Eqn. 1.2, which is much less general. I claimhe’s being sneaky because he sounds like he’s starting very general, talkingabout linear combinations, yet he sneaks in a much more restricted relation-ship!

He reduces the relationship between the interval in one frame and anotherto a function of the relative velocities of their origins, see Eqn. 1.5 on p. 10,

∆s2 = −M00∆s2 = φ(v)∆s2.

To show that φ(v) depends only upon direction he considers the case of arod (or two events A and B at the ends of the rod) lying along the y−axis.A and B are simultaneous in O, and he argues that they are therefore alsosimultaneous in O, by constructing the y-axis as he did in Fig. 1.3. Butnow the velocity of O is orthogonal to the constructed axis, so of coursethe simultaneity of events is not changed by the coordinate transform. Theintermediate result is that the space-time interval between A and B in either

Page 6: My Notes on Schutz2009

6

frame is just the square of the length, so their ratio is the sought-after φ(v).Now the subtle point is that he then claims that this ratio cannot depend onthe direction of the velocity, because the rod is perpendicular to it and thereare no preferred directions?! So what??

I think the solution is that v could be in an arbitrary direction in the x−zplane. The ratio of lengths should not depend upon this direction, becausethen there would be preferred directions. But as far as I can tell, this onlyshows that the direction of the component orthogonal to the y−axis cannotinfluence φ(v).

1.7 Invariant hyperbolae

At the end of the section it’s stated that

“The lesson of Fig. 1.12b is that tangent to a hyperbola at anyevent P is line of simultaneity of the Lorentz frame whose timeaxis joins P to the origin. If this frame has velocity v, the tangenthas slope v.”

The above is stated without proof or even hint that there’s some calcula-tion involved. Fortunately it proceeds straightforwardly. We seek the slopeof the tangent to a hyperbola. Differentiate any time-like hyperbola wrt x,to obtain in general

dt

dx=x

t.

At some point P the slope of the tangent wrt the x−axis is xp

tp. Now if the

t−axis is chosen to go through the origin and P its slope wrt the t−axis willalso be xp

tp, corresponding to tan(φ) = v = xp

tp. But we know from Fig. 1.5

that the corresponding x−axis will have slope v relative to the x−axis. Thatis, the tangent is parallel the x−axis, and is therefore a line of simultaneityfor O. QED.

1.8 Particularly important results

Time dilation This was straightforward once one uses the invariant hyper-bolae. The event xB was constructed so that it had t = 1. The corresponding

Page 7: My Notes on Schutz2009

Rob’s notes on Schutz 7

event in O is obtained by tracing the point back to the t−axis along the hy-perbola with the same interval, ∆s2 = −1,

−t2 + x2 = −1

One must also note that the equation for the t−axis is t = x/v. Substitutingthis into the hyperbola,

−t2B + x2B = −1 (1.1)

−t2B + (tBv)2 = −1 (1.2)

tB = 1√1−v2 (1.3)

This gives Eqn. 1.8.Lorentz contraction I still don’t see how he came up with

xC =l√

1− v2

But I obtain the same end result using instead the invariance of the interval,which in O is

∆s2AC = −∆t

2+ ∆x2 = −t2 + x2 = 0 + l2 = l2.

and therefore must also be in O. I also used the equation for the x−axis,t = vx. This was confusing at first since the units look wrong! But it’s clearwhen you go back to Fig. 1.5 and note that tan(φ) = v, which was obvious forthe t−axis since the observer O is moving along the x−axis at speed v. Thatthe x−axis was also inclined at the same angle φ was more complicated. Onealso needs a three relation, which is simply xC − xB = vtC. A little algebragives the Lorentz contraction:

xB = l√

1− v2.

1.9 Lorentz transformation

The first step, substituting the equations for theO axes proceeds immediatelyto

t = α(t− v x) (1.4)

x = σ(x− v t) (1.5)

Page 8: My Notes on Schutz2009

8

I had trouble seeing how α = σ from the path of a light ray, so I used theinvariant hyperbolae instead. Substituting (1.4 and 1.5) into the equationfor the interval from the origin, gives,

∆s2 = −t2 + x2 = ∆s2 = −t2 + x2

The cross term on the RHS involving x t must be zero, giving that α2 = σ2.Equating either of the other terms gives the Lorentz factor,

α2 =1√

1− v2

As Schutz (2009, p. 22) points out, the positive root is selected so that thecoordinates are not inverted when v = 0.

In retrospect, it is clear how the path of a light ray gives α = σ. Sim-ply note that the world line of line ray has ∆x = ±∆t and ∆x = ±∆t.Substitution into (1.4 and 1.5) gives,

σ

α

(∆x− v∆t

∆t− v∆x

)=

∆x

∆t= 1.

So σ = α.

The Lorentz transformation is often said to reduce to the Galilean trans-formation in the limit v � 1, but that’s not strictly true. Unlike for theGalilean transformation, in the Lorentz transformation time is affected atlarge distances even for small velocities.

1.10 Velocity composition law

1.11 Paradoxes and physical intuition

1.12 Further reading

A more thoughtful look at fundamentals, Bohm (2008).

Page 9: My Notes on Schutz2009

Rob’s notes on Schutz 9

1.13 Appendix: the twin paradox dissected

1.14 Exercises

1.14.1 Convert to geometric units

a)

10 J = 10N m = 10kg m2/s2 = 10/9× 1016kg = 1.1× 10−16kg.

b)

100W = 100J/s = 1.1×10−15kg/s = 1.1×10−15/3×108kg/m = 0.3×10−23kg/m

c)

~ = 1.05× 10−34J s =1.05× 10−34J s

3× 108m/s= 0.33× 10−42kg m

d) Car velocity [108 km/hr]

v = 30m/s = 10−7

e) Car momentum

p = 30m/s× 1000kg = 10−4kg

f) Atmospheric pressure,

1bar = 105N m−2 =105kg m s−2

9× 1016m4s−2= 1.1× 10−12kg m−3

g) water density

103kgm−3

h) Luminosity flux

106J s−1cm−2 = 1010J s−1m−2 =1010kg s−3m−1

33 × 1024m3s−3≈ 4× 10−16kg m−4

Page 10: My Notes on Schutz2009

10

1.14.6 Show that Eq. (1.2) contains only Mαβ + Mβα

when α 6= β, not Mαβ and Mβα independently.Argue that this allows us to set Mαβ = Mβα with-out loss of generality.

∆s2 =3∑

α=0

3∑β=0

Mαβ(∆xα)(∆xβ)

Pick a pair of indices, α = α′ and β = β′ say, where α′ 6= β′, and α′ ∈ {0 . . . 3}and β′ ∈ {0 . . . 3}. So ∆s2 contains a term like,

Mα′β′(∆xα′)(∆xβ

′).

But ∆s2 also contains a term like,

Mβ′α′(∆xβ′)(∆xα

′) = Mβ′α′(∆x

α′)(∆xβ′).

The equality follows because of course the product does not depend uponthe order of the factors. So we can group these two terms and factor out the(∆xα

′)(∆xβ

′) leaving,

(∆xα′)(∆xβ

′)(Mα′β′ +Mβ′α′)

Because the off-diagonal terms always appear in pairs as above, we couldwithout changing the interval (and therefore without loss of generality) re-place them with their mean value

Mαβ ≡ (Mαβ +Mβα)/2

Thus the new tensor Mαβ is by construction symmetric.

1.14.7 In the discussion leading up to Eq. (1.2), as-sume that the coordinates of O are given as thefollowing linear combinations of those of O:

t = αt+ βx, (1.6)

x = µt+ υx, (1.7)

y = ay, (1.8)

z = bz, (1.9)

Page 11: My Notes on Schutz2009

Rob’s notes on Schutz 11

where α, β, µ, υ, a, and b may be functions of the velocity v of O relative to O,but they do not depend on the coordinate. Find the numbers {Mαβ, α, β =0, . . . 3} of Eq. (1.2) in terms of α, β, µ, υ, a, and b.

First note that the origins of the two coordinate systems line up, andthat ∆t = t etc. Then the result follows from straightforward substitutionof (1.6) to (1.9) into Eq. (1.1)

∆s2 = −∆t2

+ ∆x2 + ∆z2 + ∆z2 (1.10)

= −(α∆t+ β∆x)2 + (µ∆t+ υ∆x)2 + (a∆y)2 + (b∆z)2 (1.11)

Grouping terms we find that (−α2 +µ2) multiplies ∆t2, so M00 = (−α2 +µ2).Similarly, the term multiplying ∆x2 is M11 = −β2 +υ2. The cross terms giveM01 = M10 = −αβ + µυ, and the remaining diagonal terms are M22 = a2,M33 = b2.

1.14.8 a) Derive Eq. (1.3) from Eq. (1.2) for generalMαβ.

Start with Eq. (1.2)∆s2 = Mαβ∆xα∆xβ.

Substituting

∆s2 = M00∆t2 +M0i∆xi∆t+Mi0∆xi∆t+Mij∆x

i∆xj

Note that Mi0 = M0i (problem 6). Consider case ∆s2 = 0, so from Eq. (1.1),∆t = ∆r =

√∆x2 + ∆y2 + ∆z2. Then,

∆s2 = M00∆r2 + 2M0i∆xi∆r +Mij∆x

i∆xj

which is Eq. (1.3).b) Since ∆s2 = 0 in Eq. (1.3) for any {∆xi}, replace ∆xi by −∆xi

in Eq. (1.3) and subtract the resulting equations from Eq. (1.3) toestablish that M0i = 0 for i = 1, 2, 3.

We have set ∆s2 = 0 and it followed, based upon the universality of thespeed of light, that ∆s2 = 0. Note that changing ∆xi to −∆xi does notchange ∆r nor ∆s. So that’s why ∆s2 = 0 in Eq. (1.3).

The only term in Eq. (1.3) to change sign when changing ∆xi to −∆xi

is the 2M0i∆xi∆r term. The final term doesn’t because changing ∆xi to

Page 12: My Notes on Schutz2009

12

−∆xi also changes ∆xj to −∆xj; the i is just a dummy index. So when wesubtract from Eq. (1.3) the following

∆s2 = M00∆r2 − 2M0i∆xi∆r +Mij∆x

i∆xj

we’re left with0 = 4M0i∆x

i∆r.

This must be true for arbitrary ∆xi so M0i = 0. QED.c) Derive Eq. (1.4)bRequired to show:

Mij = −M00δij, (i, j = 1, 2, 3).

Adding to Eq. (1.3) the following

0 = ∆s2 = M00∆r2 − 2M0i∆xi∆r +Mij∆x

i∆xj

gives,0 = M00∆r2 +Mij∆x

i∆xj (1.12)

Suppose, ∆x = ∆r,∆y = ∆z = 0. Substituting into (1.12) then givesM00 = −M11. Or, when ∆y = ∆r,∆x = ∆z = 0, we see that M00 = −M22.Similarly, M00 = −M33. To see that the off-diagonal terms are zero, notethat it’s also possible that ∆x = ∆y = ∆r/

√2 and ∆z =. Substitution into

(1.12) gives that

0 = (M12 +M21)∆r/2 = ∆rM12 = 0

Similarly, M13 = 0 = M23. In summary,

Mij = −M00δij, (i, j = 1, 2, 3).

which is Eq. (1.4)b. QED.

1.14.18 a) Show that velocity parameters add linearly,b) apply to a specific problem

Define the velocity parameter W through w = tanh(W ).Want to show the velocity addition law,

w′ =u+ w

1 + wu

Page 13: My Notes on Schutz2009

Rob’s notes on Schutz 13

implies linear addition of velocity parameters. Simply substitute the defini-tion of velocity parameter,

w′ =tanh(U) + tanh(W )

1 + tanh(U) tanh(W )(1.13)

=(tanh(U) + tanh(W )) cosh(W ) cosh(U)

cosh(W ) cosh(U) + sinh(U) sinh(W )(1.14)

The numerator can be written as,

N = sinh(W ) cosh(U) + cosh(W ) sinh(U)

so that

w′ =sinh(W ) cosh(U) + cosh(W ) sinh(U)

cosh(W ) cosh(U) + sinh(U) sinh(W )

The following identities are useful:

cosh(a) cosh(b) =

(exp(a) + exp(−a)

2

)(exp(b) + exp(−b)

2

)=

exp(a+ b) + exp(−(a+ b))

4+

exp(a− b) + exp(−(a− b))4

=cosh(a+ b)

2+

cosh(a− b)2

(1.15)

sinh(a) sinh(b) =

(exp(a)− exp(−a)

2

)(exp(b)− exp(−b)

2

)=

exp(a+ b) + exp(−(a+ b))

4− exp(a− b) + exp(−(a− b))

4

=cosh(a+ b)

2− cosh(a− b)

2(1.16)

sinh(a) cosh(b) =

(exp(a)− exp(−a)

2

)(exp(b) + exp(−b)

2

)=

exp(a+ b)− exp(−(a+ b))

4+

exp(a− b)− exp(−(a− b))4

=sinh(a+ b)

2− sinh(a− b)

2(1.17)

Page 14: My Notes on Schutz2009

14

Using (1.15) and (1.16) the denominator above simplifies to D = cosh(U+W ). Using (1.17) the numerator simplifies to N = sinh(U +W ). So,

w′ = tanh(U +W )

which reveals that we can linearly add velocity parameters, then apply tanhto reduce the final parameter to the final velocity.

b Velocity of 2nd star relative to first, u2 = 0.9. Velocity of nth starrelative to (n-1)th, un − un−1 = 0.9. So the nth star relative to the first is,

u′N = tanh[(N − 1)U ]

where 0.9 = tanh(U).

1.14.19 a) Lorentz Transformation using velocity pa-rameter

t = γt− γvx (1.18)

x = −γvt+ γx

y = y

z = z

Let, v = tanh(V ). Note that the Lorentz factor also simplifies,

γ ≡ 1√1− v2

=(1− tanh2(V )

)−1/2

=

(cosh2(V )

cosh2(V )− sinh2(V )

)1/2

= ± cosh(V ) (1.19)

I’m not sure why we always take the positive root in the Lorentz factor.The final equality follows from the following identity, which is stated

without proof in b).

cosh2(V )− sinh2(V ) =

(exp(V ) + exp(−V )

2

)2

−(

exp(V )− exp(−V )

2

)2

=

(exp(2V ) + exp(−2V ) + 2

4

)−(

exp(2V ) + exp(−2V )− 2

4

)= 1 (1.20)

Page 15: My Notes on Schutz2009

Rob’s notes on Schutz 15

Substituting v = tanh(V ) and (1.19) into (1.18) gives the desired result,

t = cosh(V ) t− sinh(V )x (1.21)

x = − sinh(V ) t+ cosh(V )x

y = y

z = z

1.14.19 b) invariance of the interval using velocity pa-rameter

The given identity is derived above (1.20). Invariance of the interval followsfrom straightforward substitution into (1.21).

∆s2 = −∆t2

+ ∆x2 + ∆y2 + ∆z2

= −(cosh(V )∆t− sinh(V )∆x)2 + (− sinh(V )∆t+ cosh(V )∆x)2 + ∆y2 + ∆z2

= ∆s2 (1.22)

In the final equality, the cross terms cancelled directly while the squaredterms simplified with the identity (1.20).

1.14.19 c) analogy between Lorentz transformation us-ing velocity parameter and Euclidean coordi-nate transformation

Hyperbolic trigonometric functions replace regular trigonometric functions,but the sign changes for the sine term in the Euclidean coordinate transfor-mation and not the sinh term of the Lorentz transformation.

The analog to the interval ∆s2 is the squared distance to the origin.

The analog to the invariant hyperbolae are circles. These could be usedto calibrate axes of the rotated Euclidean frame.

1.14.20 Lorentz tranformation in matrix form

x = Ax

Page 16: My Notes on Schutz2009

16

where

x =

txyz

, x =

txyz

and

A =

[cosh(V ) − sinh(V )− sinh(V ) cosh(V )

]

1.14.21 a) Timelike separated events can be transformedto occur at the same point.

1.15 Additional thoughts

I think it’s worth mentioning that the Lorentz transformation, which is linearby construction, transforms lines to lines. This is easily verified by substi-tuting the equation for a line in O and confirming that it’s also a line inO.

It’s also worth pointing out that a tangent line to a curve in O remains atangent line in O. Of course it would be quite strange if this were not true,but on the other hand it was not immediately obvious to me that it holds.

Page 17: My Notes on Schutz2009

Chapter 2

Vector Analysis in SpecialRelativity

2.1 Definition of a vector

Buried in footnote 2 on p. 35 is an important notational point.

2.2 Vector algebra

Eq. (2.10) introduces a strange notational twist. Apparently enclosing thevectors ~eα with parentheses and writing a superscript β implies that we areforming a tensor from the set of these vectors?

( ~eα)β = δβα

There’s no comment to explain this. Earlier the author explained that thesuperscript notation will become clear when he introduces differential geom-etry. For now I just note that the RHS is the Kronecker delta, which is asecond-rank tensor.

Eq. (2.18) is described as a key formula. Exercise 2.11c is to verify it.

2.4 The four-momentum

Typo on p. 42, in the example, p1 = mv(1− v2)−1/2 should be p1 = mv(1−v2)−1/2.

17

Page 18: My Notes on Schutz2009

18

2.9 Exercises

2 a) α is the dummy index. One equation.

b) ν is the dummy index. µ is the free index. Four equations.

d) ν and µ are free indices, and there are 16 equations. Although theindices are repeated, they’re not repeated in the same factor, and one is notsuperscript.

3 Prove Eq. (2.5). There’s nothing to prove really. It follows immediatelyfrom the definition and notation conventions. In particular, the LHS involvesa sum over all values of the dummy index β ∈ {0, 1, 2, 3}, see p. 34. TheRHS merely spells this out, with the convention that Roman indices like itake all values i ∈ {1, 2, 3}.

4 a) −6 ~A→O (−30, 6, 0,−6)

5 a) Show that the basis vectors are linearly independent. Start with ageneral linear combination aµ,

0 = aµ(~eα)µ = aµδµα

Start with the first component, α = 0. The equation above is 0 = a0 × 1,so a0 must be zero. Similarly for the other components. Since this trivialsolution is the only solution, the basis vectors must be linearly independent.

5 b) The given set is not linearly independent, since the linear combina-tion (−5,−3,+2, 1) gives the zero vector.

6 As in Fig. 1.5, the t and x axes are tilted at an angle φ relative to theirO frame counterparts and toward the world line of the line ray t = x. The

basis vectors are parallel to these O axes. Here tan(φ) = 0.6. For the O theaxes will be tilted even further toward t = x. The angle of this basis vectorsθ can be computed as

tan(θ) = tanh(2arctanh (0.6))

Page 19: My Notes on Schutz2009

Rob’s notes on Schutz 19

7 a) Verify Eq. (2.10). As mentioned above, this is a strange notationaltwist. If we write the basis vectors as row vectors as in Eq. 2.9, then theset form a matrix, and the matrix element is unity when row and columnnumbers are equal, and zero otherwise, i.e. the identity matrix. The RHS ofEq. 2.10 can of course be written as the identity matrix.

7 b) I’ve always thought of Eq. 2.11 as the definition of the vector, so itseems to me a tautology, rather than something to prove. Perhaps it’s worthstating the result in words. If you use the components of the vector, ~A toform the linear combination of basis vectors ~e, i.e.

Aα~eα

then you, of course, recover the vector ~A. In particular, for the first compo-nent, α = 0, the first component A0 multiplies all the basis vectors, but onlythe first one ~e0 contributes since the other basis vectors are all zero in thefirst component. Similarly for the other components.

8 a) The zero vector has the same components in all reference frames.This follows immediately from the use of a linear transformation to go be-tween reference frames. See p. 35, and Eq. 2.7 for the definition of the general(4-) vector and the linear transformation.

8 b) If two vectors have equal components in one frame their componentsare equal in all frames. My first thought is that if their components are equalin a given frame, then they’re the same vector. By the definition of a vector,they are invariant under coordinate transformation. So their components areequal in all other frames. But that doesn’t use 8a.

Using 8a, one could subtract the two equal vectors, giving the zero vec-tor in that frame. Under coordinate transformation, this difference vectorremains they zero vector. Thus their components must be equal in any otherframe.

9 There are 16 terms to write out, which is too much work. It seemsconvincing enough to me to note that for each term in the sum on the LHS,there is a corresponding term on the RHS. In general these terms look like,

ΛαβA

β~eα

Page 20: My Notes on Schutz2009

20

Of course the order of summation doesn’t matter for a finite sum. Substi-tuting specific values for the dummy indices might make this more clear, sayα = 0, β = 1.

10 Prove Eq. (2.13) from

Aα(Λβα~eβ − ~eα) = 0

Choosing any Aα with only one non-zero entry, like (1, 0, 0, 0), or (10, 0, 0, 0),shows straight away that

Λβ0~eβ = ~e0

and similarly (0, 1, 0, 0), or (0, 2, 0, 0), shows straight away

Λβ1~eβ = ~e1.

So repeating this argument gives the result for the other two basis vectors.Perhaps more instructive is to note that this result works for more general

situations. The quantity inside the parentheses is a set of 4 different vectors~vα,

(Λβα~eβ − ~eα) = ~vα

Then view the components of Aα as the components of a linear combinationof this vector ~vα. Now it’s clear that the RHS is not just the number zero,but the 4-vector (0, 0, 0, 0). The linear combination of the set of ~vα must sumto the zero vector for arbitrary components of the linear combination. If thefirst three led to a non-zero vector,

2∑α=0

~vα = (2, 4, 6, 8)

then A3 would have to be chosen so bring this to zero. For example, if~v3 = (1, 2, 3, 4) one would have to choose A3 = −2. But since Aα wasarbitrary so then choosing A3 = +2 would violate the equality. So thismeans that the only way it could work is if

2∑α=0

~vα = 0

Page 21: My Notes on Schutz2009

Rob’s notes on Schutz 21

and ~v3 = 0. One can now repeat this argument for the∑1

α=0 ~vα etc. andshow that all ~vα are the zero 4-vector. And the result Eq. (2.13) holds.

11 (a) Matrix of Λνµ(−v). Exercise 1.20 was to put the Lorentz trans-

formation in matrix form. Note that sinh(−V ) = − sinh(V ), cosh(−V ) =cosh(V ). So we only have to change the sign of the sinh(V ) elements,

Λ =

cosh(V ) sinh(V ) 0 0sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

where v = tanh(V ).

(b) Aα for all α.

A0 = cosh(V )A0 − sinh(V )A1 (2.1)

A1 = − sinh(V )A0 + cosh(V )A1 (2.2)

A2 = A2 (2.3)

A3 = A3 (2.4)

(c) Verify Eq. (2.18). Written out in matrix form Eq. (2.18) becomes,cosh(V ) sinh(V ) 0 0sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

cosh(V ) − sinh(V ) 0 0− sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

=

1 0 0 00 1 0 00 0 1 00 0 0 1

.To show this it’s useful to use the hyperbolic function identity,

cosh2(x)− sinh2(x) = 1.

Eq. (2.18) follows immediately from matrix multiplication. This identityis easy to derive, and can be found at http://en.wikipedia.org/wiki/

Hyperbolic_function#Similarities_to_circular_trigonometric_functions

along with other properties.

Page 22: My Notes on Schutz2009

22

(d) The Lorentz transformation matrix from O to O is just the matrix in(a). Since O is moving to toward increasing x with velocity v with respect toO, then from O point of view O is moving toward increasing x with velocity−v.

(e) Aα for all α.

A0 = cosh(V )A0 + sinh(V )A1 = A0 (2.5)

A1 = + sinh(V )A0 + cosh(V )A1 = A1 (2.6)

A2 = A2 = A2 (2.7)

A3 = A3 = A3 (2.8)

Relation to Eq. (2.18): Multiplying the vector ~A on the left by the Lorentztransformation matrix Λ(v) gives the components in the O frame, Aα =Λαβ(v)Aβ. Multiplying this vector on the right by the Lorentz transformation

matrix Λ(−v) should return the vector to the O frame. And indeed it does,when we use Eq. (2.18) in the final step below:

Λνα(−v)Aα = Aν (2.9)

Λνα(−v)Λα

β(v)Aβ = Aν (2.10)

δνβAβ = Aν (2.11)

(f) Verify that the order applying the transformations doesn’t matter.Physically we know this must be true. Mathematically it works out becauseif we repeat (c) with the matrices in the opposite order, we get the sameresult:

cosh(V ) − sinh(V ) 0 0− sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

cosh(V ) sinh(V ) 0 0sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

=

1 0 0 00 1 0 00 0 1 00 0 0 1

.

(g) Establish that~eα = δνα~eν

I find this a rather strange question. From the definition of the Kroneckerdelta function, Eq. (1.4c), the result is immediately obvious. Another way

Page 23: My Notes on Schutz2009

Rob’s notes on Schutz 23

to see this is that the Kronecker delta can be written as the identity matrix.And of course, writing the vector on the RHS as a column vector, multiplyingby the identity matrix, gives back the original vector.

12 (b) Remember not to add the velocities linearly, but to use the Einsteinlaw of composition of velocities Eq. (1.13), or use the velocity parametersintroduced in Exercise 1.18.

(c) Note that the definition of the magnitude of the vector is analogousto the interval introduced in Chapter 1, see Eq. (2.24).

~A2 = −02 + (−2)2 + 32 + 52 = 38.

(d) The magnitude should be independent of the reference frame, becauseof the invariance of the interval.

13(a) Transformation of coordinates from O to O is can be constructed in

two steps. First transform to O,

Aγ = Λγ(v)µAµ.

Then transform from O to O,

Aα = Λαγ (v′)(Λγ(v)µA

µ).

So the Lorentz transformation from O to O is

Λαµ = Λα

γ (v′) Λγ(v)µ.

(b) I thought we just did show that Eq. (2.41) was the matrix product ofthe two individual Lorentz transformations. Maybe he means write it out inmatrix form? I’m not sure what he’s looking for.

(c) The was an important exercise for me because I learned that theLorentz transformation matrix did not have to be symmetric when there are

Page 24: My Notes on Schutz2009

24

velocity components in two directions.

Λαµ =

γ(v)γ(v′) −γ(v)v −γ(v)γ(v′)v′ 0−γ(v)vγ(v′) γ(v) γ(v)vγ(v′)v′ 0−γ(v′)v′ 0 γ(v′) 0

0 0 0 1

.

(d) Show that the interval is invariant under the above transformation.

(e) Show that the order matters in constructing the Lorentz transforma-tion as in (a), i.e.

Λαγ (v) Λγ(v′)µ 6= Λα

γ (v′) Λγ(v)

Using the example from (c), the LHS of the above would be,

LHS =

γ(v′)γ(v) −γ(v′)v′ −γ(v′)γ(v)v 0−γ(v′)v′γ(v) γ(v′) γ(v′)v′γ(v)v 0−γ(v)v 0 γ(v) 0

0 0 0 1

6= Λαµ

Comparison with the matrix in (c) shows it’s different. This is surprising ifwe think in a Galilean way. However, mathematically we know in generalthat matrix multiplication is not commutative, http://en.wikipedia.org/wiki/Matrix_multiplication#Common_properties. Physically we knowthat the Lorentz transformation results in the axes tilting toward the t = xline, as in Fig. 1.5. The order of rotations matters. For example, rotatingthe globe 90◦ to the east about the polar axis, then 45◦ clockwise about theaxis through the Equator and 90◦W and 90◦E, puts the coordinates 0◦N, 0◦Ewhere the South Indian Ocean used to be. But performing the same rotationsin the opposite order leaves the coordinates 0◦N, 0◦E on the old Equatorialplane.

14 (a) v = −3/5 in the positive z direction. The off-diagonal term givesthe direction, −vγ = 0.75, and the diagonal term gives γ = 1.25. One canconfirm that γ = 1/

√1− v2, once v is found.

(b) Since it’s a Lorentz transformation, the inverse should be obtained by

Page 25: My Notes on Schutz2009

Rob’s notes on Schutz 25

from the Lorentz transformation from O back to O.

Λ(−v) =

1.25 0 0 −0.75

0 1 0 00 0 1 0

−0.75 0 0 1

And matrix multiplication confirms this is the inverse.

(c) 1.25 0 0 −0.75

0 1 0 00 0 1 0

−0.75 0 0 1

1200

=

1.25

20

−0.75

15 (a) The particle 3-velocity is v = (v, 0, 0). In the frame moving

with the particle, the 4-velocity is ~e0, so ~A →O (1, 0, 0, 0). The Lorentztransformation back to the O frame is

Λ(−v) =

γ(v) v γ(v) 0 0v γ(v) γ(v) 0 0

0 0 1 00 0 0 1

.So ~A in the O frame has components ~A→O (γ(v), vγ(v), 0, 0).

(b) For general particle 3-velocity is v = (u, v, w). Let’s start with aslightly less general 3-velocity is v = (u, v, 0) to make the algebra easier.One could rotate through an angle θ to a frame where v = (|v|, 0, 0). Here θis such that [

u′

v′

]=

[cos(θ) sin(θ)− sin(θ) cos(θ)

] [uv

]=

[|v|0

]Now we have the situation as in (a) so we can apply the Lorentz transforma-tion back to the O frame

Λ(−v) =

γ(|v|) |v| γ(|v|) 0 0|v| γ(|v|) γ(|v|) 0 0

0 0 1 00 0 0 1

.

Page 26: My Notes on Schutz2009

26

So ~A in a frame moving with the O frame but rotated through θ has compo-nents ~A →O (γ(|v|), |v|γ(|v|), 0, 0). Finally we rotate through −θ to obtain~A in the O frame

~A→O(γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) (2.12)

=(γ(|v|), uγ(|v|), γ(|v|)v, 0) (2.13)

Finally, there’s no reason for the z component to behave differently, so we cangeneralize this. For general particle 3-velocity is v = (u, v, w), the 4-velocityis

~A→O(γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) (2.14)

=(γ(|v|), uγ(|v|), vγ(|v|), wγ(|v|)) (2.15)

where

|v| =√u2 + v2 + w2.

(c) Starting with the 4-velocity components {Uα}, one can write the 3-velocity,

v = (U1/γ, U2/γ, U3/γ)

where γ ≡ 1/√

1− v · v = U0.

(d) Applying the above formula, if the 4-velocity is given as (2, 1, 1, 1) thenthe 3-velocity is v = (1/2, 1/2, 1/2). Note the magnitude of the 4-velocity is−4 + 3 = −1, making it a legitimate example.

16 Particle moves with speed w, say along the x−axis, in a referenceframe O moving along the x−axis with speed v. Deriving Einstein’s velocityaddition law from a Lorentz transformation of the particle’s 4-velocity.

The particle’s 4-velocity in reference frame O, U →O (γ(w), γ(w)w, 0, 0).Lorentz transformation from O to O

Λ(−v) =

γ(v) v γ(v) 0 0v γ(v) γ(v) 0 0

0 0 1 00 0 0 1

.

Page 27: My Notes on Schutz2009

Rob’s notes on Schutz 27

So the 4-velocity is, U →O (γ(w)γ(v)+vwγ(w)γ(v), vγ(v)γ(w)+wγ(v)γ(w), 0, 0).Converting this to the 3-velocity using the formula in 15c,

vx =Ux

U0(2.16)

=γ(v) γ(w)(v + w)

γ(v) γ(w)(1 + v w)(2.17)

=(v + w)

(1 + v w)(2.18)

17 (a) Prove that any timelike vector ~U for which U0 > 0 and ~U · ~U = −1is the four-velocity of some world line.

The four-velocity is the ~e0 in the MCRF. If ~U is some world line’s four-velocity, then there exists a Lorentz transformation for which ~U →O (1, 0, 0, 0).

Let’s see if that’s possible for the given vector ~U .The coordinate system can be rotated so that Uα = (U0, u, 0, 0), just to

make the algebra simpler. Now apply an arbitrary Lorentz transformationγ(v) −v γ(v) 0 0−v γ(v) γ(v) 0 0

0 0 1 00 0 0 1

U0

u00

=

1000

,for some v and γ(v). Thus we require

1 = γ(U0 − v u) (2.19)

0 = γ(u− U0v). (2.20)

But in general we require γ ≥ 1, so the second equation (2.20) requires

v = u/U0. We know U0 > 0 (given) and it follows from the fact that ~U istimelike that U0 > u. So thus v < 1. Thus γ(v) > 1, and most importantly,γ(v) ∈ <, i.e. the Lorentz transformation is possible. Does this requiredLorentz transformation also bring the time component to unity?

The algebra can get messy, but simplifies if we use the fact that ~U ·~U = −1.Eliminate v in the first equation (2.19) gives

1

γ(v)= U0 − u2

U0=

1

U0((U0)2 − u2) =

1

U0

Page 28: My Notes on Schutz2009

28

Soγ = U0

(b) Use this to prove that for any timelike vector ~V there is a Lorentz

frame in which ~V has zero spatial components.The magnitude of a vector is the interval between the origin and the co-

ordinates of the vector. For a timelike interval the vector is timelike, and viceversa. Timelike intervals can be transformed via a Lorentz transformationto have zero spatial part, see Exercise 1.21. The corresponding vector willhave zero spatial components.

If you haven’t done Exercise 1.21, you can construct a proof using part17(a). We are no longer required to make the time part unity; we onlyrequire the space part to be zero, i.e. (2.20), 0 = γ(v)(u − V 0v), where u isnow V iVi = u2. We no longer have V 0 > 0, but that doesn’t matter. Becauseit’s a timelike vector we have

(V 0)2 > V iVi = u2

So (2.20) implies now that|v| < 1

and again, γ(v) ∈ <, i.e. the Lorentz transformation is possible.

18 (a) Sum of two spacelike orthogonal vectors is spacelike.

By definition, orthogonal vectors have ~A · ~B = 0, so

( ~A+ ~B) · ( ~A+ ~B) = ~A · ~A+ ~B · ~B + 2 ~A · ~B (2.21)

= ~A · ~A+ ~B · ~B > 0. (2.22)

Spacelike vectors have positive magnitude, ~A · ~A > 0. So ( ~A + ~B) is alsospacelike.

(b) Timelike vector and null vector cannot be orthogonal.

Timelike vector ~A. Let’s keep the algebra simple and rotate to a co-ordinate frame such that the spacepart of the null vector ~N is all in onecomponent,

A→O (A0, A1, 0, 0)

Page 29: My Notes on Schutz2009

Rob’s notes on Schutz 29

The null vector ~N has unknown coordinates in this frame, but

A ·N = −A0N0 + A1N1

19 Stuck!

20 The particle moves in a circle in the x − y plane of radius b, in aclockwise sense when viewed in the direction of decreasing z. The circletranslates along the x−axis at speed a. It’s stated that |bω| < 1, but therequirement for a realistic particle is actually that |a+bω| < 1. The 3-velocityis computed directly by differentiating the given equations, v →O (x, y, 0),where

x = a+ ωb sin(ωt) (2.23)

y = −ωb cos(ωt) (2.24)

The 4-velocity is obtained from the 3-velocity using the formula derivedin problem 1.15b.

~V →O (γ(v), xγ(v), yγ(v), 0)

where v = |v| =√

(a+ ωb sin(ωt))2 + (ωb cos(ωt))2 =√a2 + 2aωb sin(ωt) + ω2b2.

To obtain the 4-acceleration we require the 4-velocity as a function ofproper time, τ , not t, the time in the inertial frame. But remember thatthe proper time is the time measured by a clock at, say, the origin of theMCRF. Call this frame O, and then t = τ = x0. And t = Λ(−v)0

αxα.

For simplicity we choose the MCRF with origin at the particle location,so xα →O (τ, 0, 0, 0), and t = γ(−v)τ = γ(v)τ . Then we obtain the 4-acceleration from the given equations in t and the chain rule,

~a ≡ d~U

dτ=d~U

dt

dt

dτ= γ(v)

d~U

dt

We now confront the question as to whether or not to let γ(v) in thisderivative! Stuck!

Let’s retreat to safer ground and compute the 4-velocity from the positionas a function of t.

Page 30: My Notes on Schutz2009

30

21 The motion is hyperbolic in frame O,

x2 − t2 = a2 cosh2

a

)− a2 sinh2

a

)= a2

and therefore hyperbolic in all reference frames, −t2 + x2 = a2. The velocityis obtained by differentiating with respect to λ,

v =dx

dt=dx

dλ/dt

dλ= tanh

a

).

So we notice that(λa

)is a velocity parameter for v, see problem 1.18.

The Lorentz transformation to the MCRF can be written in a simple formwith the velocity parameter, see problem 1.20:

Λ =

[cosh

(λa

)− sinh

(λa

)− sinh

(λa

)cosh

(λa

) ]Thus we find the points transform to[

t(λ)x(λ)

]=

[cosh

(λa

)− sinh

(λa

)− sinh

(λa

)cosh

(λa

) ] [ a sinh(λa

)a cosh

(λa

) ] =

[0a

]The particle always ends up on the x−axis.

To show that the parameter λ is the proper time, we show that

dt

dλ= 1

for a MCRF and any λ. This is a bit subtle, because we want to hold theLorentz transformation fixed (so hold λ = λMCRF fixed), so that the MCRFis inertial. But we want to let λ vary about λ = λMCRF so we can take thederivative of t(λ) wrt λ. I’ve written out this dependence explicitly below:

t(λ) = cosh

(λMCRF

a

)a sinh

a

)− sinh

(λMCRF

a

)a cosh

a

)Now differential wrt λ, and evaluate at λ = λMCRF giving,

dt

dλ= cosh2

a

)− sinh2

a

)= 1

Page 31: My Notes on Schutz2009

Rob’s notes on Schutz 31

The 4-velocity is

~U →O(

cosh

a

), sinh

a

), 0, 0

)The 4-acceleration is easy for this problem because we have the 4-velocity

as a function of proper time!

~a ≡ d~U

dτ=d~U

dλ→O

(1

asinh

a

),

1

acosh

a

), 0, 0

)We can check if it’s orthogonal to the 4-velocity, as it should be.

~U · ~a = −1

asinh

a

)cosh

a

)+

1

asinh

a

)cosh

a

)= 0.

Is it uniformly accelerating?

~a · ~a = − 1

a2sinh2

a

)+

1

a2cosh2

a

)=

1

a2.

And a was given as constant, and it’s always pointing in the x−direction, soit is uniformly accelerating (see definition in problem 2.19).

22 (a) Given 4-momentum, ~p→O (4, 1, 1, 0) kg. Find:Energy in O: In general ~p→O (E, p1, p2, p3), so E = 4 kg.

3-velocity in O: In general m~U = ~p, where m is the rest mass and ~Uis the 4-velocity. And the 3-velocity is related to the 4-velocity as inferredin problem 2.15b, Uα = Λ(−|v|)α0 . So ~p →O (mγ,muγ,mvγ,mwγ), wherev→O (u, v, w) are the components of the 3-velocity. Note that E = mγ, andsimply dividing through by E gives v→O (1/4, 1/4, 0).

Rest mass:

γ =1√

1− v · v=

4√14.

From which it follows from E = mγ = 4 that m =√

14.

(b) We must apply the law of conservation of 4-momentum.

~pI = ~p1 + ~p2 →O (5, 0, 1, 0) kg

Page 32: My Notes on Schutz2009

32

By conservation of 4-momentum,

~pF = ~pI = ~p3 + ~p4 + ~p5,

so~p5 = ~pI − ~p3 − ~p4 =→O (3,−1/2, 1, 0) kg.

Now, like in problem (a), we know the 4-momentum. From an analysis justlike in (a), we find the 5th particle has in this same reference frame: E5 →O 3,and v5 →O (−1/6, 1/3, 0). Finally, the rest mass is m =

√31/2.

The CM frame is found by finding the Lorentz transformation that trans-forms the ~pF to have only a time component,

Λαβp

β = (~e0)α

This gives the equation for the y-direction,

−vγ5 + γ = 0

So CM has 3-velocity v→O (0, 1/5, 0).

23 Find the energy given the 3-velocity and rest mass.First find the 4-momentum, ~p = m~U = mγ(1, u, v, w). And the energy is

the time-part of the 4-momentum,

E = mγ

We can find an approximate value of γ from the binomial series, http://en.wikipedia.org/wiki/Binomial_series. This is just a Taylor series aboutx = 0. Let x = v · v = v2, and α = −1/2, so we obtain,

γ = 1 +1

2v2 − 3

8v4 + . . .

So

E ≈ m(1 +1

2v2 − 3

8v4 + . . .)

i.e. the rest mass, plus the classical kinetic energy, plus a correction of orderO(v4). The correction is 1/2 the kinetic energy when,

v =√

2/3

Page 33: My Notes on Schutz2009

Rob’s notes on Schutz 33

24 Show that it’s impossible for a positron and an electron to annihilateand produce a single γ−ray.

Apparently particles come and go, but 4-momentum is conserved. Lineup the coordinates such that the x−axis is aligned with the direction ofpropagation of the γ−ray. Then conservation of 4-momentum,

~pe+ + ~pe− = ~pγ,

gives two equations. The time part looks like conservation of energy,

p0e+ + p0

e− = p0γ,

while the spatial part looks like traditional conservation of momentum,

p1e+ + p1

e− = p1γ.

It’s important to realize that they are not independent, since in a referenceframe wherein the electron and positron move with velocities ve− and ve+ ,we have

m(γ(ve+) + γ(ve−)) = hν (2.25)

m(γ(ve+) ve+ + γ(ve−) ve−) = hν, (2.26)

where m is the rest mass of the electron and positron and ν is the frequencyof the γ−ray. The only mathematical solution is then ve− = ve+ = 1, whichis physically impossible because of their non-zero rest mass. Nothing movesat the speed of light, except electromagnetic radiation and possibly gravitywaves if they exist.

It’s possible to produce two γ−rays. Suppose they are travel in oppositedirections with equal and opposite momentum in some frame of reference.Then the final total 4-momentum is the null vector. To satisfy momentumconservation we only require that the positron and electron have equal andopposite momentum in the same frame of reference, so ve+ = −ve− witharbitrary ve+ , which can obviously be satisfied.

25 Doppler shift.In frame O photon has 4-momentum

~p→O (hν, hν cos(θ), hν sin(θ), 0)

Page 34: My Notes on Schutz2009

34

Transforming to the frame O moving at speed v along the x−axis, weapply the Lorentz transformation

Λ(v) =

γ(v) −v γ(v) 0 0−v γ(v) γ(v) 0 0

0 0 1 00 0 0 1

to obtain

~p→O

γhν − vγ(v)hν cos(θ)

−vγ(v)hν + γ(v)hν cos(θ)hν sin(θ)

0

So the Doppler shift is obtained from the time component, i.e. the firstcomponent, and can be expressed as,

ν

ν= γ(v)(1− v cos(θ)) =

1√1− v2

(1− v cos(θ))

as given.

(b)No Doppler shift occurs when

ν

ν= 1 =

1√1− v2

(1− v cos(θ))

or

θ = arccos

(1−√

1− v2

v

)Extra questions: Does this have solutions? For |v| � 1 use the binomial

series to see that θ ≈ π/2. What’s the maximum angle of no Doppler shift?As v → 1, θ → 0. Show that at v = 1/2, θ ≈ 74.5◦.

(c)(2.35) is the frame-invarient expression for energy E relative to observer

moving with velocity ~Uobs,

−~p · Uobs = E

Page 35: My Notes on Schutz2009

Rob’s notes on Schutz 35

and (2.38) was just E = hν. This calculation ends up being exactly thesame as above, but allows one to focus on the relevant parts, i.e. just thetime component. Since

~Uobs →O (γ(v), γ(v)v, 0, 0)

and recall~p→O (hν, hν cos(θ), hν sin(θ), 0)

so we can immediately find

E = γ(v)hν − vγ(v)hν cos(θ).

which was the time component of the ~p→O found in (a) above.

26 Energy required to accelerate an object with rest mass m from v toδv to first order in δv.

E = mγ(v) = m1√

1− v2

so the change in energy is just

δE = m(γ(v + δv)− γ(v)).

When v � 1 the problem is easy. Just differentiate γ wrt v to get theTaylor series approximation

γ(v + δv)− γ(v) = γ′(v)δv +1

2γ′′(v)δv2 + . . .

where

γ′ =dγ

dv= vγ3 (2.27)

γ′′ = γ3 + v3γ2γ′ (2.28)

Soγ(v + δv)− γ(v) = vγ3δv +O(δv2) . . .

And so the change in energy is,

δE ≈ mvγ3δv.

Page 36: My Notes on Schutz2009

36

A subtlety arises when v is not small. The coefficient γ′′ become largerelative to γ′, so ignoring the O(δv2) term becomes misleading. The authorshould have instructed us to check this. In particular,

γ′′

γ′=

1

v+ 3vγ2

When v � 1 we can replace

1

2γ′′δv2 ≈ γ′δv

(δv

2v

)� γ′(v)δv

since we’re given that ( δvv

) � 1. So we’re still justified in ignoring the 2ndterm in the Taylor series. But when v is not small we need another approach.

The above argument is not formally correct when v is not small becausethe higher order terms in the Taylor series can no longer be ignored. Here isone approach.

Write v = 1− ε where 0 < ε� 1, so we’re close to the speed of light. Useε� 1 and the Binomial series to simplify γ,

γ(v) =1√

ε(2− ε)≈ 1√

2ε(1 + ε/4),

and

γ(v + δv) =1√

(ε− δε)(2− ε+ δε)

where δε = −δv. To simplify the latter we need to consider the case where|δε| � ε. But this is not so restrictive. Then

γ(v + δv) =1√

(ε− δε)(2− ε+ δε)≈ 1√

(1 +

δε

)(1 +

ε− δε4

)To find the perturbation in energy we take the difference,

γ(v + δv)− γ(v) ≈ 1√2ε

(δε

4

)It’s clear that as ε→ 0, so v → 1,

A simpler and better solution: Write v = 1− ε where 0 < ε� 1, so we’reclose to the speed of light.

γ(v) =1√

ε(2− ε).

Page 37: My Notes on Schutz2009

Rob’s notes on Schutz 37

Now expand this in a Taylor Series in ε:

γ(v + δv)− γ(v) =dγ

dε(−δε) +

1

2

d2γ

dε2(−δε)2 + . . .

Anddγ

dε=

−(1− ε

4

)(2ε)3/2(1− ε/2)3/2

≈−(1− ε

4

) (1 + 3ε

4

)(2ε)3/2

where the approximation exploits 0 < ε � 1 with the Binomial Series ap-proximation. It’s important to check the size of the 2nd derivative relativeto the first. We find, again using the Binomial Series,

d2γ

dε2≈ −3

so we’re only justified in ignoring the 2nd term if |δε| � ε. In this case, thechange in energy is

δE ≈ m1

(2ε)3/2δv = mγ3δv +O(ε)

This actually agrees with the result we would have obtained from using thesimply Taylor Series above.

We’re asked to show that the energy becomes infinite when v → 1. Thisis easily obtained by noting that γ is finite for 0 ≤ v < 1. However,

limv→1

γ(v)→∞.

27 Increasing temperature increases the rest mass.Object has rest mass, m(T0) = 10[kg]. Increasing temperature from T0

to T by heat flux δQ = 100 J. This must be reflected in an increase in restmass, since in the MCRF of the object, U0 = 1 and mU0 = p0 = E. So

m(T ) = m(T0)[kg] + δQ[J]/c2[m2/s2] = 10 + 1.1× 10−15[kg]

This problem is interesting to look at from a thermodynamics point ofview. The heat flux increases the temperature and enthalpy of the object,which is reflected on a microscopic scale by an increase in the motion, relativeto the centre of mass of the object, of the elements (atoms or molecules or sea

Page 38: My Notes on Schutz2009

38

of electrons depending on the material) composing the object. This motionincreases the effective mass of the elements. Say an element has rest massmi, then when it has thermal speed vi it has “relativistic mass”

mi,rel = miγ(vi).

I found this website, which expands on these ideas http://en.wikipedia.

org/wiki/Massenergy_equivalence.

28 Boring.

29

d

dτ(~U · ~U) =

d

(−(U0)2 + (U1)2 + (U2)2 + (U3)2

)(2.29)

= −2U0dU0

dτ+ 2U idU

i

dτ(2.30)

= 2~U · d~U

dτ(2.31)

Q.E.D.

30 Four velocity of rocket ship,

~U →O (2, 1, 1, 1)

High-velocity cosmic ray with 4-momentum,

~P →O (300, 299, 0, 0)× 10−27kg

(a) Transform to MCRF of rocket ship. We know from Ex. 2.15, that forgeneral particle 3-velocity is v = (u, v, w), the 4-velocity is

~A→O(γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) (2.32)

=(γ(|v|), uγ(|v|), vγ(|v|), wγ(|v|)) (2.33)

Page 39: My Notes on Schutz2009

Rob’s notes on Schutz 39

where

|v| =√u2 + v2 + w2.

Inspection of ~U reveals that

γ = 2

u = 1/2

v = 1/2

w = 1/2

and |v| =√u2 + v2 + w2 =

√3/2. Now we need the Lorentz transformation

for a reference frame moving with 3-velocity with more than one non-zerocomponent. Up to this point we haven’t learned this, and I’m a bit surprisedSchutz has thrown this at us now. To lead one through the steps to constructa general Lorentz transformation, I’ve created supplementary problem R.1in section 2.10. Here we note that we actually only need the first row of theLorentz transformation matrix, since we only require P 0 = E. This first rowmust be such that it transforms ~U →O (1, 0, 0, 0). Thus it must be related

to the components of ~U as follows:

Λ00 = U0 Λ0

i = −U i.

Applying Λ0α to the given ~P gives, E = 301× 10−27kg in rocket ship frame.

(b)

−~P · ~Uobs = Eobs = 10−27[300 299 0 0]

2111

= 301× 10−27kg

(c) Of course (b) was faster. The same computations were performed toget the answer, but in (b) we only did the necessary computations.

31 Photon reflects off mirror without changing frequency ν. Angle ofincidence is θ.

Page 40: My Notes on Schutz2009

40

This appears to be a straightforward application of conservation of 4-momentum, but it fun because it gets us thinking about all 4 components.

Let the mirror lie in the y − z plane, with photon travelling initially inthe x − y plane, with angle θ to the x−axis. Then the initial 4-momentumof the photon is written

~Pi = (hν, cos(θ)hν, sin(θ)hν, 0).

First let’s construct the 4-momentum of the reflected photon ~Pr. Since thephoton frequency doesn’t change, we know instantly the time component,

P 0r = P 0

i = hν.

For a smooth mirror we assume that the momentum transferred is only inthe x−direction. So then we can also construct the components,

P 2r = P 2

i = sin(θ)hν, P 3r = P 3

i = 0

Recall from Eq. (2.37) that the 4-momentum of a photon is orthogonal toitself. This along gives us two possibilities for P 1

r = ± cos(θ)hν. For thereflected photon, we choose the minus sign. In summary,

~Pr = (hν,−hν cos(θ), hν sin(θ), 0).

By conservation of 4-momentum, we see that the momentum transferred tothe mirror must be ∆P 1

m = 2hν cos(θ) in the x−direction. How did the mirroracquire x−direction momentum without gaining energy? See SupplementaryProblem R2 in section 2.10.

If the photon is absorbed, then the momentum transferred to the mirrorhas three components,

∆~Pm = (∆Em,∆P1m,∆P

2m, 0) = (hν, hν cos(θ), hν sin(θ), 0),

How did the mirror acquire the extra energy ∆Em = hν? See SupplementaryProblem R2 in section 2.10.

32 Derive the Compton scattering relationship Eq. 2.43.Initially the 4-momentum in the particle’s initial rest frame O is

~P →O (hνi, hνf , 0, 0) + (m, 0, 0, 0)

Page 41: My Notes on Schutz2009

Rob’s notes on Schutz 41

After the scattering event,

~P →O (hνf , hνf cos(θ), hνf sin(θ), 0) +m(γ, vγ cos(φ), vγ sin(φ), 0)

where v and φ are the speed and the angle of the particle’s scattered trajec-tory in the x−y plane relative to the initial direction of the incident photon.Equating the three nonzero components of 4-momentum gives 3 equationsfor the 3 unknowns νf , v, φ. In principle one can then solve for νf in termsof the other two unknowns, but I found it too tedious to do so.

33 Compton scattering of a cosmic microwave background radiation pho-ton off a cosmic ray ( high-energy proton). What’s the max frequency ofscattered photon?

Very nice problem. At first appears very challenging, but the extremedifferences in energy between the two particles simplifies things.

First we note that in the rest frame of the particle, Compton scatteringonly reduces the frequency and more so for less massive particles (see alsosupplementary problem R.2 below). So how can Compton scattering increasethe energy of the photon?? The increase in energy is revealed via the Dopplershift.

The key simplification in this problem is that the Compton scattering inthe frame of the particle has very little effect on frequency.

1

hνi= 5000eV−1 � 1

mp

= 10−9eV−1.

So the angle of the Compton scattering has very little effect on the finallyfrequency in the particles initial rest frame. So in considering the effect ofthe angle, we need only consider its effect on the Doppler shift.

Now the problem is easy. The Doppler shift in frequency is given ingeneral by Eq. 2.42. Obviously to maximize the frequency in the cosmic rayframe, νi, we want the photon and cosmic ray traveling in a line in oppositedirections, i.e. θ = π radians, for which Eq. 2.42 gives

hνi = hνi1√

1− v2(1 + v) ≈ hνi

2√1− v2

= hνi2× 109 = 4× 105eV.

The Doppler shift has made a tremendous increase in frequency! The Comp-ton scattering will make very little difference, so to maximize the scattered

Page 42: My Notes on Schutz2009

42

frequency in the Sun’s frame, choose the Compton scattering angle to maxi-mize the Doppler shift. That is, choose the scattering angle to be π. Eq. 2.43gives

1

hνf=

1

hνi+

2

mp

= 0.25× 10−5 + 2× 10−9 ≈ 0.25× 10−5[eV]−1.

Compton scattered caused negligible decrease in energy in the proton’s frame.The proton, like the mirror in problem 31, is massive enough to cause littlechange in frequency of the photon in the proton’s frame. See also Sup-plementary problem R.2. Now Lorentz transform back to the Sun’s frame.The photon again gains tremendously from the Doppler shift (that’s why wechoose the scattering angle to be complete reflection).

hνf ≈ hνf 2× 109 ≈ 8× 1014eV.

This is a very hard γ−ray. A pair of 511 keV photons arising from annihi-lation of an electron and positron are considered to be γ−rays. This morethan a billion times more energetic.

34 These are quite trivial. For example, expand out the dot product interms of components using the definition in Eq. 2.26, and use the linearityproperty given by Eq. 2.8,

(α ~A) · ~B = −αA0B0 + αA1B1 + αA2B2 + αA3B3

= α(−A0B0 + A1B1 + A2B2 + A3B3)

= α( ~A · ~B) (2.34)

35 Show that ~eβ obtained from Eq. 2.15,

~eµ = Λνµ(−v)~eν ,

obey~eα · ~eβ = ηαβ

~eα · ~eβ = Λνα(−v)~eν · Λµ

β(−v)~eµ

= Λνα Λµ

β~eν · ~eµ

= Λνα Λµ

βηνµ

Page 43: My Notes on Schutz2009

Rob’s notes on Schutz 43

The LHS is a vector expression, and it shouldn’t depend upon the orientationof the coordinate axes. So let’s rotate the axes so that v is oriented alongthe x−axis. Then

Λ(v) =

γ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

Note that Λ is symmetric so we can interchange indices on one without effect,

~eα · ~eβ = Λβµ Λν

α ηνµ

For given α = β, the RHS looks like the product of a row of Λβµ times a

column Λνα. It’s easy to see that the result is −1 for α = β = 0 and +1 for

α = β > 0. When α 6= β, the RHS = 0. Q.E.D.

2.10 Rob’s supplemental problems

R.1 Suppose the 4-velocity of rocket ship is ~U →O (2, 1,√

2, 0) in somereference frame O.

(a) Show that the given ~U is a legitimate 4-velocity. Show that ~V →O(2, 1, 1, 0) is not possible.

(b) Find the 3-velocity in O. Hint: see Ex. 2.15. (You’ll need this for(c)).

(c) Find the matrix that rotates of spatial coordinates such that the 3-velocity has only one non-zero component, in say the x−direction. What’sthe matrix that rotates the 4-velocity to have only one nonzero spatial com-ponent?

(d) Find the inverse rotation matrices for above. Hint: Think physicallyand check mathematically, i.e. R−1

4 R4 = I

(e) Find the Lorentz transformation from O to the MCRF of the rocket

ship. Confirm that it has the correct effect applied to ~U itself. Hint: Theproblem here is that we have so far only seen the Lorentz transformationwhen the 3-velocity has only one non-zero component. Use your rotationmatrix from above and its inverse.

Solution:

Page 44: My Notes on Schutz2009

44

(a)~U · ~U = −22 + 12 +

√2

2= −1

which is consistent with Eq. (2.28). On the other hand,

~V · ~V = −22 + 12 + 12 = −2

which is inconsistent with Eq. (2.28).

(b) See solution to Ex. 2.15:

v→O (1/2,√

2/2, 0)

(c) Rotating anticlockwise through angle θ = arccos(1/√

3) aligns thex−axis with the 3-velocity. This is accomplished with the matrix R3,

R3 =

cos(θ) sin(θ) 0− sin(θ) cos(θ) 0

0 0 1

For the 4-velocity

R4 =

1 0 0 00 cos(θ) sin(θ) 00 − sin(θ) cos(θ) 00 0 0 1

(d) To find the inverse of the rotation matrix just change the sign of the

angle!

R−14 =

1 0 0 00 cos(θ) − sin(θ) 00 sin(θ) cos(θ) 00 0 0 1

(e) The Lorentz transformation for the case Λ(u, v, 0) can be built from

the above tools. Consider transforming a vector, ~U .

~U = Λ~U

= R−14 Λ′(u′,0,0)R4

~U

Page 45: My Notes on Schutz2009

Rob’s notes on Schutz 45

where

Λ′(u′, 0, 0) =

γ(u′) −u′γ(u′) 0 0−u′γ(u′) γ(u′) 0 0

0 0 1 00 0 0 1

So this defines the desired Lorentz transformation Λ(u, v, 0),

Λ(u, v, 0) =

γ(|v|) −uγ(|v|) −vγ(|v|) 0−uγ(|v|) γ(|v|) cos2(θ) + sin2(θ) (γ(|v|)− 1) cos(θ) sin(θ) 0−vγ(|v|) (γ(|v|)− 1) cos(θ) sin(θ) γ(|v|) sin2(θ) + cos2(θ) 0

0 0 0 1

(2.35)

where |v| =√u2 + v2 and θ = arctan(v/u). It’s straightforward, albeit a bit

tedious, to show that

Λ(u, v, 0)~U =

1000

.

R.2 (a)How did the mirror in problem 2.31 acquire x−direction momen-tum without acquiring energy when the photon was reflected?

(b) How did it acquire the energy when the photon was absorbed?

Solution:(a) The change in 4-momentum is related to the change in 4-velocity of

a massive object,

∆~Pm = m∆~U = m(∆γ,∆(uγ), 0, 0) = m(γ − 1, uγ, 0, 0),

where the 2nd equality assumes the mirror is initially at rest. Thus the ratioof

∆P 0m

∆P 1m

=∆Em

∆(mU1)=

1

u(1−

√1− u2) ≈ u

2.

The approximation applies in the limit u � 1 using the binomial series. Sothe change in energy can be arbitrarily small for a given change in momentumif the change in velocity is correspondingly small. This corresponds to intu-ition that a more massive mirror would rebound less for a given momentum

Page 46: My Notes on Schutz2009

46

transfer. I suspect the imposition of “reflection without change in frequency”is an idealization applicable for massive “mirrors”. Indeed the next problem,2.32 covers Compton scattering, wherein a photon reflects off a particle ofmass m. In Eq. 2.43 we see that for

m

h� νi

where νi is the incident frequency of the photon, the reflected frequencyνf ≈ νi.

(b) For a massive mirror, the energy must have become mostly thermalenergy. For a less massive mirror the energy, more the energy would go intothe translational kinetic energy of the rebound.

Page 47: My Notes on Schutz2009

Bibliography

Misner, C. W., K. S. Thorne, and J. A. Wheeler, 1973: Gravitation. W. H.Freeman and company. 1279 + XXVI pp.

Schutz, B., 2009: A first course in General Relativity . Cambridge UniversityPress. 2nd ed., 393 + XV pp.

47