Top Banner
695

A New Perspective on Relativity

Jan 18, 2016

Download

Documents

johnlocke77

A New Perspective on Relativity
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A New Perspective on Relativity
Page 2: A New Perspective on Relativity

8083.9789814340489-tp.indd 1 8/29/11 4:59 PM

Page 3: A New Perspective on Relativity

This page intentionally left blankThis page intentionally left blank

Page 4: A New Perspective on Relativity

N E W J E R S E Y • L O N D O N • S I N G A P O R E • B E I J I N G • S H A N G H A I • H O N G K O N G • TA I P E I • C H E N N A I

World Scientific

Bernard H LavendaUniversita’ degli Studi di Camerino, Italy

8083.9789814340489-tp.indd 2 8/29/11 4:59 PM

Page 5: A New Perspective on Relativity

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.

For photocopying of material in this volume, please pay a copying fee through the Copyright

Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to

photocopy is not required from the publisher.

ISBN-13 978-981-4340-48-9

ISBN-10 981-4340-48-0

Typeset by Stallion Press

Email: [email protected]

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,

electronic or mechanical, including photocopying, recording or any information storage and retrieval

system now known or to be invented, without written permission from the Publisher.

Copyright © 2012 by World Scientific Publishing Co. Pte. Ltd.

Published by

World Scientific Publishing Co. Pte. Ltd.

5 Toh Tuck Link, Singapore 596224

USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601

UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Printed in Singapore.

A NEW PERSPECTIVE ON RELATIVITY

An Odyssey in Non-Euclidean Geometries

YeeSern - A New Perspective on Relativity.pmd 10/14/2011, 9:08 AM1

Page 6: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

In memory of Franco Fraschetti (1924–2009)

v

Page 7: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

This page intentionally left blankThis page intentionally left blank

Page 8: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

Preface

Electrodynamics was the next oasis after thermodynamics which saw aconfluence of physicists and mathematicians, many of whom had beenprotagonists in thermodynamics. Just as thermodynamics had an offspring,quantum theory, so too did electrodynamics, namely the theory of relativity.While a single name can be attached to the origins of thermodynamics, SadiCarnot, and that of its offspring, Max Planck, no such simplicity exists inelectrodynamics and relativity.

Relativity is as much about physics as it is about the human beings,and their failings, that made it. Every physics student will have heard ofMaxwell’s equations but will he have also heard of Weber’s force? The stu-dent may have heard of Weber and Gauss for the units named after them,but not about their championing of Ampère’s law which threatened thesupremacy of Newton’s inverse square law. The names of Helmholtz, Clau-sius and Boltzmann may be familiar from thermodynamics and statisticalthermodynamics but much less known for their theories of electromag-netism. Every student of mathematics will have heard the names of Gaussand Riemann, but will he also know of their fundamental contributionsto electromagnetism? Who were Abraham, Heaviside, Larmor, Liénard,Lorenz, Ritz, Schwarzschild, and Voigt? Why have their names been struckfrom the annals of electromagnetism?

We are familiar with the priority disputes between Kelvin andClausius in thermodynamics, but not with those in electromagnetism andrelativity. A student of physics may have heard the name of Lorentz,because of his law of force and transformation, but not at the same levelof Einstein. And Poincaré is known for just about everything else than hisprinciple of relativity. The history of electromagnetism and relativity hasbeen rewritten and in a very unflattering way.

vii

Page 9: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

viii A New Perspective on Relativity

By the modern historical account of electromagnetism and relativity,there were winners and losers. Maxwell is said to have triumphed overWeber and Gauss, in formulating a field theory of electromagnetism, andover Lorenz and Riemann in the formulation of his displacement current,Einstein’s absolute speed of light prevailed over Ritz’s ballistic theory ofemission, Lorentz’s supremacy over Abraham and Bucherer in devising amodel of the electron whose expressions for the variation of mass, momen-tum, and energy with velocity were later to be adopted in toto by relativityas a model for all matter, whether charged or not, and Einstein’s seniority instating the principles of relativity though they were previously enunciatedby Poincaré.

Why were the experimentalists, Ives and Essen, so vehementlyopposed to relativity? Ives viewed his verification of the second-orderDoppler shift as a clear demonstration that a moving clock runs slow bythe same factor that was predicted by Larmor and Lorentz, and not asa vindication of time dilatation in special relativity. Essen, who built thefirst cesium clock, queried what happens to the lost ticks when more ticksare transmitted than are received, independent of whether two clocks areapproaching or receding from one another? Essen went so far as to queryrelativity as a “joke or swindle?”

Most if not all monographs on relativity do not touch on these ques-tions. Not sowithO’Rahilly’s Electromagneticswritten in1938. Not everyonewill agree with his dispraise of Maxwell’s displacement current, or his overappraisal of Ritz, but much of what he says could not be truer today:

There is far more authoritarianism in science that physicists are aware or at leastpublicly acknowledge. Anybody with a scientific reputation would today hesitateto criticize Einstein, except by way of outdoing him in cosmological speculations.

Essen expressed similar views

Students are told that the theory (relativity) must be accepted although they cannotbe expected to understand it. . . The theory is so rigidly held that young scientistswho have any regard for their careers dare not openly express their doubts.

Whether there is any truth in the allegations I will leave to the reader.But what I plan to do is to present relativity from a ‘new’ point of view thattreats, known and unknown, relativistic phenomena from different per-spectives. I put ‘new’ in quotation marks because the approach is really notnew, but was suggested by Kaluza and Varicak over a century ago. What

Page 10: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

Preface ix

is ‘new,’ I believe, is the wealth of physical phenomena that can be drawnfrom the non-Euclidean geometrical perspective. This monograph is nei-ther intended an historical account of relativity nor an essay in constructivecriticism of it.

Arecurring theme is that motion causes deformity and this can, undercertain circumstances, catapult us into non-Euclidean spaces. It was also anexciting exercise to see where non-Euclidean geometries could be found,but were not appreciated as such. There are at least two eye-catching rela-tions: The product of two longitudinal Doppler shifts is the square rootof the cross-ratio, and whose logarithm is hyperbolic distance, and theBeltrami metric in polar coordinates is the exact expression for the metricfor the uniformly rotating disc. Gravitational phenomena rather than beinga manifestation of warped space-time can be accounted for by a varyingindex of refraction in an inhomogeneous medium that modifies Fermat’sprinciple of least time. The reader will find old and new things alike — butthe ‘old’ with a new interpretation. I don’t expect that everything is trueto 100 percent, some things will have to be changed, modified or clarified,but, I do believe that this is a very fruitful approach that has led to thequestioning of many fundamental aspects of relativity.

According to Riemann, physics is the search for a geometric mani-fold upon which physical processes occur. The line element of constantcurvature,

1

1 + 14α

∑x2

√ (∑dx2

), (R)

appearing in his Habilitation Dissertation, when written in polar coordinatesis precisely the metric for a uniformly rotating disc with constant negativecurvature, α < 0. When charge is added, it becomes the Liénard expressionfor the rate of energy loss due to radiation.

The role of the longitudinal Doppler shift means that space andtime do not appear separately but only in a ratio, as a homogeneous coor-dinate. It is the difference in longitudinal Doppler shifts that is responsi-ble for the slowing down of clocks in relative motion. Einstein elevatedc, the velocity of light in vacuo, to a universal constant. The fact that c isa constant, even to observers in relative motion, is tantamount to makingit a unit of measurement — one which is necessary for the existence of a

Page 11: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

x A New Perspective on Relativity

non-Euclidean geometry. So raising c to a universal constant, as Essen haspointed out, meant that the definition of unit length or time, or both, had tobe abandoned. Relativity will thus unfold in a hyperbolic space of velocitiesthat is entirely consonant with the relativistic addition of velocities.

Following the historical route spreads the honors of discovery of rel-ativity more evenly. Poincaré had arrived at the postulates of relativity atleast five years before Einstein, but “because he did not fully appreciate thestatus of both postulates” is no argument to deny him credit.

To deny Poincaré his primary role in developing the theory of relativ-ity because he held onto the aether concept is to deny Carnot the credit fordiscovering his principle because he still believed in caloric theory. It wouldnever have passed my mind to say that Boltzmann’s principle is incom-plete because it deals with only part of a probability distribution, being avery large number instead of a proper fraction, whereas I have shown thatthe entropy is the potential of law of error for which the most probablevalue is the average value of the measurements, that I have detracted anycredit from Boltzmann. And which average is considered most probablewill determine the form of the entropy.

What is incomprehensible was Poincaré’s need to ‘adjust’ the lawsof physics so as to preserve Euclidean geometry, and Einstein’s later con-currence with him. Was Euclidean geometry superior to non-Euclideangeometries to which Poincaré made so many outstanding contributions?Why couldn’t Poincaré connect with his fractional linear transformationswhich preserve certain geometric properties and define a new concept oflength in hyperbolic geometry with Lorentz transformations which he didso much work on?

Historians of science make much ado over the tortuous path thatEinstein followed to arrive at his field equations of general relativity —taking for granted that they are the final solution to the gravitationalproblem. Little progress has been made since Einstein wrote down hisequations almost a century ago, and what the general theory proposeshas still to be collaborated by observation. Singularities, black holes, andgravitational waves have, as yet, to be confirmed. Why time warps andwhat constitutes emptiness is left to be explained. How can a gravitationalfield exist in the absence of matter and all other physical fields?

Page 12: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

Preface xi

Young Maxwell gave a very interesting example of an optical instru-ment, for which the optical length of any curve in the object space is equalto that of its image, by expressing Fermat’s principle for the extremal pathof a ray in terms of a varying index of refraction and a flat metric. Thevarying index of refraction had the exact same form as the coefficient in(R) for positive curvature (α > 0). This gave me the idea that an optico-gravitational approach might prove useful in which a non-constant indexof refraction would mimic a varying gravitational field while the flat metricwould include the centrifugal potential. That gravitational and centrifugalforces appeared in different parts led me to question the equivalence prin-ciple whereby a gravitational field can be annulled by acceleration.

The distortions that we observe due to motion is the result of ourEuclidean rulers and clocks. Inhabitants of hyperbolic space would seeno changes in the measuring devices since they change along with them.All gedanken experiments using local observers would lead to null results.Paradoxes exist because the phenomena which give rise to them are notunderstood. Everyone would agree that emission theories are dead, butto say that the velocity on the outward journey is c + v, and the velocityon the return journey is c − v, where v is the velocity of the aether in theMichelson–Morley experiment is truly contradictory. To explain the nullresult a contraction hypothesis in the direction of the motion was assumed,yet the only contraction that arises from the Doppler shift is a second-orderone in the direction normal to the motion.

The journey has been a long one for me. Along the way I have gottento know a lot of people through their writings. I can feel the eccentricity andbiting sarcasm of Oliver Heaviside, who without a formal education, tookon the establishment with his unwavering faith in Maxwell; the youthfulenthusiasm of Walther Ritz for his science, the credit that was denied him,and the tragedy of his short and painful life; the nonchalance by whichPoincaré added hypotheses to theories, his wavering afterthoughts aboutthem, and his humility that led him to uphold Euclidean geometry afterall the work he did in bringing hyperbolic geometry into the mainstreamof mathematics — but not physics; the quarrelsome and critical Abraham,who was denied the credit he justly deserved, and whose death was alsotragic; the mild mannered, cautious and pragmatic approach of Lorentz,and, finally, the enigmatic figure of Einstein, who, more often than not,

Page 13: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

xii A New Perspective on Relativity

contradicted his own principles. It has also shown me other sides to peoplewhom I thought I knew. The openness to explore all avenues, no matter howdistasteful, that Planck exercised in his approach to blackbody radiation isnow contrasted to his opinionated view that non-Euclidean geometrieswas ‘child’s play,’ in comparison to the demands that relativity make onthe mind. But is it?

Trevignano Romano Bernard H. LavendaMarch 2011

Page 14: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

Contents

Preface vii

List of Figures xxi

1. Introduction 1

1.1 Einstein’s Impact on Twentieth Century Physics . . . . 11.1.1 The author(s) of relativity . . . . . . . . . . . . . 21.1.2 Models of the electron . . . . . . . . . . . . . . . 261.1.3 Appropriation of Lorentz’s theory of the electron

by relativity . . . . . . . . . . . . . . . . . . . . . 271.2 Physicists versus Mathematicians . . . . . . . . . . . . 30

1.2.1 Gauss’s lost discoveries . . . . . . . . . . . . . . 311.2.2 Poincaré’s missed opportunities . . . . . . . . . 35

1.3 Exclusion of Non-Euclidean Geometriesfrom Relativity . . . . . . . . . . . . . . . . . . . . . . . 41

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2. Which Geometry? 51

2.1 Physics or Geometry . . . . . . . . . . . . . . . . . . . . 512.1.1 The heated plane . . . . . . . . . . . . . . . . . . 51

2.2 Geometry of Complex Numbers . . . . . . . . . . . . . 572.2.1 Properties of complex numbers . . . . . . . . . 572.2.2 Inversion . . . . . . . . . . . . . . . . . . . . . . 582.2.3 Maxwell’s ‘fish-eye’: An example of inversion

from elliptic geometry . . . . . . . . . . . . . . . 61

xiii

Page 15: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

xiv A New Perspective on Relativity

2.2.4 The cross-ratio . . . . . . . . . . . . . . . . . . . 672.2.5 The Möbius transform . . . . . . . . . . . . . . . 72

2.3 Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . 762.4 Models of the Hyperbolic Plane

and Their Properties . . . . . . . . . . . . . . . . . . . . 802.5 A Brief History of Hyperbolic Geometry . . . . . . . . 88References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

3. A Brief History of Light, Electromagnetism and Gravity 109

3.1 The Drag Coefficient: A Clash Between Absoluteand Relative Velocities . . . . . . . . . . . . . . . . . . 109

3.2 Michelson–Morley Null Result:Is Contraction Real? . . . . . . . . . . . . . . . . . . . . 112

3.3 Radar Signaling versus Continuous Frequencies . . . 1173.4 Ives–Stilwell Non-Null Result: Variation of Clock

Rate with Motion . . . . . . . . . . . . . . . . . . . . . 1183.5 The Legacy of Nineteenth Century English Physics . . 122

3.5.1 Pressure of radiation . . . . . . . . . . . . . . . . 1223.5.2 Poynting’s derivation of E = mc2 . . . . . . . . 1233.5.3 Larmor’s attempt at the velocity composition

law via Fresnel’s drag . . . . . . . . . . . . . . . 1243.6 Gone with the Aether . . . . . . . . . . . . . . . . . . . 127

3.6.1 Elastic solid versus Maxwell’s equations . . . . 1273.6.2 The index of refraction . . . . . . . . . . . . . . 133

3.7 Motion Causes Bodily Distortion . . . . . . . . . . . . 1373.7.1 Optical effect: Double diffraction

experiments . . . . . . . . . . . . . . . . . . . . 1373.7.2 Trouton–Noble null mechanical effect . . . . . . 1383.7.3 Anisotropy of mass . . . . . . . . . . . . . . . . 1403.7.4 e/m measurements of the transverse mass . . . 149

3.8 Modeling Gravitation . . . . . . . . . . . . . . . . . . . 1563.8.1 Maxwellian gravitation . . . . . . . . . . . . . . 1563.8.2 Ritzian gravitation . . . . . . . . . . . . . . . . . 163

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

Page 16: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

Contents xv

4. Electromagnetic Radiation 177

4.1 Spooky Actions-at-a-Distance versus WigglyContinuous Fields . . . . . . . . . . . . . . . . . . . . . 1774.1.1 Irreversibility from a reversible theory . . . . . 1814.1.2 From fields to particles . . . . . . . . . . . . . . 1844.1.3 Absolute versus relative motion . . . . . . . . . 1864.1.4 Faster than the speed of light . . . . . . . . . . . 189

4.2 Relativistic Mass . . . . . . . . . . . . . . . . . . . . . . 1924.2.1 Gedanken experiments . . . . . . . . . . . . . . 1944.2.2 From Weber to Einstein . . . . . . . . . . . . . . 1974.2.3 Maxwell on Gauss and Weber . . . . . . . . . . 2004.2.4 Ritz’s electrodynamic theory of emission . . . . 208

4.3 Radiation by an Accelerating Electron . . . . . . . . . . 2124.3.1 What does the radiation reaction

force measure? . . . . . . . . . . . . . . . . . . . 2124.3.2 Constant rate of energy loss in hyperbolic

velocity space . . . . . . . . . . . . . . . . . . . . 2174.3.3 Radiation at uniform acceleration . . . . . . . . 2204.3.4 Curvatures: Turning and twisting . . . . . . . . 2254.3.5 Advanced potentials as perpetual

motion machines . . . . . . . . . . . . . . . . . . 229References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

5. The Origins of Mass 235

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2355.2 From Motional to Static Deformation . . . . . . . . . . 236

5.2.1 Potential theory . . . . . . . . . . . . . . . . . . 2375.3 Gravitational Mass . . . . . . . . . . . . . . . . . . . . . 243

5.3.1 Attraction of a rod: Increase in mass withbroadside motion . . . . . . . . . . . . . . . . . 243

5.3.2 Attraction of a spheroid on a point in its axisof revolution: Forces of attraction as minimalcurves of convex bodies . . . . . . . . . . . . . . 245

5.4 Electromagnetic Mass . . . . . . . . . . . . . . . . . . . 2495.4.1 What does the ratio e/m measure? . . . . . . . . 255

Page 17: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

xvi A New Perspective on Relativity

5.4.2 Models of the electron . . . . . . . . . . . . . . . 2625.4.3 Thomson’s relation between charges in motion

and their mass . . . . . . . . . . . . . . . . . . . 2635.4.4 Oblate versus prolate spheroids . . . . . . . . . 265

5.5 Minimal Curves for Convex Bodies in Ellipticand Hyperbolic Spaces . . . . . . . . . . . . . . . . . . 275

5.6 The Tractrix . . . . . . . . . . . . . . . . . . . . . . . . . 2805.7 Rigid Motions: Hyperbolic Lorentz Transforms

and Elliptic Rotations . . . . . . . . . . . . . . . . . . . 2835.8 The Elliptic Geometry of an Oblate Spheroid . . . . . . 2875.9 Matter and Energy . . . . . . . . . . . . . . . . . . . . . 289References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

6. Thermodynamics of Relativity 301

6.1 Does the Inertia of a Body Dependon its Heat Content? . . . . . . . . . . . . . . . . . . . . 301

6.2 Poincaré Stress and the Missing Mass . . . . . . . . . . 3036.3 Lorentz Transforms from the Velocity

Composition Law . . . . . . . . . . . . . . . . . . . . . 3086.4 Density Transformations and the Field Picture . . . . . 3156.5 Relativistic Virial . . . . . . . . . . . . . . . . . . . . . . 3236.6 Which Pressure? . . . . . . . . . . . . . . . . . . . . . . 3256.7 Thermodynamics from Bessel Functions . . . . . . . . 327

6.7.1 Boltzmann’s law via modifiedBessel functions . . . . . . . . . . . . . . . . . . 328

6.7.2 Asymptotic probability densities . . . . . . . . . 334References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

7. General Relativity in a Non-Euclidean Geometrical Setting 341

7.1 Centrifugal versus Gravitational Forces . . . . . . . . . 3417.2 Gravitational Effects on the Propagation of Light . . . 344

7.2.1 From Doppler to gravitational shifts . . . . . . 3447.2.2 Shapiro effect via Fermat’s principle . . . . . . 346

7.3 Optico-gravitational Phenomena . . . . . . . . . . . . 3487.4 The Models . . . . . . . . . . . . . . . . . . . . . . . . . 3617.5 General Relativity versus Non-Euclidean Metrics . . . 367

Page 18: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

Contents xvii

7.6 The Mechanics of Diffraction . . . . . . . . . . . . . . . 3757.6.1 Gravitational shift of spectral lines . . . . . . . 3787.6.2 The deflection of light . . . . . . . . . . . . . . . 3797.6.3 Advance of the perihelion . . . . . . . . . . . . 381

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

8. Relativity of Hyperbolic Space 385

8.1 Hyperbolic Geometry and the Birth of Relativity . . . 3858.2 Doppler Generation of Möbius Transformations . . . . 3888.3 Geometry of Doppler and Aberration

Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . 3938.4 Kinematics: The Radar Method of Signaling . . . . . . 398

8.4.1 Constant relative velocity: Geometric-arithmeticmean inequality . . . . . . . . . . . . . . . . . . 398

8.4.2 Constant relative acceleration . . . . . . . . . . 4018.5 Comparison with General Relativity . . . . . . . . . . 4078.6 Hyperbolic Geometry of Relativity . . . . . . . . . . . 4108.7 Coordinates in the Hyperbolic Plane . . . . . . . . . . 4158.8 Limiting Case of a Lambert Quadrilateral:

Uniform Acceleration . . . . . . . . . . . . . . . . . . . 4198.9 Additivity of the Recession and Distance

in Hubble’s Law . . . . . . . . . . . . . . . . . . . . . . 421References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423

9. Nonequivalence of Gravitation and Acceleration 425

9.1 The Uniformly Rotating Disc in Einstein’s Developmentof General Relativity . . . . . . . . . . . . . . . . . . . . 425

9.2 The Sagnac Effect . . . . . . . . . . . . . . . . . . . . . 4349.3 Generalizations of the Sagnac Effect . . . . . . . . . . . 4399.4 The Principle of Equivalence . . . . . . . . . . . . . . . 4439.5 Fermat’s Principle of Least Time

and Hyperbolic Geometry . . . . . . . . . . . . . . . . 4499.6 The Rotating Disc . . . . . . . . . . . . . . . . . . . . . 4539.7 The FitzGerald–Lorentz Contraction

via the Triangle Defect . . . . . . . . . . . . . . . . . . . 464

Page 19: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

xviii A New Perspective on Relativity

9.8 Hyperbolic Nature of the Electromagnetic Fieldand the Poincaré Stress . . . . . . . . . . . . . . . . . . 468

9.9 The Terrell–Weinstein Effect and the Angleof Parallelism . . . . . . . . . . . . . . . . . . . . . . . . 470

9.10 Hyperbolic Geometries with Non-Constant Curvature 4739.10.1 The heated disc revisited . . . . . . . . . . . . . 4739.10.2 A matter of curvature . . . . . . . . . . . . . . . 4769.10.3 Schwarzschild’s metric: How a nobody became

a one-body . . . . . . . . . . . . . . . . . . . . . 4789.10.4 Schwarzschild’s metric: The inside story . . . . 482

9.11 Cosmological Models . . . . . . . . . . . . . . . . . . . 4849.11.1 The general projective metric in the plane . . . 4849.11.2 The expanding Minkowski universe . . . . . . 4909.11.3 Event horizons . . . . . . . . . . . . . . . . . . . 4929.11.4 Newtonian dynamics discovers

the ‘big bang’ . . . . . . . . . . . . . . . . . . . . 496References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498

10. Aberration and Radiation Pressure in the Kleinand Poincaré Models 501

10.1 Angular Defect and its Relation to Aberrationand Thomas Precession . . . . . . . . . . . . . . . . . . 501

10.2 From the Klein to the Poincaré Model . . . . . . . . . . 50910.3 Aberration versus Radiation Pressure

on a Moving Mirror . . . . . . . . . . . . . . . . . . . . 51210.3.1 Aberration and the angle of parallelism . . . . . 51210.3.2 Reflection from a moving mirror . . . . . . . . . 514

10.4 Electromagnetic Radiation Pressure . . . . . . . . . . . 51510.5 Angle of Parallelism and the Vanishing

of the Radiation Pressure . . . . . . . . . . . . . . . . . 52210.6 Transverse Doppler Shifts as Experimental Evidence

for the Angle of Parallelism . . . . . . . . . . . . . . . . 525References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526

11. The Inertia of Polarization 529

11.1 Polarization and Relativity . . . . . . . . . . . . . . . . 529

Page 20: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

Contents xix

11.1.1 A history of polarization and someof its physical consequences . . . . . . . . . . . 529

11.1.2 Spin . . . . . . . . . . . . . . . . . . . . . . . . . 54011.1.3 Angular momentum . . . . . . . . . . . . . . . . 54311.1.4 Elastic strain . . . . . . . . . . . . . . . . . . . . 54511.1.5 Plane waves . . . . . . . . . . . . . . . . . . . . 55011.1.6 Spherical waves . . . . . . . . . . . . . . . . . . 55311.1.7 β-decay and parity violation . . . . . . . . . . . 554

11.2 Stokes Parameters and Their PhysicalInterpretations . . . . . . . . . . . . . . . . . . . . . . . 560

11.3 Poincaré’s Representation and Spherical Geometry . . 56811.3.1 Isospin and the electroweak interaction . . . . . 572

11.4 Polarization of Mass . . . . . . . . . . . . . . . . . . . . 57711.4.1 Mass and momentum . . . . . . . . . . . . . . . 57711.4.2 Relativistic space-time paths: An example

of mass polarization . . . . . . . . . . . . . . . . 58511.5 Mass in Maxwell’s Theory and Beyond . . . . . . . . . 590

11.5.1 A model of radiation . . . . . . . . . . . . . . . . 59011.5.2 Enter mass: Proca’s equations . . . . . . . . . . 60011.5.3 Proca’s approach to superconductivity . . . . . 60711.5.4 Phase and mass . . . . . . . . . . . . . . . . . . 61711.5.5 Compressional electromagnetic waves:

Helmholtz’s theory . . . . . . . . . . . . . . . . 62011.5.6 Directed electromagnetic waves . . . . . . . . . 627

11.6 Relativistic Stokes Parameters . . . . . . . . . . . . . . 63111.6.1 Weyl and Dirac versus Stokes . . . . . . . . . . 63111.6.2 Origin of the zero helicity state . . . . . . . . . . 64011.6.3 Lamb shift and left-hand

elliptical polarization . . . . . . . . . . . . . . . 648References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654

Index 657

Page 21: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

This page intentionally left blankThis page intentionally left blank

Page 22: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

List of Figures

1.1 A tiling of the hyperbolic plane by curvilinear triangles thatform right-angled pentagons. . . . . . . . . . . . . . . . . . . 37

2.1 A bug’s life in the heated disk; ‘hot’ in the center and ‘cold’on the disc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.2 Construction of the point of inversion P. . . . . . . . . . . . . 59

2.3 Circle of inversion for constructing the inverse P withrespect to P′. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

2.4 Maxwell’s “fish-eye.” . . . . . . . . . . . . . . . . . . . . . . . 63

2.5 The magnification of the inner product as it is projectedstereographically onto the Euclidean plane. . . . . . . . . . . 64

2.6 In the case of inversion both the point and its image are on thesame ray emanating from the center of the disc H. . . . . . . 66

2.7 It appears that rulers get longer as they are moved further fromthe origin. However, the elliptic distance from x to y is exactlythe same as that from X to Y. . . . . . . . . . . . . . . . . . . 67

2.8 A tiling of the plane. . . . . . . . . . . . . . . . . . . . . . . . . 69

2.9 Calculation of cross-ratio and perspectivity. . . . . . . . . . . 70

2.10 The four points u, a, c, v and a, d, x′, y′ from point p have thesame angles, hence, have the same cross-ratio. This also is truefor c, b, w, z and d, b, x′, y′. . . . . . . . . . . . . . . . . . . . . . 72

2.11 Derivation of Snell’s law. . . . . . . . . . . . . . . . . . . . . . 77

2.12 Angle of parallelism. . . . . . . . . . . . . . . . . . . . . . . . 78

2.13 The number of lines passing through P that are hyperparallelto the line g are infinite. The lines h1 and h2 are limiting parallelto g, while the others are hyperparallel to g. . . . . . . . . . . 79

xxi

Page 23: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

xxii A New Perspective on Relativity

2.14 Surfaces of negative constant curvature that are mapped ontopart of the hyperbolic plane. The middle figure is the mappingof a pseudosphere that produces horocycles as dashedcurves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

2.15 The ratio of concentric limiting arcs depends onlyon the distance between them. . . . . . . . . . . . . . . . . . . 84

2.16 Using Euclidean geometry to derive the angle of parallelismby considering concentric limiting arcs. . . . . . . . . . . . . 85

2.17 A right triangle in hyperbolic space: As P increases withoutlimit the angle tends to the angle of parallelism which is afunction only of d. . . . . . . . . . . . . . . . . . . . . . . . . . 88

2.18 The parallax of a star. . . . . . . . . . . . . . . . . . . . . . . . 892.19 Tractrix and pseudosphere as its surface of revolution. . . . . 902.20 Minkowski’s vision of space-time. . . . . . . . . . . . . . . . 942.21 Projection of the hyperboloid onto the plane. . . . . . . . . . 972.22 Geodesics determined by planes cutting the hyperboloid

and passing through the center. . . . . . . . . . . . . . . . . . 992.23 Cayley’s calculation of distance in the projective disc model. 1002.24 The Poincaré disc model as a stereographic projection from

the south pole S of the bottom sheet. . . . . . . . . . . . . . . 1012.25 Beltrami’s double mapping of Klein and his hyperbolic disc

model onto the Poincaré disc model. . . . . . . . . . . . . . . 1022.26 The combined vertical orthogonal projection upwards

and the stereographic projection downwards. . . . . . . . . . 1022.27 Geodesics consist of arcs that cut the disc, �, orthogonally. . 1043.1 Fizeau’s aether-drag apparatus with mirrors placed on corners

to reflect light. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1113.2 Monochromatic, yellow light is split by a mirror into two beams. 1133.3 Second-order wavelength shifts plotted as a function of

first-order shifts. . . . . . . . . . . . . . . . . . . . . . . . . . . 1213.4 Trouton–Noble experiment to search for effect of Earth moving

through aether. . . . . . . . . . . . . . . . . . . . . . . . . . . 1393.5 Planes formed from a moving trihedron. . . . . . . . . . . . . 1463.6 Thomson’s apparatus for determining the ratio e/m for

cathode rays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Page 24: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

List of Figures xxiii

3.7 The points on the parabola refer to electrons deflected byparallel and anti-parallel (left side) fields. . . . . . . . . . . . 152

3.8 Elliptical orbit of Mercury showing the excess rotation of themajor axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

4.1 The configuration for calculating the retarded scalar potential. 1824.2 Orientation of two circuit elements ds and ds′. . . . . . . . . . 2014.3 Frenet frame field for a trajectory of the motion. . . . . . . . 2255.1 Stellar aberration: (a) A telescope at rest, and (b) a telescope

aimed at the same star but in relative motion. . . . . . . . . . 2365.2 The potential of a homogeneous rod. . . . . . . . . . . . . . . 2385.3 Arod AB has length 2� with O as its center. The attracted point

P with an element of mass dm at a distance r from it. r1 and r2

are the lines joining P to the ends of the rod at A and B. . . . 2395.4 Family of ellipses and orthogonal confocal hyperbolas. . . . 2435.5 Attraction of a circular disc on its axis. . . . . . . . . . . . . . 2455.6 A figure of revolution. . . . . . . . . . . . . . . . . . . . . . . 2465.7 The ratio of charge to mass as a function of the relativity veloc-

ity. The sloping curve is the ratio determined by Abrahamwhile the horizontal curve results from Lorentz’s formula. . 257

5.8 The orientation of the fields in Bucherer’s experiment. . . . . 2585.9 (a) Oblate ellipsoid with a = b > c; (b) prolate ellipsoid with

a = b < c. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2685.10 The caustic circle of radius c separates the bright (periodic)

region a > c from the shadow (exponential) region, a < c. . . 2745.11 The perimeter L consists of the two half-lines that are tangent

to the circle and the arc length between them. . . . . . . . . . 2775.12 A circle inscribed in an n-gon. . . . . . . . . . . . . . . . . . . 2785.13 A regular n-gon inscribed in a circle. . . . . . . . . . . . . . . 2805.14 Newton’s tractrix. . . . . . . . . . . . . . . . . . . . . . . . . . 2827.1 The set-up for the Shapiro effect. . . . . . . . . . . . . . . . . 3467.2 Rays tangent to a circular caustic of radius l. . . . . . . . . . . 3647.3 Sector inscribed in a triangle. . . . . . . . . . . . . . . . . . . 3657.4 Newton’s tractrix again. . . . . . . . . . . . . . . . . . . . . . 3677.5 The stereographic projection of a point on the sphere P

onto the plane at point Q. . . . . . . . . . . . . . . . . . . . . 368

Page 25: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

xxiv A New Perspective on Relativity

7.6 Comparison of the Newtonian potential (a) with that of theSchwarzschild potential (b). . . . . . . . . . . . . . . . . . . . 372

7.7 Geodesic curves that cut the rim of the hyperbolic planeorthogonally are arcs of a circle whose center O lies outsidethe disc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375

8.1 Circles of inversion. . . . . . . . . . . . . . . . . . . . . . . . . 3918.2 A more detailed description of the circle containing the fixed

points v1 and λ which are uniform states of motion at relativevelocities u and 2u/(1+u2). The Möbius automorphism of thedisc may be considered as a composition of two hyperbolicrotations: A rotation of π about the hyperbolic midpointbetween the origin and λ, and a rotation about the origin. Themaximum angle φ is determined by the angle of parallelism,�, beyond which no motion can occur. . . . . . . . . . . . . . 392

8.3 Extension of hyperbolic trigonometry to general triangles. . 3948.4 Hyperbolic velocity triangle. . . . . . . . . . . . . . . . . . . . 3958.5 A Lambert quadrilateral in velocity space consisting of three

right-angles and one acute angle. . . . . . . . . . . . . . . . . 4168.6 A Lambert quadrilateral comprised of complementary

segments where the ‘fourth vertex’ is an ideal point. . . . . . 4209.1 The Sagnac Interferometer as originally depicted in his 1913

article. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4369.2 Disc cut out of hemisphere at an angle ϑ. . . . . . . . . . . . 4409.3 Gamow’s [62] depiction of Einstein’s gedanken experiment

showing the equivalence between accelerationand gravity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443

9.4 The angle of parallelism between two bounding parallelsconnected by the geodesic curve γ . . . . . . . . . . . . . . . . 451

9.5 Geometric characterization of the metric density. . . . . . . . 4559.6 Geometric set-up for stellar aberration. . . . . . . . . . . . . . 4599.7 Fokker’s [65] visualization of fitting errors when objects are

placed on curved surfaces. The left and right sides correspondto negative and positive curvature, respectively. . . . . . . . 465

9.8 Hyperbolic right triangle inscribed in a unit disc. . . . . . . . 4669.9 Interpretation of the variables of the two metrics which are the

radii of the elliptic plane. . . . . . . . . . . . . . . . . . . . . . 486

Page 26: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

List of Figures xxv

9.10 The three possible scenarios of closed, flat and open universes.The freckles are the galaxies which are more or less evenlydistributed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492

9.11 The fates of the universe. . . . . . . . . . . . . . . . . . . . . . 49810.1 A segment H of a horocycle with center � at infinity with

angles of parallelism �. . . . . . . . . . . . . . . . . . . . . . . 50510.2 Angle of parallelism � with transversal perpendicular to one

of the parallel lines. . . . . . . . . . . . . . . . . . . . . . . . . 50610.3 Poincaré’s projections of the Beltrami model vertically into

the southern hemisphere and stereographically back onto theequator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510

10.4 Klein model where vertical sections of the hemisphere areprojected into straight lines. Geodesics retain their straightnessat the cost of not being conformal. . . . . . . . . . . . . . . . 511

10.5 Radiation falling obliquely on a mirror of length AB. . . . . . 51710.6 The Poincaré half-plane model of measuring distances. . . . 52411.1 Spherical right triangle for scheme (II). . . . . . . . . . . . . . 53811.2 Hyperbolic right triangle related to the scheme (III). . . . . . 53911.3 Weak β-decay of the neutron. In Fermi’s theory this occurs at

a single point where the emission of an electron-antineutrinopair is analogous to electromagnetic photon emission. . . . . 555

11.4 The decay of polarized cobalt. . . . . . . . . . . . . . . . . . . 55611.5 The decay plane of cobalt 60. . . . . . . . . . . . . . . . . . . 55711.6 The spherical coordinates used to describe the orientation of

spin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55911.7 The Poincaré sphere is the parametrization of the Stokes

parameters in elliptic geometry. . . . . . . . . . . . . . . . . . 56311.8 The polarization ellipse swept out by the electric field vector

which is enclosed by a rectangle of sides 2a and 2b. Thetransformation to new electric vector components E′

x and E′y

consists in a counter-clockwise rotation about the angle ψ. . 56411.9 Complex plane representation of polarized states. . . . . . . 56611.10 Stereographic projection of the complex plane onto the

Poincaré sphere. . . . . . . . . . . . . . . . . . . . . . . . . . . 56811.11 The scattering of a neutrino and antineutrino emits a Z0 boson

which decays into W bosons. . . . . . . . . . . . . . . . . . . 575

Page 27: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-fm

xxvi A New Perspective on Relativity

11.12 V and S interactions rotate toward one another as the electronvelocity decreases. . . . . . . . . . . . . . . . . . . . . . . . . . 580

11.13 A short vertical antenna. . . . . . . . . . . . . . . . . . . . . . 59111.14 The configuration of electric and magnetic fields on the surface

of a sphere. P is Poynting’s vector showing the direction ofradiation. In any small portion, a spherical wave cannot bedistinguished from a plane wave. . . . . . . . . . . . . . . . . 595

11.15 The polar plots of the spherical harmonics. Maxwell’sequations prohibit the middle radiation pattern. . . . . . . . 597

11.16 The diagrams of the original and deformed paths ofintegration with the pole at r = ∞ as if it were at a finitedistance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640

11.17 A right-spherical triangle. . . . . . . . . . . . . . . . . . . . . 64411.18 A right-spherical triangle traced out by an orbiting electron. 64611.19 Zeeman splitting: light path parallel (perpendicular) to field

results in a doublet (triplet). . . . . . . . . . . . . . . . . . . . 64811.20 The conventional explanation of the Lamb shift as the

shielding of the electron’s charge by virtual electron-positronpairs that are produced by the vacuum when acted upon byan electric field. . . . . . . . . . . . . . . . . . . . . . . . . . . 651

11.21 Splitting of energy levels of a hydrogen-like atom (not drawnto scale). All shifts are left-hand elliptical polarizations. . . . 653

Page 28: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Chapter 1

Introduction

Planck made two great discoveries in his lifetime: the energy quantum and Einstein[Miller 81]

1.1 Einstein’s Impact on Twentieth Century Physics

When one mentions the word ‘relativity’ the name Albert Einstein springsto mind. So it is quite natural to ask what was Einstein’s contribution tothe theory of relativity, in particular, and to twentieth century physics, ingeneral. Biographers and historians of science run great lengths to rewritehistory.

Undoubtedly,Abraham Pais’s [82] book, Subtle is the Lord, is the defini-tive biography of Einstein; it attempts to go beneath the surface and givesmathematical details of his achievements. A case of mention, which willserve only for illustration, is the photoelectric effect.

Pais tells us that Einstein proposed Emax = hν − P, where ν is the fre-quency of the incident (monochromatic) radiation and P is the work func-tion — the energy needed for an electron to escape the surface. He pointedout that [this equation] explains Lenard’s observation of the light intensityindependence of the electron energy. Pais, then goes on to say that first

E [sic Emax] should vary linearly with ν. Second, the slope of the (E, ν) plot is auniversal constant, independent of the nature of the irradiated material. Third,the value of the slope was predicted to be Planck’s constant determined from theradiation law. None of this was known then.

This gives the impression that Einstein singlehandedly discovered thephotoelectric law. This is certainly inaccurate. Just listen to what J. J.Thomson [28] had to say on the subject:

It was at first uncertain whether the energy or the velocity was a linear functionof the frequency. . . Hughes, and Richardson and Compton were however able to

1

Page 29: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

2 A New Perspective on Relativity

show that the former law was correct. . . The relation between maximum energyand the frequency can be written in the form 1

2 mv2 = kν − V0e, where V0 is apotential characteristic of the substance. Einstein suggested that k was equal to h,Planck’s constant. [italics added]

Pais asks “What about the variation of the photoelectron energy with lightfrequency? One increases with the other; nothing more was known in 1905.”So it is not true that “At the time Einstein proposed his heuristic principle,no one knew how E depended on ν beyond the fact that one increases withthe other.” . . . And this was the reason for Einstein’s Nobel Prize.

1.1.1 The author(s) of relativity

Referring to the second edition of Edmund Whittaker’s book, History of theTheory of Relativity, Pais writes

Forty years latter, a revised edition of this book came out. At that time Whittakeralso published a second volume dealing with the period from 1910 to 1926. Histreatment of the special theory of relativity in the latter volume shows how well theauthor’s lack of physical insight matches his ignorance of the literature. I wouldhave refrained from commenting on his treatment of special relativity were it notfor the fact that his book has raised questions in many minds about the prioritiesin the discovery of this theory. Whittaker’s opinion on this point is best conveyedby the title of his chapter on this subject: ‘The Relativity Theory of Poincaré andLorentz.’

Whittaker ignited the priority debate by saying

In the autumn of the same year, in the same volume of the Annalen der Physik as hispaper on Brownian motion, Einstein published a paper which set forth the relativitytheory of Poincaré and Lorentz with some amplifications, and which attracted muchattention. He asserted as a fundamental principle the constancy of the speed of light,i.e. that the velocity of light in vacuo is the same for all systems of reference whichare moving relatively to each other: the assertion which at the time was widelyaccepted, but has been severely criticized by later writers. In this paper Einstein gavethe modifications which must now be introduced into the formulae for aberrationand the Doppler effect.

Except for the ‘severe criticism,’ which we shall address in Sec. 4.2.1, Whit-taker’s appraisal is balanced. Pais’s criticism that “as late as 1909 Poincarédid not know that the contraction of rods is a consequence of the twoEinstein postulates,” and that “Poincaré therefore did not understand oneof the most basic traits of special relativity” is an attempt to discreditPoincaré in favor of Einstein. In fact, there have been conscientious attemptsat demonstrating Poincaré’s ignorance of special relativity.

Page 30: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 3

The stalwarts of Einstein, Gerald Holton [88] and Arthur Miller [81]have been joined by John Norton [04] and Michel Janssen [02]. There hasbeen a growing support of Poincaré, by the French, Jules Leveugle [94],Christian Marchal, and Anatoly Logunov [01], a member of the RussianAcademy of Sciences. It is, however, of general consensus that Poincaréarrived at the two postulates first — by at least ten years — but that “hedid not fully appreciate the status of both postulates” [Goldberg 67].Appre-ciation is fully in the mind of the beholder.

There is a similar debate about who ‘discovered’ general relativity,was it Einstein or David Hilbert? These debates make sense if the theoriesare correct, unique and compelling — and most of all the results they bear.In this book we will argue that they are not unique. It is also very dangerouswhen historians of science enter the fray, for they have no means of judgingthe correctness of the theories. However, since it makes interesting readingwe will indulge and present the pros and cons of each camp.

Why then all the appeal for Einstein’s special theory of relativity?Probably because the two predictions of the theory were found to havepractical applications to everyday life. The slowing down of clocks as aresult of motion should also apply to all other physical, chemical and bio-logical phenomena. The apparently inescapable conclusions that a twinwho goes on a space trip at a speed near that of light returns to earth to findhis twin has aged more than he has, and the decrease in frequency of anatomic oscillator on a moving body with the increase in mass on the movingbody which is converted into radiation, all have resulted in paradoxes.

All this means that the physics of the problems have as yet to be under-stood. Just listen to the words of the eminent physicist Victor Weisskopf [60]:

We all believe that, according to special relativity, an object in motion appears tobe contracted in the direction of motion by a factor [1 − (v/c)2]1/2. A passenger ina fast space ship, looking out the window, so it seemed to us, would see sphericalobjects contracted into ellipsoids.

Commenting on James Terrell’s paper on the “Invisibility of the Lorentzcontraction” in 1960, Weisskopf concludes:

. . . is most remarkable that these simple and important facts of the relativisticappearance of objects have not been noticed for 55 years.

It is well to recognize that what appears as to be a firmly establishedphenomenon keeps popping up in different guises. It is the same type

Page 31: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

4 A New Perspective on Relativity

of remarks that the space contraction is a ‘psychological’ state of mind, andnot a ‘real’ physical effect, that prompted Einstein to reply:

The question of whether the Lorentz contraction is real or not is misleading. It isnot ‘real’ insofar as it does not exist for an observer moving with the object.

Here, Einstein definitely committed himself to the ‘reality’ of the Lorentzcontraction.

1.1.1.1 Einstein’s retraction of these two postulates and theexistence of the aether

The cornerstones of relativity are the equivalence of all inertial frames,and the speed of light is a constant in all directions in vacuo. These pos-tulates were also those of Poincaré who uttered them at least seven yearsprior to Einstein. So what makes Einstein’s postulates superior to those ofPoincaré?

Stanley Goldberg [67] andArthur Miller [73] tell us that Poincaré’s [04]statements

the laws of physical phenomena must be the same for a stationary observer as foran observer carried along in a uniform motion of translation; so that we have notand cannot have any means of discerning whether or not we are carried along insuch a motion,

and

no velocity can surpass that of light,

were elevated to “a priori postulates” [Goldberg 67] which “stood at thehead of his theory.” These postulates also carry the name of Einstein. Whythen would Einstein ever think of retracting them?

If time dilatation and space contraction due to motion are actual pro-cesses then there is no symmetry between observers in different inertialframes. The first postulate of relativity is therefore violated [Essen 71].Einstein used gedanken experiments which is an oxymoron. Consider whatEinstein [16] has to say about a pair of local observers on a rotating disc:

By a familiar result of the special theory of relativity the clock at the circumference —judged by K — goes more slowly than the other because the former is in motionand the latter is at rest. An observer at the common origin of coordinates capableof observing the clock at the circumference by means of light would therefore seeit lagging behind the clock beside him. As he will not make up his mind to let thevelocity of light along the path in question depend explicitly on the time, he will

Page 32: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 5

interpret his observations as showing that the clock at the circumference ‘really’goes more slowly than the clock at the origin.

First the uniformly rotating disc is not an inertial system so the specialtheory does not apply. Second, local observers cannot discern any changesto their clocks or rulers as to where they are on the disc because they shrinkor expand with them. It is only to us Euclideans that these variations areperceptible.

If the velocity of light is independent of the velocity of its source, howthen can the outward journey of a light signal to an observer moving atvelocity v be c + v, on its return it travels with a velocity c − v? Althoughthis violates the second postulate, such assertions appear in the expressionfor the elapsed time of sending out a light signal from one point to anotherand back again in the Michelson–Morley experiment whose null result theyhope to explain. They also appear alongside Einstein’s relativistic velocitycomposition law in his famous 1905 paper “On the Electrodynamics ofMoving Bodies.”

Also in that paper is his ‘definition’ of the velocity of light as the ratioof “light path” to the “time interval.” But we are not allowed to measurethe path of the light ray and determine the time it took, for c has beenelevated to a universal constant! “How can two units of measurement bemade constant by definition?” Essen queries.

In his first attempt to explain the bending of rays in a gravitationalfield, Einstein [11] claims

For measuring time at a place which, relative to the origin of the coordinates, hasa gravitation potential �, we must employ a clock which — when removed to theorigin of coordinates — goes (1 + �/c2) times more slowly than the clock used formeasuring time at the origin of coordinates. If we call the velocity of light at theorigin of coordinates c0, then the velocity of light c at a place with the gravitationalpotential � will be given by the relation

c = c0

(1 + �

c2

).

The principle of the constancy of the velocity of light holds good according to this theoryin a different form from that which usually underlies the ordinary theory of light. [italicsadded]

On the contrary, this violates the second postulate which makes no refer-ence to inertial nor non-inertial frames.And is his equation a cubic equationfor determining c?

Page 33: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

6 A New Perspective on Relativity

It did not take Max Abraham [12] long to point this out statingthat Einstein had given “the death blow to relativity,” by retracting theinvariance of c. Abraham said he warned “repeatedly against the sirensong of this theory. . . [and] that its originator has now convinced himselfof its untenability.” What Abraham objected most to was that even if rela-tivity could be salvaged, at least in part, it could never provide a “completeworld picture,” because it excludes, by its very nature, gravity.

Einstein also uses the same Doppler expression for the frequency shift.The Doppler shift is caused by the motion of the source with respect to theobserver. “There is, therefore, no logical reason why it should be causedby the gravitational potential, which is assumed to be equivalent to theacceleration times distance” [Essen 71]. Thus Einstein is proposing anothermechanism for the shift of spectral lines that employs accelerative motionrather than the relative motion of source and receiver. Does the accelerationof a locomotive cause a shift in the frequency of its whistle? or is it dueto its velocity with respect to an observer on a stationary platform? Butno, Einstein has replaced the product of acceleration and distance withthe gravitational potential — which is static! Just where a clock is in agravitational field will change its frequency. This is neither a shift causedby velocity nor acceleration.

Everyone would agree that Einstein removed the aether. WhereasHertz considered the aether to be dragged along with the motion of abody, Lorentz considered the aether to be immobile, a reference frame foran observer truly at rest. On the occasion of a visit to Leyden in 1920, Ein-stein [22a] had this to say about the aether:

. . . the whole change in the conception of the aether which the special theory ofrelativity brought about, consisted in taking away from the aether its last mechanicalquality, namely, its immobility. . . . according to the general theory of relativity spaceis endowed with physical qualities; in this sense, therefore, there exists an aether.. . . space without aether is unthinkable; for in such a space there not only wouldbe no propagation of light, but also no possibility of the existence for standards ofspace and time (measuring rods and clocks), nor therefore any space time intervalsin the physical sense. But this aether may not be thought of as endowed with thequality characteristic of ponderable media, as consisting of parts which may betracked through time. The idea of motion may not be applied to it.

Essentially what Einstein is saying that what was not good for specialrelativity is good for general relativity for “We know that [the new aether]determines the metrical relations in the space-time continuum.” How is it

Page 34: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 7

needed for the propagation of light signals and yet has not the character-istics of a medium? Einstein’s real problem is with rotations for “Newtonmight no less well have called his absolute space ‘aether;’ what is essentialis merely that besides observable objects, another thing, which is not per-ceptible, must be looked upon as real, to enable acceleration or rotation tobe looked upon as something real.”

This is five years after Einstein’s formulation of general relativity, andhis desire is to unite the gravitational and electromagnetic fields into “oneunified conformation” that would enable “the contrast between aether andmatter [to] fade away, and, through the general theory of relativity, thewhole of physics would become a complete system of thought.” The searchfor that utopia was to occupy Einstein for the remainder of his life.

1.1.1.2 Which mass?

In Lorentz’s theory two masses result depending on how Newton’s law isexpressed, i.e.

F = ddt

(mv),

or

F = ma,

where a is the acceleration. Both forms of the force law coincide when themass is independent of the velocity, but not so when it is a function ofthe velocity. If the force is perpendicular to the velocity there results thetransverse mass,

mt = m0√(1 − β2)

,

while if parallel to the velocity there results the longitudinal mass,

ml = m0

(1 − β2)3/2 .

While it is true that a larger force is required to produce an acceleration inthe direction of the motion than when it is perpendicular to the motion,it “is unfortunate that the concept of two masses was ever developed,for the [second] form of Newton’s law is now recognized as the correctone” [Stranathan 42].

Page 35: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

8 A New Perspective on Relativity

In the early days of relativity the relativistic mass was written m =43E/c2, and not m = E/c2. Einstein was aloof to the factor of 4

3 — whichwas a consequence of the Lorentz transform on energy — but not to therebeing two masses. According to Einstein [05] “with a different definitionof the force and acceleration we would obtain different numerical valuesfor the masses; this shows that we must proceed with great caution whencomparing different theories of the motion of the electron.” Apart from‘numerical’ differences, Kaufmann’s experiments identified the mass asthe transverse mass, but this did not prevent Einstein [06a] to proposean experimental method to determine the ratio of the transverse to thelongitudinal mass.

According to Einstein the ratio of the transverse to longitudinal masswould be given by the ratio of the electric force, eE, to the potential, V, “atwhich the shadow-forming rays get deflected,” i.e.

mt

ml= ρ

2Ex

V,

where ρ is the radius of curvature of the shadow-forming rays and Ex isthe electric field in the x-direction. As the ‘definition’ of the longitudinalmass, ml, Einstein takes

kinetic energy = 12

mlv2.

It would be very difficult for Einstein to get this energy as a nonrelativisticapproximation of a relativistic expression for the kinetic energy.

Einstein’s contention that

A change of trajectory evidently is produced by a proportional change of the fieldonly at electron velocities at which the ratio of the transverse to longitudinal massis noticeably different from unity

is at odds with his assumption of the validity of the equation of motion,

m0d2xdt2 = −eEx,

which holds “if the square of the velocity of the electrons is very smallcompared to the square of the velocity of light.” The mass of the electronm0 is not specified as to whether it is the transverse or longitudinal mass,or a combination of the two.

Page 36: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 9

This example shows that Einstein was not attached to his relativitytheory as he is made out to be. Why is it that the same types of contradic-tions and incertitudes found in Poincaré’s statements are used as proof asto his limitations as a physicist, while there is never mention of them inEinstein’s case?

1.1.1.3 Conspiracy theories

In order to defend the supremacy of German science, David Hilbert, withthe help of Hermann Minkowski and Emil Wiechert, set out to denyPoincaré the authorship of relativity. Hilbert was the last in a long lineof illustrious Göttingen mathematicians who sought to retain the domi-nance of the University which boasted of the likes of Carl Friedrich Gauss,Bernhard Riemann and Felix Klein. Whereas there existed a friendly com-petition between Felix Klein and Poincaré [Stillwell 89], Hilbert’s prede-cessor, there was jealousy between Hilbert and Poincaré, which was onlyexasperated when Poincaré won the Bolyai prize in mathematics for theyear 1905. Ironic as it may be, János Bolyai was the co-inventor of hyper-bolic geometry, and the rivalry between Klein and Poincaré had to do withthe development of that geometry.

As the story goes, Arnold Sommerfeld [04], an ex-assistant of Klein’s,Gustav Herglotz and Wiechert were working on superluminal electronsduring the fall of 1904 through the spring of 1905. In the summer monthsof 1905, beginning on the notorious date of the 5th of June, the Göttingenmathematicians organized seminars on the ‘theory of electrons,’ in whichthere was a session on superluminal electrons chaired by Wiechert on the24th of July.

The date of the 5th of June coincided with Poincaré’s [05] presentationof his paper, “Sur la dynamique de l’électron,” to the French Academy ofSciences. The printed paper was published and sent out to all correspon-dents of the Academy that Friday, the 9th of June. The earliest it could havearrived in Göttingen was Saturday the 10th, or given postal delays it wouldhave arrived no latter than the following Tuesday, the 13th of June.a In that

aThese dates are reasonable since the other German physics bi-monthly journal,Fortschrift der Physik had a synopsis of the Poincaré paper in its 30th of June issue.Given the publication delay, it would make the 10th of June arrival date of theComptes Rendus issue more likely.

Page 37: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

10 A New Perspective on Relativity

paper Poincaré supposedly declared that no material body can go fasterthan the velocity of light in vacuum, and this threw a wrench into the worksof the Göttingen school [Marchal].

However, this is nothing different than what Poincaré [98] had beensaying since 1898 when he postulated the invariance of light in vacuo to allobservers, whether they are stationary or in motion. Or, to what Poincaréreiterated in 1904: “from these results, if they are confirmed would arise anew mechanics [in which] no velocity could surpass that of light.” So theall-important date of the publication date of 5th of June to the proponentsof the conspiracy theory [Leveugle 04] is a red herring for it said onlywhat he had said before on the limiting velocity of light. Moreover, therewas a continual boycott of Poincaré’s relativity work in such prestigiousGerman journals as Annalen der Physik. Consequently, there was no con-tingency for the appearance of Einstein’s paper when it did. But let uscontinue.

So the plot was hatched that some German, of minor importance andone who was willing to take the risks of plagiarism, had to be found thatwould reproduce Poincaré’s results without his name. Now Minkowskiknew of Einstein since he had been his student at the ETHb from 1896–1900. Einstein was also in contact with Planck, since Einstein’s summaryof the work appearing in other journals for the Beiblätter zu der Annalen derPhysik earned him a small income. In fact, there is one review of Einstein ofa paper by A. Ponsot “Heat in the displacement of the equilibrium of a cap-illary system,” that appeared in the Comptes Rendu 140 just 325 pages beforePoincaré’s June 5th paper. To make matters worse, an article by Weiss,which appeared in the same issue of Comptes Rendu, was summarized in theNovember issue of the Supplement, but not for Poincaré’s paper.

Neither that paper nor its longer extension that was published in theRendiconti del Circolo Matematico di Palermo [06] were ever summarized inthe Beiblätter. Surely, these papers would have caught the eye of Planck,who was running the Annalen, and was known to be in correspondencewith Einstein not only in this connection, but, also with regard to questionson quanta. Einstein had also published some papers on the foundations

bThe Eidgenössische Technische Hochschule (ETH) was then known as the Eid-genössische Polytechnikum; the name was officially changed in 1911.

Page 38: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 11

of thermodynamics during the years 1902–1903 in the Annalen whose sim-ilarity with those of J. Willard Gibbs was “quite amazing” even to MaxBorn [51].

Thus, the relativity paper was supposedly prepared by the Göttingenmathematicians and signed by Einstein who submitted it for publicationat the end of June, arriving at the offices of the Annalen on the 30th ofJune. Einstein was an outsider, being considered a thermodynamicist, witha lot to gain and little, if nothing, to lose. The paper fails to mentioneither Lorentz or Poincaré, and, for that matter, contains no referencesat all. If there was a referee for the paper,c other than Planck himself, itwould have been obvious that the transformation of the electrodynamicquantities went under the name of Lorentz, with Lorentz’s parameter k(v)replaced by Einstein’s ϕ(v), both ultimately set equal to 1, and the relativis-tic addition law had already been written down by Poincaré as a conse-quence of the Lorentz transform in his 1905 paper on “Sur la dynamiquede l’électron.” Although Einstein derives the relativistic composition lawin the same way as Poincaré, he provides a new generalization when thecomposition of Lorentz transformations are in different planes, for thatalso involves rotations. It has been claimed that there was no connectionbetween Lorentz and Einstein for Einstein gets the wrong expression forthe transverse mass in his “Electrodynamics of moving bodies,” whileLorentz errs when he subjects the electric current to a Lorentz transforma-tion [Ohanian 08]. But, it is clear from his method of derivation from theLorentz force, that Einstein’s error was a typo. Einstein’s paper appeared inthe 26th of September issue of the Annalen, and Planck lost no time in orga-nizing a symposium on his paper that November, which, in the words ofvon Laue, was “unforgettable.”

Not all is conjecture, certain things are known. First, Poincaré workedin friendly competition with Klein in studying universal coverings of sur-faces. What initiated Poincaré on his studies of hyperbolic geometry was

cApparently the paper was handled by Wilhelm Röntgen, a member of the Kurato-rium of the Annalen, who gave it to his young Russian assistant,Abraham Joffe [Auf-fray 99]. Joffe noted that the author was known to the Annalen, and recommendedpublication. That an experimental physicist should have handled the paper, and notthe only theoretician on the Kuratorium — Planck — would have made such a refer-ring procedure extremely dubious.

Page 39: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

12 A New Perspective on Relativity

an 1882 letter of Klein to Poincaré who informed him of previous work bySchwarz. Second, it was Klein who brought Hilbert to Göttingen. Whencriticized about his choice, Klein responded “I want the most difficultof all.” Third, Klein was known to pass on important letters and scien-tific material to Hilbert. Fourth, since Klein and Poincaré were on goodterms and in contact, it would be unthinkable that Klein did not knowof Poincaré’s work on relativity, and that Klein would have passed thison to Hilbert. Fifth, there was a lack of “kindred spirit” [Gray 07] betweenPoincaré and Hilbert from their first meeting in Paris in 1885. Sixth, Poincaréwas “unusually open about his sources,” [Gray 07] and non-polemical,while Hilbert had a tremendous will who thought every problem was solv-able. Lastly, Poincaré’s work on relativity was actively boycotted in Ger-many, and later in France thanks to Paul Langevin. Thus, it is unthinkablethat Hilbert was in the dark about relativity theory prior to 1905. His col-league, Minkowski, became interested in electrodynamics through readingLorentz’s papers. According to C. Reid, in “Hilbert,” Hilbert conducted ajoint seminar with Minkowski.Ayear after their study, in 1905, they decidedto dedicate the seminar to a topic in physics: the electrodynamics of movingbodies. Hilbert was often quoted as saying “physics is too important to beleft to the physicists.” What is truly unbelievable that the discover of rela-tivity and two models of hyperbolic geometry would not even once thinkthere was a relation between the two. Everything else is conjecture, evenEinstein’s supposed receipt of the latest issue of Volume CXL of ComptesRendus, vested as a reviewer for the Beiblätter, on Monday the 12th of Junein the Berne Patent Office. Undoubtedly, that would have created a direurgency to finish his article on the electrodynamics of a moving body [Auf-fray 99]. But wherever the real truth may lie, there cannot be any doubtthat Planck played a decisive role in Einstein’s rise to fame.

The behavior of Langevin to a fellow countryman is even more baf-fling when we realize that he was the first French physicist to learn of the“new mechanics” of Poincaré, which would later be known as relativity, butwithout the name of its author. Langevin had accompanied Poincaré to theSaint-Louis Congress of 1904 where he presented his principle of relativ-ity. It is hardly admissible that Langevin was not familiar of all Poincaré’spublications especially when Poincaré [06] dedicated a whole section of

Page 40: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 13

his 1906 article in the Rendiconti to him, entitling it “Langevin Waves,” andstating

Langevin has put forth a particularly elegant formulation of the formulas whichdefine the electromagnetic field produced by the motion of a single electron.

Yet, in his obituary column of Poincaré, Langevin fails to note Poincaré’spriority over Einstein’s writing

Einstein has rendered the things clearer by underlining the new notions of space andtime which correspond to a group totally different than the conserved transforma-tions of rational mechanics, and asserting the generality of the principle of relativ-ity and admitting that no experimental procedure could ascertain the translationalmovement of a system by measurements made on its interior. He has succeededin giving definitive form to the Lorentz group and has indicated the relations thatexist between the same quantity simultaneously made on each of two systems inrelative movement.

Henri Poincaré arrived at the same equations in the same time following a differ-ent route, his attention being directed to the imperfect form which the formulas forthe transformation had been given by Lorentz. Familiar with the theory of groups,he was preoccupied to find the invariants of the transformation, elements whichare unaltered and thanks to which it is possible to pronounce all the laws of physicsin a form independent of the reference system; he sought the form that these lawsmust have in order to satisfy the principle of relativity.

This could not have appeared in a more appropriate place: Revue de Méta-physique et de Morales!

Another priority feud also erupted between Einstein and Hilbert overgeneral relativity in November 1915. It ended with the publication of paperswith the unpretentious titles of “The foundation of the general theory ofrelativity,” by Einstein, and “The foundations of physics,” by Hilbert. His-torians of science make Einstein’s theory the ultimate theory of gravita-tion with titles like “How Einstein found his field equations,” [Norton 84],and “Lost in the tensors: Einstein’s struggle with covariance principles”[Earman & Glymour 78]. In the opinion of O’Rahilly [38], “Einstein’s the-ory, which delights every aesthetically minded mathematician, is a muchless grandiose affair as judged and assessed by the physicist.” He points outthat Walther Ritz arrived at prediction of a perihelion advance of the planetsin 1908. We will use his same force equation to show he could have obtainedthe other predictions of general relativity in Sec. 3.8.2. Furthermore, thesame experimental tests of these equations can be obtained with far moresimplicity, as we shall see in Chapter 7. The proponents of the conspiracy

Page 41: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

14 A New Perspective on Relativity

theory claim that Einstein’s conciliatory letter of December to Hilbert maybe due, in part, for the favor that Hilbert did for him ten years earlier.

The defenders of Einstein belittle Poincaré for his “lack of insight intocertain aspects of the physics involved” [Goldberg 67]. The same can besaid of Einstein; in a much quoted letter to Carl Seelig on the occasion ofthe 50th ‘anniversary’ of relativity, Einstein writes:

The new feature was the realization of the fact that the bearing of the Lorentz-transformations transcended their connection with Maxwell’s equations and wasconcerned with the nature of space and time in general. A further result was thatthe Lorentz invariance is a general condition for any physical theory. This was forme of particular importance because I had already previously found that Maxwell’stheory did not account for the micro-structure of radiation and could therefore haveno general validity.

In a letter to von Laue in 1952, Einstein elaborated what he meant by a“second type” of radiation pressure:

one has to assume that there exists a second type of radiation pressure, not deriv-able from Maxwell’s theory, corresponding to the assumption that radiation energyconsists of indivisible point-like localized quanta of energy hν (and of momentumhν/c, c = velocity of light), which are reflected undivided. The way of looking atthe problem showed in a drastic and direct way that a type of immediate realityhas to be ascribed to Planck’s quanta, that radiation, must, therefore, possess a kindof molecular structure as far as energy is concerned, which of course contradictsMaxwell’s theory.

Maxwell’s equations together with the Lorentz force satisfy theLorentz transform so it is difficult to see that the transformation is moregeneral than what it transforms. In addition, the discovery of Planck’sradiation law did not contradict the Stefan–Boltzmann radiation law,nor provide a new type of radiation. Here, Einstein is confusing macro-scopic laws with the underlying microscopic processes that are entirelycompatible with those laws when the former are averaged over all fre-quencies of radiation. Consequently, there is no second type of radiationpressure.

What the conspiracy theories have in common with their opponents isthe presumption that the end result is correct. What authority did Poincaré’sJune paper of 1905 have for dashing the efforts of Sommerfeld’s investiga-tions on superluminal electrons? Weber was no stranger to superluminalparticles nor was Heaviside. In all the years preceding that paper, therewas no authority bearing down upon them even though the mathemat-ical structure of relativity had been set in place. What was supposedly

Page 42: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 15

new about Einstein’s paper was the liberation of space and time from anelectromagnetic framework, as he claimed in his letter to Seelig. But is thistrue?

1.1.1.4 Space-time in Einstein’s world

The conventional way of rebuffing the conspiracy theories is “to show thenature of Poincaré’s ideas and approach that prevented him from pro-ducing what Einstein achieved” [Cerf 06]. Einstein was not so unread ashe would have us believe for he used Poincaré’s method — radar signal-ing — in discussing simultaneous events, and falls into the same trap asPoincaré did.

Poincaré asks us to consider two observers, A and B, who are equippedwith clocks that can be synchronized with the aid of light signals. B sends asignal to A marking down the time instant in which it is sent. A, on the otherhand, resets his clock to that instant in time when he receives the signal.Poincaré realized that such a synchronization would introduce an errorbecause it takes a time t for light to travel between B and A. That is, A’s clockwould be behind B’s clock by a time t = d/c, where d is the distance betweenB and A. This error, according to Poincaré is easy to correct: Let A send alight signal to B. Since light travels at the same speed in both directions,B’s clock will be behind A’s by the same time t. Therefore, in order tosynchronize their clocks it is necessary for A and B to take the arithmeticmean of the times arrived at in this way. This is also Einstein’s result.

Certainly the definition of the velocity v = d/t seems innocuousenough. But, as Louis Essen [71] has pointed out it is possible to define theunits of any two of these terms. Normally, one measures distance in metersand time in seconds so the velocity is meters per second. But making thevelocity of light constant “in all directions and to all observers whether sta-tionary or in relative motion” is tantamount to making c a unit of measure-ment, or what will turn out to be an absolute constant. According to Essen,“the definition of the unit of length or of time must be abandoned; or, tomeet Einstein’s two conditions, it is convenient to abandon both units.”

The two conditions that Essen is referring to is the dilatation of timeand the contraction of length. There is no new physical theory, but, “simplya new system of units in which c is constant” so that either time or length

Page 43: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

16 A New Perspective on Relativity

or both must be a function of c such that their ratio, d/t, gives c. This is notwhat Louis de Broglie [51] had to say:

Poincaré did not take the decisive step. He left to Einstein the glory of havingperceived all the consequences of the principle of relativity and, in particular, ofhaving clarified through a deeply searching critique of the measures of length andduration, the physical nature of the connection established between space and timeby the principle of relativity.

So by elevating the velocity of light to a universal constant, Einsteinimplied that the geometry of relativity was no longer Euclidean. Thenumber c is an absolute constant for hyperbolic geometry that dependsfor its value on the choice of the unit of measurement. To the localobservers there is no such thing as time dilatation nor length contrac-tion. These distortions are due to our Euclidean perspective. It is all aquestion of ‘frame of reference.’

Poincaré after having written down his relativistic law of the com-position of velocities should have realized that the only function whichcould satisfy such a law is the hyperbolic tangent, which is the straight linesegment in Lobachevsky (velocity) space. Thus, time and space have noseparate meaning, but only their ratio does.

Consider Einstein’s two postulates which he enunciated in 1905:

(i) The same laws of electrodynamics and optics will be valid for all framesof reference for which the equations of mechanics hold.

(ii) Light is always propagated in empty space with a definite velocity c,which is independent of the state of motion of the emitting body.

Match them against Poincaré’s first two postulates as he pronouncedthem in 1904:

(i) The laws of physical phenomena should be the same whether for anobserver fixed, or for an observer carried along in a uniform move-ment of translation; so that we could not have any means of discerningwhether or not we are carried along in such a motion;

(ii) Light has a constant velocity and in particular that its velocity is thesame in all directions.

Page 44: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 17

Now Poincaré introduces a third postulate, which Pais makes thefollowing comment:

The new mechanics, Poincaré said, is based on three hypotheses. The first of theseis that bodies cannot attain velocities larger than the velocity of light. The secondis (I use modern language) that the laws of physics shall be the same in all inertialframes. So far so good. Then Poincaré introduces a third hypothesis. ‘One needsto make still a third hypothesis, much more surprising, much more difficult toaccept, one which is of much hindrance to what we are currently used to. A body intranslational motion suffers a deformation in the direction in which it is displaced. . .

However strange it may appear to us, one must admit that the third hypothesis isperfectly verified.’ It is evident that as late as 1909 Poincaré did not know thatthe contractions of rods is a consequence of the two Einstein postulates. Poincarétherefore did not understand one of the most basic traits of special relativity.

Whether or not rods contract or rotate when in motion will be dis-cussed in Sec. 9.9, but it appears that Pais is reading much too much intowhat Poincaré said as to what he actually did. In Sec. 4 of “Sur la dynamiquede l’électron” published in 1905, entitled “The Lorentz transformation andthe principle of least action,” Poincaré shows that both time dilatation andspace contraction follow directly from the Lorentz transformations. By theLorentz transformation,

δx′ = γl(δx − βct), δy′ = lδy, δz′ = lδz, δt′ = γl(δt − βδx/c),

it follows that for measurements made on a body at the same moment,δt = 0, in an inertial system moving with a relative velocity β = v/c alongthe x-axis, the body undergoes contraction by a factor γ−1 when viewed inthe unprimed frame when we set l = 1. It is therefore very strange thatPoincaré would reintroduce this as a third hypothesis when it is a con-sequence of Lorentz’s transformation which he accepts unreservedly. AsPoincaré was prone to writing popular articles and books he may havethought that the contraction of rods were sure to catch the imagination ofthe layman.

The problem is in the interpretation of what is meant by the secondpostulate regarding the constancy of light, which is usually interpreted asthe velocity of light relative to an observer, whether he be stationary ormoving at a velocity v. Thus, instead of obtaining values c + v or c − v forthe velocity of light, for an observer moving at ±v relative to the source,one would always ‘measure’ c. A frequency would therefore not undergoa Doppler shift, contrary to what occurs.

Page 45: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

18 A New Perspective on Relativity

According to Einstein’s prescription, the time taken for a light signalto complete a ‘back-and-forth’ journey over a distance d is the arithmeticaverage of the two

t = 12

d[

1c + v

+ 1c − v

]= d

cc2 − v2 .

We are thus forced to conclude that instead of obtaining the velocity c, weget the velocity c(1 − v2/c2), which differs from the former in the presenceof a second order term, −v2/c2. Rather, if we use the relativistic velocities(c + v)/(1 + v/c) and (c − v)/(1 − v/c), we obtain

t = 12

d[

1 + v/cc + v

+ 1 − v/cc − v

]= d/c,

and the second-order effect disappears, just as it would in the Michelson–Morley experiment [cf. Sec. 3.2].

It is not as Einstein claims: “The quotient [distance by time] is, inagreement with experience, a universal constant c, the velocity of lightin empty space.” The ‘experience’ is the transmission of signals back andforth, like those envisioned by Poincaré. In this setting, the ‘principle’ ofthe constancy of light is untenable [Ives 51].

The velocities of light in the out and back directions co and cb will, ingeneral, be different. If the distance traversed by the light signal is d, thetotal time for the outward and backward journey is, according to Einstein,

t = 12

(dco

+ dcb

)=

(co + cd

cocd

)d2

. (1.1.1)

But, according to the principle of relativity, there should be no differencein the velocities of light in the outward and backward directions, so thatthis principle decrees

t = dc

. (1.1.2)

Equating (1.1.1) and (1.1.2) yields [Ives 51]

(co + cb)/2cocb

= 1c

,

Page 46: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 19

which can easily be rearranged to read:

c√(cocb)

= 2√

(cocb)(co + cb)

≤ 1. (1.1.3)

The inequality in (1.1.3) follows from the arithmetic-geometric meaninequality which becomes an equality only when co = cb = c. Thus, ifthere are no superluminal velocities, the latter case must hold, for if not,one of the two velocities, co or cb must be greater than c.

Asimilar situation occurs for the inhomogeneous dispersion equationof a wave [cf. Sec. 11.5.6],

ω2 = c2κ2 + ω20,

where ω and κ are the frequency and wave number, and ω0 is the criticalfrequency below which the wave becomes attenuated. Differentiation ofthe dispersion equation gives

ωdω = c2κ dκ.

Introducing the definitions of phase and group velocities, u = ω/κ andw = dω/dκ, it becomes apparent that u > c implies w < c [Brillouin 60].Since uw = c2, the equivalence of the two velocities requires the criticalfrequency to vanish and so restores the isotropy of space.

Einstein [05] uses absolute velocities to show that two observers trav-eling at velocities ±v would not find that their clocks are synchronouswhile those at rest would declare them so. He considers light emitted atA at time tA to be reflected at B at time tB which arrives back at A at time t′A.If d is the distance between A and B, the time for the outward and returnjourneys are

tB − tA = dc + v

,

and

t′A − tB = dc − v

,

respectively. Since these are not the same, Einstein concludes that whatseems simultaneous from a position at rest is not true when in relativemotion. But, in order to do so, Einstein is using absolute velocities: thevelocity on the outward journey is c + v, and the velocity of the return

Page 47: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

20 A New Perspective on Relativity

journey is, c − v, and so violates his second postulate. If the relativistic lawof the composition of velocities is used, instead, the total times for outwardand return journeys become the same, which is what is found to within thelimits of experimental error [Essen 71].

Einstein then attempts to associate physical phenomena with the factthat clocks in motion run slower than their stationary counterparts, androds contract when in motion in comparison with identical rods at rest. Heconsiders what is tantamount to the Lorentz transformations, as a rotationthrough an imaginary angle, θ,

x′ = x cosh θ − ct sinh θ, ct′ = ct cosh θ − x sinh θ,

at the origin of the system in motion so that x′ = 0. He thus obtains

x/t = c tanh θ, t′ = t√

(1 − v2/c2) = t/ cosh θ. (1.1.4)

He then concludes that clocks transported to a point will run slower byan amount 1

2 tv2/c2 with respect to stationary clocks at that point, which isvalid up to second-order terms.

Rather, what Einstein should have noticed is that

θ = tanh−1 v/c = 12

ln

(1 + v/c1 − v/c

)

is the relative distance in a hyperbolic velocity space whose ‘radius ofcurvature’ is c. Space and time have lost their separate identities, andonly appear in the ratio v = x/t whose hyperbolic measure is θ = v/c.The role of c is that of an absolute constant, whose numerical value willdepend on the arbitrary choice of a unit segment. By raising the velocity oflight to a universal constant, Einstein implied that the space is no longerEuclidean. Euclidean geometry needs standards of length and time; in thissense Euclidean geometry is relative. In terms of meters and seconds, thespeed of light is 3×108 m/s. If there was no Bureau of Standards we wouldhave no way of defining what a meter or second is.

Not so in Lobachevskian geometry where angles determine the sidesof the triangle. In Lobachevskian geometry lengths are absolute as wellas angles. The ‘radius of curvature’ c is no longer an upper limit to thevelocities, but, rather, defines the unit of measurement. Lobachevskiangeometries with different values of c will not be congruent.As c approachesinfinity, Lobachevskian formulas go over into their Euclidean counterparts.

Page 48: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 21

The exponential distance,

ev/c =(

1 + v/c1 − v/c

)1/2

= ν′

ν, (1.1.5)

is the ordinary longitudinal Doppler factor for a shift in the frequency, ν′,due to a moving source at velocity v. In the Euclidean limit, θ ≈ x/ct and(1.1.5) reduces to the usual Doppler formula [Varicak 10]:

ν′ = ν(1 + v/c).

It is undoubtedly for this reason that both Einstein and Planck found non-Euclidean geometries distasteful. For as Planck remarked [98]

It need scarcely be emphasized that this new conception of the idea of time makesthe most serious demands upon the capacity of abstraction and projective powerof the physicist. It surpasses in boldness everything previously suggested in spec-ulative natural phenomena and even in the philosophical theories of knowledge:non-Euclidean geometry is child’s play in comparison. And, moreover, the princi-ple of relativity, unlike non-Euclidean geometry, which only comes seriously intoconsideration in pure mathematics, undoubtedly possesses a real physical signifi-cance. The revolution introduced by this principle into the physical conceptions ofthe world is only to be compared in extent and depth with that brought about bythe introduction of the Copernican system of the universe.

Prescinding Planck’s degrading remarks concerning non-Euclideangeometries, we can safely conclude that

the distortion effects due to the spatial contraction and time dilatationof moving objects can be perceived by an observer using a Euclideanmetric and clock. To local observers in hyperbolic space, there is nopossible way of discerning these distortions because their rulers andclocks shrink or expand with them. All the ‘peculiar consequences’ arebased on the issue of ‘frame of reference.’

What is truly tragic is that Poincaré never realized that his models ofnon-Euclidean geometries were pertinent to relativity. According to ArthurMiller [73]

For a scientist of Poincaré’s talents the awareness of Lorentz’s theory should havebeen the impetus for the discovery of relativity. Poincaré seemed to have all the req-uisite concepts for a relativity theory: a discussion of the various null experimentsto first and second order accuracy in v/c; a discussion of the role of the speed oflight in length measurements; the correct relativistic transformation equations forthe electromagnetic field and the charge density; a relativistically invariant action

Page 49: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

22 A New Perspective on Relativity

principle; the correct relativistic equation for the addition of velocities; the concept of theLorentz group; a rudimentary of the four-vector formalism and of four-dimensionalspace; a correct relativistic kinematics. . . [italics added]

so what went wrong? Miller claims that “his relativity was to be an induc-tive one with the laws of electromagnetism as the basis of all of physics.”This, according to Miller, prevented him from grasping the “universalapplicability of the principle of relativity and therefore the importance ofthe constancy of the velocity of light in all inertial frames.” In other words,the equations are right but the deductions are wrong. One can deduce whathe likes from the equations as long as it is compatible with experiment.

While Miller [81] acknowledges that both Poincaré and Einstein,“simultaneously and independently,” derived the relativistic addition lawfor velocities, “only Einstein’s view could achieve its full potential.” Hefurther claims that Poincaré never proved “the independence of the veloc-ity of light from its source. . ..” These assertions have no justification at all:Poincaré did not have to prove anything, the velocity addition law negatesballistic theories. It is also not true that “Lorentz’s theory contained specialhypotheses for this purpose.” No special hypotheses are needed since thevelocity addition law is a direct outcome of the Lorentz transformations.Here is a clear intent to disparage Poincaré.

And where is the experimental verification of Einstein’s theory asopposed to Poincaré’s? Or, maybe, Poincaré just did not go far enough?According to Scribner [64] the whole of the kinematical part of Einstein’s1905 paper could have been rewritten in terms of aether theory. So accord-ing to him, the aether would play the role of the caloric in Carnot’s theorywhich, by careful use, did not invalidate his results. Carnot never ‘closed’his cycle for that would have meant equating the heat absorbed at the hotreservoir with the heat rejected at the cold reservoir since, according tocaloric theory, heat had to be conserved.

Where Einstein puts into quotation marks “stationary” as opposed to“moving” it does not imply a physical difference because one is relative tothe other. Moreover, the distinctions between “real” and “apparent” mustlikewise be abandoned. If there is no distinction between the two, then whyshould Einstein have taken exception to Varicak’s remark that Einstein’s“contraction is, so to speak, only a psychological and not a physical fact.”This brought an immediate reaction from Einstein to the effect that Varicak’s

Page 50: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 23

note “must not remain unanswered because of the confusion that it couldbring about.” After all these years has the confusion been abated?

To condemn Lorentz and Poincaré for their belief in the aether isabsurd. The aether for them was the caloric for Carnot. But did the caloricinvalidate Carnot’s principle? And if Carnot has his principle, why doesPoincaré not have his? Carnot’s principle still stands when the scaffoldingof caloric theory falls.

Another analogy associates Poincaré to Weber, and Einstein to Max-well. Weber needed charges as the seat of electrical force, while Maxwellneeded the aether as the medium in which his waves propagate. Maxwell’scircuital equations make no reference to charges as the carriers ofelectricity.

Miller [73] asserts that Poincaré did not realize “in a universal rela-tivity theory the basic role is played by the energy and momentum insteadof the force.” But it was Lorentz’s force that was able to bridge Maxwell’smacroscopic field equations with the microscopic world of charges andcurrents.

It is clear that Poincaré did not want to enter into polemics withEinstein. And Einstein, on his part, admits that his work was precededby Poincaré. After a critical remark made by Planck on Einstein’s firstderivation of �m = �E/c2, to the effect that it is valid to first-order only,the following year Einstein [06b] makes another attempt. In this studyhe proposes to show that this condition is both necessary and sufficientfor the law of momentum, which maintains invariant the center of grav-ity, citing Poincaré’s 1900 paper in the Lorentz Festschrift. He then goeson to say

Although the elementary formal considerations to justify this assertion are alreadycontained essentially in a paper of Poincaré, I have felt, for reasons of clarity, not toavail myself of that paper.

Even though Einstein clearly admits to Poincaré’s priority no one seems tohave taken notice of it.

On July the 5th 1909, Mittag-Leffler, editor of Acta Mathematica writesto Poincaré to solicit a paper on relativity writing

You know without doubt Minkowski’s Space and Time published after his death, andalso the ideas of Einstein and Lorentz on the same problem. Now, Fredholm tellsme that you have reached the similar ideas before these other authors in which you

Page 51: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

24 A New Perspective on Relativity

express yourself in a less philosophical, but more mathematical, manner. Wouldyou write me a paper on this subject . . . in a comprehensible language that even thesimple geometer would understand.

Poincaré never responded.Then there was the letter of recommendation of Poincaré’s to Weiss

at the ETH where he considers Einstein

as one of the most original minds that I have met. I don’t dare to say thathis predictions will be confirmed by experiment, insofar as it will one day bepossible.

Notwithstanding, Einstein writes in November 1911 that “Poincaré wasin general simply antagonistic.” Relativity was probably just a word tohim, since it was he who postulated the ‘principle of relativity.’ But itis true that Poincaré looked to experimental confirmation for his princi-ple. Be that as it may, what is truly incomprehensible is Poincaré’s lackof appreciation of the velocity addition law, for that should have put himon the track of introducing hyperbolic geometry. Then the distortions inspace and time could be explained as the distortion we Euclideans observewhen looking into another world governed by the axioms of hyperbolicgeometry. To the end of his life, Poincaré maintained that Euclidean geom-etry is the stage where nature enacts her play, never once occurring tohim that his mathematical investigations would have some role in thatenactment.

Now Poincaré was more than familiar with Lorentz’s contraction ofelectrons when they are in motion. He even added the additional, non-electromagnetic, energy necessary to keep the charge on the surface of theelectron from flying off in all directions. The contraction of bodies is likenedto the inhabitants of this strange world becoming smaller and smaller asthey approach the boundary. The absolute constant needed for such a geom-etry would be the speed of light which would determine the radius of cur-vature of this world. In retrospect, it is unbelievable how Poincaré couldhave missed all this.

It is also said that Poincaré was using the principle of relativity as afact of nature, to be disproved if there is one experiment that can invalidateit. This is not much different than the second law of thermodynamics. Infact when Kaufmann’s measurements of the specific charge initially tended

Page 52: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 25

to favor the Abraham model of the electron [cf. Sec. 5.4.1], Poincaré [54]appears to have lost faith in his principle for

[Kaufmann’s] experiments have given grounds to the Abraham theory. The princi-ple of relativity may well not have been the rigorous value which has been attributedto it.

Kaufmann’s experiments were set-up to discriminate between variousmodels proposed for the dependency of the mass of the electron on itsspeed. And if the Lorentz model had been found wanting, Einstein hadmuch more to lose since his generalization of Lorentz’s electron theory toall of matter would certainly have been its death knell. Einstein had this tosay in his Jahrbuch [07] article:

It should also be mentioned that Abraham’s and Bucherer’s theories of the motionof the electron yield curves that are significantly closer to the observed curve thanthe curve obtained from the theory of relativity. However, the probability that theirtheories are correct is rather small, in my opinion, because their basic assumptionsconcerning the dimensions of the moving electron are not suggested by theoreticalsystems that encompass larger complexes of phenomena.

The last sentence is opaque, for what do the dimensions of a movingelectron share with larger complexes of phenomena? And how are bothrelated with Kaufmann’s deflection measurements? Einstein may not havelikedAbraham’s model, butAbraham did because, according to him, it wasbased on common sense. It must be remembered that Lorentz’s theory ofthe electron was also a model. According to Born and von Laue, Abrahamwill be remembered for his unflinching belief in “the absolute aether, hisfield equations, his rigid electron just as a youth loves his first flame, whosememory no later experience can extinguish.”

But how rigid could Abraham’s electron be if the electrostatic energydepended on its contraction when in motion? That is everyone will agreethat “Abraham took his electron to be a rigid spherical shell that maintainedits spherical shape once set in motion. . . [yet] a sphere in the unprimedcoordinate system becomes, in the primed system, an ellipsoid of revolu-tion” [Cushing 81]. The unprimed system is related to the prime systemby a dilation factor, equal to the inverse FitzGerald–Lorentz contraction,which elongates one of the axes into the major axis of the prolate ellipsoid.In the Lorentz model, one of the axes is shortened by the contraction factorso that an oblate ellipsoid results. In fact, as we shall see in Sec. 5.4.4, that

Page 53: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

26 A New Perspective on Relativity

the models of Abraham and Lorentz are two sides of the same coin, whichare related in the same way that hyperbolic geometry is related to ellipticgeometry, or a prolate ellipsoid to an oblate ellipsoid.

If we take Einstein’s [Northrop 59] remark:

If you want to find out anything about theoretical physicists, about the methodsthey use, I advise you to stick closely to one principle: don’t listen to their words,fix your attention on their deeds.

at face value, then according to Einstein’s own admission, there is no dif-ference between the Poincaré–Lorentz theory and his. Whether the masscomes from a specific model of an electron in motion, or from general prin-ciples which makes no use of the fact that the particle is charged or not, theymerge into the exact same formula for the dependence of mass on speed.

1.1.2 Models of the electron

At the beginning of the twentieth century several models of the electronwere proposed that were subsequently put to the test by Kaufmann’s exper-iments involving the deflection of fast moving electrons by electric andmagnetic fields. The two prime contenders were the Abraham and Lorentzmodels. If mass of the electron were of purely electromagnetic origin, itshould fly apart because the negative charges on the surface would repelone another. There is a consensus of opinion that it was for this reasonAbraham chose a rigid model of an electron which would not see the accu-mulation of charge that a deformed sphere would.

Miller [81] contends that Abraham “chose a rigid electron because adeformable one would explode, owing to the enormous repulsive forcesbetween its constituent elements of charge.” Even a spherical electronwould prove unstable without some other type of binding forces. In thatcase, “the electromagnetic foundations would be excluded from the out-set,” according to Abraham. In order to calculate the electrostatic energyAbraham needed an expression for the capacitance for an ellipsoid ofrevolution. This he found in an 1897 paper by Searle. The last thing hehad to do was to postulate a dependence of the semimajor axis of rev-olution upon the relative velocity β = v/c. ‘Rigid’ though the electronmay be, Abraham evaluated the electrostatic energy in the primed systemwhere a sphere of radius a turns into a cigar-shaped prolate ellipsoid with

Page 54: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 27

semimajor axis a/√

(1−β2). So Abraham’s rigid electron was not so rigid ashe might have thought for the total electromagnetic energy he found wasproportional to [Bucherer 04]:

12

ln

(1 + β

1 − β

)− β.

This expression happens to be the difference between the measures of dis-tance in hyperbolic and Euclidean velocity spaces. When the radius of cur-vature, c, becomes infinite, the total electromagnetic energy will vanish, andwe return to Euclidean space. So Abraham’s total electromagnetic energywas a measure of the distance into hyperbolic space which depended onthe magnitude of the electron’s velocity.

Abraham’s model fell into disrepute, and even Abraham abandonedit in latter editions of his second volume of Theorie der Elektrizität. Howeverhis electron turns out to be a cigar-shaped, prolate ellipsoid when in motion,while Lorentz’s was a pancake-shaped, oblate ellipsoid. So the two modelswere complementary to one another; the former belonging to hyperbolicvelocity space while the latter to elliptic velocity space, with the transitionbetween the two being made by ‘inverting’ the semimajor and semiminoraxes.

1.1.3 Appropriation of Lorentz’s theory of the electronby relativity

Another historian of science, Russell McCormmach [70], claims that:

Einstein recognized that not only electromagnetic concepts, but the mass and kineticenergy concepts, too, had to be changed. Entirely in keeping with his goal of findingcommon concepts for mechanics and electromagnetism, he deduced from the elec-tron theory elements of a revised mechanics. In his 1905 paper he showed that allmass, charged or otherwise, varies with motion and satisfies the formulas he derivedfor the longitudinal and transverse masses of the electron. He also found a newkinetic energy formula applying to electrons and molecules alike. And he arguedthat no particle, charged or uncharged, can travel at a speed greater than that oflight since otherwise its kinetic energy becomes infinite. He first derived thesenon-Newtonian mechanical conclusions for electrons only. He extended them fromelectrons to material particles on the grounds that any material particle can beturned into an electron by the addition of charge “no matter how small.” It is curi-ous to speak of adding an indefinitely small charge, since the charge of an electronis finite. Einstein could speak this way because he was concerned solely with the“electromagnetic basis of Lorentzian electrodynamics and optics of moving bodies”[italics added].

Page 55: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

28 A New Perspective on Relativity

The argument that takes us from electrodynamic mass to mass ingeneral is the following. Kaufmann and others have deflected cathoderays by electric and magnetic fields to find the ratio of charge to mass.This ratio was found to change with velocity. If charge is invariant, thenit must be the mass in the ratio that increases with the particle’s velocity.These measurements cannot be used to confirm that all the mass of theelectron is electromagnetic in nature. The reason is that “Einstein’s theoryof relativity shows that mass as such, regardless of its origin, must dependon the velocity in a way described by Lorentz’s formula” [italics added][Born 62].

In a collection dedicated to Einstein, Dirac [86] in 1980 observed

In one aspect Einstein went much farther than Lorentz, Poincaré and others, namelyin assuming that the Lorentz transforms should be applicable in all of physics, andnot only in the case of phenomena related to electrodynamics. Any physical force,that may be introduced in the future, must be consistent with Lorentz transforms.

According to J. J. Thomson [28],

Einstein has shown that to conform with the principles of Relativity mass mustvary with velocity according to the law m0/

√(1 − v2/c2). This is a test imposed

by Relativity on any theory of mass. We see that it is satisfied by the conceptionthat the whole of the mass is electrical in origin, and this conception is the onlyone yet advanced which gives a physical explanation of the dependence of masson velocity.

So this would necessarily rule out the existence of neutral matter, and, infact, this is what Einstein [05] says when he remarks that charge “no matterhow small” can be added to any ponderable body.

The dependencies of mass upon motion arose from the assumptionthat bodies underwent contraction in the direction of their motion. Thisfollows directly from the nature of the Lorentz transformation. From thegeometry of the body one could determine the energy, W , and momentum,G, since the two are related by

dW = v dG,

in a single dimension. Then since G = mv, the expression for the incrementin the energy becomes

dW = v2dm + mv dv.

Page 56: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 29

Introducing dW = c2dm, and integrating lead to

m/m0 = 1/√

(1 − β2), (1.1.6)

where m0 is a constant of integration, and β = v/c, the relative velocity.Expression (1.1.6) was derived by Gilbert N. Lewis in 1908. The same

proof was adopted by Philipp Lenard, a staunch anti-relativist, in his ÜberAether und Uräther who attributes it to Hassenöhrl’s [09] derivation of radia-tion pressure. The only verification of a dependency of mass upon velocityat that time was Kaufmann’s experiments on canal rays. Kaufmann wasable to measure the ratio e/m, and assuming that the charge is constant, allthe variation of this ratio must be attributed to the mass.

The mass of the negative particle contains both electromagnetic andnon-electromagnetic contributions. However, Lewis contended that what-ever its origin is mass remains mass so that “it matters not what the sup-posed origin of this mass may be. Equation (1.1.6) should therefore bedirectly applicable to the experiments of Kaufmann.” But an acceleratingelectron radiates, and the radiative force is missing from dG. This did nottrouble Lewis, and he went on to compare the observed value of the relativevelocity with that calculated from (1.1.6). His results are given in the follow-ing table.

m/m0 β (observed) β (calculated)

1 0 01.34 0.73 0.671.37 0.75 0.691.42 0.78 0.711.47 0.80 0.731.54 0.83 0.761.65 0.86 0.801.73 0.88 0.822.05 0.93 0.882.14 0.95 0.892.42 0.96 0.91

Page 57: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

30 A New Perspective on Relativity

Although the calculated and observed values of the relative veloci-ties follow the same monotonic trend, the latter are between 6–8% larger.Lewis believed that this was within the limits of experimental error in Kauf-mann’s experiments. While Kaufmann claimed a higher degree of accuracyis necessary, Lewis believed that

notwithstanding the extreme care and delicacy with which the observations aremade, it seems almost incredible that measurements of this character, which con-sisted in the determination of the minute displacement of a somewhat hazy spoton a photographic plate, could have been determined with the precision claimed.

So what is Lewis comparing his results to?Kaufmann’s initial results agreed better with the expression,

mm0

= 34

1β2

(1 + β2

2βln

1 + β

1 − β− 1

),

derived from Abraham’s model rather than (1.1.6), which coincides withthe Lorentz model, but which has been “derived from strikingly differentprinciples.” Why neutral matter should be subject to the deflection by theelectromagnetic fields in Kaufmann’s set-up is not broached. But, Lewisconsiders that the mass of a positively charged particle emanating froma radioactive source would be a good test-particle because it consists ofmainly ‘ponderable’ matter with a very small ‘electromagnetic’ mass.

Lewis believed that his non-Newtonian mechanics revived the parti-cle nature of light. From the fact that the mass, according to (1.1.6), becomesinfinite as the velocity approaches that of light, it follows that “a beam oflight has mass, momentum and energy, and is traveling at the velocity of lightwould have no energy, momentum, or mass if it were at rest. . ..” This is almosttwo decades before Lewis [26] was to coin the name ‘photon’ in a paperentitled “The conservation of photons.” The paper was quickly forgotten,but the name stuck.

1.2 Physicists versus Mathematicians

In attempting to unravel the priority rights to the unification of lightand electricity we can appreciate a remarkable confluence of physi-cists and mathematicians in one single arena that was never to repeatitself. On the physics side there were André-Marie Ampère, Ludwig

Page 58: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 31

Boltzmann, Rudolf Clausius, Michael Faraday, Hermann von Helmholtz,James Clerk Maxwell, and Wilhelm Weber, while on the mathematics sidethere were Carl Friedrich Gauss and Bernhard Riemann, and those thatshould have been there, but were not: János Bolyai and Nicolai IvanovitchLobachevsky.

To Ampère credit must go to the fall of the universal validity of New-ton’s inverse square law as a means by which particles interact with oneanother at a distance. Today, Ampère is remembered as a unit, rather thanas the discoverer of that law, and contemporary treatises on electromag-netism present the alternative formulation of Jean-Baptiste Biot and FélixSavart.Although both laws of force coincide when the circuit is closed, theydiffer on the values that the force takes between two elements of currentwhen open. That the interaction of persisting direct (galvanic) currentsneeded an angular-dependent force was loathed and scorned at. Surely,magnetism cannot be the result of the motion of charged particles. Oddas it may seem, like many of the French physics community, Biot rejectedAmpère’s discovery outright.

Since the angular dependencies vanish when electric currents appearin complete circuits, it seemed as extra baggage to many, includingMaxwell, who reasoned in continuous fields which could store energyand media (i.e. the aether) in which waves could propagate in. Yet, it wasAmpère’s attempt that would initiate a search for a molecular understand-ing of what electricity is and how it works.

1.2.1 Gauss’s lost discoveriesIt may take very long before I make public my investigations on this issue; in fact, this maynot happen in my lifetime for I fear the ‘clamor of the Boeotians.’Gauss in a letter to Bessel in 1829 on his newly discovered geometry.

Gauss’s seal was a tree but with only seven fruits; his motto read “few,but ripe.” Such was, in effect, an appraisal of Gauss’s scientific accomplish-ments. Gauss had an aversion for debate, and, probably, a psychologicalproblem of being criticized by people inferior to him, like the Boeotians ofGreece who were dull and ignorant.

Ampère’s discovery would have finished in oblivion had it not caughtthe eye of Gauss. By 1828 Gauss was resolved to test Ampère’s angle lawwhen he came into contact with a young physicist, Wilhelm Weber. With no

Page 59: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

32 A New Perspective on Relativity

surprise, Weber was offered a professorship at Göttingen three years later,and an intense collaboration between the two began. According to his 1846monograph, Weber was out to measure a force of one current on the other.This was something not contemplated by Ampère who was satisfied tomaking static, or what he called ‘equilibrium,’ measurements.

When Weber was ready to present his results, he shied away from adiscussion of the angular force because he knew it would cause commotion.A letter from Gauss persuaded him otherwise, and insisted that furtherprogress was needed to find a “constructible representation of how thepropagation of the electrodynamic interaction occurs.”

Weber accepted Fechner’s model in which opposite charges are mov-ing in opposite directions, and interpreted Ampère’s angular force in termsof the force arising from relative motion, depending not only on their rel-ative velocities but also on their accelerations. In so doing, Weber can thusbe considered to be the first relativist! The anomaly in Ampère’s law, wherethere appears a diminution of the force at a certain angle, now appearedas a diminution of the force at a certain speed. That constant later becameknown as Weber’s constant, and in a series of experiments carried out withRudolf Kohlrasch it was found to be the speed of light, increased by a fac-tor of the square root of 2. Present at these experiments was Riemann, andRiemann was later to present his own ideas on the matter.

In the 1858 paper, “A contribution to electrodynamics,” that was readbut not published until after Riemann’s death, Riemann states

I have found that the electrodynamic actions of galvanic currents may be explainedby assuming that the action of one electrical mass on the rest is not instantaneous,but is propagated to them with a constant velocity which, within the limits ofobservation, is equal to that of light.

Although he errs referring to φ = −4πρ as Poisson’s law, insteadof ∇2φ = −4πρ, Riemann surely did not merit the wrath that Clausiusbestowed upon him. Riemann proposes a law of force similar to that ofWeber, where the accelerations along the radial coordinate connecting thetwo particles are replaced by the accelerations projected onto the coordi-nate axes, and advocates the use of retarded potentials instead of a scalarpotential.

In his Treatise, Maxwell cites Clausius’s criticisms as proof of theunsoundness of Riemann’s paper. Surely, Maxwell had no need ofClausius’s help, so it was probably used to avoid direct criticism. Moreover,

Page 60: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 33

Clausius’s criticisms are completely unfounded, and what Maxwell foundwanting in Weber’s electrokinetic potential actually applies to Clausius’sexpression. Whereas Clausius had some grounds for his priority disputewith Kelvin when it came to the second law, here he has none.

Weber’s formulation,which today is all but forgotten, held sway inGermany until Heinrich Hertz [93], Helmholtz’s former assistant, verifiedexperimentally the propagation of electromagnetic waves and showed thatthey had all the characteristics of light. Helmholtz then crowned Maxwell’stheory, and went even a step further by generalizing it to include longitu-dinal waves, if ever there would be a need of them [cf. Sec. 11.5.5].

Gauss played a fundamental role in bridging the transition fromAmpère to Weber. Moreover, Maxwell’s formulation of a wave equation,from his circuit equations, in which electromagnetic disturbances prop-agate at the speed of light, was undoubtedly what Gauss thought wasas an oversimplification of the problem. The complexity of the interac-tions in Ampère’s hypothesis persuaded him that it was not as simple aswriting down a wave equation for a wave propagating at the speed oflight. This will not be the only time Gauss loses out on a fundamentaldiscovery.

Gauss’s letters are more telling than his publications, and if it had notbeen for his reluctance to publish he would have certainly been the discov-erer of what we now know as hyperbolic geometry. Gauss wrote anotherfamous letter, this time to Taurinus in 1824, again reluctant to publish hisfindings. This is what he said:

. . . that the sum of the angles cannot be less than 180◦; this is the critical point, thereef on which all the wrecks occur. . . I have pondered it for over thirty years, and Ido not believe that anyone can have given it more thought. . . than I, though I havenever published anything on it. The assumption that the sum of three angles is lessthan 180◦ leads to a curious geometry, quite different from ours (the Euclidean),but thoroughly consistent. . .

Gauss is, in fact, referring to hyperbolic geometry, and it is another of hislost discoveries. The credit went instead to Bolyai junior and Lobachevsky.In 1831, Gauss was moved to publish his findings, as it appears in a letterto Schumacher:

I have begun to write down during the last few weeks some of my own meditations,a part of which I have never previously put in writing, so that already I have hadto think it all through anew three or four times. But I wished this not to perishwith me.

Page 61: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

34 A New Perspective on Relativity

But it was too late, before Gauss could finish his paper, a copy of Bolyai’sAppendix arrived.

Gauss’s reply to Wolfgang Bolyai senior unveils his disappointment:

If I commenced by saying that I am unable to praise this work, you would certainlybe surprised for a moment. But I cannot say otherwise. To praise it, would be topraise myself. Indeed the whole contents of the work, the path taken by your son,the results to which he is led, coincide almost entirely with my meditations, whichhave occupied my mind partly for the last thirty or thirty-five years. So I remainedquite stupefied. . . it was my idea to write down all this later so that at least it shouldnot perish with me. It is therefore a pleasant surprise for me that I am spared thetrouble, and I am very glad that it is just the son of my old friend, who takesprecedence of me in such a remarkable manner.

Even more mysterious is why Gauss failed to help the younger Bolyaigain recognition for his work. Was it out of jealousy or Gauss’s extremeprudence?

Another person who was looking to the stars for confirmation thattwo intersecting lines can be parallel to another line was Lobachevsky.He, like Gauss, considered geometry on the same status of electrodynam-ics, that is, a science founded on experimental fact. Lobachevsky fullyrealized that deviations from Euclidean geometry would be exceedinglysmall, and, therefore, would need astronomical observations. Just as Gaussattempted to measure the angles of a triangle formed by three mountain-tops, Lobachevsky claimed that astronomical distances would be necessaryto show that the sum of the angles of a triangle was less than two rightangles.

In 1831 Gauss deduced from the axiom that two lines through a givenpoint can be parallel to a third line that the circumference of a circle is2πR sinh r/R, where R is an absolute constant. By simply replacing R by iR,he obtained 2πR sin r/R, or the circumference of a circle of radius r on thesphere. The former will be crucial to the geometrical interpretation of theuniformly rotating disc that had occupied so much of Einstein’s thoughts.And we will see in Sec. 9.11 that Gauss’s expression for the hyperboliccircumference is what modern cosmologists confuse with the expansionfactor of the universe.

The first person to show that there was a complete correspondencebetween circular and hyperbolic functions was Taurinus in 1826, who wasin Gauss’s small list of correspondents on geometrical matters. Althoughthis lent credibility to hyperbolic geometry, neither Taurinus nor Gauss

Page 62: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 35

felt confident hyperbolic geometry was self-consistent. In 1827 Gauss camewithin a hair’s breadth of what would latter be known as the Gauss–Bonnettheorem. This theorem shows that the surfaces of negative curvature pro-duce a geometry in which the angular defect is proportional to the area.Gauss was cognizant that a pseudosphere was such a surface, and Gauss’sstudent Minding latter showed that hyperbolic formulas for triangles arevalid on the pseudosphere. But, a pseudosphere is not a plane, like theEuclidean plane, because it is infinite only in one direction. The exten-sion of the pseudosphere to a real hyperbolic plane came much later withEugenio Beltrami’s exposition in 1868. So it was not clear to Gauss andhis associates what this new geometry was, and, if, in fact, it was logicallyconsistent.

Gauss dabbled in many areas of physics and mathematics, and itwould appear that his interests in electricity and non-Euclidean geome-tries are entirely disjoint. Who would have thought that these two lostdiscoveries might be connected in some way? Surely Poincaré did not andit is even more incredible because he developed two models of hyperbolicgeometry that would have made the handwriting on the wall unmistakableto read.

1.2.2 Poincaré’s missed opportunities

Jules-Henri Poincaré began his career as a mathematician, and, undoubt-edly, became interested in physics because of the courses he gave at theSorbonne. Poincaré was not a geometer by trade, but made a miraculousdiscovery that the Bolyai–Lobachevsky geometry which the geometers,Beltrami and Klein, were trying to construct already existed in mainstreammathematicians [Stillwell 96]. The tragedy is that he failed to see what hecalled a Fuchsian group was the same type of transform that Lorentz wasusing in relativity, and that he would be commenting on the latter withoutany recognition of the former.

1.2.2.1 From Fuchsian groups to Lorentz transforms

Poincaré’s first encounter with hyperbolic geometry came when he wastrying to understand the periodicity occurring in solutions to particulardifferential equations. The single periodicities of trigonometric functions

Page 63: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

36 A New Perspective on Relativity

were well-known, and so too the double periodicities of elliptic functions.Double periodicity can be best characterized by tessellations consisting ofparallelograms in the complex Euclidean plane whose vertices are multi-ples of the doubly periodic points.

Poincaré found a new type of periodic function, which he called ‘Fuch-sian,’ after the mathematician Lazarus Immanuel Fuchs who first discov-ered them.d The periodic function is invariant under a group of substitu-tions of the form

z �→ az + bcz + d

, (1.2.1)

for which ad − bc = 0, for otherwise it would result in a lack-luster con-stant mapping. Poincaré wanted to study this group of transformationsby the same type of tessellations that elliptic functions could be charac-terized in the complex Euclidean plane. Only now the tessellation con-sists of curvilinear triangles in a disc, shown in Fig. 1.1, which Poincaréobtained from earlier work by Schwarz in 1872. The curvilinear trianglesform right-angled pentagons which are mapped onto themselves by thelinear fractional transformation, (1.2.1). As Poincaré tells us

Just at the time I left Caen, where I was living, to go on a geological excursion . . .

we entered an omnibus to go some place or other. At the moment I put my foot onthe step the idea came to me, without anything in my former thoughts seeming tohave paved the way for it, that the transformation I had used to define Fuchsianfunctions were identical with those of non-Euclidean geometry.

The linear fractional transformations, (1.2.1), can be used to define a new conceptof length for which the cells of the tessellation are all of equal size. The resultinggeometry is precisely that of Bolyai–Lobachevsky which, through Klein’srenaming in 1871, has come to be known as hyperbolic geometry.

If c = b and d = a, then the fractional linear transformation (1.2.1)becomes the distance-preserving and orientation-preserving map, witha2 − b2 = 1, of Poincaré’s conformal disc model of the hyperbolic planeD

2-isometrics. What Poincaré failed to realize is that by interpreting z asthe linear fractional transformation (1.2.1), with a = cosh � and b = sinh �,becomes precisely the transformation he named in honor of Lorentz, where

dAfter Klein informed Poincaré in May 1880 that there were groups of linear frac-tional transformations, other than those of Fuchs, Poincaré named them ‘groupeskleinéens,’ to the chagrin of Klein.

Page 64: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 37

Fig. 1.1. A tiling of the hyperbolic plane by curvilinear triangles that form right-angled pentagons.

the sides of any curvilinear triangle in Fig. 1.1 are proportional to the hyper-bolic measures of the three velocities in three different reference frames.Had Poincaré recognized this, it would have changed his mind about the‘convenience’ of Euclidean geometry, and would have brought hyperbolicgeometry into mainstream relativity.

That is, given three bodies moving with velocities u1, u2 and u3,the corresponding triangle with curvilinear sides has as its vertices thepoints u1, u2 and u3. The relative velocities will correspond to the sidesof the triangle and the angles between the velocities will add up to some-thing less than two right angles. It should also be appreciated that thesquare of the relative velocity is invariant under (1.2.1). Suppose that wis a relative velocity formed from the composition of u and v, then ifthese velocities are replaced by the velocities u′ and v′ relative to someother frame, the value of w will be unaffected by the change. In otherwords, the square of the relative velocity w is invariant under a Lorentztransformation.

Page 65: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

38 A New Perspective on Relativity

However, it never dawned on Poincaré that these curvilinear-shapedtriangles might be relativistic velocity triangles for he kept mathematicsand physics well separated in his mind. For he considered

. . . the axioms of geometry . . . are only definitions in disguise. What then are weto think of the question: Is Euclidean geometry true? We might as well ask if themetric system is true and if the old weights and measures are false; if Cartesiancoordinates are true and polar coordinates false. One geometry cannot be moretrue than another: it can only be more convenient.

Convenience was certainly not the answer.e

1.2.2.2 An author of E = mc2

Unquestionably the most famous formula in all of physics, its origins lieelsewhere than in Einstein’s [05b] paper “Does the inertia of a body dependupon its energy content?” John Henry Poynting [07] derived a relationbetween energy and mass from the radiation pressure around the turn ofthe twentieth century. Friedrich Hassenöhrl [04] obtained the effective massof blackbody radiation as 4

3ε/c2, where ε = hν. The same factor of 43 was

found by Comstock [08] from his electromagnetic analysis, and representsthe sum of the energy and the work done by compression, the latter beingequal to one-third of the energy in the ultrarelativistic limit. The sum of thetwo quantities is the enthalpy, as was first clearly stated by Planck [07],so in Einstein’s title ‘heat content,’ or enthalpy, should replace ‘energycontent.’

Once again we find evidence of Poincaré’s priority in the derivation ofthe famous formula, and, as we have mentioned, Einstein’s recognition of it[cf. p. 23]. In the second edition of his text, Électricité et Optique, Poincaré [01]treats the problem of the recoil due to a body’s radiation. He considersthe emission of radiation in a single direction, and in order to maintainfixed the center of gravity, the body recoils like an ‘artillery cannon’ (pièced’artillerie). According to the theory of Lorentz, the amount of the recoilwill not be negligible. Suppose, says Poincaré, that the artillery piece has amass of 1 kg, and the radiation that is sent in one direction at the velocity

eStrangely, we find Einstein [22b] uttering the same words: “For if contradictionsbetween theory and experience manifest themselves, we should rather decide tochange physical laws than to change axiomatic Euclidean geometry.”

Page 66: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 39

of light has an energy of three million Joules. Then, according to Poincaré,it will recoil a distance of 1 cm.

Actually, the relation between ‘electromagnetic momentum’ andPoynting’s vector appears in a 1895 paper by Lorentz, which was com-mented and elaborated upon by Poincaré [00] in 1900. He derives theexpression between the momentum density, G, and the energy flux, S, as

G = S/c2. (1.2.2)

Even earlier in 1893, J. J. Thomson refers to ‘the momentum’ arising fromthe motion of his Faraday tubes. It is only later that Abraham [03] intro-duced the term ‘electromagnetic momentum.’ Pauli [58] unjustly attributes(1.2.2) to Planck [07] as a theorem regarding the equivalence betweenmomentum density and the energy flux density. According to Pauli,

This theorem can be considered as an extended version of the principle of theequivalence of mass and energy. Whereas the principle only refers to the total energy,the theorem has also something to say on the localization of momentum and energy.

Since the magnitude of the energy flux, S = Ec, (1.2.2) becomes:

mv = E/c.

Then introducing m = 103 grams, E = 3×1013 ergs, and c = 3×1010 cm/sec,Poincaré finds v = 1 cm/sec for the recoil speed. Thus, Poincaré derivedE = Gc, and if G is the momentum of radiation, G = mc, so that m = E/c2

is the mass equivalent to the energy of radiation.Poincaré was infatuated with the break-down of Newton’s third law,

the equality between action and reaction, in his new mechanics. In a follow-up paper entitled, “The theory of Lorentz and the principle of reaction,”Poincaré [00] considers electromagnetic energy as a ‘fictitious fluid’ (fluidefictif) with a mass E/c2. The corresponding momentum is the mass of thisfluid times c. Since the mass of this fictitious fluid was ‘destructible’ forit could reappear in other guises, it prevented him from identifying thefictitious fluid with a real fluid. What Poincaré could not rationalize became‘fictitious’ to him.

The lack of conservation of the fictitious mass prevented Poincaréfrom identifying it with real mass, which had to be conserved under allcircumstances. What is conserved, however, is the inertia associated withthe radiation that has produced the recoil of the artillery cannon. It is the

Page 67: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

40 A New Perspective on Relativity

difference between the initial mass and what is radiated that is equal to thechange in energy of the system. Ives [52] showed that

m − m′ = �m = E/c2, (1.2.3)

where m′ is the change in mass after radiation, and E/c2 is the mass ofthe radiant energy, which follows directly from Poincaré’s 1904 relativityprinciple.

The difference between the Doppler shift in the frequency due to asource moving toward and away from a fixed observer is:

�ν = 12ν

[(1 + β

1 − β

)1/2

−(

1 − β

1 + β

)1/2]

= νvc√

(1 − β2). (1.2.4)

The frequency shift becomes a nonlinear function of the velocity, just likethe expression for the relativistic momentum. But here there is no masspresent!

The relation between frequency and energy was known at the time; itis given by Planck’s law, E = hν, so that (1.2.4) could be written as

h�ν/c = Evc2√(1 − β2)

= G,

where G is momentum imparted to the artillery piece due to recoil. It isgiven by

G = �m√(1 − β2)

v,

if (1.2.3) holds. The derivation is thus split into two parts: A nonrelativisticrelation between mass and energy, (1.2.3), which depends only on the cen-tral frequency, ν, and a relativistic part that relates the size of the shift to thevelocity, according to (1.2.4). It is through the difference in the Dopplershifts that the momentum acquires nonlinear dependency upon thevelocity,

12(ev/c − e−v/c) = sinh (v/c) = β√

(1 − β2), (1.2.5)

where v is the hyperbolic measure of the velocity whose Euclidean measureis v. Equation (1.2.5) also indicates that c is the absolute constant of velocity

Page 68: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 41

space. If we multiply (1.2.5) through by πc, it becomes Gauss’s expressionfor the semi-perimeter of a non-Euclidean circle of radius v, and absoluteconstant c, that he wrote in a letter to Schumacher in 1831 [cf. Eq. (9.11.24)].

Where is the mass dependence on velocity?

The Doppler shifts refer to a shift in frequency, the frequency is relatedto an energy, the energy is related to mass; that is, the mass equivalentof radiation. In fact, the attributed nonlinear dependence of mass on itsspeed, (1.2.4), can be obtained without mentioning mass at all!

Poincaré was ever so close to developing a true theory of relativity,but ultimately could not break loose of the classical bonds which heldhim. It is even a greater tragedy that he could not bridge the gap betweenhis mathematical studies on non-Euclidean geometries and relativity thatcould have unified his lifelong achievements.

1.3 Exclusion of Non-Euclidean Geometries fromRelativity

Neither Whittaker, nor Pais, gave any reference to the potential role thatnon-Euclidean geometries could have played in relativity. Pais pays lit-tle tribute to Hermann Minkowski other than saying that Einstein had achange in heart; rather than considering the transcription of his theory intotensorial form as ‘superfluous learnedness’ (überflüssige Gelehrsamkeit),he later claimed it was essential in order to bridge the gap from his specialto general theories.

Minkowski, in his November 1907 address to the Göttingen Mathe-matical Society, began with the words “The world in space and time is, ina certain sense, a four-dimensional non-Euclidean manifold” [cf. p. 37]. Theinvariance of the hyperboloid of space-time from the Lorentz transform wasidentified as a pseudosphere of imaginary radius, or a surface of negative,constant curvature. It is plain from Whittaker’s formulas that the Lorentztransformation consists of a rotation through an imaginary angle.

Poincaré too viewed the Lorentz transformation as a rotation in four-dimensional space-time about an imaginary angle and that the ratio of thespace to time transformations gave the relativistic law of velocity addition.

Page 69: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

42 A New Perspective on Relativity

But, he could not bring himself to identify the velocity as a line element inLobachevsky space.

Edwin Wilson, who was J. Willard Gibbs’s last doctoral student, andLewis [12] felt the need to introduce a non-Euclidean geometry for rota-tions, but not for translations. They assumed, however, that Euclid’s fifthpostulate (the parallel postulate) held, and, therefore, excluded hyper-bolic geometry from the outset, even though their space-time rotations arethrough an imaginary angle. Had they realized that their non-Euclideangeometry was hyperbolic they would have retracted the statement that“Through any point on a given line one and only one parallel (non-intersecting) line can be drawn.”

It would have also saved them the trouble of inventing a new geome-try for the space-time manifold of relativity. They do, in fact, disagree withPoincaré that

it is, however, inconsistent with the philosophic spirit of our time to draw a sharpdistinction between that which is real and that which is convenient, and it wouldbe dogmatic to assert that no discoveries of physics might render so convenientas to be almost imperative the modification or extension of our present system ofgeometry.

Neither their plea nor paper had a sequel.In the last of his eight lectures, delivered at Columbia University in

1909, we listened to Planck’s animosity toward non-Euclidean geometries.Although blown up, and completely out of proportion, Planck was makinga statement that he does not want any infringement on the special theory ofrelativity by mathematicians. Where would this infringement come from?From nowhere else than the Göttingen school of mathematicians, notablyFelix Klein.

The Hungarian Academy of Science established the Bolyai Prize inmathematics in 1905. The commission was made up of two Hungariansand two foreigners, Gaston Darboux and Klein. The contenders were noneother than Poincaré and Hilbert. Although the prize went to Poincaré, hisold friend Klein refused to present him with it citing ill health. Accordingto Leveugle [04] it would have meant that Klein had to publicly admitPoincaré’s priority over Einstein to the principle of relativity, and the groupof transformations that has become known as the Lorentz group, a namecoined by Poincaré in honor of his old friend. This would not have been

Page 70: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 43

received well by the Göttingen school for not only did Hilbert come inat second place, it would have been a debacle of all their efforts to retainrelativity as a German creation.

Arnold Sommerfeld, a former assistant to Klein, showed in 1909that the famous addition theorem of velocities, to which Einstein’sname was now attached, was identical to the double angle formula forthe hyperbolic tangent. The velocity parallelogram closes only at lowspeeds. This was the first demonstration that hyperbolic geometry def-initely had a role in relativity, and its Euclidean limit emerged at lowspeeds.

Now Sommerfeld would surely have known that the hyperbolic tan-gent is the straight line segment in Lobachevsky’s non-Euclidean geometry.Acknowledgment of his former supervisor’s interest in relativity surfacedin the revision of Pauli’s [58, Footnote 111] authoritative Mathematical Ency-lopedia article on relativity where he wrote:

This connection with the Bolyai–Lobachevsky geometry can be briefly describedin the following way (this had not been noticed by Varicak): If one interpretsdx1, dx2, dx3, dx4 as homogeneous coordinates in a three-dimensional projectivespace, then the invariance of the equation (dx1)2 + (dx2)2 + (dx3)2 − (dx4)2 = 0amounts to introducing a Cayley system of measurement, based on a real conicsection. The rest follows from the well-known arguments of Klein.

Sommerfeld just could not resist rewriting the history of relativity. Hechanged Minkowski’s opinion of the role Einstein had in formulating theprinciple of relativity. Quite inappropriately he inserted a phrase praisingEinstein for having used the Michelson experiment to show that a state ofabsolute rest, where the immobile aether would reside, has no effect onphysical phenomena [Pyenson 85]. He also exchanged the role of Einsteinas the clarifier with that as the originator of the principle of relativity.f

A much more earnest attempt to draw hyperbolic geometry into themainstream of relativity was made by Vladimir Varicak. Varicak says that

fAnd Sommerfeld’s revisions did not stop at relativity. Writing in the obituary col-umn of the recently deceased Marion von Smolukowski, Sommerfeld lauds Einsteinfor his audacious assault on the derivation of the coefficient of diffusion in Brow-nian motion, “without stopping to bother about the details of the process.” VonLaue, writing in his History of Physics clearly states that Smolukowski developeda statistical theory of Brownian motion in 1904 “to which Einstein gave definitiveform (1905).”

Page 71: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

44 A New Perspective on Relativity

even before he heard Minkowski’s 1907 talk, he noticed the profound anal-ogy between hyperbolic geometry and relativity. At low velocities, the lawsof mechanics reduce to those of Newton, just as Lobachevskian geome-try reduces to that of Euclidean geometry when the radius of curvaturebecomes very large. To Varicak, the Lorentz contraction appears as a defor-mation of lengths, just as the line segment of Lobachevskian geometry isbowed.

Taking the line element of the half-plane model of hyperbolic geom-etry, Varicak says that it cannot be moved around without deformation.Thus, he queries whether the Lorentz contraction can be understood asan anisotropy of the (hyperbolic) space itself. Varicak also appreciates thatin relativity the velocity parallelogram does not close; hence, it does notexist, and must be replaced by hyperbolic addition, which is the doubleangle formula of the hyperbolic tangent. Relativity abandons the absolute,but does introduce an absolute velocity, c; this corresponds to the absoluteconstant in the Lobachevsky velocity space.

Owing to the fact that an inhabitant of the hyperbolic plane wouldsee no distortion to his rulers as he moves about because his rulers wouldshrink or expand with him, Varicak questions the reality of the Lorentztransform. To Varicak, the “contraction is, so to speak, only a psychologicaland not a physical fact.” Although known non-Euclidean geometries werenot entertained by Einstein, Varicak’s formulation should have raised eye-brows. But it did not. The only thing that it would do, by questioning thereality of the space contraction, would be to cause confusion, and this pro-voked a response by Einstein himself. But whose confusion did he abate?

Apart from optical applications referring to the Doppler shift andaberration, which were already contained in Einstein’s 1905 paper in adifferent form, Varicak produced no new physical relations or new insightsinto old ones. These factors led to the demise of the hyperbolic approachto relativity, as far as physicists were concerned.

However, there was an isolated incident in 1910, where TheodorKaluza [10] draws an analogy between a uniformly rotating disc andLobachevskian geometry. Kaluza writes the line element as

∫ √1 + r2

1 ± r2

(dϕ

dr

)2

dr, (*)

Page 72: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 45

which at constant radius becomes

∫r2

√1 ± r2

dϕ. (**)

If Kaluza wants to show that the circumference of a hyperbolic circle isgreater than its Euclidean counterpart, he has to choose the negative signin expression (*), bring out the dr from under the square root, and removethe square in the numerator of (**). Apart from these typos, and the fact thatthe first factor in (*) had to be divided by (1 − r2)2, Kaluza was the first todraw attention to the fact that the hyperbolic metric of constant curvaturedescribes exactly a uniformly rotating disc. The paper was stillborn.

Another unexplainable event is that Einstein entered into a mathe-matical collaboration with his old friend, Marcel Grossmann, to developa Riemannian theory of general relativity. Grossmann was an expert innon-Euclidean geometries; so why did he not set Einstein on the track oflooking at known non-Euclidean metrics instead of putting him on thetrack of Riemannian geometry? Probably Einstein wanted the general the-ory to reduce to Minkowski’s metric in the absence of gravity which meantthat the components of the metric tensor reduce to constants. But thatmeant he was fixing the propagation of gravitational interactions at thespeed of light. Grossmann is, however, usually remembered for havingled Einstein astray in rejecting the Ricci tensor as the gravitational tensor[Norton 84].

In order for it to reproduce correctly the curvature of ‘space-time,’ thecoefficients would have to be (nonlinear) functions of space, and maybeeven of time. According to Einstein the Riemannian metric should playthe role of the gravitational field. Curvature would be a manifestation ofthe presence of mass–energy so that if he could find a curvature tensor,comprising of the components of the metric tensor, then by setting it equalto a putative energy–momentum tensor he could find the components ofthe metric tensor, and thereby determine the line element.

Such an equation would combine time and space with energy andmomentum. The rest is history and has been too amply described byhistorians of science. Since the metric has ten components, the searchwas on for a curvature tensor with the same number of components.The contraction of the Riemann–Christoffel tensor into the Ricci tensor,

Page 73: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

46 A New Perspective on Relativity

having ten components, seemed initially as a good bet to be set equal theenergy–momentum tensor. Setting the Ricci tensor equal to zero was madea condition for the emptiness of space. It constitutes Einstein’s law of grav-itation, and as Dirac [75] tells us

‘Empty’ here means that there is no matter present and no physical fields exceptthe gravitational field. [italics added] The gravitational field does not disturb theemptiness. Other fields do.

So gravity can act where matter and radiation are not!When the field is not empty, setting the Ricci tensor equal to the

energy–momentum tensor leads to inconsistencies insofar as energy–momentum is not conserved. If the Ricci tensor vanishes then so do allthat is related to it, like the scalar, or total, curvature. Einstein found that ifhe subtracted one-half the curvature-invariant from the Ricci tensor and setit equal to the energy–momentum tensor, then energy–momentum wouldbe conserved.

The equipment needed to carry out the program involves, curvi-linear coordinates, parallel displacement, Christoffel symbols, covariantdifferentiation, Bianchi relations, the Ricci tensor and its contraction, plusa knowledge of what the energy–momentum tensor is. The only outstand-ing solution is known as the Schwarzschild metric, in which the metric isconstructed on solving the ‘outer’ and ‘inner’ solutions [cf. Secs. 9.10.3 and9.10.4].All the known tests of general relativity are independent of the time-component of the metric, except for the gravitational shift of spectral lines,which is independent of the spatial component. The latter was predicted byEinstein in 1911, prior to his general theory of relativity. However, it doesnot follow from the Doppler shift so Einstein was either uncannily lucky, orthe true explanation lies elsewhere.

Viewed from a pseudo-Euclidean point of view, there is a cleardistinction between special and general relativity. Within the hyperbolicframework, this separation between inertial and noninertial ones becomesblurred. This is because the uniformly rotating disc is, as Stachel [89] claims,the missing link to Einstein’s general theory. That the Beltrami metricdescribes exactly the uniformly rotating disc, means that hyperbolic geom-etry is also the framework for noninertial systems.

We have already seen Planck’s hostility to non-Euclidean geometries.There was also Wilhelm Wien, Planck’s assistant editor of the Annalen, who

Page 74: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 47

insisted that relativity has “no direct point of contact with non-Euclideangeometry,” and Arnold Sommerfeld who considered the reinterpretationof relativity in terms of non-Euclidean geometry could “be hardly recom-mended.” Authoritarianism carried the day and non-Euclidean geometrywas shelved for good. It is the purpose of this monograph to show thatnon-Euclidean geometries make inroads into relativistic phenomena andwarrant our attention.

References

[Abraham 03] M. Abraham, “Prinzipien der Dynamik des Elektrons,” Ann. derPhys. 10 (1903) 105–179.

[Abraham 12] M. Abraham, “Relativität und Gravitation. Erwiderung auf eineBemerkung des Herrn A. Einstein,” Ann. der Phys. 38 (1912) 1056–1058.

[Auffray 99] J.-P. Auffray, Einstein et Poincaré: Sur des Traces de la Relativité (LePommier, Paris, 1999), pp. 131, 133.

[Born 51] M. Born, “Physics in my generation, the last fifty years,” Nature 268 (1951)625.

[Born 62] M. Born, Einstein’s Theory of Relativity (Dover, New York, 1962), p. 278.[Brillouin 60] L. Brillouin, Wave Propagation and Group Velocity (Academic Press,

New York, 1960), p. 143.[Bucherer 04] A. H. Bucherer, Mathematische Einführung in die Elektronentheorie

(Teubner, Leipzig, 1904), p. 50, Eq. (91a).[Cerf 06] R. Cerf, “Dismissing renewed attempts to deny Einstein the discovery of

special relativity,” Am. J. Phys. 74 (2006) 818–824.[Comstock 08] D. F. Comstock, “The relation of mass to energy,” Phil. Mag. 15 (1908)

1–21.[Cushing 81] J. T. Cushing, “Electromagnetic mass, relativity, and the Kaufmann

experiments,” Am. J. Phys. 49 (1981) 1133–1149.[de Broglie 51] L. de Broglie, Savants et Découvertes (Albin Michel, Paris, 1951), p. 50.[Dirac 75] P. A. M. Dirac, General Theory of Relativity (Wiley-Interscience,

New York, 1975), p. 25.[Dirac 86] P. A. M. Dirac, Collection Dedicated to Einstein, 1982-3 (Nauka, Moscow,

1986), p. 218.[Earman & Glymour 78] J. Earman and C. Glymour, “Lost in the tensors: Ein-

stein’s struggles with covariance principles” Stud. Hist. Phil. Sci. 9 (1978)251–278.

[Einstein 05a] A. Einstein, “On the electrodynamics of moving bodies,” Ann. derPhys. 17 (1905); transl. in W. Perrett and G. B. Jeffrey, The Principle of Rela-tivity (Methuen, London, 1923).

[Einstein 05b] “Does the inertia of a body depend upon its energy content?,” Ann.der Phys. 18 (1905) 639–641; translated in The Collected Papers of AlbertEinstein: The Swiss Years, Vol. 2 (Princeton U. P., Princeton NJ, 1989),pp. 172–174.

Page 75: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

48 A New Perspective on Relativity

[Einstein 06a] A. Einstein, “On a method for the determination of the ratio ofthe transverse and longitudinal mass of the electron,” Ann. der Phys. 21(1906) 583–586; translated in The Collected Papers of Albert Einstein, Vol. 2(Princeton U. P., Princeton NJ, 1989), pp. 207–210.

[Einstein 06b] A. Einstein, “Le principe de conservation du mouvement ducentre de gravité ed l’inertie de l’energie,” Ann. der Phys. 20 (1906)627–633.

[Einstein 07] A. Einstein, “On the relativity principle and the conclusions drawnfrom it,” Jahrbuch der Radioaktivität und Elektronik 4 (1907) 411–462; trans-lated in The Collected Papers of Albert Einstein, Vol. 2 (Princeton U. P.,Princeton NJ, 1989), pp. 252–311.

[Einstein 11] A. Einstein, “On the influence of gravitation on the propagation oflight,” Ann. der Phys. 35 (1911); translated in W. Perrett and G. B. Jeffrey,The Principle of Relativity (Methuen, London, 1923), pp. 99–108.

[Einstein 16] A. Einstein, “The foundation of the general theory of relativity,” Ann.der Phys. 49 (1916); translated in W. Perrett and G. B. Jeffrey, The Principleof Relativity (Methuen, London, 1923), pp. 111–173.

[Einstein 22a] A. Einstein, “Aether and the theory of relativity,” translated inG. B. Jeffrey and W. Perrett, Sidelights on Relativity (E. P. Dutton, NewYork, 1922), pp. 1–24.

[Einstein 22b] A. Einstein, “Geometry and experience,” translated in G. B. Jeffreyand W. Perrett, Sidelights on Relativity (E. P. Dutton, New York, 1922),pp. 27–56.

[Essen 71] L. Essen, The Special Theory of Relativity: A Critical Analysis (ClarendonPress, Oxford, 1971).

[Goldberg 67] S. Goldberg, “Henri Poincaré and Einstein’s theory of relativity,”Am. J. Phys. 35 (1967) 934–944.

[Gray 07] J. Gray, Worlds Out of Nothing (Springer, London, 2007), p. 252.[Hassenöhrl 04] F. Hassenöhrl, “Zur Theorie der Strahlung in bewegten Körpern,”

Ann. der Phys. 320 (1904) 344–370; Berichtigung, ibid. 321 589–592.[Hassenöhrl 09] F. Hassenöhrl, “Bericht über dei Trägheit der Energie,” Jahrbuch

der Radioactivität 6 (1909) 485–502.[Hertz 93] H. Hertz, Electric Waves (Macmillan, London, 1893).[Holton 88] G. Holton, Thematic Origins of Scientific Thought (Harvard U. P.,

Cambridge MA, 1988).[Ives 51] H. Ives, “Revisions of the Lorentz transformations,” Proc. Am. Phil. Soc.

95 (1951) 125–131.[Ives 52] H. Ives, “Derivation of the mass–energy relation,” J. Opt. Soc. Am. 42 (1952)

540–543.[Janssen 02] M. Janssen, “Reconsidering a scientific revolution: The case of Einstein

versus Lorentz,” Phys. Perspect. 4 (2002) 424–446.[Kaluza 10] Th. Kaluza, “Zur Relativitätstheorie,” Physik Zeitschr. XI (1910)

977–978.[Leveugle 94] J. Leveugle, “Poincaré et la relativité,” La Jaune et la Rouge 494 (1994)

31–51.

Page 76: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

Introduction 49

[Leveugle 04] J. Leveugle, La Relativité, Poincaré et Einstein, Planck, Hilbert. HistoireVéridique del la Théorie de la Relativité (l’Harmattan, Paris, 2004).

[Lewis 08] G. N. Lewis, “Arevision of the fundamental laws of matter and energy,”Phil. Mag. 16 (1908) 705–717.

[Lewis 26] G. N. Lewis, “The conservation of photons,” Nature 118 (1926) 874–875.[Logunov 2001] A. A. Logunov, On the Articles by Henri Poincaré “On the dynamics

of the electron”, 3rd ed. (Dubna, 2001).[Marchal] C. Marchal, “Poincaré, Einstein and the relativity: A surprising secret,”

(http:// www.cosmosaf.iap,fr/Poincare.htm).[McCormmach 70] R. McCormmach, “Einstein, Lorentz and the electromagnetic

view of Nature,” Hist. Studies Phys. Scis. 2 (1970) 41–87.[Miller 73] A. I. Miller, “A study of Henri Poincaré’s ‘Sur la Dynamique de

l’Électron,”’ Arch. His. Exact Sci. 10 (1973) 207–328.[Miller 81] A. I. Miller, Albert Einstein’s Special Theory of Relativity (Addison-

Wesley, Reading MA, 1981), p. 254.[Northrop 59] F. C. Northrop, “Einsteins’s conception of science,” in ed. P. A.

Schillip, Albert Einstein Philosopher-Scientist, Vol. II, (Harper Torchbooks,New York, 1959), p. 388.

[Norton 84] J. Norton, “How Einstein found his field equations,” Stud. Hist. Phil.Sci. 14 (1984) 253–284.

[Norton 04] J. D. Norton, “Einstein’s investigations of Galilean covariant electro-dynamics prior to 1905,” Arch. His. Exact Sci. 59 (2004) 45–105.

[Ohanian 08] H. C. Ohanian, Einstein’s Mistakes: The Human Failings of Genius(W. W. Norton & Co., New York, 2008), p. 84.

[Pais 82] A. Pais, Subtle is the Lord (Oxford U. P., Oxford, 1982), p. 381.[Pauli 58] W. Pauli, Theory of Relativity (Dover, New York, 1958), p. 125.[Planck 07] M. Planck, “Zur Dynamik bewegter Systeme,” Berliner Sitzungsberichte

Erster Halbband (29) (1907) 542–570; see also, B. H. Lavenda, “Does theinertia of a body depend on its heat content?,” Naturwissenschaften 89(2002) 329–337.

[Planck 98] M. Planck, Eight Lectures on Theoretical Physics (Dover, New York, 1998),p. 120.

[Poincaré 98] H. Poincaré, “La mesure du temps,” Rev. Mét. Mor. 6 (1898)371–384.

[Poincaré 00] H. Poincaré, “The theory of Lorentz and the principle of reaction,”Arch. Néderland. Sci. 5 (1900) 252–278.

[Poincaré 01] H. Poincaré, Électricité et Optique: La Lumière et les Théories Électrody-namiques, 2 ed. (Carré et Naud, Paris, 1901), p. 453.

[Poincaré 04] H. Poincaré, “L’état actuel et l’avenir de la physique mathema-tique,” Bulletin des sciences mathématiques 28 (1904) 302–324; translation“The principles of mathematical physics,” Congress of Arts and Sci-ence, Universal Exposition, St. Louis, 1904 Vol. 1, (1905) pp. 604–622(http://www.archive.org/details/ congressofartssc01inte).

[Poincaré 05] H. Poincaré, “Sur la dynamique de l’électron,” Comptes Rend. Acad.Sci. Paris 140 (1905) 1504–1508.

Page 77: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch01

50 A New Perspective on Relativity

[Poincaré 06] H. Poincaré, “Sur la dynamique de l’électron,” Rend. Circ. Mat.Palermo 21 (1906) 129–175.

[Poincaré 52] H. Poincaré, Science and Hypothesis (Dover, New York, 1952),pp. 70–71; translated from the French edition, 1902.

[Poincaré 54] H. Poincaré, Oeuvres (Gauthier-Villars, Paris, 1954), p. 572.[Poynting 07] J. H. Poynting, The Pressure of Light, 13th Boyle Lecture delivered

30/05/1906 (Henry Frowde, London, 1907); (Soc. Promo. Christ. Know.,London, 1910).

[Pyenson 85] L. Pyenson, The Young Einstein: The Advent of Relativity (Adam Hilger,Bristol, 1985).

[Schribner 64] C. Scribner, Jr, “Henri Poincaré and the principle of relativity,” Am.J. Phys. 32 (1964) 672–678.

[Sommerfeld 04] A. Sommerfeld, “Überlichtgeschwindigkeitsteilchen,” K. Akad.Wet. Amsterdam Proc. 8 (1904) 346 (translated from Verslag v. d. Gewone Ver-gadering d. Wis-en Natuurkundige Afd. 26/11/1904, Dl. XIII); Nachr. Wiss.Göttingen 25/02/1904, 201–235.

[Stachel 89] J. Stachel, “The rigidly rotating disc as the ‘missing link’ in the historyof general relativity,” in Einstein and the History of General Relativity, eds.D. Howard and J. Stachel (Birhaüser, Basel, 1989).

[Stillwell 89] J. Stillwell, Mathematics and Its History (Springer, New York, 1989),p. 311.

[Stillwell 96] J. Stillwell, Sources of Hyperbolic Geometry (Am. Math. Soc., ProvidenceRI, 1996), p. 113.

[Stranathan 42] J. D. Stranathan, The ‘Particles’ of Modern Physics (Blakiston,Philadelphia, 1942), p. 137.

[Thomson 28] J. J. Thomson and G. P. Thomson, Conduction of Electricity ThroughGases, 3rd ed. (Cambridge U. P., Cambridge, 1928), p. 439.

[Varicak 10] V. Varicak, “Application of Lobachevskian geometry in the theory ofrelativity,” Physikalische Zeitschrift 11 (1910) 93–96.

[Walter 99] S. Walter, “The non-Euclidean style of Minkowskian relativity,” in J.Gray, ed. The Symbolic Universe (Oxford U. P., Oxford, 1999), pp. 91–127.

[Weisskopf 60] V. F. Weisskopf, “The visual appearance of rapidly moving objects,”Phys. Today, Sept. 1960, 24–27.

[Whitakker 53] E. Whittaker, A History of the Theories of Aether and Electricity, Vol. IIThe Modern Theories 1900–1926 (Thomas Nelson & Sons, London, 1953),p. 38.

[Wilson & Lewis 12] E. B. Wilson and G. N. Lewis, “The space-time manifold of rel-ativity. The non-Euclidean geometry of mechanics and electrodynamics,”Proc. Am. Acad. Arts and Sci. 48 (1912) 387–507.

Page 78: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Chapter 2

Which Geometry?

2.1 Physics or Geometry

2.1.1 The heated plane

In La Science et l’Hypothèse Henri Poincaré [68] argued for the ‘passivity’of physical space. Since all measurements involve both physical and geo-metrical assumptions, Poincaré considered it meaningless to ask whetherspace was Euclidean or non-Euclidean. We might try to measure the sumof the angles of a triangle formed by three hill tops and check to determinewhether their sum was greater or less than 180◦.

In fact, Gauss attempted such a measurement. He measured thesum of the angles of a triangle formed by the three peaks of Broken,Hohehangen and Inselsberg. The sides of the triangle were 69, 85 and197 km. Gauss determined that the sum exceeded 180◦ by 14′′85. How-ever, to the chagrin of Gauss, the experiment was inconclusive since theexperimental error was greater than the excess he found. In fact, the sumcould have as well as been less than 180◦. The triangle was too small, sinceas Gauss realized, the defect is proportional to its area, and only a big tri-angle, of astronomical proportions, could be used to settle the question ofwhether the geometry of the universe is Euclidean or not.

Poincaré was more indecisive in that he reasoned that any defectwhich could be revealed could equally as well be the consequence of thefact that light rays do not always travel in straight paths. It is this type ofreasoning that was used against Poincaré, and from being denied the dis-covery of relativity. For we have seen in 1.2.2 that many of the concepts thatwere attributed to Einstein rightly belong to Poincaré, such as the veloc-ity addition theorem, for which uniform motion is undetectable as far asphysical laws are concerned, and the axiom that nothing can travel faster

51

Page 79: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

52 A New Perspective on Relativity

than light. His willingness to change a physical law so as to suit Euclideangeometry is responsible for his secondary role in twentieth century science.But not a word was muttered when Einstein [22] was found agreeing withPoincaré’s philosophy.

To make his point, Poincaré considered an imaginary universe � inthe interior of a sphere of radius R. In such a universe, at any point p, itstemperature would be given by T(p) = k(R2 − r2), where k is a positiveconstant and r is the Euclidean distance from the center of the sphere tothe point p in question. He also assumed that the linear dimensions of thebody vary with the temperature at the point where the body is found sothat as one moves from the center of the body to the surface he becomescolder and contracts. In fact, it would take him an infinite amount of timeto reach the surface. Even worse he cannot detect his shrinkage becausethe measuring sticks he uses shrink along with him. To our traveler, theuniverse appears infinite.

We know that in Euclidean space the shortest path between two pointsis a straight line. But, because of the shrinkage, these geodesics, which are bydefinition the paths of shortest distance between pairs of points in �, are notstraight lines, but are curves bent inward toward the center of �. Actually,they are circular arcs that cut the boundary normally. This is shown inFig. 2.1 where the bug’s right legs are shorter than his left so even thoughhe thinks he is traveling in a straight path, the unequal lengths of his legscause him to follow a circular arc AB.

Fig. 2.1. A bug’s life in the heated disk; ‘hot’ in the center and ‘cold’ on the disc.

Page 80: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 53

However, to the bug, his right legs do not appear to be shorter than hisleft legs because his measuring tools also contract as things get colder.But, to us Euclideans, it appears that the bug’s right legs are shorter thanhis left legs because we are using Euclidean measuring sticks.

So even though we owe this model of hyperbolic geometry toPoincaré, he failed to find it physically attractive. The question he posed“Which geometry is correct?” was answered by him with another question:“Which geometry is more convenient?” And unhesitatingly Poincaré clungto Euclidean geometry as the true geometry which Nature chooses. So thatif we find a discrepancy between a physical law and Euclidean geometrywe must be willing to change the former so as to preserve the latter. Ineffect, Poincaré was debasing his models of hyperbolic geometry, alongwith those of Beltrami and Klein, as having no physical relevance.

Suppose that we have to deal with a rather large metal sheet whichis not at a uniform temperature. Take one edge of the sheet and label itthe x-axis, and consider its normal y-axis to vary with temperature in thefollowing way:

T = by − 1p

,

where b and p are constants. Suppose also that the metallic sheet is fixed insuch a way that it cannot bend or buckle. Lastly, we are given a measuringstick made of another metal whose thermal coefficient of expansion is p.How can we use this stick to determine the nature of the geometry of thesheet?

It would be better to have a measuring rod whose coefficient of ther-mal expansion were zero, but not having one we are left to make measure-ments with this imperfect rod. We therefore inquire how to make consistentmeasurements. There are two ways:

(i) At some standard temperature, which we take to be zero degrees cel-sius, the measuring stick has a length ds. But because there will be pointswith higher temperatures, the true length at any point (x, y) will be

ds′ = (1 + pT)ds = pby ds.

Page 81: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

54 A New Perspective on Relativity

This choice allows us to maintain our Euclidean measure on the surfaceby allowing for a temperature correction factor pby. This is our physicallaw which allows us to preserve the Euclidean nature of the geometry.

(ii) We make measurements without taking into account the changes inthe length of the rod. Here, we clearly rule out that there are changesin matter due to heat variations, and look to the geometry to make thenecessary modifications.

If we opt for the second choice we realize that the x-axis representsabsolute cold, which corresponds to a line at infinity. If we try to makemeasurements using a rod parallel to the y-axis we find that the stick willbecome shorter and shorter as it approaches the x-axis, so that it will appearas a line at infinity because it is infinitely far away.

The prime interest of a geometer is to create an object, such as a tri-angle, made up of the measuring sticks, that when moved over the surfaceremains congruent. We shall refer to such displacements as motions, ofwhich we will be interested primarily in infinitesimal ones. But, we mustfirst determine how we measure distance, or define a metric for the space.Wanting to keep as close as possible with a Euclidean measure we might try:

pby ds = √(dx2 + dy2).

If we agree to a choice of units where pb = 1, then

ds =√

(dx2 + dy2)y

. (2.1.1)

This ‘distance’ increases without limit as y → 0. For x constant, the‘distance’ along vertical lines increases exponentially in comparison withits Euclidean counterpart. For example, the adjacent distances betweeny = 1, 1

2 , 14 , . . . at x = 0 are all equal.

We now want the invariance property of this metric to determine thepermissible motions. Consider, for instance, a point transformation:

x′ = x′(x, y), y′ = y′(x, y). (2.1.2)

We want this point transformation to conserve distance; the condition is:

dx′2 + dy′2

y′2 = dx2 + dy2

y2 . (2.1.3)

Page 82: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 55

Obviously, this implies the invariance of distance, but we want somethingmore. We want it also to preserve angles, meaning we want it to be confor-mal. For later use, observe that

1y′ 2

{(∂x′

∂x

)2

+(

∂y′

∂x

)2}

= 1y2 ,

∂x′

∂x∂x′

∂y+ ∂y′

∂x∂y′

∂y= 0, (2.1.4)

1y′ 2

{(∂x′

∂y

)2

+(

∂y′

∂y

)2}

= 1y2 .

Now consider two infinitesimal displacements, (d1x, d1y) and(d2x, d2y) drawn from the point (x, y) and making an angle θ, and the cor-responding ones (d1x′, d1y′) and (d2x′, d2y′) drawn from (x′, y′) making acorresponding angle θ′. In order for the transformation to be conformal,we require the cosines of the two angles,

cos θ =d1xd2x+d1yd2y

y2

√ [(d1x2+d1y2

y2

) (d2x2+d2y2

y2

)] , (2.1.5a)

and

cos θ′ =d1x′d2x′+d1y′d2y′

y′ 2

√ [(d1x′2+d1y′2

y′2) (

d2x′2+d2y′2y′2

)] , (2.1.5b)

to be equal, where we have divided numerator and denominator by y2 andy′2, respectively, in order to be able to use (2.1.3). That is, on account of(2.1.3) the denominators in (2.1.5a) and (2.1.5b) are equal so it remains onlyto show that the condition,

1y′2 (d1x′d2x′ + d1y′d2y′) = 1

y2 (d1xd2x + d1yd2y), (2.1.6)

holds.If we introduce,

dx′ = dx′

∂xdx + ∂x′

∂ydy, dy′ = ∂y′

∂xdx + ∂y′

∂ydy,

Page 83: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

56 A New Perspective on Relativity

into the left-hand side of (2.1.6) and use (2.1.4) it becomes evident that theleft side coincides with the right side thereby establishing the conformalityof the point transformation (2.1.2).

Regarding motions, it is easily seen that magnification x → ax, witha ≥ 0, and translation x → x + s, with s real, are two possible motions. It iswell-known that in a two-dimensional space, like the one we are consider-ing, if there exist two independent motions then there must be a third. Thisthird is called inversion and it states that if there are two points connectedby a straight line to the origin of a circle whose circumference divides thetwo points then the product of the distances that the two points are from theorigin is equal to the square of the radius. Inversion introduces the notionof anti-congruence.

The basic motions are most easily expressed in terms of complex vari-ables z = x + iy and w = x′ + iy′, viz.

translation: w = z + s (s ∈ R)magnification: w = az (a ≥ 0)

inversion: w = 1/z,(2.1.7)

where z is the complex conjugate of z.These three independent motions imply that any two-dimensional

object in the space may be shifted, magnified and turned inside-out andstill remain congruent if the number of inversions is even, or anti-congruentif the number of inversions is odd. These motions give the object completefreedom of movement.

If we generalize the concept of inversion to include the product of aninversion in the unit circle, a translation by an amount c,

w = zcz + 1

,

and another inversion then we can construct a generic displacementas a product of the fundamental motions involving an even num-ber of inversions. Such a generalized displacement will have the form[cf. (1.2.1)]:

w = az + bcz + d

, (2.1.8)

Page 84: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 57

where a, b, c, d are real numbers and the determinant � = ad − bc > 0. Thelinear fractional transformation is known as a Möbius transform, and itwill play a prominent role in what follows.

The Möbius transform (2.1.8) can be obtained by the following ele-mentary motions [Archbold 70]:

z1 = cz, (magnification)

z2 = z1 + d, (translation)

z3 = 1/z2, (inversion)

z4 = �

cz3,

and so

w = ac

− z4,

which is (2.1.8).The displacement of any object in our space requires knowing the

position of the object, and the alignment of a particular direction withthat of an arbitrarily chosen direction in the space. For this to be accom-plished we need three parameters, and the corresponding group is a three-parameter group. An object enjoying free mobility in such a space is acongruent space and its geometry is a congruent geometry. We will nowdig deeper into the notions of these motions by transferring to the complexplane.

2.2 Geometry of Complex Numbers

2.2.1 Properties of complex numbers

An ordered pair (x, y) is called a complex number, z = x+iy. The modulus ofz is r = √

(x2 + y2). The number θ, defined by cos θ = x/r and sin θ = y/r iscalled the amplitude or argument (arg) of z. In terms of r and θ the complexnumber can be expressed as z = reiθ = r( cos θ + i sin θ), and de Moivre’stheorem follows:

(cos θ + i sin θ)n = cos nθ + i sin nθ.

Page 85: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

58 A New Perspective on Relativity

If z1 and z2 are any two complex numbers then

|z1z2| = |z1| · |z2|,|z1/z2| = |z1|/|z2|,

arg (z1z2) = arg z1 + arg z2,

arg (z1/z2) = arg z1 − arg z2.

The last two properties recall the property of logarithms, which we shallshortly return to.

Moreover, the product of a complex number z and its complex con-jugate z is zz = |z|2. The square of the absolute value of the sum of twocomplex numbers is:

|z1 + z2|2 = (z1 + z2)(z1 + z2) = z1z1 + z2z2 + z1z2 + z1z2

= |z1|2 + |z2|2 + z1z2 + z1z2

= |z1|2 + 2Re(z1z2) + |z2|2

≤ |z|1 + 2|z1||z2| + |z2|2 = (|z1| + |z2|)2,

since Re(z1z2) ≤ |z1z2| = |z1||z2|. Taking the positive square roots andobserving that |z| = |z| gives the triangle inequality:

|z1 + z2| ≤ |z1| + |z2| . (2.2.1)

2.2.2 Inversion

The property of inversion can be stated as: If a circle of radius R has a centerO and two points P and P′ are inverse with respect to the circle then thefollowing conditions must hold:

(i) O, P, P′ lie on the same straight line;(ii) O does not lie between P and P′;

(iii) OP · OP′ = r2.

To find the point of inversion P we construct a semicircle with diameterOP′. If Q is the point of intersection of this semicircle with the circle whoseorigin is 0 then P will be the foot perpendicular from Q to OP′. This is a

Page 86: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 59

Fig. 2.2. Construction of the point of inversion P.

consequence of the fact that OQP′ is a right triangle as shown in Fig. 2.2.Because OPQ will also be a right triangle,

cos ϑ = OPr

= rOP′ ,

and, consequently,

OP · OP′ = r2; (2.2.2)

O then is the center of inversion and the circle is called the circle of inversion.These conditions can be simply stated as: If P and P′ are represented

by the complex number z and w, then

(i) arg w = arg z,(ii) |w| = r2/|z|,where property (i) takes both properties (i) and (ii) of the above.

Given the point of inversion P we may calculate the coordinates ofP′, and vice-versa. From Fig. 2.3 it is apparent that the right triangles whosehypoteneuses are OP and OP′ are similar so that

xy

= x′

y′ . (2.2.3)

Then since (2.2.2) holds,

(x2 + y2)(x′ 2 + y′ 2) = r2.

Page 87: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

60 A New Perspective on Relativity

Fig. 2.3. Circle of inversion for constructing the inverse P with respect to P′.

Introducing the value of y in (2.2.3) we have

x2 + x2 · y′2

x′2 = r4

x′2 + y′2 ,

or

x = x′r2

x′2 + y′2 , y = y′r2

x′2 + y′2 .

These are the coordinates of the interior points which are fully symmetricto the exterior points,

x′ = xr2

x2 + y2 , y′ = yr2

x2 + y2 .

Although the method of inversion has found extensive use in elec-trostatics, apparently introduced by Lord Kelvin, it seems to be relativelyunknown in other branches of science. On closer inspection, however, itappears to have been employed for the first time in optics in a completelynovel way by, the then twenty-three year old, Maxwell [Born & Wolf 59].Since it combines Fermat’s principle of least time, which we will needlater on in Chapter 7, and inversion, we will now turn to a discussionof it.

Page 88: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 61

2.2.3 Maxwell’s ‘fish-eye’: An example of inversionfrom elliptic geometry

Light emitted by a point source at P0 will propagate in a medium of index ofrefraction η(x, y, z). Although an infinite number of rays have been emittedby our point source, only a finite number will be found to pass throughany other point in the medium, with the exception of a point P1 throughwhich an infinite number of rays pass. Such a point is said to be a stigmatic,or sharp, image of P0.

An optical instrument which images stymatically in three-dimensionsis referred to as absolute. To every point P0 in the object space there corre-sponds a stigmatic image P1 in the image space. These points in the twospaces are said to be conjugate to one another. It was precisely Maxwell, in1858, who proved that for an absolute instrument the optical length of anycurve in the object space is equal to the optical length of its image, provided bothspaces are homogeneous.

Maxwell provides us with a simple example of an absolute instrumentin a medium which is characterized by a refractive index,

η(r) = 11 + (r/a)2

η0, (2.2.4)

where r denotes the distance from a fixed point O, and η0 and a are con-stants. It is commonly referred to as Maxwell’s ‘fish-eye’ which he firststudied in 1854.

According to Fermat’s principle, light will propagate between anytwo points in such a way as to minimize (or at least to extremize) its traveltime. In a system of varying index of refraction, (2.2.4), the true path willrender the optical length,

I =∫

η(r)ds =∫

η(r)√

(dr2 + r2dϕ2)

= η0

∫ √(dr2 + r2dϕ2)1 + r2/a2 = η0

∫ √(1 + r2ϕ′2)dr1 + r2/a2 , (2.2.5)

an extremum, where the prime indicates differentiation with respect to r.Calling the integrand �, and noting that ϕ is a cyclic coordinate, i.e. ϕ

is absent but its derivative is not, we immediately obtain a first integral of

Page 89: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

62 A New Perspective on Relativity

the motion,

∂�

∂ϕ′ = η(r)r2ϕ′√

(1 + r2ϕ′2)= c = const.

Solving for ϕ′, we get

ϕ = c∫ r dr√

(η2(r)r2 − c2),

on integrating. To perform the integral it will be convenient to set ρ = r/aand κ = c/aη0. For then we find:

ϕ − ϕ0 =∫ ρ κ(1 + ρ2)dρ

ρ√

(ρ2 − κ2(1 + ρ2)2)

=∫ ρ d

[sin−1

(κ√

(1 − 4κ2)ρ2 − 1

ρ

)]dρ,

and, consequently, by inverting,

sin (ϕ − α) = c√(a2η2

0 − 4c2)r2 − a2

ar, (2.2.6)

where α is a constant of integration.Expression (2.2.6) is the equation of a circle in polar coordinates. All

rays through the fixed point, P0(r0, ϕ0), must be as such to keep the ratio,

r2 − a2

r sin (ϕ − α)= r2

0 − a2

r0 sin (ϕ0 − α),

constant. The fixed point P1(r1, ϕ1) must also satisfy this ratio for whateverα may be, and this leads to the conditions

r0r1 = a2, ϕ1 = π + ϕ0. (2.2.7)

All rays from a point P0 meet at P1 which lies on a line connecting P0 to O. Thepoints P0 and P1 lie on opposite sides of O such that OP0 · OP1 = a2. Conse-quently, Maxwell’s fish eye is an absolute instrument where the image isan inversion since the first condition in (2.2.7) is the condition for inversion,(2.2.2). Only this time O is between the two points instead of condition 2above.

For ϕ = α and ϕ = π + α, r = a and each ray emanating from a fixedpoint P0 intersects the circle r = a normally.All Euclidean circles orthogonal

Page 90: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 63

Fig. 2.4. Maxwell’s “fish-eye.”

to the rim of the circle of radius r = a are the routes of geodesics in the ellipticplane E. The rays emanating from any fixed point P0 and coalescing at P1,which lies on the line OP0, are geodesics, or paths of shortest distancebetween the two points. Arcs of a circle replace straight lines in the ellipticplane. We shall return to this point shortly.

To transform the Eq. (2.2.6) from polar to Cartesian coordinates weset x = r cos ϕ and y = r sin ϕ. We then obtain

y cos α − x sin α = c

a√

(a2η20 − 4c2)

(x2 + y2 − a2),

(x − b sin α)2 + (y + b cos α)2 = a2 + b2 = a4

4κ2 , (2.2.8)

where b = (a/2c)√

(a2η20 − 4c2). According to the theorem of chords, all

chords passing through a fixed interior point, in this case O, are dividedinto two parts whose lengths have constant product: OP0 · OP1 = a2. Wthus have to set b = 0, so that the radius of the circle of inversion is exactlya = 2c/η0.

If we do not distinguish between the flat metric and the index ofrefraction in (2.2.5), then we can write the metric as

ds2 = η2(r)ds2 = dr2 + r2dϕ2

(1 + (r/r0)2)2, (2.2.9)

Page 91: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

64 A New Perspective on Relativity

Fig. 2.5. The magnification of the inner product as it is projected stereographicallyonto the Euclidean plane.

where we set the absolute constant a = r0. A simple way to get new geo-metric structures is to distort old ones. The stereographic projection of thedot product of the tangent vectors x and y at a point p on the surface of asphere S projects onto the Euclidean plane at a point q where x and y arethe tangent vectors, as shown in Fig. 2.5. The relation between their innerproducts is given by the stereographic inner product distortion [O’Neill 66]

x · y =(

1 + x2 + y2

r20

)x · y, (2.2.10)

and so transforms the Euclidean plane into the stereographic plane with con-stant, positive curvature, 1/r2

0.To rationalize (2.2.10), we consider the inverse map of a plane onto

a sphere, from a horizontal plane of height r0 onto a sphere of radius r0.This is given by the projection along the radius (x, y, z) �→ (λx, λy, λz), wherez = r0, the domain of the plane, and (λx)2+(λy)2+(λr0)2 = r2

0, the codomainof the sphere. Solving for λ results in:

λ = r0√(r2

0 + x2 + y2), (2.2.11)

and, consequently, the stereographic inner product can be written as

λx · λy = x · y,

which is again (2.2.10).Stereographic projection was one of the topics covered by Riemann in

his 1854 lecture for his Habilitation. Although he discusses a space of posi-tive, constant, curvature, 1/r2

0, he was undoubtedly aware of what happens

Page 92: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 65

when r0 becomes imaginary. In such an event, the inner product (2.2.10)makes sense only when we restrict it to a disc x2 + y2 < r2

0. Inside thisregion, the two-dimensional space is one of negative, constant, curvature.We shall come back to this in our discussion of the Poincaré disc model inSec. 9.5.

Stereographic projection possesses two very remarkable properties:

(i) Circles on the sphere are mapped into lines or circles in the plane.(ii) Angles are preserved: the angle formed from two intersecting circles

on the sphere is the same as the angle formed from intersecting lines orcircles in the plane that correspond to the former under stereographicprojection.

Therefore, by sacrificing straight line geodesics we have been ableto preserve angles so the stereographic projection is a conformal map ofthe surface. Maxwell, unwittingly, discovered that his expression for therefractive index, (2.2.4), was the dilatation factor in

ds = η(r)ds, (2.2.12)

in which the infinitesimal shape on the surface is represented in the mapby a similar shape that differs from the original one only in size. The oneon the stereographic plane is just η times bigger, and the index of refractionis the stereographic magnification factor! This was indeed a big fish to fry forthe 23 year-old: he obtained an image as an inversion using stereographicprojection. We will use the metric (2.2.12) in Chapter 7 to derive the tests ofgeneral relativity by identifying physically the index of refraction, η, whichis the magnification factor of the flat metric, ds.

The essential point is that both the point source P0 and its image pointP1 are on different collinear rays emanating from O, whereas in the caseof inversion, a circle P0 and its image P1 are on the same ray emanatingfrom O. This is guaranteed by the form of the index of refraction (2.2.4).

As we have just mentioned, we can get a surface of negative curva-ture, −1/r2

0 = 1/(ir0)2 by allowing the radius of the sphere to take on theimaginary value ir0. Instead of the index of refraction (2.2.4) we now have

η(r) = 11 − (r/r0)2

η0, (2.2.13)

Page 93: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

66 A New Perspective on Relativity

which obviously limits us to a (hyperbolic) disc r < r0, which is the absoluteconstant of the space. Following the same procedure as before, we now findthe equation of the circle

(x − β sin α)2 + (y + β cos α)2 = β2 − r20, (2.2.14)

where β = (r0/2c)√

(r20η

20 + 4c2). The circle of inversion, C, has a center

at (β, α), and its distance from the center of the hyperbolic disc is β. But,in order that (2.2.14) describe a circle, β > r0, as can be seen in Fig. 2.6,the center of inversion must lie outside the hyperbolic plane, H. Thus, itscenter does not separate the source P0 and its image P1 along a commonline uniting the three points, and, as a consequence P0 and P1 will not lie ongeodesics arcs of a circle. So, it was not at all fortuitous that young Maxwellchose the form (2.2.4) for the index of refraction, and not (2.2.13).

In fact, as we approach the rim of H, which does not belong to H, theindex of refraction (2.2.13) becomes infinite. Since the velocity of propaga-tion is inversely proportional to the index of refraction, it will become verysmall in the limit. Clocks slow down and rulers shrink as they approach therim when viewed from our Euclidean perspective. We might expect thatthis shrinking of rulers and slowing down of clocks to have something todo with space contraction and time dilatation.

This ‘shrinkage’ of rulers, and ‘slowing down’ of clocks is in direct con-trast as to what happens in the stereographic, or elliptic, plane of constant

Fig. 2.6. In the case of inversion both the point and its image are on the same rayemanating from the center of the disc H.

Page 94: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 67

Fig. 2.7. It appears that rulers get longer as they are moved further from the origin.However, the elliptic distance from x to y is exactly the same as that from X to Y.

positive curvature. The projection of points in the northern hemisphereX and Y are much further away from the origin than projections fromthe southern hemisphere, x and y, as shown in Fig. 2.7. The index ofrefraction (2.2.4) becomes smaller and smaller the farther we move awayfrom the origin. This means that the velocity increases and clocks speedup, while rulers get longer as they move farther away from the origin.Large circles in the stereographic plane have very small stereographic arc-length since they correspond to small circles about the north pole of thesphere S.

In consideration of the relationship between hyperbolic and ellipticspaces we might expect phenomena such as time contraction and spacedilatation to be characteristic of elliptic spaces when viewed from ourEuclidean perspective. To an inhabitant of the plane, he would measure thesame distance between x and y in Fig. 2.7, as he would measure betweenX and Y.

2.2.4 The cross-ratio

Now consider a circular arc, defined by arg[(z − z1)/(z − z2)] = const. Inaddition let there be two fixed points on the arc P1 and P2 with P lyingbetween them. We let P vary such that the angle, measured in radians,∠P1P2 = θ is constant. If P, P1, P2 are represented respectively by z, z1, z2

the necessary and sufficient condition that the angle remains constant, asP is varied, is:

arg (z − z1) − arg (z − z2) = arg

(z − z1

z − z2

)= θ.

Page 95: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

68 A New Perspective on Relativity

If 0 < θ < π, the locus of P is the arc of a circle with endpoints P1 and P2.For the particular values θ = π/2, the locus of P is a semi-circle, while forθ = π it is the segment P1P2.

Generalizing to four points lying on an arc we have: If points P3 andP4 lie on an arc whose endpoints are P1 and P2, then

arg

(z3 − z1

z3 − z2

)= arg

(z4 − z1

z4 − z2

),

or

arg

(z3 − z1

z3 − z2· z4 − z2

z4 − z1

)= 0,

where the zi’s represent the Pi’s. But, this can only be if the number,

z3 − z1

z3 − z2· z4 − z2

z4 − z1,

is real and positive. This number is the cross-ratio of the four numbersz1, z2, z3, z4. Alternatively, if P3 and P4 lie outside of the arc segment P1P2,then

arg

(z3 − z1

z3 − z2· z4 − z2

z4 − z1

)= π,

and the corresponding cross-ratio is a negative real number.The cross-ratio finds its origins in renaissance art where artists found

it necessary to give depth to their two-dimensional drawings. If the pointsA, B, C, D lie on a line and the pairs of points A, B separate C, D then thecross-ratio,

{A, B|C, D} = ACBC

/ADBD

,

is positive, while if they do not then the cross-ratio is negative. The cross-ratioof four points is the minimum number of points that is invariant under projection.

A correspondence between two straight lines such that for all cor-responding quadruples, A, B, C, D and A′, B′, C′, D′, their cross-ratios areequal, {A, B|C, D} = {A′, B′|C′, D′} is called a projective correspondence.Since the ordinary projection of a line onto a line preserves the cross-ratio,it is an example of a projective correspondence. Such a correspondence issaid to be perspective.

Page 96: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 69

A perspective is merely a realistic representation of spatial depth on aplane.Yet, the method for correct perspective was awarded to the Florentinepainter, Brunelleschi at the beginning of the fifteenth century.Alberti solveda special case, known as costruzione legittima, whereby nonhorizontal floortiles are lined up on a base line with ever-progressing smaller tiles placedbehind them and letting them converge to a vanishing point on the horizonas in Fig. 2.8.

The development of projective geometry followed, mainly throughthe work of Desargues, with the introduction of ‘vanishing’ points, or pointsat infinity where parallels meet, and transformations which change lengthsand angles, i.e. projections. But, if length and angles are not invariant underprojection, what is? Since it is possible to project any three points on aline onto any three others, this cannot be an invariant. The smallest num-ber of points which is invariant is four, and the cross-ratio is a projectiveinvariant.

Following the proof given by Möbius in 1827 that the cross-ratio isa projective invariant, we consider four points on a line A, B, C, D, and apoint O not lying on the line as in Fig. 2.9. Drop a normal onto the line andlet δ be its length. By computing the area of the triangles OCA, OCB,ODA and ODB, first using the height δ and the bases AB, BC, DA, andDB, and then using the bases OA, and OB, and expressing the height in

Fig. 2.8. A tiling of the plane.

Page 97: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

70 A New Perspective on Relativity

Fig. 2.9. Calculation of cross-ratio and perspectivity.

terms of the sines of the angle at O, we find

12δ · CA = area OAC = 1

2OA · OC sin ∠COA,

12δ · CB = area OCB = 1

2OB · OC sin ∠COB,

12δ · DA = area ODA = 1

2OA · OD sin ∠DOA,

12δ · DB = area ODB = 1

2OB · OD sin ∠DOB,

Taking the ratio of the first and second pairs, and dividing the former bythe latter results in

CACB

/DADB

= sin ∠COAsin ∠COB

/sin ∠DOAsin ∠DOB

.

Observing that any other four points A′, B′, C′, D′ in perspective withthe original points A, B, C, D, with the same external point O, will havethe same central angle at O, shown in Fig. 2.9, and, consequently, willhave the same cross-ratio.

Projective transformations, or collineations as they are sometimesreferred to, can map parallel lines onto intersecting lines thereby providinga sense of depth, like the converging parallel lines in Fig. 2.8. In order todefine a projective transformation, we must add points ‘at infinity.’ Theseare necessary in order to insure the one-to-one correspondence that arisesin connection with the central projection of a plane onto a plane in which

Page 98: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 71

some of the points of the first plane have no images. The straight lines ofone that intersect the plane correspond to points of intersection with theplane, while those lines parallel to the plane are new points, called points atinfinity where parallels meet. For when the straight line that intersects theplane becomes closer and closer to a parallel line, its point of intersectionrecedes to infinity. The Euclidean plane is transformed into the projectiveplane by the addition of points at infinity.

The logarithm of the cross-ratio measures hyperbolic distance.Because of the logarithmic form it would not satisfy the triangle inequality,(2.2.1). It is well-known that logarithmic equations of state in thermody-namics [Lavenda 09], and logarithmic measures of divergence in informa-tion theory [Kullback 59], have all the topological requisites of a distanceexcept that of the triangle inequality. However, a remarkable property ofthe cross-product enables the hyperbolic distance to satisfy the triangleinequality, and, therefore, be considered as a bona fide distance.

Consider four collinear points with a and b between x and y with bbetween a and x. The cross-ratio,

{a, b|x, y} > 1,

unless a = b. If d is some other interior point,

{a, d|x, y} · {d, b|x, y} = {a, b|x, y}, (2.2.15)

and the cross-ratio is associative. Since the distance is the logarithm of thecross-ratios, it is precisely this last property that would lead one to believethat the triangle inequality cannot be satisfied. But wait a moment.

What happens if we shorten the interval, say to some x′ lying betweenb and x. It can be shown that [Buseman & Kelly 53]:

{a, b|x′, y} > {a, b|x, y}. (2.2.16)

So anytime we shorten the interval we increase the cross-ratio, and, conse-quently, the distance from a to b is also increased.

To establish the triangle inequality consult Fig. 2.10. The perspectivityof the lines uv and xy from the pole, p, and the inequality (2.2.16) give

{a, c|u, v} = {a, d|x′, y′} ≥ {a, d|x, y}.

Page 99: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

72 A New Perspective on Relativity

Fig. 2.10. The four points u, a, c, v and a, d, x′,y′ from point p have the same angles,hence, have the same cross-ratio. This also is true for c, b, w, z and d, b, x′,y′.

Likewise, the perspectivity of wz and xy, together with inequality (2.2.16),give

{c, b|w, z} = {d, b|x′, y′} ≥ {d, b|x, y}.Taking the product of the two inequalities, and using property (2.2.15),result in

{a, c|u, v} · {c, b|w, z} ≥ {a, d|x, y} · {d, b|x, y} = {a, b|x, y}.Finally, forming the hyperbolic distances by taking the logarithm of bothsides yields the triangle inequality,

h(a, c) + h(c, b) ≥ h(a, b), (2.2.17)

for the hyperbolic distance as the logarithm of the cross-ratio.

2.2.5 The Möbius transform

The properties of the Möbius transform that we discuss here will be usedin Chapter 8, especially in Sec. 8.2.

Page 100: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 73

2.2.5.1 Invariance of the cross-ratio

We will now show that the Möbius transform leaves the cross-ratio invari-ant. It is this property that Poincaré used, to show that all the cells of thetessellations in the hyperbolic plane are of equal size.

Take all four numbers z1, z2, z3, z4 to be different, and cz + d = 0 forany of them.

If

wi = azi + bczi + d

,

i = 1, . . . , 4, the wi are all different and their differences are given by:

w1 − w2 = �(z1 − z2)/(cz1 + d)(cz2 + d),

w2 − w3 = �(z2 − z3)/(cz2 + d)(cz3 + d),

with � denoting, again, the determinant. Dividing the first by the second,

w1 − w2

w2 − w3= z1 − z2

z2 − z3· cz3 + d

cz1 + d.

Likewise,

w1 − w4

w4 − w3= z1 − z4

z4 − z3· cz3 + d

cz1 + d,

and again dividing the first by the second,

w1 − w2

w2 − w3

/w1 − w4

w4 − w3= z1 − z2

z2 − z3

/z1 − z4

z4 − z3.

The left- and right-hand sides are the cross-ratios of four numbers, andthey are equal. This shows that the Möbius transform preserves cross-ratios.

2.2.5.2 Fixed points

A fixed point occurs when w = z. Fixed points are, therefore, determinedby the equation

cz2 + (d − a)z − b = 0.

It is not difficult to see that the only Möbius transform with more thantwo fixed points is the identity transform. For if a = d and b = c = 0,every point is fixed. The transformation reduces to w = z, or the identity

Page 101: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

74 A New Perspective on Relativity

transformation. Now, if a = d, b = 0, and c = 0, the quadratic has one root,i.e. ∞, which is the only fixed point. Further, if c = 0, a = d, and b = 0, thequadratic has distinct roots, b/(a − d) and ∞. Now, assume that c = 0 and δ

is either of the square roots of the discriminant, (a − d)2 + 4bc. If δ = 0, thequadratic has two distinct roots, (a − d ± δ)/2c. Rather, if δ = 0, the rootscoalesce to a single fixed point (a− d)/2c. Hence, we have shown that therecannot be more than two fixed points of a Möbius transformation.

2.2.5.3 Associativity

The Möbius transformation is also associative, just like the cross-ratio,(2.2.15). That is, if T1 transforms z1 into z2 and T2 transforms z2 into z3,then the product T1T2 transforms z1 into z3.

Let T1 be the Möbius transform, w = (az + b)/(cz + d), and T2 be thetransform, w = (Az + B)/(Cz + D), then their product, T1T2 is defined as

w = A[(az + b)/(cz + d)] + BC[(az + b)/(cz + d)] + D

,

which has the same form

w = (Aa + Bc)z + (Ab + Bd)(Ca + Dc)z + (Cb + Dd)

.

Since the determinant is the product of determinants �1�2 = (AD − BC)(ad−bc), and does not vanish, it makes T1T2 also a Möbius transformation.

A special product transformation will be of importance in our furtherdevelopments; that is, when the product T1T2 = I, the identity transfor-mation. The identity w = z will result only when the following conditionsare met

Aa + Bc = Cb + Dd,

Ab + Bd = 0,

Ca + Dc = 0.

This will happen only when the ratios A : B : C : D are the same as d : − b :−c : a. Then there is a unique transform T2 which has the Möbius transform,

w = dz − b−cz + a

.

Page 102: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 75

It is the inverse to T1, and is written as T−11 . Hence, T1 transforms z1 into z2,

and T−11 transforms z2 back into z1. Moreover, if it happens that a+d = 0, T1

and T−11 are the same, and T1 is called involutory, since it is its own inverse.

2.2.5.4 Transformations for which the unit circle is invariant

These transformations are particularly interesting for they correspond tothe Lorentz transformations in relativity. Considering complex coordinateson the unit disc, a Lorentz transformation corresponds to a Möbius trans-formation,

w = az + ccz + a

, (2.2.18)

for which |a| > |c|, so that their ratio, |c/a| will be a point in the interior ofthe unit disc. This is a necessary and sufficient condition that w maps theinterior of the unit disc onto its interior [Schwerdtfeger 62].

A Möbius transform which transforms three distinct points of a unitcircle into three other distinct points of the circle it must, obviously, trans-form the unit circle into itself since if z is a circle or a line, so too will be w.

If the Möbius transform, w = (az+b)/(cz+d) transforms the unit circleinto itself, |w| = 1, implying |az + b| = |cz + d|. The latter condition mustbe the same as |z| = 1. Now, the condition |az + b| = |cz + d| is the same as

|z + b/a| = |c/a| · |z + d/c|.This is the equation of a circle having a pair of inverse points −b/a and−d/c. We know that the two inverse points must be of the form z = 1/z,which implies b/a = c/d.

As a special case we can set b = c and d = a. For then, the Möbiustransform which carries the unit circle into itself will be of the form (2.2.18).Inverting it we get

z = aw − c−cw + a

.

The family of circles |z| = κ > 0, where κ is real, is transformed into thecoaxial circle: |az − c| = κ|cz − a|. Coaxial circles are a family of circles suchthat any pair has the same radical axis. The radical axis is the line passingthrough the two points of intersection of a pair of circles, as the line PQ in

Page 103: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

76 A New Perspective on Relativity

Fig. 8.1. The origin, which is a circle of radius κ = 0, is transformed intoc/a, which lies inside the unit circle when |c| < |a|, and outside of it when|c| > |a|. The condition that the determinant must not vanish, ad − bc = 0,prohibits the case |c| = |a|.

2.3 Geodesics

Returning to our hyperbolic model of the heated plane, we have

s =∫ p2

p1

√(dx2 + dy2)

y=

∫ p2

p1

1y

[1 +

(dydx

)2]1/2

dx, (2.3.1)

as the distance between two points p1 and p2. The pre-factor has the form ofa varying index of refraction. For if we suppose that at the Earth’s surfacey = 0, the index of refraction η(y) will be a function of height y only. Thepropagation time, τ along a ray connecting two endpoints p1 and p2 willbe given by Fermat’s principle of least time:

cτ =∫ p2

p1

η(y)√

(1 + y′2)dx, (2.3.2)

where the prime denotes differentiation with respect to the independentvariable, x. The product cτ is known as the optical path length, where cis the velocity of light in vacuum. According to Fermat’s principle of leasttime, the optical path length is stationary for the true ray path.

In terms of the integrand of (2.3.2),

�(y, y′) = η(y)√

(1 + y′2), (2.3.3)

the Euler–Lagrange equation can be written as

� − y′ ∂�∂y′ = C, (2.3.4)

where C = const is a first integral of the motion. Explicitly, the Euler–Lagrange equation (2.3.4) is

η√(1 + y′2)

= C. (2.3.5)

So, the constant C is the value of the index of refraction where the raybecomes horizontal. The angle, θ, formed between the tangent to the ray

Page 104: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 77

Fig. 2.11. Derivation of Snell’s law.

and the normal to the ray, shown in Fig. 2.11, is given by

θ = arc sin(dx/√

(dx2 + dy2)) = arc sin(1/√

(1 + y′2)),

so that the Euler–Lagrange equation (2.3.5) coincides with Snell’s law,

η(y) sin θ = C. (2.3.6)

According to Snell’s law, the sines of the angles which the incident θi

and transmitted θt rays make with the normal to an interface between twodifferent media are proportional, i.e.

sin θi

sin θt= η, (2.3.7)

where η is the relative index of refraction of the two media. Expression(2.3.6) generalizes Snell’s law to the case where the index of refraction is afunction of the height.

Ordinarily, the index of refraction decreases with altitude, and thisis borne out by the heated plane model since upon comparing terms inthe integrands of (2.3.1) and (2.3.2) we find η = 1/y, and, consequently,dη/dy < 0. The true ray that will connect the two points will be concave:Light minimizes its propagation time by arching its path upwards betweenthe endpoints, like a cat ready to attack. As a result, objects do not appearto be where they are but are a little bit lower than our line of sight.

Page 105: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

78 A New Perspective on Relativity

In contrast, the index of refraction will be an increasing function ofheight in inversion layers where mirages are formed. The ray is now convexand the images will be higher than our line of sight. In both cases, thesedistortions are caused by the non-Euclidean nature of the geometry. Thiswill be a recurrent theme throughout.

Even without making the integral (2.3.1) stationary we can get someremarkable properties about hyperbolic geometry, undoubtedly the mostimportant of which is the angle of parallelism. Transforming to polar coor-dinates, and considering the arc γ to increase the angle from α to π/2 we get

γ

√(dx2 + dy2)

y=

∫ π/2

α

√(r′2 + r2)r sin θ

≥∫ π/2

α

sin θ= − ln tan (α/2), (2.3.8)

where the prime now stands for differentiation with respect to the inde-pendent variable, θ.

If we set (2.3.8) proportional to the minimum distance from a pointP, using the perpendicular distance d, to a line �, as shown in Fig. 2.12,we have one of the most remarkable formulas in all of mathematics. Thenumber α of radians in the angle of parallelism depends only on the distanced from P to Q and not on the particular line �, or the particular point P. Theformula was discovered independently by J. Bolyai and N. Lobachevsky.In Euclidean geometry the rays emanating at P must coincide, so α, whichis usually written as �(d) in the literature, is always a right angle. However,under Lobachevsky’s postulate these lines are distinct and the angle �(d)is necessarily acute. It is a function only of the hyperbolic distance d.

Fig. 2.12. Angle of parallelism.

Page 106: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 79

The Euler–Lagrange equation which renders the integral (2.3.1) sta-tionary is:

y′′ + 1 + y′2

y= 0.

The solution to this equation is a family of circles

(x − a)2 + y2 = c2,

whose centers lie on the x-axis, where a and c are constants of integration.Restricting ourselves to the half-circles located in the upper half-plane willgive us the Poincaré half-plane model. The semi-circles will be the geodesicsof our hyperbolic space.

If the plane were Euclidean, we could draw only one line through anygiven point parallel to a given straight line. This is Euclid’s fifth postulate.In this plane there would be only one geodesic through a given point thatwould be parallel to another given geodesic. Not so in our heated plane!Because the geodesics are semi-circles, all geodesics through a point P notlying on the geodesic g in Fig. 2.13 are parallel to g, even h1 and h2, whichare tangent to it at points U and V, because those points have been excludedby considering them infinitely far away, i.e. points at infinity.

To any student of geometry, this smacks of Lobachevsky geometry,who only claimed that there exist two lines parallel to a given line through agiven point not on the line. However, this does not mean that he did not rec-ognize that there were infinitely many non-intersecting lines. His parallel

Fig. 2.13. The number of lines passing through P that are hyperparallel to the lineg are infinite. The lines h1 and h2 are limiting parallel to g, while the others arehyperparallel to g.

Page 107: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

80 A New Perspective on Relativity

property is what is now usually referred to as ‘asymptotically parallel’ or‘horoparallel.’

Here, Lobachevsky’s statement is due to the peculiar nature of thePoincaré half-plane model. Nevertheless, the half-plane model illustratesthe hallmark of Lobachevskian geometry: the sum of the angles of a triangleare less than two right angles.

The heated plane model also illustrates other properties of projectivegeometry. We substitute the positive square root,

√[c2 − (x − a)2] for y in(2.3.1) and determine the distance s between two points x1 and x2 as

s =∫ x2

x1

cc2 − (x − a)2

dx

= 12

∫ x2

x1

{1

c + x − a+ 1

c − x + a

}dx

= 12

ln

[x2 − (a − c)(a + c) − x2

· (a + c) − x1

x1 − (a − c)

]. (2.3.9)

Now, let us define two other x-coordinates, x3, x4 with x4 > x3, as the pointswhere the geodesics intersect the x-axis, i.e. x3 = a − c and x4 = a + c.Substituting these values into (2.3.9) results in

s = 12

ln

(x2 − x3

x4 − x2· x4 − x1

x1 − x3

). (2.3.10)

This is precisely the logarithm of the cross-ratio,

x2 − x3

x4 − x2· x4 − x1

x1 − x3,

of four ordered points, x1, x2, x3, x4. For fixed endpoints x3 and x4, (2.3.10)is the hyperbolic distance between x1 and x2, which we know by (2.2.17)satisfies the triangle inequality.

2.4 Models of the Hyperbolic Planeand Their Properties

In the half-plane model, studied at the beginning of this chapter, we foundit equipped with the distance function ds = √

(dx2 + dy2)/y. This is onemodel of the hyperbolic plane because anything with the same metric isalso a viable model of the hyperbolic plane.

Page 108: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 81

In search of these other models, we take our cue from Euclidean andspherical geometries where equivalent metrics, or isometries, are foundby using complex functions. In terms of the complex number, z = x + iy,the distance ds = √

(dx2 + dy2)/y becomes |dz|/Im z. The map from thehalf-plane to the disc is

w = iz + 1z + i

, or z = −iw + 1w − i

,

so that

ds = |dz|Im z

=∣∣∣∣d

(iw + 1w − i

)∣∣∣∣/

Im(−iw + 1

w − i

)

= 2|dw||w − i|2

/Im

(1 − iw)(w + i)|w − i|2

= 2|dw|1 − |w|2 . (2.4.1)

Thus, we have two models already of the hyperbolic plane:

• the upper half-plane model with distance ds = |dz|/Im z, where the ‘lines’are semi-circles perpendicular to the real axis, as in Fig. 2.13, and angleswhich are the same as Euclidean angles; and

• the open disc model with metric, (2.4.1), and ‘lines’ that are circular arcsorthogonal to the boundary, as shown in Fig. 2.6, with angles the sameas Euclidean angles.

The reason for conformality of the Poincaré disc model is that it took twoinversions to go from the half-plane to the disc. We will soon meet yetanother disc model which straightens out the circular arcs at the cost oflosing angle invariance.

The attribute of having a circle at infinity as a natural boundary is thatthe points on the disc are actually located at infinity as our inhabitants ofthe unit disc, whom we shall refer affectionately to as ‘Poincarites,’ know.a

The distance from the origin to any point tends to infinity as the point tendsto 1. Lines, or rather circular arcs, which have a common point on the circle

aThe circle at infinity will take on a physical vest when it is identified as the limit ofthe inner solution to the Schwarzschild metric in Sec. 9.10.3. The name ‘Poincarites’was probably first used by Needham [97].

Page 109: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

82 A New Perspective on Relativity

at infinity are known as asymptotic lines. The point where they meet is not apoint belonging to the lines, but, rather, a limit point because points are at‘infinity.’ In contrast, ultraparallels are circular arcs which cut the circle atinfinity but have no common point.

The distinction between these two lines in the unit disc is far fromacademic. A product of reflections in ultraparallel lines constitutes a trans-lation, whereas a product of reflections in asymptotic lines is a ‘limit’ rota-tion. Limit rotations on the disc are circles tangent to the circle at infinity,and are known as horocycles, or ‘limit’ cycles. An amazing find was that ahorocycle, or a horosphere in three-dimensions, is a circle at infinity thatobeys Euclidean, and not a hyperbolic, geometry. This discovery was madeby Wachter, a student of Gauss, way back in 1816. It will allow us to useEuclidean geometry to determine the properties of the hyperbolic plane,notably the angle of parallelism and the necessity of introducing an abso-lute constant, or a unit of measure, which is completely foreign to Euclideangeometry.

The horocycle, or circle whose center is at infinity, i.e. on the unit disc,is most clearly seen considering the ‘pseudosphere.’ Of all the mappings ofconstant negative curvature on the unit disc, it is only the middle figure inFig. 2.14, which has been adapted from Klein’s 1928 book on non-Euclideangeometry, that shows the horocyles as dashed lines. The solid lines are theimage of one turn of the covering of the pseudosphere. All three mappingsshow that surfaces of constant negative curvature are mapped only ontopart of the disc.

We postpone a discussion of some of the remarkable properties of thepseudosphere, which is to hyperbolic geometry what the plane and sphereare to Euclidean and elliptic geometry, respectively, and use the followingproperty of horocycles to derive the angle of parallelism.

The ratio of any two concentric limiting arcs cut by radii depends onlyon the distance between them and not on their size or where they arelocated in the hyperbolic plane.

It is by no means an understatement to say that all the trigonometricrelations of hyperbolic geometry follow from the fact that the ratio of con-centric limiting arcs l and m, with l > m, intercepted between two radii is

Page 110: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 83

Fig. 2.14. Surfaces of negative constant curvature that are mapped onto part ofthe hyperbolic plane. The middle figure is the mapping of a pseudosphere thatproduces horocycles as dashed curves.

given by

l/m = ea/κ, (2.4.2)

where a is the distance between the arcs and κ is a positive (absolute) con-stant.

In Fig. 2.15 the arcs l, m, and n are cut by two radii. The distancebetween the first two is a, while the distance between the second and thirdarcs is b. We know that the ratios depend on the distance between them but

Page 111: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

84 A New Perspective on Relativity

Fig. 2.15. The ratio of concentric limiting arcs depends only on the distancebetween them.

we do not know the functional form. That is,

l/m = f (a), m/n = f (b), l/n = f (a + b),

where f is a positive, and increasing function.These relations suggest the functional relation,

f (a) · f (b) = f (a + b).

Such a functional relation can only be satisfied by an exponential function.The transfer from any exponential g > 1 to the constant e entails introduc-ing an absolute constant κ such that

ga = (eln g)a = ea/κ,

and on account that g > 1, κ > 0.If we go back to the pseudosphere, we find that for any two points on

its surface, the following remarkable relationship holds

x2 + y2 < κ2.

When we go to plot these points on a Euclidean plane, as in Fig. 2.14, theyare constrained to lie within a circle of radius κ. All points on the entirepseudosphere are thus constrained to lie within a circle of radius κ on aEuclidean plane.

This radius is called the radius of curvature, or space constant, and isan absolutely determined length. It is the analog of the radius of a spherein spherical geometry under the transform κ → iκ. And although it isabsolutely determined, its magnitude will depend upon the units chosen.

Page 112: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 85

Fig. 2.16. Using Euclidean geometry to derive the angle of parallelism by consid-ering concentric limiting arcs.

The arc lengths C′B′ and CB belong to two concentric horocycles inFig. 2.16. The angle at C′ is a right angle so that by Euclidean geometryB′C′ = l sin β and AC′ = l cos β. The distance between the concentric arcs is

eb/κ = eCC′/κ = lB′C′ = l

l sin β= csc β.

Now, the ratio of the concentric limiting arcs l + AC′ and l is [Kulczycki 61]

ea/κ/eb/κ = l + AC′

l= 1 + cos β,

and consequently,

ea/κ = 1 + cos β

sin β= cot(β/2).

Denoting β = �(a) as the angle of parallelism, which can only depend onthe hyperbolic distance a, we obtain the Bolyai–Lobachevsky formula,

tan�(a)

2= e−a/κ. (2.4.3)

Euclidean geometry can be used in the hyperbolic plane to derive non-Euclidean results by considering the properties of concentric horocy-cles, and from their property (2.4.2) all the trigonometric formulas ofhyperbolic geometry follow.

Now consider M1(x1, y1) and M2(x2, y2) as any two points on a horo-cycle lying in the unit disc. Let M(x, y) be any point on the arc M1M2. With

Page 113: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

86 A New Perspective on Relativity

M1M/MM2 = λ, the relation between the coordinates are known to be

x = x1 + λx2

1 + λ, y = y1 + λy2

1 + λ. (2.4.4)

The values λ1 and λ2 will be the roots of the unit disc,

x2 + y2 = 1, (2.4.5)

that is when M coincides with either P or Q on the boundary. Introducing(2.4.4) into (2.4.5),

(x1 + λx2)2 + (y1 + λy2)2 − (1 + λ) = 0,

and defining

�11 = x21 + y2

1 − 1, �22 = x22 + y2

2 − 1,

�22 = x1x2 + y1y2 − 1,

so that the former equation can be written as the quadratic equation,

�22λ2 + 2�12λ + �11 = 0.

The ratio of the roots to this quadratic,

λ1,2 = −�12 ± √(�2

12 − �11�22)�22

,

is the cross-ratio,

{M1, M2|P, Q} = λ2

λ1= �12 + √

(�212 − �11�22)

�12 − √(�2

12 − �11�22).

Thus, the distance between the two points M1 and M2 is:

s(M1, M2) = κ

2ln

(�12 + √

(�212 − �11�22)

�12 − √(�2

12 − �11�22)

)

= κ tanh−1

(√(�2

12 − �11�22)�12

),

where κ is the absolute constant.

Page 114: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 87

We now transfer this result to velocity space. Let u and v be velocities,and s becomes relative velocity, w. The equation for the unit disc transformsinto

u2 + v2 = c2,

where c is the speed of light. Let us consider that v is infinitesimally closeto u such that v = u + du. Now,

�11 = u2 + v2 − c2, �22 = du2 + dv2,

�12 = u du + v dv,

and for small arguments, the inverse hyperbolic tangent can be approxi-mated by the argument itself, so that

dw2 =√

(�212 − �11�22)

�22

= c2 (u du + v dv)2 − (u2 + v2 − c2)(du2 + dv2)(u2 + v2 − c2)2

= c2 c2(du2 + dv2) − (v du − u dv)2

(u2 + v2 − c2)2, (2.4.6)

Equation (2.4.6) is the famous Beltrami metric, but whose derivation wefollowed was that by Klein [71].

If the velocities are infinitesimally close to one another, such that v =u + du then (2.4.6) becomes

dw2 = c2

{c2(du)2 − (u × du)2

(c2 − u2)2

},

or

= c2

{(du)2

c2 − u2 + (u · du)2

(c2 − u2)2

}. (2.4.7)

We will come across the Beltrami metric on numerous occasions, for exam-ple in the radiation pressure in Sec. 4.2.4, on uniformly rotating disc inSec. 9.6, and in the Thomas precession in Sec. 10.1, which was discoveredby Borel.

Page 115: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

88 A New Perspective on Relativity

For finite difference between the velocities u and v, the relative veloc-ity w is

w2 = (u − v)2 − (u × v)2/c2

(1 − u · v/c2)2. (2.4.8)

The square of the relativity velocity, (2.4.8) is invariant under Möbius trans-forms (2.2.18) with |a| > |c|. If we replace the relative velocities by u′ andv′ relative to some other frame, the value of (2.4.8) will be unaffected bythe change.

2.5 A Brief History of Hyperbolic Geometry

The elliptic plane can be developed on a sphere; the hyperbolic plane on ahyperboloid. It took almost two thousand years to appreciate that on suchplanes Euclid’s fifth postulate would be violated.

Hyperbolic geometry was born in 1829 when Lobachevsky showedthat in a right triangle with a fixed side d, as the opposite vertex P movesinfinitely far away, the angle α increases to a limit α0 = �(d) < π/2, asshown in Fig. 2.17.

If the unit of distance is properly chosen, Lobachevsky derived

d = − ln tan (�(d)/2).

Some forty years later, Beltrami showed that this unit of distance corre-sponds to a surface negative curvature, −1.

The following year, Lobachevsky was already looking for applicationsof his ‘imaginary’ geometry. If the universe is indeed non-Euclidean, thenthe unit of distance must be much larger than our solar system. If the vertexis the star Sirius, and the distance d is that of the Earth’s orbit, the parallaxof Sirius would be 1.24′′.b The parallax of stars is the annual oscillation

Fig. 2.17. A right triangle in hyperbolic space: As P increases without limit theangle tends to the angle of parallelism which is a function only of d.

bActually, the parallax of Sirius is 0.37′′.

Page 116: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 89

of the star’s apparent position due to the Earth’s motion about the sun.It depends on the distance that the star is from the Earth. Bradley backin 1725 tried to measure the distance to a star using the diameter of theEarth’s orbit as a baseline. He had hoped to determine stellar distancesin much the same way that surveyors measure distances by triangulation.However, what he measured was not the parallax for it depended on theEarth’s motion, and not its position at a given point in the orbit. We willreturn to the phenomenon he discovered, stellar aberration, in Sec. 10.1.

The parallax is measured by comparing the apparent position of a starS with that of some reference star S′ which is much more distant. This isshown in Fig. 2.18, where the angle φ is the parallax of the star. It is theupper limit on the defect D of the triangle ASM,

D = π −(π

2+ α + β

)<

π

2− α = φ.

Three years after Lobachevsky’s first publication, J. Bolyai publishedhis own version of hyperbolic geometry. As we saw in Sec. 1.2.1 Gauss,a friend of Bolyai senior, did nothing to encourage his son to develop hisideas further. Probably the reason can be found in Gauss’s 1824 letter toTaurinus where he writes:

I have sometimes in jest expressed the wish that Euclidean geometry is not true.For then we would have an absolute a priori unit of measurement.

Gauss never published anything during his lifetime on hyperbolic geom-etry, as it came to be known. We have also mentioned in the preface that

Fig. 2.18. The parallax of a star.

Page 117: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

90 A New Perspective on Relativity

Riemann noted in his Habilitation that the metric of a manifold of constantcurvature, α, could be written as:

ds =√

(dx21 + · · · + dx2

n)

1 + α4 (x2

1 + · · · + x2n)

,

where α = +1, −1 for elliptic and hyperbolic spaces, respectively.The next advance came in 1868 with two publications by

Beltrami [Stillwell 91]. The conclusion of his first paper was that two-dimensional non-Euclidean geometry is simply the study of surfaces ofconstant negative curvature. He coined the name ‘pseudosphere’ of radiusR, for a bugle looking surface of negative constant curvature, −1/R2. Thepseudosphere is the surface of revolution that is obtained by revolving thetractrix about its axis of symmetry, as shown in Fig. 2.19. The pseudospherehas total curvature −2π, and when it is divided by its constant curvature,−1/R2 gives, surprisingly, a finite area 2πR2.

A tractrix is the track that a dog on a leash of unit length leaves whois being pulled by his master walking along the x-axis. The curve is deter-mined by the property that its tangent lines meet the x-axis at a unit distancefrom the point of tangency. The tractrix was known to Newton as far back as1676, and the pseudosphere was investigated by Huygens as early as 1693.Huygens established that its surface area is finite, and found its volumeand the enclosed center of mass of the solid are also finite.

After having read Riemann’s 1854 inaugural address, which was onlypublished posthumously in 1868, Beltrami realized, in his second paper,

Fig. 2.19. Tractrix and pseudosphere as its surface of revolution.

Page 118: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 91

that the points of an n-dimensional non-Euclidean geometry are identicalto the interior points of a hemisphere:

y = √(c2 − x2

1 − · · · − x2n), y ≥ 0,

in an (n + 1)-dimensional Euclidean space, and provided it with the Rie-mann metric,

ds = R√

(dx21 + · · · + dx2

n + dy2)y

. (2.5.1)

The metric (2.5.1) is an obvious generalization of (2.1.1) to give it n−1 moredimensions. So it was Beltrami who actually discovered the Poincaré discmodel some 14 years before he did!

Beltrami also appreciated that the boundary points at y = 0 areinfinitely far from the interior of this metric. These points are the coldestregions of the heated plane model, and are referred to as points at infinityin the projective plane.

Projecting the hemisphere stereographically onto the disc,

x21 + · · · + x2

n ≤ c2,

Beltrami obtained the conformal disc model with metric,

ds =√

(dz21 + · · · + dz2

n)

1 − 14R2 (z2

1 + · · · + z2n)

,

which had already been obtained by Riemann. Then, performing an inver-sion in a boundary point of the disc, Beltrami obtained the half-plane model,with coordinates x1, . . . , xn and y ≥ 0 whose metric is given by (2.5.1). Bel-trami gave credit to Liouville for having written down the two-dimensionalcase earlier, precisely the credit that was denied to him in having discoveredthe half-plane model in the n-dimensional case!

The two-dimensional formula, (2.1.1), was derived by Liouville in1850 by mapping the pseudosphere into the half-plane, but he did notrealize that the half-plane with his distance formula was a model of hyper-bolic geometry. Twenty-one years were to pass before Klein formulatedBeltrami’s projective disc model in the language of projective geometry.A sphere in elliptic geometry with radius R has constant positive curva-ture, 1/R2. A sphere in hyperbolic space also has constant curvature, but it

Page 119: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

92 A New Perspective on Relativity

is negative, −1/R2. Therefore, if an elliptic space has radius R, its hyperboliccounterpart has radius iR.

To construct an n-dimensional sphere we begin with the equation ofa circle,

c2 = x20 + x2

1 + · · · + x2n,

to which we have added an extra dimension. This gives rise to a Euclideanmetric ds2 = dx2

0 + dx21 + · · · + dx2

n. Constraining this metric to the unitsphere, c = 1, gives a Riemannian metric of constant positive curvature,+1.

Alternatively, if we begin with the indefinite metric,

ds2 = −dx20 + dx2

1 + · · · + dx2n, (2.5.2)

which is associated with a hyperbola, c2 = −x20 + x2

1 + · · · + x2n, and then

a sphere of radius i centered at the origin is the hyperboloid, c2 = −1. Ahyperboloid is the surface of revolution obtained by rotating the hyperbolaaround x0. We pause for a moment to relate (2.5.2) to space-time.

Consider two inertial frames, S and S′, both traveling at the sameuniform speed but in opposite directions. The space-time coordinates ofone frame, x, t, must be linear functions of the other frame, x′, t′, viz.

x = Ax′ + Bt′,

x′ = Ax − Bt.

If we place ourselves at the origin of S, we measure a velocity B/A = −uin S′. Likewise, if we are at the origin of S′, we measure a velocity B/A = uin S.

Now, we consider the propagation of light signals in both frames; inS we have x = ct, while in S′, x′ = ct′. When these equations are introducedinto the above pair of linear equations, we get

t = (A + B/c)t′,t′ = (A − B/c)t.

The times can be eliminated from these equations to get a condition on theconstants, i.e.

c2 = A2(c2 − u2),

Page 120: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 93

where we used B/A = u. Rearranging we find

A = 1√(1 − u2/c2)

=: γ .

Thus, the Lorentz transformation is

x′ = γ(x − ut), ct′ = γ(ct − ux/c). (2.5.3)

Now squaring both sides of the first equation, and subtracting it from thesquare of the second give:

x′2 − c2t′2 = x2 − c2t2.

We may now identify x0 with ct in (2.5.2). The Lorentz transformationthus consists in passing from one set to another set comprised of a time-like semi-diameter ct = 1, and a space-like semi-diameter x = 1 of thehyperboloid,

x2 + y2 + z2 − (ct)2 = −1, (2.5.4)

and taking the lengths of the new semi-diameters for time and space coor-dinates.

Actually, Minkowski wrote the Lorentz transform (2.5.3) in the form:

x′ = x cos ω + (ct) sin ω, y′ = y, z′ = z,

ct′ = −x sin ω + (ct) cos ω,

and concluded that the Lorentz transformation may be described as

a rotation in a four-dimensional space x, y, z, ct, through an imaginary angle ω inthe plane x, ct, or ‘round the plane’ y, z.

Minkowski took seriously his pseudo-Euclidean space for he wrotein his 1909 “Time and space” paper:

The world postulate permits identical treatment of the four coordinates x, y, z, t.By this means, as I shall now show, the forms in which the laws of physics aredisplayed again in intelligibility. In particular the idea of acceleration acquires aclear-cut character.

I will use a geometrical manner of expression, which suggests itself at once if wetacitly disregard z in the triplet x, y, z. I take any world-point O as the zero-pointspace-time. The cone (ct)2 − x2 − y2 = 0 with apex O in Fig. 2.20 consists of twoparts, one with values t < 0, the other with values t > 0. The former, the frontcone of O consists, let us say, of all the world-points which “send light to O,” thelatter, the back cone of O, of all the world-points which “receive light from O.” The

Page 121: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

94 A New Perspective on Relativity

Fig. 2.20. Minkowski’s vision of space-time.

territory bounded by the front cone alone, we call “before” O, which is bounded bythe back cone alone, “after” O. The hyperboloid sheet already discussed [(2.5.4)]lies after O. The territory between the cones is filled by the one-sheeted hyperboloidfigures

x2 + y2 + z2 − (ct)2 = k2

for all constant positive k. We are specially interested in the hyperbolas with Oas center, lying on the latter figures. The single branches of these hyperbolas maybe called briefly the internal hyperbolas with center O. One of these branches,regarded as a world-line, would represent a motion which, for t = −∞ and t = ∞,rises asymptotically to the velocity of light, c.

Minkowski’s view of space-time has been reproduced in almost everybook written on the special theory of relativity. It has led to many specula-tive thought-experiments regarding communication and space travel. Yet,it lies outside the domain of the hyperbolic plane, and it is in this plane where allthe physics occurs, including the velocity addition law which stands in thedefense of Poincaré’s (Einstein’s) postulate that c be the limiting velocity.

In order to get a projective space we have to identify antipodal points.These points lie on disjoint sheets of the hyperbola, the north N and S poles.Just as we have different types of maps which represent the surface of theEarth, different maps can be used to represent the hyperbolic plane.

But, in order to fully understand what Beltrami did let us consider theprojection of the sphere onto a plane. The map from the hemisphere onto aplane is called a geodetic projection. A sphere is centered at the origin andhas a radius r. A map from a horizontal plane at height κ to the hemisphereis given by the projection along the radius by the magnification (u, v, w) �→(λu, λv, λr) such that (λu)2+(λv)2+(λr)2 = κ2. Solving for the magnification

Page 122: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 95

yields [cf. (2.2.11) above]

λ = κ√(r2 + u2 + v2)

.

The distance between (u, v) and (u + du, v + dv) is given by the first funda-mental form

dw2 = E2 du2 + 2F du dv + G dv2

= κ2 (r2 + v2)du2 + 2uv du dv + (r2 + u2)dv2

(r2 + u2 + v2)2.

The absolute constant, κ, is related to the constant positive curvature, i.e.1/κ2.

To see the effect on how distances become distorted, just set v = 0,

dw = κr du

r2 + u2 .

In order to get the same increments in dw, we have to consider larger andlarger increments in du, as depicted in Fig. 2.7. This is to say that viewedby us Euclideans, it appears that our rulers become longer and longer thefarther we travel from the origin.

In contrast, Beltrami considered the projection of a pseudosphere ontoa horizontal plane by changing the surface to one of constant negativecurvature, −1/κ2. Beltrami called the surface of constant negative curva-ture a ‘pseudosphere,’ changing the name Minding had given his surface,Fig. 2.19.

The pseudosphere, as we have mentioned earlier, possesses someremarkable properties. Many geodesics can be drawn through a point onthe surface of the pseudosphere that never meet a given geodesic. If threeangles of one triangle are equal respectively to three angles of another, thetriangles have equal areas. This is true also in elliptic geometry, and showsthat in non-Euclidean geometries, the angles determine the sides of a trian-gle, something that is not true in Euclidean geometry. In other words, thesize of a triangle cannot be altered without distorting it.

Beltrami writes in his Saggio that

dw2 = κ2 (r2 − v2)du2 + 2uv du dv + (r2 − u2)dv2

(r2 − u2 − v2)2(2.5.5)

Page 123: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

96 A New Perspective on Relativity

represents the square of a line element on a surface whose spherical curvatureis constant, negative, and equal to [−1/κ2]. The form of this expression … has theparticular advantage (from our point of view) that a linear equation in u, v representsa geodesic and, conversely, any geodesic is representable by a linear equation inthese variables.

In order to derive (2.5.5), Beltrami considered the map from a hor-izontal plane of height r to the hyperboloid H+ given by the projection(u, v, w) �→ (λu, λv, λw), where w = r to the hyperboloid (λr)2 − (λu)2 −(λv)3 = κ2. To see what distortion there is, we again set v = 0,

dw = κr du

r2 − u2 .

Now equal increments in dw require smaller and smaller increments indu as we move away from the origin. In other words, our rulers becomesmaller and smaller as they approach the rim of the disc so that it wouldappear that the rim is infinitely far away.

Beltrami achieved this without producing a surface in three-dimensional Euclidean space that would be analogous to a hemisphere,for no such surface exists. The pseudosphere cannot be considered sucha surface since it has a discontinuity where geodesics cannot tread. Klein,however, did indicate how a pseudosphere could be drawn on Beltrami’sdisc. Cut it open and place the cusp end on the rim of the disc. The abovemetric tells us that the rim of the disc is infinitely far away.

Now, what Beltrami did was to project stereographically the upperhyperboloid sheet, H+, onto the unit disc so that all the rays would convergeat the origin, O, as shown in Fig. 2.21. In this way Klein got the projectivemodel, which now bears his name , of rays falling onto the horizontal planez0 = const.

Somewhat earlier, Weierstrass introduced coordinates, analogous tospherical coordinates, to describe a sphere of imaginary radius. Since theradial coordinate is imaginary it is not difficult to see that the appropriatecoordinates are

x = κ sinh (r/κ) cos ϕ, y = κ sinh (r/κ) sin ϕ, z = κ cosh (r/κ),

where κ is the absolute constant that sets the scale. If we square the thirdterm and subtract the squares of the other two terms we get

z2 − x2 − y2 = κ2,

Page 124: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 97

Fig. 2.21. Projection of the hyperboloid onto the plane.

which is none other than our hyperboloid with z = ct. The semi-diameterκ sets the scale and is often set equal to unity for mere convenience.

On his surface of constant negative curvature, −1/κ2, Beltrami usedthe Weierstrass coordinates in the form of ratios. That is, there is a map fromthe upper half-hyperboloid, H+, to any disc with its center on the z-axisparallel to the xy-plane.Aconvenient choice has the disc touching the upperhyperboloid at its lowest point z = 1. The map sends the coordinates,

(a sinh (r/κ) cos ϕ, a sinh (r/κ) sin ϕ, a cosh (r/κ)),

into the homogeneous coordinates (u, v, 1), where

u = a tanh (r/κ) cos ϕ, v = a tanh (r/κ) sin ϕ, (2.5.6)

are rectilinear coordinates.Projective geometry allows points of infinity, such as those where par-

allel train tracks merge in a drawing, to be placed on the same level as anyother coordinates in (X, Y) ∈ R

2. The so-called homogeneous coordinateswere introduced by Möbius and Plucker in the early part of the nineteenthcentury. By extending the coordinates to all real triples, (x, y, z), and divid-ing through by z to obtain (x/z, y/z, 1), these triples are just the coordinates

Page 125: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

98 A New Perspective on Relativity

of a line in R3 from 0 to (X, Y), where X = x/z and Y = y/z are the homo-

geneous coordinates. All horizontal lines whose points have coordinates(x, y, 0) corresponds to points at infinity. Thus, the extra coordinate z enablesus to create new points, and, in particular, the points at infinity. As z → 0,both X and Y tend to infinity, so it is entirely reasonable to consider themas ‘points at infinity.’

If Weierstrass’s coordinates are interpreted as the space-time coor-dinates in a two-dimensional space then Beltrami’s coordinates are thevelocity coordinates of the fundamental disc. In Beltrami’s own words:

If we denote by the letters x and y the rectangular coordinates of points of anauxiliary plane, then the equations

x = u, y = v

determine a representation of the region under investigation in which to everypoint of the region there corresponds a uniquely determined point of the planeand vice versa; and the whole region turns out to be represented in the interior ofa circle of radius [c] with center at the origin that we will call the limit circle. Inthis representation the chords of the limit circle correspond to the geodesics of thesurface and, in particular, the parallels to the coordinate axes correspond to thecoordinate geodesic lines.

In fact, Beltrami’s coordinates,

u = c tanh (r/κ) cos ϕ, v = c tanh (r/κ) sin ϕ,

satisfy

u2 + v2 = c2 tanh2 (r/κ) < c2.

Taking the square root and inverting we have

r = κ tanh−1

(√(u2 + v2)

c

)= κ

2ln

(c + √

(u2 + v2)c − √

(u2 + v2)

).

The argument of the logarithm is none other than the cross-ratio {0, r|c, −c},where r = √

(u2 + v2) is the distance from the center of the disc to agiven point. Hence, r is the distance between any two arbitrary pointsin Beltrami’s model; and the whole sphere of imaginary radius, or hyper-boloid, is represented by the interior of the circle of radius c, the speed oflight in vacuo.

Page 126: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 99

Fig. 2.22. Geodesics determined by planes cutting the hyperboloid and passingthrough the center.

Geodesics on the hyperboloid are described by planes that passthrough a given point p in the direction of the tangent vector to the hyper-boloid and pass through the origin. This is shown in Fig. 2.22. The projectiononto the unit disc makes them straight lines, which is what we normallyexpect of geodesics. But, there is a price to pay, namely, angles becomedistorted and the model is not conformal.

So if we restrict ourselves to the disc, where hyperbolic geometry rules, wecannot reason in terms of space-time, but, rather, in terms of their ratios, thevelocities. The distance we want is the distance between two relative veloc-ities. And although Beltrami provided the first proof of the consistency ofLobachevsky’s plane geometry by representing it in the Euclidean plane,he gave no formula for the distance between two arbitrary points. Kleinbegan with Cayley’s expression for the non-Euclidean measure of distance,

12

lne(ap)e(aq)

· e(bq)e(bp)

> 0, (2.5.7)

which we recognize as one-half the natural logarithm of the cross-ratio,{p, q|a, b} between two interior points p and q, and two boundary points aand b, as shown in Fig. 2.23.

One-half is introduced so that the curvature will be −1, and e(aq) isthe Euclidean distance from a to q. The factor of one-half is very important

Page 127: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

100 A New Perspective on Relativity

Fig. 2.23. Cayley’s calculation of distance in the projective disc model.

since it is then distinguished from the Poincaré half-plane model in whichthe one-half is not present [cf. Chapter 10]. For Poincaré’s model will turnout to be conformal, whereas Klein’s is not. In the same paper Klein [71]also coined the term ‘hyperbolic geometry’ as the non-Euclidean geometryof Lobachevsky and Bolyai.

Another model of the hyperbolic plane is to stereographically projectfrom the south pole, S, of the lower hyperboloid sheet, H−, onto the upperhyperboloid sheet, H+. The stereographic projection of these rays locatedon the unit disc in the horizontal plane x0 = 0, shown in Fig. 2.24, has cometo be known as the Poincaré disc model.

There are many proofs that non-Euclidean geometries are consistent.The earliest one was given by Beltrami who represented non-Euclideangeometries on Euclidean surfaces of constant curvature. Beltrami’s favoritewas the hemisphere, and he used it to go from his flat model to the Poincarémodel of the hyperbolic plane in two maps. In view of the facts that he didthis in 1868, and that Poincaré did not get around to doing this till 1882,this model should also be attributed to Beltrami. This is a classic example ofStigler’s law of eponymy, which states that no scientific discovery is namedafter its rightful discoverer.

The maps from the Beltrami flat plane model to the Poincaré modelconsist of the following. Beltrami’s model is located in the disc B of radiusr in (a) of Fig. 2.25. A sphere of the same radius is placed over the disc withits south pole at the center of B in (b). Using vertical parallel — and notstereographic — projection, B will be projected onto the southern hemi-sphere with the disc radius coinciding with the equator, as shown in (c)of Fig. 2.25. From the north pole, the southern hemisphere is, this time,

Page 128: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 101

Fig. 2.24. The Poincaré disc model as a stereographic projection from the southpole S of the bottom sheet.

stereographically projected back onto the plane, which covers a circularregion P of radius r′ in (d). This is the Poincaré disc.

The straight lines of the Beltrami model undergo a transformationunder these maps. Under the first map, a chord of B rises to the sphereto become an arc of a circle which intersects the equator at right anglesin (e) of Fig. 2.25. Under the second map this arc is mapped back into acircular arc that cuts the boundary of P normally in (f). As a consequenceof these two maps, one is a vertical parallel projection and the other astereographic projection, the hyperbolic straight lines are transformed intocircular arcs that cut P at right angles.Although we lose the characterizationof geodesics as straight lines, we still preserve the angles: Euclidean andhyperbolic angles are equal so the model is conformal. The two mappingsare summarized in Fig. 2.26.

Page 129: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

102 A New Perspective on Relativity

Fig. 2.25. Beltrami’s double mapping of Klein and his hyperbolic disc model ontothe Poincaré disc model.

Fig. 2.26. The combined vertical orthogonal projection upwards and the stereo-graphic projection downwards.

Page 130: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 103

So who exactly did what, and who should get the credit: is it Beltramior Klein? Beltrami’s approach was to show that hyperbolic geometry isconsistent if Euclidean geometry is. Klein, on the other hand, worked fromprojective geometry and showed that all three geometries, elliptic, hyper-bolic, and Euclidean, are consistent if projective geometry is consistent.And projective geometry comes before Euclidean geometry. To Beltramithe hyperbolic plane lies in a portion of the Euclidean plane, the unit disc.Outside the disc there is nothing. But to Klein, who believed in pointsat infinity, the plane is the projective, and not the Euclidean, plane. SoKlein was able to go beyond the boundary of the disc and showed howdistinct forms of rotation could take place within the disc, on the rim ofthe disc, and outside of the disc. Today, these differences seem marginalsince the foundation for all geometries are based on real number the-ory, where points are n-tuples and planes are the equations they satisfy[Stillwell 91].

In 1882 Poincaré studied the Liouville–Beltrami upper half-planemodel in regard to fractional linear transformations of a complex variable.Poincaré showed that the cross-ratio of two points, z and z + dz, infinitesi-mally separated on the same geodesic semi-circle with endpoints lying onthe x-axis that could be written as 1+|dz|/y, when higher than linear-orderinfinitesimals are ignored. The distance is the logarithm of the cross-ratio,which in this case is

ln (1 + |dz|/y) = |dz|/y, (2.5.8)

to linear-order in the infinitesimal. This connects the geometry of the half-plane, or our heated plane, with the Poincaré metric |dz|/y to the cross-ratiowhose logarithm is the distance between any two points in the interior ofthe unit disc.

By mapping the upper half-plane onto the unit disc, Poincaré againobtained the representation of the hyperbolic plane in the interior of acircle. At the expense of having hyperbolic straight lines, or geodesics, theynow appear as arcs of circles that cut the unit disc orthogonally. This mayappear as a blemish, but the model is conformal: Euclidean and hyperbolicmeasures of an angle are the same. Let A and B be points inside the unit discand let P and Q be the points where the line intersects the disc, as shownin Fig. 2.27.

Page 131: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

104 A New Perspective on Relativity

Fig. 2.27. Geodesics consist of arcs that cut the disc, �, orthogonally.

Then, Poincaré defined the cross-ratio in the usual way,

{A, B|P, Q} = e(AP)e(AQ)

· e(BQ)e(BQ)

,

where e(AP) is the Euclidean length of AP. But, unlike Cayley, Poincarédefined his length d(AB) as

d(AB) = | ln{A, B|P, Q}|,which is precisely twice that of Cayley’s.

Poincaré also studied motions and considered the group of all frac-tional linear transformations of the form

z �→ az + bcz + d

, (2.5.9)

which we recognize as Möbius transformations. Poincaré worked with realcoefficients, a, b, c, d and the determinant, � > 1. Poincaré found in (2.5.9)a new type of periodic function that is invariant under this substitution.Up until this time the only periodic functions that were known were thetrigonometric and elliptic functions. The double periodicity of the ellipticfunctions could be characterized by a tessellation of the Euclidean planein which the elliptic functions take on the same value at the vertices ofparallelograms.

As we have seen in Sec. 1.2.2, tessellations consist of curvilinear trian-gles inside the disc in the hyperbolic plane, as depicted in Fig. 1.1. As men-tioned there, the tessellation was originally constructed by H. A. Schwarzin 1872 to explain the periodicities in the solutions of Gauss’s differentialequation. In the passage quoted there, Poincaré had the wonderful idea

Page 132: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 105

that these new fractional transformations, (2.5.9), could be used to define anew distance for which the tessellations all become of equal size. His frac-tional transformation maps the upper half-plane model onto itself if � > 0,which excludes the lower half-plane. We know from (2.1.7) that (2.5.9) canbe decomposed into a magnification, translation, and inversion. Each ofthese preserves the metric (2.5.8), and so do (2.5.9).

Poincaré’s measure of distance was twice as great as Cayley (2.5.7),and as we have mentioned, it preserves angles and so is conformal. TwoPoincaré lines are parallel if and only if they have no point in common. Anexample is two arcs of a circle that cut the unit circle which do not inter-sect. This permits all the axioms of hyperbolic geometry to be translatedinto Euclidean-geometrical statements. Thus, the Poincaré model gives stillfurther proof that if Euclidean geometry is consistent so too is hyperbolicgeometry.

In the following year, Poincaré studied the motions consisting of allfractional linear transformations of the plane of points at infinity by allow-ing the coefficients in (2.5.9) to become complex, z �→ eiθz. Poincaré alsointroduced the half-space and hemisphere models, which we have dis-cussed in the example of the heated plane. They can be considered as ‘firstcousins’ of the disc model because they can be derived from one anotherby inversions [Thurston 97].

They can also be derived from one another by the linear fractionaltransformation,

z �→ 1 − ziz − i

.

To see this [Stahl 08], consider a point, z = x + iy on the interior of the unitdisc, and another point w = u + iv in the upper half-plane. Then

w = 1 − ziz − i

,

is translated into

u + iv = 1 − (x + iy)ix + iy − i

.

Separating real and imaginary terms, differentiating, and using theCauchy–Riemann relations, we find that if there is a curve γ in the unit

Page 133: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

106 A New Perspective on Relativity

disc model then there will be another curve, � in the upper half-planesuch that

√(du2 + dv2)

v=

γ

2√

(dx2 + dy2)1 − x2 − y2 .

This relates the heated plane metric to the unit disc metric, which was firstwritten down by Riemann.

The geometric interpretation of all fractional linear transformationsof the form (2.5.9) as length-preserving, or isometries of the hyperbolicplane, is just not possible because of the infinite number of them. Poincaré’sidea was to decompose the linear fractional transformations into individualmotions and to express them as products of inversions. But what he failedto notice was the one with a = d and b = c, such that b/a = tanh ψ = z1 isan inner point of the unit circle. For then

z′ �→ z + z1

1 + zz1, (2.5.10)

which maps the unit circle onto itself sending z1 into 0. And for z = z1,(2.5.10) becomes the isomorphism that takes us from the Poincaré to theKlein, or projective, model. With the relative velocity given by b/a = tanh ψ,(2.5.9) preserves the unit circle zz = 1, which expresses Lorentz invariance,and the identification of (2.5.10) as the Poincaré composition law for rel-ativistic velocities, that is usually attributed to Einstein. He would thennot have needed to introduce his second postulate for it would be alreadyincorporated in the statement that all transformations of (2.5.9), with theabove definitions of the coefficients, leave the real unit circle invariant, andall interior points have velocities less than the speed of light.

Poincaré would have been able to study them by studying tessella-tions, which would be curvilinear triangles that make up right-angled poly-hedra in relativistic velocity space, referred to as honeycombs [Coxeter 99].The tessellations do not have the same size but are mapped onto them-selves by the Lorentz transform. And since Möbius transforms preservethe cross-ratio, they can be used, according to Poincaré’s wonderful idea,to define a new concept of length under which the cells of the tessellationare all equal. The geometry to which it gives rise to is hyperbolic geom-etry, and it is this geometry that is used by Poincarites. A fortiori, if someof the Poincarites lived in the half-plane and others in the unit disc, and

Page 134: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

Which Geometry? 107

they could communicate between themselves, there would nothing thatthey could do or measure to tell their worlds apart. Only to us Euclideanswould differences in their two worlds become noticeable. This is the originof all relativistic effects. In this the cross-ratio has a fundamental role. Andsince the cross-ratio is none other than a product of longitudinal Dopplershifts, we must look to the Doppler shift for the origin of all relativisticphenomena.

The story of hyperbolic geometry does not end here, but for usit does. New advances involve the definition of a global manifold, forwhich remained no satisfactory definition throughout the nineteenthcentury. A chronological account of all later developments can be foundin Milnor [82].

References

[Archbold 70] J. W. Archbold, Algebra, 4th ed. (Pitman, London, 1970), p. 91.[Barankin 42] E. W. Barankin, “Heat flow and non-Euclidean geometry,” Amer.

Math. Monthly (1942) 4–14.[Born & Wolf 59] M. Born and E. Wolf, The Principles of Optics (Macmillan,

New York, 1959), pp. 146–148.[Buseman & Kelly 53] H. Busemann and P. J. Kelly, Projective Geometry (Academic

Press, New York, 1953), pp. 157–158.[Coxeter 99] H. S. M. Coxeter, “Regular honeycombs in hyperbolic space,” in The

Beauty of Geometry: Twelve Essays (Dover, New York, 1999), pp. 199–214;see also, C. Criado and N. Alamo, “Relativistic kinematic honeycombs,”Found. Phys. Lett. 15 (2002) 345–358.

[Einstein 22] A. Einstein, “Geometry and experience,” in Sidelights on Relativity,transl. by G. B. Jeffrey and W. Perrett (E. P. Dutton, New York, 1922), p. 34.

[Klein 71] F. Klein, “On the so-called non-Euclidean geometry,” Math. Ann. 4 (1871)573–625; translated in J. Stillwell, Sources of Hyperbolic Geometry (Am.Math. Soc., Providence RI, 1991), pp. 69–110, especially Sec. 8.

[Kulczycki 61] S. Kulczycki, Non-Euclidean Geometry (Pergamon, Oxford, 1961),p. 138.

[Kullback 59] S. Kullback, Information Theory and Statisitics (Wiley, New York, 1959),p. 6.

[Lavenda 09] B. H. Lavenda, A New Perspective on Thermodynamics (Springer,New York, 2009), Sec. 6.11.

[Milnor 82] J. Milnor, “Hyperbolic geometry: The first 150 years,” Bull. Amer. Math.Soc. 6 (1982) 9–24.

[Needham 97] T. Needham, Visual Complex Analysis (Clarendon Press, Oxford,1991).

Page 135: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch02

108 A New Perspective on Relativity

[O’Neill 66] B. O’Neill, Elementary Differential Geometry (Academic Press,New York, 1966), p. 314.

[Poincaré 68] H. Poincaré, La Science et l’Hypothèse (Flammarion, Paris, 1968).[Schwerdtfeger 62] H. Schwerdtfeger, Geometry of Complex Numbers (U. Toronto

Press, Toronto, 1962), p. 121.[Stahl 08] S. Stahl, A Gateway to Modern Geometry: The Poincaré Half-Plane, 2nd ed.

(Jones and Bartlett, Sudbury MA, 2008), pp. 193–194.[Stillwell 91] J. Stillwell, Sources of Hyperbolic Geometry (Am. Math. Soc., Providence

RI, 1991), pp. 63–68.[Thurston 97] W. P. Thurston, Three-Dimensional Geometry and Topology (Princeton

U. P., Princeton NJ, 1997), p. 53.

Page 136: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

Chapter 3

A Brief History of Light,Electromagnetism and Gravity

Much ado about null results.

3.1 The Drag Coefficient: A Clash Between Absoluteand Relative Velocities

Most, if not all books on the special theory, use the famous drag coefficient ofFizeau as an example of the new kinematics that special relativity preaches.That is, the drag coefficient comes out naturally from the velocity additionlaw, (2.5.10). According to the special theory, the parallelogram rule for theaddition of velocities is only approximately true. For two bodies moving atspeeds u and u′ in opposite directions, the velocity that an observer wouldregister traveling along with the second body is not u − u′, but

w = u − u′

1 − uu′/c2 .

Now let us apply this modification of the parallelogram law to Fizeau’sexperiment.

Larmor [00] tells us that the phenomenon to be accounted for is theobservation that the motion of the Earth does not affect reflection and refrac-tion of light. With the corpuscular theory then still in vogue,Arago reasonedthat insofar as the velocity of light is different in air than it is in glass, theaberration of its path due to the motion of the Earth would also be differ-ent in the two media, depending upon the direction of the Earth’s motion.Arago did not find any effect at all, and asked Fresnel whether he couldanalyze this null result from the point of view of wave theory.

109

Page 137: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

110 A New Perspective on Relativity

Fresnel replied that the lack of an effect could be explained by assum-ing that the surrounding aether is carried along with the motion of theEarth. But, if this is true, it goes against the grain of stellar aberrational mea-surements which could only be explained if the aether was immobile, orstagnant. According to this hypothesis, the velocity of light in vacuo wouldretain its normal value, but, it would change in the body arising from itsmotion through the aether so as to make refraction and reflection the sameas for the matter at rest. Arago had not considered such a possibility.

What Fresnel was able to show was that there would be no change ifthe velocities in vacuo and the prism were c and c′, respectively, when atrest, while when in motion they were c − u and c′ − u/η2, where η is theordinary index of refraction, η = c/c′. That is, the absolute velocity of lightin the moving prism would be

c = c′ + u(1 − η−2). (3.1.1)

Fresnel believed that aether permeates moving matter and is partially con-vected by matter, being absorbed at the front surface and readmitted at therear surface. In this way the paths of the light rays in moving bodies wouldbe unaltered, and at the same time the known facts of aberration would notbe contradicted.

The experimental confirmation of Fresnel’s formula had to wait thirty-three years, when Fizeau preformed his famous experiment. His arrange-ment, shown in Fig. 3.1, is essentially an optical interferometer. Rays from alight source are split into two by a halfway mirror. The two beams are madeto travel around the same closed circuit, by reflection from mirrors, but inopposite directions. When they arrive back at the halfway mirror one beamcarries on while the other beam is reflected, and both are culminated on atelescope. Any given fringe represents an optical path difference betweenthe interfering beams. If δ is the distance that a ray traverses in a mediumof refractive index η, the optical path is simply η · δ.

Since water is flowing through the circuit, the rays going against thedirection of the water should be retarded, and this produces a draggingeffect. The optical path difference concerns only what goes on in the tubes,so that if each tube is of length δ and the water is flowing at speed u, it is

�t = 2δ

c′ − γu− 2δ

c′ + γu, (3.1.2)

Page 138: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 111

Fig. 3.1. Fizeau’s aether-drag apparatus with mirrors placed on corners to reflectlight.

where γ is the so-called drag coefficient, and recall that c′ = c/η. This givesa first-order effect of amount

�t ≈ 4γδ

c′ · uc

.

The optical path length difference is c�t, and expressing this as the numberof interference lines f times the wavelength, λ, of the monochromatic lightused, gives the expression

f = 4γδ

λ· u

c′ .

Typical values of Fizeau parameters are: δ = 1.5 m, u = 7 m/sec, λ =5.3×10−7 m, η = 1.33, and f = 0.23 of a fringe. This gives the observationalvalue of the drag coefficient as γ = 0.48, while, its calculated value fromthe index of refraction of water alone is γ = 1−1/η2 = 0.43. Thus, to withinan error of approximately 10%, Fizeau confirmed Fresnel’s formula for thedrag coefficient — but it did not explain it. All it could do was to reinforcethe belief that the motion of the aether has no effect on the properties ofmoving objects, just as it does not have on stellar aberration.

However, it is claimed that the Fizeau experiment can be explainedby the kinematics of special relativity. What special relativity says is thatthe velocity of light in the direction of the flow of water with velocity u is

Page 139: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

112 A New Perspective on Relativity

not c′ + u, but, rather

c = c′ + u1 + c′u/c2 ≈ c′ + u

(1 − c′ 2

c2

).

Since η = c/c′, the velocity composition law, “without any extra assump-tions,” gives to lowest-order the Fresnel drag expression, which “otheraether theorists had to explain in terms of a partial dragging of light by themedium” [French 68]. However, whereas c′ + u was an absolute velocityin the determination of the fringes, it has now, miraculously, become thesum of two velocities, which for the relativistic law of addition applies!Moreover, the relativistic velocity composition law applies only to relativevelocities, and not to absolute and relative velocities.

But, if it is true that we are talking about the sum of two velocities, c′

and u, we should take this into account in determining the time difference,(3.1.2). If this is done we get

�t = 4δγu(1 − 1/η2)c′2 − (γu)2

, (3.1.3)

and this decreases both the fringes and the drag coefficient, i.e. f = 0.1fringe and γ = 0.21. The latter is no way near the calculated value of 0.43.

Thus, there is a fundamental incapability of using special relativity todeal with the classical experiments that led to the demise of the aether.Only observations which do not give a null effect are those in whichabsolute velocities are involved, as in the Fizeau experiment.

3.2 Michelson–Morley Null Result:Is Contraction Real?

The experimental set-up shown in Fig. 3.2 appeared in the ‘classic’ 1887paper of Michelson and Morley. Light is emitted from a source and isreflected by a mirror at distance �1. The time it takes for the forward andbackward journeys can be found in any text,

t1 = �1

c + u+ �1

c − u= 2�1/c

(1 − u2/c2), (3.2.1)

Page 140: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 113

Fig. 3.2. Monochromatic, yellow light is split by a mirror into two beams. Thesebeams cover equal distances to mirrors b and c where they are reflected back to a, andthen combined to produce interference fringes. The paths lengths ab = ac = 11 m.A rotation of the apparatus by 90 degrees gave no displacement of the interferencefringes that was expected because the beam traveling along ab is parallel to theEarth’s motion through the aether while the ray along ac was normal to it.

where u is the speed that Michelson’s apparatus is moving with respect tothe inertial frame defined by the aether. To calculate the time of transit ofthe light ray perpendicular to the ‘aether wind,’ the Galilean compositionof velocities is used. If c is the hypotenuse with base u, the decrease invelocity between the interferometer and the second mirror is

√(c2 − u2).

The time of transit of the outward and backward journeys from the secondmirror is

t2 = 2�2√(c2 − u2)

. (3.2.2)

From (3.2.1) and (3.2.2) Michelson determined the time difference.This time difference was compared to that which is resulted when thewhole apparatus is rotated by 90◦, which interchanges �1 with �2. The dif-ference between the time differences would produce a shift in the inter-ference pattern, which is calculated in terms of the number of fringes, just

Page 141: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

114 A New Perspective on Relativity

as in the Fizeau experiment. Although the expected effect was predictedto be small, Michelson failed to observe anything. Michelson was led toconclude that

The interpretation of these results is that there is no displacement of the interferencebands. The result of the hypothesis of a stationary aether is shown to be incorrect.

The difference in times of the two journeys is

�t = 2c(1 − u2/c2)

{�1 − �2√

(1 − u2/c2)}. (3.2.3)

Because no differences were found, FitzGerald and Lorentz independentlypostulated that there must be a contraction of the arm in the directionof motion by the amount

√(1 − u2/c2) so that with equal arm-lengths,

(3.2.3) would vanish. Oliver Lodge claimed that this ‘contraction’ hypoth-esis [cf. (3.2.5) below] was proposed by FitzGerald in a conversation hehad with him in order to explain the null result. It was also put forwardby Lorentz a short time afterwards, and has become to be known as theFitzGerald–Lorentz contraction, or more succinctly as the Lorentz contrac-tion. However, O’Rahilly [38] warns us: “It is to be gravely doubted thatthe FitzGerald–Lorentz contraction really does explain Michelson’s null-result.”

If the arms’ lengths are equal, �1 = �2, dividing (3.2.1) by (3.2.2) resultsin

t1 = t2√(1 − u2/c2)

.

This can be interpreted as a time dilatation, where clocks in motion sup-posedly go slower than clocks at rest. According to Larmor, “the changeof the time variable, in comparison of radiations in the fixed and movingsystems, involves the Doppler effect on the wavelength.”

The letter that Maxwell sent to D. P. Todd to thank him for the astro-nomical tables was read by a young scientist by the name of Michelson,who in the previous year had already performed a measurement ofthe speed of light. Although we will discuss this in greater detail inSec. 4.1.3, we mention it here because of the effect it had upon Michelson.Michelson did not rule out the possibility of detecting the motion ofthe aether as a second-order effect by the amount predicted by Maxwell

Page 142: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 115

[cf. Eq. (4.1.12)]. His first attempt was in 1881, and a more preciseexperiment carried out with Morley, described above, was done in 1887.According to French [66] “this refined version of the experiment. . .haslong been regarded as one of the main experimental pillars of specialrelativity.”

However, Maxwell was using absolute velocities. If you use relativevelocities, you will find t1 = 2�/c. There is no effect at all. Since the timedifference, t1 − t2, must be positive or vanish, we must also find t2 = 2�/cso that the aether is stagnant, u = 0. One could also query the validity ofthe use of the Galilean law of the composition of velocities. Light emittedtransversely to the direction of the motion can be obtained by an aver-age of measurements on radiation emitted forward and backward withrespect to the direction of the motion, as Ives and Stilwell so well realized[cf. Sec. 3.4],

12

{(c + uc − u

)1/2

+(

c − uc + u

)1/2}

= 1√(1 − u2/c2)

.

So, there is nothing wrong with the use of the Galilean composition lawof velocities. If there is no change in t1 there can be no change in t2 sincet2 cannot be greater than t1. It can only be equal to it when the aether isstagnant. Pillar or not, it is no wonder that some of the best experimen-talists of their day, like Herbert Ives and Louis Essen, were staunch anti-relativists.

Therefore, Michelson, like Fresnel, Fizeau, and Airy before him, couldonly conclude that the aether remains completely undisturbed by theEarth’s motion. Moreover,

the FitzGerald–Lorentz contraction, and time dilatation arise from con-sidering the velocities to be absolute, and not relative. The null result isin perfect conformity with the relativistic additional law for velocities,and the existence of a stationary aether.

On the basis of Maxwell’s (and Michelson’s) analysis u is absolute, sothat if the lengths are the same then there is time dilatation,

t1 = t2√(1 − u2/c2)

, (3.2.4)

Page 143: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

116 A New Perspective on Relativity

while if the epochs are the same, then there is length contraction,

�1 = �2√

(1 − u2/c2). (3.2.5)

Alternatively, if u is relative there is no effect at all since an absolute speed ccannot be combined with a relative speed u through the relativistic lawof the composition of speeds. Here, we find complete agreement withO’Rahilly [38] who contends

the electrodynamic contraction is based on the assumption of an Earth-convectedaether and involves an ordinary measurable u relative to the laboratory. If thisassumption is correct, the FitzGerald–Lorentz contradiction disappears; and thenull result of the Michelson–Morley experiment becomes self-evident.

When dealing with absolute velocities, there is no way that we canavoid having a speed greater than c if on one tract, the speed is less thanc on the other tract. To see this, do not specify what the speeds are for theoutward and return journeys; then

t = �

c′ + �

c′′ ,

where the two unknown speeds are c′ and c′′. But, if light is to travel at thesame speed no matter which way we are going then

t = 2�

c2 .

Comparing the two expressions we arrive at the conclusion that

c√(c′c′′)

=√

(c′c′′)12 (c′ + c′′)

≤ 1,

on the basis of the arithmetic-geometric mean inequality. The equality signholds if and only if the two speeds coincide with the speed of light. Oth-erwise, one of the two speeds will necessarily be greater than the speed oflight.

We should not leave this section with the impression that relativevelocities will have no effect in delaying the round trip times. We mustonly specify that one of the velocities in the relativistic composition lawis not the velocity of light. This we can do by immersing the entireapparatus in a medium whose index of refraction η > 1. For then the

Page 144: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 117

time to make a round trip parallel and perpendicular to the aether windwill be

t1 = �

c

(η + u/c1 + uη/c

+ η − u/c1 − uη/c

)

= 2�η

c

(1 − u2/c2

1 − u2η2/c2

)≈ 2�η

c

[1 + u2

c2 (η2 − 1)

],

t2 = 2�η

c1√

(1 − u2η2/c2)≈ 2�η

c

(1 + u2η2

2c2

).

The magnitude and sign of the time difference will depend upon the mag-nitude of η.

3.3 Radar Signaling versus Continuous Frequencies

The Doppler shift measures the change in frequency due to relative motion.Anyone standing on a platform listening to a train go by will have noticedthat the pitch on the whistle of a train is higher when the train is approach-ing than when it is receding. Only one frequency is recorded; it has noth-ing to do with the exchange of signals between two different inertialplatforms.

For the time measurement of the exchange of signals from two differ-ent inertial platforms we need two clocks. Consider one clock in motionand the other stationary, the latter sends out a signal which is picked upby the former in time T, according to the stationary clock. Now, the newfeature is that a clock in motion has a different ticking rate than the clockat rest, so to the observer with his clock in motion it would appear that thetime for the signal to reach him will be KT, where K is some factor thathas to be determined. Every time the signal bounces back it increases by afactor K, so the signal that was sent out in time T will be reflected back intime K2T. The moment at which the light is reflected should be the arith-metic mean of these two times: 1

2 (T + K2T). During this time interval, thedistance that the clock in motion covers is the difference of these two timesmultiplied by c: 1

2 (K2 − 1)cT. The average velocity is therefore

uc

= 12

(K2 − 1)T(K2 + 1)T

,

Page 145: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

118 A New Perspective on Relativity

from which K can be determined as

K =(

c + uc − u

)1/2

= eu/c, say. (3.3.1)

The left-hand side is just the longitudinal Doppler effect, but what is theright-hand side? Taking the logarithm of both sides of (3.3.1) gives

u = c2

ln

(1 + u/c1 − u/c

)= c tanh−1 (u/c). (3.3.2)

For K �= 1, this new feature that a moving clock has its rate changedhas catapulted us into a new world, where if u is approaching the speed oflight, the new measure of speed, u in (3.3.2), approaches infinity. In this newworld there is no limit on the speed at which objects can travel! It dependson the exponent of the longitudinal Doppler shift. There is nothing classicalabout the velocity u. We will return to a discussion of how radar signalingis performed in Sec. 8.4.

Maxwell realized that radiation causes pressure. It was also knownthat this pressure was Doppler-shifted, or to use Heaviside’s quaint ter-minology ‘dopplerized,’ when it hits a moving mirror. These are the pre-relativity days when one could think of the “number of waves occupyingc.” If stationary it is just c, while if the source is moving forward at velocityu, the number of waves are “crushed into c − u, or if moving away arelengthened by an amount c + u” [Poynting 10].

If the mirror is moving toward the source at speed u, then the wave-length of the incident beam λ will be proportional to c + u, while thereflected radiation at wavelength λ′ will be lessened to c−u due to the com-pression of the spring which is likened to the electromagnetic wave. Thus,

λ′

λ= c − u

c + u, (3.3.3)

so that the wavelength becomes dopplerized on reflection due to the motionof the mirror.

3.4 Ives–Stilwell Non-Null Result: Variationof Clock Rate with Motion

According to Ives [51], the spectrum of relativity theory has “one end bythe Michelson–Morley experiment with its null result . . . and at the other

Page 146: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 119

by the Ives–Stilwell experiment, which demonstrated a positive result, thevariation of the clock rate with motion.”

The classical Doppler shift can be derived by considering the changein frequency of a wave that impinges on a mirror moving at speed u. Theincoming wave impinges on the mirror at an angle ϑ,

A cos ω

(t − x cos ϑ + y sin ϑ

c+ ϕ

),

of amplitude A and phase ϕ, is reflected by the mirror, thereby producingan outgoing wave of the form

A′ cos ω′(

t + x cos ϑ′ − y sin ϑ′

c+ ϕ′

).

At the surface of the mirror, x = ut, there must hold a certain relationbetween the incoming and outgoing waves that is valid for all times. This ispossible only if the coefficients of t in the arguments are the same, implying

ω′ = ω

(1 − (u/c) cos ϑ

1 + (u/c) cos ϑ′

). (3.4.1)

This is the ordinary, oblique, Doppler shift.Relativity changes this by relating the two times and two coordinates

in space through the transformation

t′ = t cosh ψ + xc

sinh ψ,

x′ = x cosh ψ + ct sinh ψ,

y′ = y,

where space and time have been rotated through an ‘imaginary’ angle, ψ =u/c. Although this is commonly referred to as the Lorentz transformation,it was first derived by W. Voigt in 1887. Only as late as 1909 does Lorentzrealize that “to my regret [Voigt’s paper of 1887] has escaped my notice allthese years.” Lorentz continues, “The idea of the transformations . . . mighttherefore have been borrowed from Voigt, and the proof that it does not alterthe equations for the free aether is contained in that paper” [O’Rahilly 38].But, the eponym is too engraved in the literature, and we will continue toadhere to it. This is yet another example of Stigler’s law of eponymy.

Page 147: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

120 A New Perspective on Relativity

Now, the incoming wave,

A cos ω

(t − x cos ϑ + y sin ϑ

c+ ϕ

),

will be converted into the outgoing wave,

A′ cos ω′(

t′ + x′ cos ϑ′ − y′ sin ϑ′

c+ ϕ′

),

but the phase is a physical invariant, and if there is no change in energy, thecondition for the agreement of the cosines at the surface of the mirror is:

ωt(1 − x

ctcos ϑ

)= ω′t′

(1 + x′

ct′cos ϑ′

). (3.4.2)

Let us consider the motion in the unprimed system from the origin of theprimed system. Then x′ = 0, which is equivalent to considering the trans-verse shift, ϑ′ = π/2. From the second equation of the Lorentz transformwe find x/t = −c tanh ψ = −u, and inserting this into the first equationgives t′ = t sech ψ. Thus, condition (3.4.2) is equivalent to:

ω′ = ω

(1 + (u/c) cos ϑ√

(1 − u2/c2)

). (3.4.3)

To test (3.4.3), Einstein in 1907 predicted that it might be possible toobserve the transverse shift, ϑ = π/2 by examining the light emitted bycanal rays in hydrogen, which Stark had published a paper on the yearbefore. No general confirmation was possible, and it had to wait more thanthirty years until Ives and Stilwell performed their famous experiment.

Realizing that it would be almost impossible to get a direct measure-ment of the transverse Doppler shift, they resorted to an averaging of theforward and backward radiation that ions would emit by accelerating themthrough a given voltage. Solving (3.4.3) for the corresponding wavelengths,we get the longitudinal Doppler shifts for ϑ = π, 0, and developing theseexpressions in a series in powers of u/c gives

λ′(π) = λ

(1 + u/c1 − u/c

)1/2

= λ

(1 + u

c+ 1

2u2

c2 + · · ·)

,

(3.4.4)

Page 148: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 121

and

λ′(0) = λ

(1 − u/c1 + u/c

)1/2

= λ

(1 − u

c+ 1

2u2

c2 − · · ·)

.

(3.4.5)

Normally, square and higher powers in the relative velocity are so muchsmaller than the linear term that they can safely be neglected. This will givea first-order Doppler shift of ±(u/c)λ. However, by averaging the forwardand backward wavelengths of the emitted radiation, the first-order termscancel, and to lowest-order there results

�λ2 := 12

(λ′(π) + λ′(0)) − λ ≈ 12

u2

c2 λ. (3.4.6)

In comparison, the first-order Doppler shift is �λ1 = uλ/c, so plotting thesecond-order shift (3.4.6) in terms of the first-order one, should result ina parabolic plot. This is exactly what was found by Ives and Stilwell, asshown in Fig. 3.3, in which a hydrogen discharge tube was the source ofH+

2 and H+3 ions.

Fig. 3.3. Second-order wavelength shifts plotted as a function of first-order shifts.

Page 149: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

122 A New Perspective on Relativity

3.5 The Legacy of Nineteenth Century English PhysicsThe English teach Mechanics as an experimental science. On the continent it isalways presented more or less as a deductive science and a priori. The Englishare right, needless to say. . . On the other hand, if the principles of Mechanicshave no other sources than experiments, they are therefore, only approximate andtemporary. New experiments may lead us some day to modify or even abandonthem [Poincaré 02]

We are at the end of the nineteenth century. It is known that the energy ofmotion is proportional to the square of the velocity, i.e. the kinetic energy.Also [Poynting 10]

. . . waves contain energy. If we compress them into a shorter length we have to putmore energy into them, somewhat as we have to put more energy into the spiralspring when we crush it up.

Since the spring is shortened, its ends will exert a pressure on any sur-face which it comes into contact with. It was assumed that the energydensity was proportional to the inverse square of the wavelength. Thisassumption is somewhat prophetic since it establishes an inverse depen-dency of the speed on the wavelength. Introducing the mass, the constantof proportionality had to be an action, and the only action that was aroundwas Planck’s quantum of action. So, de Broglie’s relation could have beenderived almost a quarter of a century before he did by people of the likesof Poynting!

3.5.1 Pressure of radiation

With an inverse square dependency of the energy density upon the wave-length, (3.3.3) gives for the ratio of the energies reflected and incident onthe mirror as

ε′

ε=

(c + uc − u

)2

.

The net rate of energy flow into the mirror is the difference,

ε′(c − u) − ε(c + u) = 2uε

(c + uc − u

),

and this is the work done against the mirror.

Page 150: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 123

The work is of a compressional nature so it is logical to set it equal toP′u, where P′ is the pressure. When this is done, it is seen that the pressurehas been dopplerized by the amount,

P′ = P(

c + uc − u

), (3.5.1)

where P = 2ε is the pressure if the mirror were at rest. In Sec. 6.6, we willderive (3.5.1) from electromagnetic theory.

Moreover, we can write the Doppler shift, (3.5.1), in the suggestiveform

P′ = c2 − u2

c2 + u2

(ε + ε′)

= √(1 − B2)(ε + ε′),

(3.5.2)

in terms of the total energy density, where

B = 2u/c1 + u2/c2 . (3.5.3)

Relation (3.5.2) expresses the pressure as the Lorentz contraction of the totalenergy density, where the relative velocity, (3.5.3) is in a frame in which themirror is initially stationary. It is also the relativistic composition law forcollinear velocities.

3.5.2 Poynting’s derivation of E = mc2

An even more profound relation can be obtained following Poynting’s rea-soning on light pressure and its relation to corpuscular theory. “Let,” saysPoynting,

a beam of light, supposed to consist of corpuscles moving with velocity c, be incidentperpendicularly on a completely absorbing, that is, a quite black surface. Let m bethe mass of the corpuscles in a cubic centimeter. Then the mass coming up to andentering a square centimeter of the surface in one second is that in a column of ccentimeters long and 1 sq. cm. cross section. The total mass is therefore mc. As it hasvelocity c the momentum entering per second is mc2. But this momentum enteringper second is the pressure P′ per sq. cm.

Poynting thus equates P′ = mc2.

Page 151: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

124 A New Perspective on Relativity

Introducing this into (3.5.2) results in

(ε + ε′)√(1 − B2) = mc2. (3.5.4)

We can therefore think that the total energy be given by

ε + ε′ = mc2√

(1 − B2)= m′c2. (3.5.5)

Relation (3.5.5) is a clear indication that mass should have a dependencyon the speed. This clearly shows that the English physicists, at the turnof the twentieth century, had all the necessary elements to determine therelativistic mass–energy relationship (3.5.5) before their colleagues on thecontinent.

3.5.3 Larmor’s attempt at the velocity compositionlaw via Fresnel’s drag

Another contender for relativistic glory was Joseph Larmor, who camewithin a hair’s breadth of the relativistic law of the composition of velocitiessome time prior to 1900.

Bradley’s work on stellar aberration had been known for almost acentury when it occurred to Arago that, inasmuch as the velocity of lightis different in glass than in air, the aberration of its path caused by themotion of the Earth would also be different in glass. The optical deviationcaused by a glass prism would be different depending on whether the lightrays are in the direction of the Earth’s motion or in the opposite direction.Arago found that the laws of reflection and refraction are not affected by themotion of the Earth. Since the velocity of light in a motionless medium hadits normal value, a way was sought to decrease its velocity of propagationin a medium which was in motion. If η = c/u is the index of refraction inthe medium then a decrease the velocity u by the factor, 1 − η−2, would dothe trick. This supposition on the part of Fresnel left the rays relative to themoving medium unaltered, which at the same time did not contradict anyknown facts about aberration.

In 1871 Sir George Airy, who was a bitter critic of Faraday’s linesof force and Maxwell’s electromagnetic theory, devised an experimentto determine whether the motion of the Earth through the aether could

Page 152: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 125

be revealed. A telescope is aimed at a star which is directly overhead[cf. Fig. 5.1]. Denote by α the angle of aberration, and u the speed ofthe Earth through the aether. Now, fill the telescope full of water whichhas a refractive index η = 1.33. According to the wave theory of light,light will travel more slowly in water than in air. It will appear that thelength of the telescope will be lengthened by a factor of η. In order tokeep the star in sight, the telescope will have to be tilted still further,say to an angle β. By measuring this angle we might hope to findthe speed u. We must also take into account the refraction that occursat the objective lens; on one side we have air and on the other side,water.

Using Snell’s law of refraction,

η = sin β

sin γ≈ β

γ, (3.5.6)

where γ is the angle of refraction in water. We would expect that γ ≈ ηα,where α ≈ u/c, the angle of aberration. Hence, the difference

β − α ≈ (η2 − 1)u/c.

Measurement of the angles allows u, the speed of the Earth through theaether, to be determined. Again, a null result was obtained: there is nodifference between the aberration angles α and β.

Now, since α and β are the same, the angle of refraction is smallerthan either of these two by the amount α/η. The telescope has length �,and the time it takes for light to pass down is t = η�/c in the presence ofwater. The light entering the top of the telescope, as measured from theposition of the eyepiece, is the sum of refraction, γ�, and the dragging ofthe water, fut, where if there were no drag it would just be ut, the distancethat the telescope moves. Hence, this distance can be thought of as the sumof refraction and drag, viz.

ut = γ� + fut.

Recalling that � = ct/η and γ = u/cη gives Fresnel’s drag coefficient,

f = 1 − 1η2 . (3.5.7)

Page 153: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

126 A New Perspective on Relativity

Who would have thought that (3.5.7) would come from somethingelse, and only be an approximation to order η−2? That person wasLarmor. Larmor [00] was fully aware of the FitzGerald–Lorentz contractionhypothesis for which “the dimensions of the moving system are contractedin comparison with the fixed system in the ratio

√(1 − u2/c2).”

Larmor reasoned that the particle should move with its own velocityv and under a convective velocity u of the medium in the x-direction by theamount x = v(t − ux/c2), or

w = v1 + uv/c2 ,

where w = x/t is the net velocity. He also took into account the FitzGerald–Lorentz contraction, so that the speed is reduced to

w = v

(1 − u2

c2

)1/2/ (1 + uv

c2

).

Then to lowest-order Larmor found

w ≈ v(

1 − 1η2

),

and identified the coefficient of v with the Fresnel drag coefficient, (3.5.7).Had Larmor simply added the term ut to his displacement, i.e. x =

v(t − ux/c2) + ut, he would have come out with

w = xt

= u + v1 + uv/c2 . (3.5.8)

This has the advantage of symmetrizing the convective velocity u and theparticle’s velocity v so that Larmor could have equally as well have writtenx = u(t − vx/c2) + vt, and arrived at (3.5.8).

The important point is that to lowest-order both u and v must con-tribute to the speed of the particle, w. This is what Larmor missed, andin so doing, failed to derive the collinear addition law for speeds, (3.5.8).Whittaker [53] succinctly sums up Larmor’s contribution:

We are now in a position to show the connection between the Lorentz transformationand FitzGerald’s hypothesis of contraction; this connection was first establishedby Larmor [00] for his approximate form of the Lorentz transformation, which isaccurate only to the second order in (w/c), but the extension to the full Lorentztransform is easy.

Page 154: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 127

3.6 Gone with the Aether

3.6.1 Elastic solid versus Maxwell’s equations

It may be said that whereas nineteenth century physicists tried to explainNature, twentieth century physicists had the more modest goal of attempt-ing to describe her. Kelvin once remarked to his friend Tait, “If you tell mewhat electricity is, I will tell you everything else.” Unfortunately, todaythis list has to be lengthened considerably. So after a century and a halfof Maxwell, do we know what is electricity? And how to describe it bestwhether in terms of fields or their potentials? This was the purpose of theaether: to account for Maxwell’s equations. According to Heaviside,

Let it not be forgotten that Maxwell’s theory is only a first step towards a full theoryof the aether; and, moreover, that no theory of the aether can be complete that doesnot fully account for the omnipresent force of gravitation.

It should be said “Let Maxwell be and all is light,” instead of referringto Newton. In one bold sweep, Maxwell introduced the displacement cur-rent to produce a magnetic field, and thereby obtained a wave solution tohis equations. Moreover, his equations predicted the velocity at which lightshould travel. It is given by the ratio of the electromagnetic unit of currentto the electrostatic unit of current. By measuring the current and voltagein the lab, and aided with a balance that indicated the equality betweenelectrostatic and magnetic forces, Maxwell found, around 1865, that thenumerical value of c was slightly less than 3×1010 cm/sec. So if electromag-netic waves really did exist they should have a speed of 300,000 km/sec, amind-boggling number in those days.

Waves must travel through something, so what is that ‘something’?In Maxwell’s time, it was believed that visible light was the wave motionof a luminiferous aether that, though weightless, had remarkable elasticproperties that could give rise to transverse waves. Maxwell’s equationsonly reiterated this belief that electromagnetic waves were disturbances ofthe luminiferous aether. And so Maxwell was led to believe that light wasmerely a manifestation of an electromagnetic wave.

Even earlier, Faraday had speculated upon this possibility reasoningthat whereas one all-pervasive and infinite aether was heavy on the mind,all the more so, would be the coexistence of two aethers — one for light and

Page 155: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

128 A New Perspective on Relativity

the other for electricity. On this basis alone, Faraday reasoned that lightwas an electromagnetic phenomenon.

Maxwell’s equations describe the unified electromagnetic andluminiferous aether, which is the “seat and transmitter of not only of electricand magnetic energy, but that of light.” Is this the most general aether thereis? Helmholtz devised a more general aether which would generate lon-gitudinal as well as transverse waves, but would do so independently [cf.Sec. 11.5.5]. It was Boltzmann who showed how this could be done by intro-ducing certain substitutions. Moreover, Boltzmann’s substitutions showedhow Helmholtz’s aether could be reduced to that of Maxwell. However, allattempts to find light phenomena that are governed by longitudinal wavesproved in vain, and so, too, the need to generalize Maxwell’s equations.Even today it is felt [Skilling 42]

safest to avoid the embarrassing question of the character of the medium in whichsuch waves are transmitted. The best we can do is to follow Faraday’s “shadow ofa speculation” and “dismiss the aether but keep the vibrations.”

Elastic solid theories of light were mechanical and made precise state-ments of what light is: light is the vibration of an elastic solid. Simple refrac-tion could be explained by changes either in the rigidity of the solid or itsdensity. However, it is known that a longitudinal wave is created in thereflection of a transverse wave polarized in the plane of incidence whereasone oscillating perpendicularly to the plane of incidence does not create alongitudinal wave. The existence of a longitudinal wave would then gen-erate another longitudinal wave as well as a transverse wave in the planeof incidence.

In other words, if the elastic solid was a viable model for light then thesplitting of light beams at interfaces would necessarily bring in longitudinal waves,which we know from experience not to exist. The splitting of light waves atsurfaces separating two different media was known as double refractionof light, and no observation of longitudinal waves has ever been observed.Maxwell’s theory predicted there would be none.

If we accept that Maxwell’s theory is a theory of propagation throughthe aether then the only vestiges are the aetheral constants, the permit-tivity and inductivity. The luminiferous aether has a long and gloriouspast, but came to a tragic end when it was interpreted as vacuousby the Michelson–Morley experiment. Why were so many of the great

Page 156: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 129

physicists of the nineteenth century prey to it? And why was it feltnecessary to get rid of the longitudinal waves and what seemed an absenceof boundary conditions? Maxwell’s integrals of densities were over allspace.

In a continuous, elastic medium (Green’s aether) with a compressibil-ity λ, and shear modulus, n, measuring its ‘rigidity’ to deformation, thereare two kinds of waves that can propagate through it: longitudinal waves inwhich the medium wiggles back and forth, and transversal waves in whichthe medium waves back and forth in directions normal to the direction ofpropagation. These waves can propagate independently of one another,and at different speeds: transversal waves travel at a speed,

vT =√ (

), (3.6.1)

while longitudinal waves have a speed,

vL =√ (

λ + 43n

ρ

), (3.6.2)

in a medium of density ρ. We can understand the rigidity and density ofa material body, but how do these properties relate to the electromagneticfield? In other words, what values must we substitute for these quantitiesso that the speed of light comes out?

With the discovery of polarization at the beginning of the nineteenthcentury by Malus, Arago and Fresnel, there was almost unanimous con-sensus that luminiferous vibrations had to be transversal. The naggingquestion was how to get rid of the longitudinal waves — if they hadto be gotten rid of at all. In fact, their observation was a long awaitedevent. Initially, it was thought Röntgen’s rays, which he discovered in1895, were the long awaited aetheral waves. The English physicist SilvanusThompson called them ‘ultra-violet sound’ due to their very short wave-lengths. The new longitudinal radiation was considered as the missing linkto the understanding of gravity. It even provoked a reaction in Kelvin whoput together a paper entitled “On the generation of longitudinal waves inthe aether.” Since Maxwell’s equations only allowed transverse wave prop-agation, Larmor voiced his opinion that Maxwell’s equations “requiredsome modification” that would allow for longitudinal disturbances

Page 157: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

130 A New Perspective on Relativity

to propagate. Even Helmholtz’s electrodynamic theory allowed forlongitudinal waves, and Helmholtz was championed by none other thanBoltzmann.

It is rather interesting to note how genius is limited to a certain area ofexpertise. Boltzmann was no match for Heaviside in electromagnetism, butHeaviside was no match for Planck in the arena of thermodynamics. Gibbs,though, held his own in both areas: Kelvin’s claim that Röntgen waveswere “condensational waves of the luminiferous aether” was followed byhis idea on how to test experimentally these waves of superluminal speed,which according to (3.6.2) would need a large compressibility factor, λ.Gibbs solved the electromagnetic equations under the conditions cited byKelvin and showed only transverse waves resulted. Both Boltzmann andLodge supported Kelvin in the hope that the newly discovered radiationwould be longitudinal because it would make the aether much more palat-able. Even Rayleigh also made his bloomers, as in his paper “The apparentfailure of the usual electromagnetic equations” in which he sided with theAmerican Barus that Maxwell’s equations could be interpreted as to makethe waves run ‘backwards.’

Ten years were to pass after Röntgen’s discovery before Barkla provedconclusively that these new X-rays were transversal. What Barkla did wasto assume that the rays were transversal and used his experiment to con-firm it. Incident light makes dipoles vibrate in the direction of its electricvector. The scattered light will be linearly polarized if the incident lightis linearly polarized. The intensity of scattered light will be zero in thedirection of the electric vector and maximum at the direction making aright angle to both it and the direction of propagation of the incident wave.Alternatively, if the incident light is unpolarized, the light scattered in thedirection normal to the direction of propagation will be linearly polarized.The components of the electric vectors of the unpolarized light that arenormal to both these directions produce scattered light in the direction per-pendicular to the direction of propagation of the incident light. In Barkla’sexperiment the scatterers were spheres of paraffin; had they been heaviermaterials, the radiation produced on scattering might have given wrongresults.

Yet, none of these superstars would have ever questioned the existenceof the aether. Whereas aether theories had to have longitudinal waves,

Page 158: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 131

Maxwell’s theory just did not include them. As Heaviside asserted:

There are no ‘longitudinal’ waves in Maxwell’s theory analogous to sound waves.Maxwell took good care that there should not be any.

The vectors E and H oscillate in planes normal to the direction of prop-agation. All, and that is a pretty big prescription, one must know are thepermittivity ε and permeability µ. In contrast, the theories of the aetherattributed a difference in the optical properties of two bodies either to theirdifference in density ρ or to their difference in rigidity, n.

Associating either the electric force or the magnetic force with thevelocity of the medium serves to reduce an electromagnetic problem toone of an elastic solid which is easier to picture. Experience has taught thatnearly all transparent bodies (i.e. non-conducting) have equal permeabil-ities, µ ≈ 1, and places the blame of optical differences squarely on theshoulders of the permittivities, ε. This is true at least for extremely shortelectromagnetic waves. Variations in the density are therefore due to varia-tions in the permittivity which for different dielectrics has widely differingvalues. This means that the electric energy density is a kinetic energy den-sity, and since both are quadratic, the electric field is proportional to thevelocity. The magnetic induction is represented by the rotation of the aetherparticles and the electric vector is proportional to their velocity. This is theway Fresnel pictured light.

Although Heaviside would agree entirely with such an analogy, it isnot without its difficulties. For an ordinary conductor with constant charge,it would mean that there is a permanent flow of aether. However, followingthe analogy, the vector potential would be the spatial displacement, insteadof the ‘electrokinetic momentum’ designated by Maxwell, so that the dis-placement current would play the role of momentum, the magnetic forcethat of a torque, and the magnetic energy would be the potential energydue to rotation.

Rather, if inertia is now attributed to the permeability which shouldoccur in the opposite, long wavelength limit, then the magnetic forcewould play the role of the velocity of the aether particles and magneticinduction their momentum. The continuous streaming of the aether neara magnet is circuitous, which is not difficult to imagine since it leads tosteady flow. Thus, whenever there is a magnetic force, there will be an

Page 159: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

132 A New Perspective on Relativity

aetheral velocity, and the dielectric displacement is the rotation due to thatvelocity.

This is the way Neumann construed light to behave, where themagnetic force lies in the plane of polarization. There is nothing in elec-tromagnetic theory to indicate which force lies in the plane of polarization.Historically, the magnetic vector is called the direction of polarization,and the plane containing this vector and the direction of propagation isreferred to as the plane of polarization. But, one could equally choose theplane of polarization to be the electric vector and the direction of propaga-tion. In the case of the reflection of light at the boundary of a transparentdielectric, if we assume that the displacement coincides with the electricdisplacement, we get Fresnel’s formula for the ratio of the reflected to inci-dent wave thereby indicating that the magnetic vector lies in the plane ofpolarization.

Even though the magnetic force plays a role only when the particleis in motion, Heaviside, while admitting the difficulties in associating theelectric force with the velocity of the medium in that a charge must be con-tinually “emitting fluid in all directions,” considered the case far worse ifthe magnetic force is the velocity,

for an impossibility is involved. The electric force becomes rotation or proportionalthereto, and the impossibility is that we need to have E both circuital and polarroundabout an isolated charge! Dr. Larmor’s determined attempt to make the rota-tional aether go, with H as the velocity, labors under this apparently incurabledefect.

The situation is not so clear-cut as Heaviside would have us believe,and even he wavered. In the first volume of his Electromagnetic Theory hechampioned associating a velocity with the magnetic force. By the time hegot to the second volume he had changed his mind completely. In the firstvolume he argues

I have shown that when impressed electric forces act it is the curl or rotation ofthe electric force which is to be considered as the source of the resulting distur-bances. Now, on the assumption that the magnetic force is the velocity of the elasticsolid, we find that the curl of the impressed electric force is represented simplyby [the] impressed mechanical force of the ordinary Newtonian type. This is veryconvenient.

If we are considering the increase in inertia due to a charge set in motion,the natural thing would be to associate H with the velocity, as we shall do

Page 160: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 133

in Sec. 5.4.3, while if we want to discuss the possibility of compressionalwaves then E is the more likely candidate for the velocity, as we shall seein Sec. 7.3.

Heaviside oscillated between euphoria

Aether is a wonderful thing. It may exist in the imagination of the wise, beinginvented and endowed with properties to suit their purposes; but we cannot dowithout it. . . But admitting the aether to propagate gravity instantaneously, it musthave wonderful properties, unlike anything we know.

and depression

The actual constitution of the aether is unknown. It can never be known.

3.6.2 The index of refraction

The aether had its purpose and maybe it still has. It got the likes of Maxwellto reason in terms of bodily distortions and how they could be modeled bytwo field equations, even though he preferred his ‘electrokinetic momen-tum’ to the actual fields. As we shall see in Sec. 7.3, the decomposition ofthe wave equation into the field equations provides an understanding ofwhat is being propagated and how it is being propagated.

Moreover, the product of the two parameters in Maxwell’s theoryallows the introduction of a spatial-varying index of refraction. UntilHertz’s famous experiment, the index of refraction was a gateway to exper-imental confirmation. In a transparent isotropic medium with dielectricconstant ε, and showing but a negligible difference in µ, Maxwell pre-dicted the speed of light to be c/

√ε. The refractive index of the dielectric

medium is simply√

ε. Maxwell extrapolated the index of refraction forinfinitely long waves so that it would approach a quasi-static process whichwould facilitate measurements of the dielectric constant. For solid paraffin,whose square-root of the dielectric constant was 1.405, he determined anindex of refraction of 1.422. The difference was more than experimentalerror would allow and did not confirm conclusively his theory. AlthoughMaxwell admitted that the dielectric constant was not the sole contributionto the index of refraction, it was its major contributor. Maxwell expectedbetter agreement when “the grain structure of the medium in question willbe taken into account.” Sadly, he died before his theory was confirmed byHertz.

Page 161: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

134 A New Perspective on Relativity

Much of the sequel depends on the form of the index of refraction, andfor mass to enter we must consider a dispersive medium. In this respect wemay follow Schrödinger’s derivation of the group velocity. He consideredthe equivalence between Hamilton’s principle,

∫2T dt = extremum, (3.6.3)

and Fermat’s principle of least time,∫

dsv

= extremum, (3.6.4)

where 2T is twice the kinetic energy, ds, is an element of path, and v is thephase velocity. The integrands of the two principles should be proportionalto one another.

Schrödinger then wrote the kinetic energy as

2T = m(

dsdt

)2

= 2m(W − V) = √[2m(W − V)]dsdt

.

Inserting this into (3.6.3) gave him∫ √[2m(W − V)]ds = extremum. (3.6.5)

The integrands of (3.6.4) and (3.6.5) must be proportional to one another,i.e.

v = C√[2m(W − V)] , (3.6.6)

where the constant of proportionality, C, must be independent of the spacecoordinates upon which the potential V depends. Hence, it can, at most, bea function of the total energy, W .

For a nonrelativistic particle, Schrödinger identified the group velocityu from the expression of the momentum

u =√[2m(W − V)]

m. (3.6.7)

If ω is the angular velocity and κ, the wavenumber, then this must coincidewith the definition of the group velocity as

1u

= dκ

dω= d

ω

ω/κ= d

dWW

√[2m(W − V)]C

,

Page 162: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 135

where ω/κ is the phase velocity, and he availed himself of Planck’s relationbetween energy and frequency. Upon differentiation, Schrödinger foundC = W .

For an electromagnetic wave, the phase velocity is c/η, which if thesystem is inhomogeneous, the index of refraction will depend upon thespatial coordinates. The product of the group, (3.6.7), and phase, (3.6.6),velocities is

uv = cη

·√[2m(W − v)]

m= W

m,

from which the expression,

η = c√[2m(W − V)]

W, (3.6.8)

for the index of refraction follows. Interpreting W as frequency we have

κ2 :=(ηω

c

)2 = 2m(W − V),

and the reduced, or Helmholtz, equation becomes

∇2E + κ2E = 0, (3.6.9)

which will have a prominent role in our gravitational studies in Chapter 7.Schrödinger then opted to describe the vibratory motion of an elec-

tron in the hydrogen atom by finding the “possible movements of an elasticbody.” Realizing that this is complicated by the existence of both longitu-dinal and transverse waves, he decided to “avoid this complication,” andconsider only longitudinal waves, thereby missing out on the discovery ofspin.

Rather, we consider the wave equation obtained from the non-dispersive Maxwell equations

∇ × H = εE,

−∇ × E = µH.

We look for a solution where the electric and magnetic forces have the form

X(x, y, z)e−iω[t−S(x,y,z)],

Page 163: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

136 A New Perspective on Relativity

where S is known as the eikonal. In the high frequency limit, or shortwavelength limit, the wave propagates in the direction ∇S which is per-pendicular to both E and H, where

(∇S)2 = µε = 1v2 = η2

c2 .

Combining this with (3.6.8) gives the expression for the product of the twoparameters appearing in Maxwell’s theory as

εµ = 2m(ω − V)ω2 , (3.6.10)

in units of action.Unfortunately, there is no known material that has a constituency

relation of the form (3.6.10). It neither reduces to Cauchy’s law of disper-sion in an electromagnetic system where η = √

ε and µ = 1, nor in anelectrostatic system where η = √

µ and ε = 1. However, it does includethe phenomenon of total reflection, where ω < V. Total reflection occurswhen the first medium from which light arrives is optically denser than thesecond medium into which it enters. Take glass and the vacuum, and useprimes to indicate the internal incidence within the glass, and the externalrefraction in the vacuum, β′ and γ ′. In applying Snell’s law, (3.5.6), the exter-nal angle of refraction γ ′ corresponds to our former β, while β′ is equal toour former γ . Total internal reflection occurs when β′ > sin−1 (1/η), wherethe latter is referred to as the critical angle. Beyond this angle, Snell’s lawyields values of sin γ ′ which are preposterous because

sin γ ′ = η sin β′ > 1,

and, consequently, values of cos γ ′ that are imaginary. The Fresnel reflectioncoefficient from the boundary becomes complex, predicting total reflection,but, more surprisingly, it predicts the correct phase jump that occurs at totalreflection.

However, even beyond the critical angle there still exists a refractedwave, whose amplitude, however, will decay exponentially, the morerapidly the greater the difference between β′ − sin−1 (1/η) becomes. Thereason for this transmission is exactly the same as for electron waves: In aregion of imaginary index of refraction the waves penetrate in an exponen-tially decaying manner. The inequality ω < V implies (3.6.8) is imaginary.

Page 164: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 137

3.7 Motion Causes Bodily Distortion

Even before Larmor, Lorentz was contemplating what effects motion hason the change in configuration of bodies. According to Lorentz, the shapeand charge distribution of an electron at rest must be correlated with onein motion. In Sec. 5.4.4, we will see that Lorentz assumed that when anelectron, with a spherical shape and uniform charge distribution on itssurface, is set in motion it will become an oblate spheroid, whose semimi-nor to semimajor axes are related by the FitzGerald–Lorentz contractionfactor. Although Lorentz does not pretend to offer even a partial explana-tion, he has gone a long way in driving home the point that motion causesdistortion. There is no better way of summarizing this then to repeat Cun-ningham’s [14] words:

. . . we are bound to recognize the possibility of changes in the shape and properties ofmaterial bodies when their velocity is altered, and also that the Newtonian conception of arigid body as one having a permanent configuration independent of its velocity is one whichis not even approximately realized unless that velocity is very small compared with that oflight.

3.7.1 Optical effect: Double diffraction experiments

Michelson’s attempt to determine the speed of the Earth relative to theaether was the first of its kind at determining a second-order effect. Evenearlier, Mascart [74] sought a first-order effect, which was repeated later byLord Rayleigh [02], and, still later, by Brace [04] to an even higher degreeof precision.

Rayleigh reasoned that if an isotropic transparent body actually didcontract as a result of its motion through the aether, then its optical prop-erties must be modified. An isotropic, transparent body would no longerbe isotropic, and, as a consequence, a beam of light passing through itin an oblique direction would undergo birefringence, or double refrac-tion. However, he could find no trace of such a change. Brace repeatedRayleigh’s experiment to a precision that, if detected, would be one-fiftiethof what would be produced through a mechanical contraction due topressure.

The null result of Rayleigh’s attempt was fully expected by Larmorwhen his paper was read before the BritishAssociation, meeting in Belfast in

Page 165: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

138 A New Perspective on Relativity

September 1902. Whereas “optical measurements are usually made by thenull method of adjusting the apparatus so that the disturbance vanishes,”Larmor [00] contended that the “result carries the general absence of theeffect of the Earth’s motion in optical experiments, up to second-order ofsmall quantities”; that is, up to terms of order (u/c)2.

This hardly led to credence in the FitzGerald–Lorentz hypothesis ofcontraction in the Michelson–Morley experiment. For if there were factorsof ‘compensation’ they should have surely been at work in that experiment.However, Lorentz [16] was so attached to his contraction hypothesis that heoffered compensating mechanisms, one being the compensation that wouldensue if there would be differences in the effective mass of an electron whenit vibrates in different directions.

3.7.2 Trouton–Noble null mechanical effectThere is absolutely no evidence of a couple acting on two rigidly connected chargesdue to the motion of the Earth through the aether.

Trouton in collaboration with Noble [03], carrying on the search for aneffect of the motion of the Earth through the aether, that was begun by hisrecently deceased mentor FitzGerald. They set up the apparatus shown inFig. 3.4 to measure a mechanical effect.

Take two equal and opposite charges and set them in motion withthe same velocity in parallel directions. The angle between the chargesand the direction of the motion should neither be a right angle nor zero.Suppose the charges are moving in the x-direction in the xy-plane. Thepositive charge will produce a magnetic field at the negative charge in thez-direction. The negative charge will feel a force in the direction of the y-axisand the positive charge will feel an equal force in the −y direction. Thecharges were then mounted on a platform which could turn freely aboutits center. The forces acting on the charges would then be transmitted tothe platform to provide a couple that would tend to set it in motion atright angles to the motion of the charges. If it exists, the effect would be tothe second-order. The greatest deflection that was measured was 0.36 cm,while the deflection expected theoretically was 6.8 cm.

This experiment extended the null results found in optical-electrodynamic phenomena to mechanical-electrodynamic effects. Myste-rious compensating factors were at work that led one to believe that there

Page 166: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 139

Fig. 3.4. (a) Schematic arrangement of their experimental set-up. A parallel platecapacitor AB was hung from a 37 cm long phosphor bronze strip PA. The capacitorwas charged to voltages up to 3000 V. Amirror attached to the capacitor was viewedthrough a telescope to see if there were any oscillations. In (b) E is the capacitor’selectric field and v is the direction of the Earth’s motion through the aether. Theangle is between the line connecting the opposite charges and the direction of theirmotion.

was some unknown mechanism upon which the mechanical, optical, andelectromagnetic properties of matter depend. This unknown mechanismwas whisked away by Poincaré even before many of the experiments hadbeen performed! In his optics course at the Sorbonne in 1899, Poincaré dis-carded the possibility of ever finding an effect that would provide evidenceof the motion of the Earth through the aether, either to first- or second-orderin the coefficient of aberration, u/c. In his own words,

I regard it as very probable that optical phenomena depend only on the relativemotions of material bodies, luminous sources, and optical apparatus concerned,

Page 167: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

140 A New Perspective on Relativity

and that is true not merely as far as quantities of the order of the square of theaberration, but rigorously.

Whittaker [53] drives this point home by mentioning two other occa-sions where Poincaré lays down his ‘principle of relativity,’ even goingso far as to liken it to the second law of thermodynamics so long as itwould fall if one example could be found where absolute motion couldbe detectable. Apparently, Whittaker was anticipating similar claims madeby Einstein in his electromagnetic paper of 1905 in which he “set forth therelativity theory of Poincaré and Lorentz with some amplifications, andwhich attracted much attention.”

3.7.3 Anisotropy of mass

Be that as it may, everyone, including Lorentz and Poincaré, were talkingabout charged particles, and, in particular, the electron. But, as Thomson[28] notes:

Einstein has shown that to conform with the principles of Relativity mass mustvary with velocity according to the law m0/

√(1 − u2/c2). This is a test imposed by

Relativity on any theory of mass. We see that it is satisfied by the conception thatthe whole of the mass is electrical in origin, and this conception is the only one yetadvanced which gives a physical explanation of the dependence of the mass onvelocity.

Einstein [23a] wrote in his 1905 paper on electrodynamics:

We remark that these results as to the mass are valid for ponderable material points,because a ponderable material point can be made into an electron (in our sense ofthe word) by the addition of an electric charge, no matter how small.

This statement makes no sense in itself, except for expressing the desire toincorporate all of matter into a theory of relativity, for which a case couldonly be made for electrons. Einstein does not even broach the problem ofthe distinction between the electrostatic and electromagnetic masses. Assuch, it can be considered nothing less than a leap of faith.

Even up to the early 1940’s, there was no example of a non-chargedparticle varying with velocity. Stranathan [42] is quick to point outthat while relativity theory and the earlier theory of Lorentz predictedthe same dependency of the mass on speed, there was one important

Page 168: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 141

difference:

Lorentz’s theory treated only electromagnetic mass, of mass attributable to acharged particle because of the energy represented by the fields about it. The relativ-ity theory treats mass in general, with no specification of what may be responsiblefor the property. It is true that Lorentz presented arguments that all mass is proba-bly electromagnetic in character. If these arguments are accepted, his theory wouldlead one to suppose that all mass varies with velocity. The relativity theory predictsthis directly.

But, from all the experiments to-date, all one could deduce is the ratio ofe/m, the ratio of the charge to the mass. So, if there were no charge, therewould be no deflection, and hence no dependency of the mass on the speed!Thus, one had to go to extra-terrestrial objects to show that mass depends onspeed. The example picked by Stranathan is the advance of the perihelionof Mercury, where, after taking into account all known perturbing forceson the orbit, there still remains a residual quantity that must be explained,that of some 43′′ of arc per century. According to Stranathan,

There was no logical interpretation of this residual rotation before the advent ofrelativity theory. If, however, account is taken of the variation of the mass of theplanet with velocity in its orbit, it turns out that one would expect this mass variationto produce a rotation almost exactly the residual observed. This is rather convincingevidence of the change in mass with velocity for ordinary matter. No one doubtsthat all mass changes in exactly the same way.

If this were correct, there would have been no need to invent the generalrelativity. In Sec. 7.6.3 we will see that the advance of the perihelion is due tofactors other than the increase in mass with speed. It just seems astoundingthat recourse had to be made to a phenomenon which has nothing to dowith the increase in mass when a charged particle undergoes deflection inan electromagnetic field.

Then it is a question of mass and its relation to energy and stress.According to the Newtonian viewpoint, the momentum of a particle wouldalways coincide with its direction of motion. That is, the mass, or betterthe ‘rest’ mass, is a scalar quantity which is necessarily isotropic. However,since the inertia of a body depends on its heat content [cf. Sec. 6.1], there is apart of the mass that depends on the stress. Only in the case where the stressdegenerates into the ordinary, scalar, pressure will the mass be isotropic.We may then say that inertia is polarized by its motion. This will occupy ourattention in Chapter 11.

Page 169: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

142 A New Perspective on Relativity

The simplest example is a person walking a wheel. If its center of iner-tia is at rest with respect to the observer, he will see a circle. However, if theobserver is at rest with respect to the wheel in motion, he will see an ellipse.The part of the wheel touching the ground will not seem contracted, whilethe uppermost part of the wheel will move with double the velocity, andthus appear contracted. Consequently, there will be a greater mass densityat the upper part of the wheel than at the bottom.Agreater amount of inertiaabove the center of inertia, coinciding with the center of the ellipse, meansthat inertia has been polarized by the motion of the wheel [Fokker 65].

Momentum will be parallel to the velocity only when the particlemoves along one of its principal axes of stress. That is, the torque exertedon the particle will vanish only when its constant velocity coincides with itsprincipal stress-axes. Stress entangles translational and rotational motionso that there will be two differently directed vectors, and, therefore it willneither be a vector nor a scalar but a combination of the two: a quaternion.

3.7.3.1 Quaternionic mass

From the Lorentz transform on momentum and energy, it emerges that themass is given by the quaternion [Silberstein 14]

M = Wc2 + V

c2γεP, (3.7.1)

where V is the volume at rest, and P is the vector operator, the stress, whichis a mixed second-order tensor. The so-called longitudinal stretching vector,which is a Lorentz transform coefficient without rotation, ε = γii + jj +kk, where γ = 1/

√(1 − u2/c2), stretches vectors parallel to the motion

in the ratio γ : 1, while not affecting those normal to the motion. Since iis the versor for the velocity u, i.e. i = u/u, the stretching factor can bewritten as

ε = I + (γ − 1)uuu2 ,

where I = ii + jj + kk, the idemfactor. Inserting this into the mass (3.7.1)leads to

Mc2 = W + V[

(P · u)uu2 + γ−1

(P − (P · u)u

u2

)]. (3.7.2)

Page 170: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 143

In general, the mass will have both scalar and vector components.The vector components are further broken down into the longitudinaland transverse components; the latter, according to (3.7.2), undergoes aFitzGerald–Lorentz contraction.

If the linear vector operator P degenerates into a simple scalar, thepressure is purely normal and equal in all directions. The mass operator,(3.5.5), degenerates into the scalar rest-mass,

M = W + PVc2 = H

c2 , (3.7.3)

where P is the isotropic, or ‘hydrostatic,’ pressure, and H is known as theheat function, or enthalpy. This expression for the energetic equivalence ofmass was first advanced by Planck in 1907 [cf. Sec. 6.4].

The principal axes of stress are those for which the pressure becomespurely normal. The stress will have three mutually perpendicular principalaxes with the corresponding principal pressures that are represented by thescalars Pi, Pj and Pk . With the definition of the absolute value of the pressureas:

|P| =√ ( 3∑

i=1

P2i

),

the mass operator can be written as:

M = 1c2

(W −|P|V|P|V W

)= W1 + i|P|V, (3.7.4)

where

1 =(

1 00 1

), i =

(0 −11 0

).

The squared absolute value of the mass is

|M|2c4 = W2 + |P|2V2 = W2 + V23∑

i=1

P2i , (3.7.5)

and not as expected from Planck’s formula (3.7.3).

Page 171: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

144 A New Perspective on Relativity

The second term in (3.7.4) is responsible for the inertia of polarization.Although we will come back to this in Chapter 11, let it suffice to say herethat the inertia of polarization destroys the decomposition of the mass into‘longitudinal’ and ‘transverse’ components in the equation of motion,

dGdt

= F,

since the inertial rest mass is no longer a scalar mass.Instead of the matrix representation of the complex mass, (3.7.4), we

can form the quaternion,

M = [W + (P1i + P2j + P3k)V]/c2. (3.7.6)

To this quaternion, there corresponds the matrix:

A = 1c2

(W + iP1V (P2 + iP3)V

−(P2 − iP3)V W − iP1V

). (3.7.7)

Thus, if we multiply the pressure vector by −i, which is equivalent to rotat-ing it through −π/2, the matrix (3.7.7) becomes

A = 1c2

(W + P1V (P3 − iP2)V

(P3 + iP2)V W − P1V

).

The determinant of (3.7.7) is the invariant form (3.7.3). Note, there is nomass component for each principal value of the stress [Silberstein 14], forthat would involve an energy component in addition. Expression (3.7.6)generalizes Planck’s expression for the scalar rest mass, (3.7.3), and gives ageneral relation between the principal stresses and the rest mass.

Another interesting point is this. Apart from the energy terms, thematrix (3.7.7) can be decomposed into three components,

A(i) =(

i 00 −i

), A(j) =

(0 1−1 0

), A(k) =

(0 ii 0

),

which are related to the Pauli spin matrices by

σx = −iA(k), σy = −iA(j), σz = −iA(i).

Page 172: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 145

If we rotate the imaginary component of the quaternion (3.7.6) by −π/2,then instead of obtaining the equation of a circle, or three-sphere, (3.7.5), wewill get a hyperbola, or hyperboloid. It is this rotation that transforms anelliptical quaternion into a hyperbolic Stokes representation when massis invariant. If energy is invariant then the Stokes representation coin-cides with an elliptical quaternion known as the Poincaré representation.These topics will be discussed in much greater detail Sec. 11.2 and Sec. 11.3,respectively.

To test for the possibility of mass anistropy, Poynting and Gray[22] looked at natural crystals which can harbor enormously large ‘latentstresses.’ They attempted to measure whether a quartz crystal sphere hadany directive action on another crystal sphere in its proximity. They did soby attempting to measure any difference in the work it would do to rotatefrom a configuration where their axes were parallel with one another to theopposite configuration where they would be crossed.

It would seem reasonable to believe that the crystals exerted greaterattraction to one another when their axes were parallel. Start with the con-figuration where both crystals lie in the same plane but with their axesperpendicular to one another. To split them apart will take work. Whenthey are out of the range of attraction, turn one of the crystals around tillits axis becomes parallel to the other one. If the process is done quasi-statically it will involve no work. Bringing the crystals together againwill show that less work will be needed than on the outgoing journey.When the crystals have come to within the same distance they startedout in, rotate one of the crystals so that its axis again becomes normalto the other. Work must be done in order to turn one of the crystals, for,if not, then the cycle could be carried out again always yielding a sur-plus of energy. This would be tantamount to a perpetual motion machinewhereby energy without limit can be drawn from it. So the work that isnecessary to rotate the crystal axis must be dissipated either in the coolingof the crystal or in a diminution of its mass. Since neither of these possi-bilities are feasible, the only conclusion we can come to is that it requireswork to rotate the crystal from where its axis is parallel to the other crys-tal to one where it is normal to it. Poynting and Gray failed to find whatthey called any ‘directive action in gravitation.’ The matter was droppedat that.

Page 173: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

146 A New Perspective on Relativity

3.7.3.2 Vectorial mass

Another way to define the electromagnetic mass is to consider it as the ratioof the momentum to speed [Schott 12],

G = mu. (3.7.8)

This definition makes mass a vectorial quantity. We will refer to (3.7.8) astransverse mass, in contrast to longitudinal mass,

m′ = ∂G∂u

= m + u∂m∂u

, (3.7.9)

which is also a vector.The different components of the mass will be related to the external

force, defined by

dGdt

= ∂G∂t

+ ω × G. (3.7.10)

We consider a mass in motion, which, at any instant, has a set of threemutually orthogonal axes, (ξ, η, ζ). These axes are the tangent, principalnormal, and binormal to the path that the electron traces out, as shownin Fig. 3.5. The partial derivative in (3.7.10) refers to the differentiation intime relative to the axes which rotate at an angular velocity, ω. Since theelectromagnetic momentum, G, is not an explicit function of time,

∂G∂t

= u∂G∂u

,

Fig. 3.5. Planes formed from a moving trihedron.

Page 174: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 147

and the angular velocity will have components (u/τ, 0, u/ρ), where ρ and τ

are the radii of curvature and torsion, respectively.The force, (3.7.10), will have the following components:

Fξ = m′ξu − mη

u2

ρ,

Fη = m′ηu + mξ

u2

ρ− mζ

u2

τ,

Fζ = m′ζu + mη

u2

τ,

(3.7.11)

implying three different actions [Schott 12]:

(i) A quasi-longitudinal force component, m′u, proportional to the acceler-ation, u, in the direction of m′.

(ii) A quasi-transversal force component, (−mη, mξ , 0)u2/ρ, proportional tothe centripetal acceleration, u2/ρ, in the osculating plane normal totransverse mass vector, m.

(iii) A torsional force component, (0, −mζ , mη)u2/τ, proportional to the tor-sion, 1/τ, in the normal plane perpendicular to the mass vector, m.

For a symmetrical electron, the torsion vanishes, so the first and sec-ond components reduce to the longitudinal and transverse mass compo-nents, respectively. If the electron is not symmetrical we need:

(i) components m′ηu and m′

ζu, in addition to the longitudinal component,m′

ξu, in order to keep the electron moving in a straight line, whereas(ii) in addition to the centripetal component, mξu2/ρ, the longitudinal com-

ponent, −mηu2/ρ, is necessary to keep the electron moving in a circleof radius ρ at uniform speed, u.

It thus becomes clear that the tangential mass component is relatedto rotational motion, while the longitudinal mass is involved in rectilinearmotion. This gives credence to the statement that the transverse mass ariseswhen the force is perpendicular to the velocity, while the longitudinal massoccurs when the force is parallel to the velocity [Okun 89]. But, it is nevermentioned that it applies to uniform circular motion.

Page 175: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

148 A New Perspective on Relativity

The momentum of the Lorentz electron,

G = mu√(1 − β2)

, (3.7.12)

implies that G is in the direction of the motion, which need not alwaysbe the case. According to (3.7.8) and (3.7.9) this will imply that the twomass vectors, m′ and m point in the direction of motion. Consequently,m′ = m′

ξ and m = mξ , and all the other components vanish. With the massdependencies given by m = m0γ and m′ = m0γ

3, where m0 is 4e2/5c2aaccording to Lorentz, with a the radius of the sphere. While to the relativists,m0 is just the ‘rest’ mass, although it will have both a longitudinal, m0γ

3u,component and a centripetal force, m0γu2/ρ, component.

Therefore, the total force acting on a Lorentz electron will be

F = m′u + muρ, (3.7.13)

where uρ = u2/ρ is the centripetal acceleration. However, this is not whatthe relativists tell us today. The longitudinal component did not complywith the early e/m measurements, and was quickly swept under the carpet.The transverse component has nothing to do with orbital motion so thatthe force acting on the electron is mu, but it still requires u ⊥ F!

The rate of energy loss,

Fu = m′uu + mu3/ρ, (3.7.14)

shows that part of the energy is stored as

W =∫

m′u du,

and part is lost at the rate mu3/ρ due to its tendency to rotate.a We will nowsee that the early measurements on the electron were specifically designedto measure the centripetal force in (3.7.13).

aSince an accelerating electron radiates, (3.7.14) would have supplemented by twoother terms: the rate of energy loss due to radiation, and a small term involving thepressure of radiation. We shall consider these terms in Sec. 4.3.

Page 176: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 149

3.7.4 e/m measurements of the transverse mass

It is the purpose of section to show that the early experiments to mea-sure the ratio e/m were specifically designed to measure the transverse, orthe centripetal acceleration in the force law, (3.7.13). There were two gen-eral methods that the ratio e/m could be measured: the so-called deflectionmethods where one observes the bending of an electron beam in electric andmagnetic fields, and the spectroscopic method which measures frequenciesin the spectral lines of a radiating atom which depend, among other things,on the mass of the nucleus and the strength of an externally applied mag-netic field. We will discuss only the former class of measurements.

3.7.4.1 Thomson’s method

Rather than aiming at a numerical precision of the charge-to-mass ratio,e/m, Thomson was interested in gaining information regarding the natureof the particles in beams that could be deflected by electric and mag-netic fields. Thomson carried out his experiments in 1897 before it becamethe fad to discriminate among the different theories that attempted toaccount for an increase in inertia with speed. These will be discussedin Sec. 5.4.1.

To Thomson’s surprise, he found that a great number of differentsources possessed the same type of particle, which was baptized the ‘elec-tron’ by G. J. Stoney, who was FitzGerald’s uncle, in 1891. AlthoughThomson only made sure identification of the electron in 1897, from hiswork on cathode rays, the term ‘electron’ was well amalgamated into theliterature by then, having been adopted by Lorentz in 1892, and by Larmorin 1894.

Thomson’s apparatus, shown in Fig. 3.6, has electrons produced atthe cathode C passing through slits A and B which then hit a fluores-cent screen. Between plates D and E an electric field could be applied, andthrough a current flowing in two external coils, a magnetic field could beproduced that would be perpendicular to the electric field. The coils werearranged so that the particles would experience both electric and mag-netic fields simultaneously. The fields were oriented in such a way that,when one field acts alone, the beam would be deflected either upward ordownward.

Page 177: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

150 A New Perspective on Relativity

Fig. 3.6. Thomson’s apparatus for determining the ratio e/m for cathode rays.

The fields were so adjusted that the net deflection on the electronswas zero. In this way, Thomson’s pet annoyance with the lack of actionand reaction in the Lorentz force could be avoided for there would result

uc

= EH

. (3.7.15)

The beam of electrons was then deflected by the magnetic field in whichmechanical equilibrium is achieved by balancing it with the centripetalforce,

eHuc

= mu2

R,

where we will determine R subsequently [cf. following (3.7.24) below]. Nomatter what the dependency the transverse mass has upon speed, it clearlyselects out the centripetal component of the force in (3.7.13). Eliminatingthe velocity between these two equations results in

em

= c2EH2R

. (3.7.16)

Since all the quantities on the right-hand side of (3.7.16) could be measuredexperimentally, the ratio of e/m could therefore be determined.

Thomson later modified his apparatus so that after the cathode rayspassed through the slit they entered a ‘Faraday chamber.’ A Faraday cham-ber is an insulated conductor which the electrons in the beam hit afterpassing through a small aperture in the chamber. If there are N electronsin the chamber, the total charge accumulated will be

Q = N e.

A beam with the same number of electrons was then aimed at a smallthermocouple of known heat capacity. The energy, W , transferred to the

Page 178: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 151

thermocouple could be measured by a rise in its temperature. This energymust be entirely kinetic so that

W = 12

N mu2.

The electron beam was then bent by a magnetic field, again selecting thecentripetal force in (3.7.13), so that

H e u = mu2

ρ.

Eliminating the velocities between these three equations results in

em

= 2WH2ρ2Q

.

The conclusion that “since all quantities on the right can be measured, theratio e/m can be obtained” [Stranathan 42] is now inaccurate.

The first objection is that Q depends on e so that the ratio is e2/m,and the second is on the validity of W for the correct expression for thekinetic energy. In fact, Thomson measured the velocity of the cathode raysto be one-tenth that of light where relativistic effects cannot be neglected.But what Thomson did succeed to determine is that the ratio e/m wasindependent of the gas used, or the composition of the metal electrodes.

3.7.4.2 Kaufmann’s methodKaufmann’s experiments show that the real constant mass of the electron is negligiblecompared with the apparent [electrodynamic] mass; it can be considered as zero, so that ifit is mass which constitutes matter we can almost say that matter no longer exists. . . Thereare merely holes in the aether.Poincaré

At the turn of the century, Kaufmann came onto the scene with a newmethod of determining the ratio. Kaufmann found β-rays expelled fromradioactive substances to be faster and better suited than cathode rays.The electric and magnetic fields were now parallel so that the deflectionof the electrons was perpendicular. The deviation of the electron from itsline of flight, on account of its deflection in an arc of a circle, is inverselyproportional to the radius of the circle, provided the deflection is small.

Again, Kaufmann selects the centripetal force in (3.7.13), and arrivesat similar equations to those of Thomson, except the radii of the circular

Page 179: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

152 A New Perspective on Relativity

arcs are different, i.e.

E e = mu2

ρE, e H = mu

ρH.

These radii can be transformed to Cartesian coordinates, x = k/ρE andy = k/ρH , where k depends upon the specific nature of the geometry. Elim-inating the velocity between the two, and replacing the radii by the Carte-sian coordinates, Kaufmann obtains the equation of a parabola,

y = c2EkH2

me

x2. (3.7.17)

The ratio,

yx

= cu

EH

, (3.7.18)

is, therefore, a measure of the deviation of the Lorentz force from zero [cf.Eq. (3.7.15) above]. The position of the origin is that of the undeflectedbeam about to enter the fields. The full parabola, for negative values of x,can be obtained by reversing the direction of the magnetic field, as shownin Fig. 3.7.

Electrons of different speeds fall at different points along the parabolaso that at any point Kaufmann obtained a specific value of e/m at a givenvelocity. Kaufmann was the first to obtain results that correlated the ratioe/m with definite electron velocities. Kaufmann found that e/m variedslightly with velocity, decreasing with increasing velocities.

Fig. 3.7. The points on the parabola refer to electrons deflected by parallel andanti-parallel (left side) fields.

Page 180: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 153

Ritz [08] queried the invariancy of Kaufmann’s parabola, (3.7.17), forit will only have that form if m is independent of the speed. If the mass didincrease with velocity, say, at some power greater than one, the parabolawould tend to a straight line as that power increases. Instead of (3.7.18)Ritz triedb

yx

=√

(c2 − u2)u

EH

, (3.7.19)

giving rise to a parabola,

y = mc2Eek H2 x2 + k

eEmc2 , (3.7.20)

shifted upward at the origin. The mass is now a constant, independent of theparameter u. The differences between observed and calculated values of ycould, according to Ritz, come well within the limits of experimental errors.The surprising aspect occurs at the origin where x vanishes for u = c givinga zero magnetic deviation, while the electric deviation is proportional to theratio of the electric to rest mass energies [cf. Fig. 3.7]. Ritz concluded thatthere is a large leeway for hypotheses, and Kaufmann’s experiments canbe interpreted equally as well as keeping the mass constant and modifyingthe Lorentz force. Moreover, all radiation effects of the accelerated electronshave been neglected.

The latter conclusion probably was attractive to Ritz on account of hismodification ofAmpère’s law to contain more terms in his Taylor expansionas the velocity of the electron increases, and leaving invariant the chargesin that expression. The transverse mass enters only when we establish amechanical equilibrium between the magnetic part of Lorentz’s force andthe centripetal force acting on the electron. Ritz, therefore, was of the opin-ion that there were no general laws at great speeds, but the phenomenaoccurring at these speeds could be accounted for by an appropriate seriesexpansion of his law of force in powers of 1/c.

Ritz was also correct in observing that it is only the transverse massthat “comes into play.”

bActually, Ritz wrote 2c for c, but that does not change anything “in a theory whichconsiders only relative velocities.”

Page 181: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

154 A New Perspective on Relativity

Deflection experiments were designed to measure the transverse mass,where the acceleration is not the longitudinal acceleration, but, rather,the centripetal acceleration which occurs in the osculating plane per-pendicular to m. The longitudinal force component is in the directionof m′. These conclusions hold for whatever model one may choose forthe vectorial masses, m′ and m.

Let us now turn to a microscopic interpretation of the deflection meth-ods to measure the ratio, e/m.

3.7.4.3 Microscopic interpretation of the deflection methods

Let us resolve the Lorentz force,

F = e(E + u

c× H

), (3.7.21)

into tangent, principal normal drawn to the center of curvature, andbinormal erected to form a right-handed system of moving axes (ξ, η, ζ).Accomplishing this we get

Fξ = eEξ ,

Fη = e(Eη − u

cHζ

),

Fζ = e(Eζ + u

cHη

),

(3.7.22)

which we equate to the mechanical force components, (3.7.11) acting on theelectron. We then obtain

eEξ = m′ξu − mη

u2

ρ, Hξ = 0,

eEη = m′ηu + mξ

u2

ρ,

ecHη = mη

,

eEζ = m′ζu,

ecHζ = mζ

.

(3.7.23)

We first note that the longitudinal components of the force do notenter into the mechanical conditions of equilibrium. Second, we observethat the ratio of the magnetic field components stand in the same ratio as

Page 182: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 155

their masses,

= mη

.

If we consider the motion along the tangential, ξ-axis, Thomson’sarrangement has the electric and magnetic fields normal to this axis. From(3.7.22) it follows that the electric field is pointing in the η-direction, and themagnetic field in the ζ-direction. Eliminating the speed parameter betweenthe two leads to

e

m2ζ/mξ

= c2Eη

H2ζ

ρ

τ2 . (3.7.24)

Comparing Thomson’s result, (3.7.16), with (3.7.24) we conclude that thetransverse mass components, mξ and mζ should be equal, and R = τ2/ρ.The distinction of these two radii of curvature will become evident inKaufmann’s set-up.

In Kaufmann’s apparatus the electric and magnetic fields are parallelto one another. If the motion is in the ξ-direction, the only possibility is toeliminate the speed parameter between Eη and Hη. We then get

= c2Eη

H2η

m2η/mξ

e1τ2 . (3.7.25)

In comparison with (3.7.17), we identify the x and y deflections with the tor-sion, 1/τ, and curvature, 1/ρ, respectively. Moreover, the two transversemass components, mη and mξ must be equal. This implies that the trans-verse mass vector, m, cannot be in the direction of the motion for, oth-erwise, mξ = m and mη = mζ = 0. Ritz’s formula, (3.7.20), would havea non-vanishing value of the curvature at zero torsion, which is entirelyreasonable for a symmetrical electron.

In some ways, Ritz’s approach is similar to Boltzmann’s view of sta-tistical mechanics where elegance should be left to tailors and cobblers.Granted its lack of uniqueness in that there are arbitrary constants thathave to be chosen to get the right numerical results, Ritz’s explanation ofthe gravitational phenomenon of the advance of the perihelion of Mercurywas extremely successful, as well as other gravitational phenomena fallingoutside the domain of Newtonian mechanics. We now turn to a field versusforce approach to relativistic gravitational phenomena.

Page 183: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

156 A New Perspective on Relativity

3.8 Modeling Gravitation

3.8.1 Maxwellian gravitation

The analogy between the inverse square laws of Newtonian and Coulombforces led to many attempts to describe gravity in electromagnetic terms.The big problem, and the adjective ‘big’ cannot be over-exaggerated, wasthe perplexing problem of the localization of gravitational energy. Forwhen one brings two like charges together, energy is necessary to over-come repulsion, and it is this energy that “comes from the field,” whichimplies that space has a positive field energy. However, masses alwaysattract one another so that it takes energy to keep them apart. This wouldmean that the surrounding field had negative, instead of positive, energy.At first sight one would think that the same would apply to unlike chargeswhich attract one another. Given a universe which is electrically neutral,these two charges would have to be separated from two other charges,and when all the electromagnetic interactions are accounted for, a negativeelectromagnetic field energy never arises.

The negative gravitational field energy so vexed Maxwell that heabandoned any hope of constructing a gravitational theory that wouldmimic his electromagnetic theory, and claimed that such a theory wasbeyond nineteenth century physics. In his own words, Maxwell admits“Since it is impossible for me to understand how a medium could possesssuch properties, I cannot pursue research, in this vein, into the cause ofgravitation.”

Specifically what Maxwell was referring to was this: Call F the gravi-tational force. In a static regime, Maxwell’s equations reduce to:

∇ · F = −ρ,

∇ × F = 0,

where the negative sign denotes convergence [-divergence], and ρ is themass density. Now, the gravitational energy will be:

W = −∫

F2 dV + const,

where the negative sign denotes attraction. Maxwell reasoned that since theenergy is essentially positive, the integration constant must be so large that

Page 184: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 157

W is positive for whatever value the force F can assume. The energy willbe a maximum when the force vanishes. For elastic systems, however, theenergy is a minimum when the deformations vanish. Hence, the gravitatingsystem will always be in a state of unstable equilibrium, and, therefore,gravity does not fall under the jurisdiction of field equations.

But this did not deter Heaviside, and, in 1893, he had a go at con-structing a gravitational theory by modifying Maxwell’s equations of theelectromagnetic field. And like Maxwell, Heaviside based his analogy onthe presence of an aether, invoking Newton’s authority in which he refersto a letter written in 1693 to Richard Bentley, the then Bishop of Worcesterin which Newton writes:

That gravity should be innate, inherent and essential to Matter, so that one bodymay act upon another at a Distance thro’ a Vacuum, without the Mediation of any-thing else, by and through which their Action and Force may be conveyed fromone to another, is to me so great an Absurdity, that I believe no Man who has inphilosophical Matters a competent Faculty of thinking, can ever fall into it. Grav-ity must be caused by an Agent acting constantly according to certain Laws; butwhether this Agent be material or immaterial, I have left to the consideration of myReaders.

Nothing had changed for the next two hundred years since Newton wrotethose words, and Heaviside [94] concluded that “It is incredible now asit was in Newton’s time that gravitative influence can be exerted with-out a medium; and, granting a medium, we may as well consider that itpropagates in time, although immensely fast.”

Heaviside first states the analogy between electricity and gravitation:both are inverse square laws so that the localization of electromagneticenergy should be the same as gravitational energy. However, there is onebig difference: bringing two like charges together requires energy, and itis this energy that ‘goes into the field.’ In contrast, two masses attract oneanother so that energy is needed to keep them apart. Thus, the field energydensity is negative. Negative energy densities are worrisome today, buteven more so to nineteenth century physicists, and its abhorrence, as wehave stated, led Maxwell to drop any attempt of grappling with gravitationalong the same lines as his electromagnetic theory, but not his protégé,Heaviside!

Charge neutrality requires the two charges to be separated from twoother opposite charges so there will be a positive energy field density

Page 185: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

158 A New Perspective on Relativity

over all. Not so with mass, for if all the mass in the universe were positive,we would expect to find a negative energy field density. Negative massmust still be relegated to science fiction, but should it exist, the analogywith electromagnetism would be one step nearer.

Heaviside reasoned that if e is the intensity of the gravitational forceand when “matter ρ enters any region through its boundary, there is asimultaneous convergence of the gravitational force into that region pro-portional to ρ.” In other words, the difference between the two currents,

ρu − e/G,

must be divergence-free, where ρu is the flux of matter. The gravitationalconstant G is seen to play the role of the inverse permittivity, showing thatthe electric field has more to do with gravitation than the permeability ofthe magnetic field that is necessary to close the circuit. This can only be thecase if the difference is proportional to the curl of a vector, say h,

∇ × h = ρu − e/G. (3.8.1)

The divergence of (3.8.1) vanishes, and this must be equivalent to the con-tinuity equation. It will be if the intensity of the force satisfies a Gaussianlaw,

∇ · e = −Gρ, (3.8.2)

so that the divergence of (3.8.1) results in the continuity equation,

∇ · J + ρ = 0, (3.8.3)

where J = ρu is the flux.According to Heaviside, if there is instantaneous action,

∇ × e = 0,

because “the gravitational force is exactly dependent on the configurationof matter,” meaning that it is given by the gradient of the Newton poten-tial. Nothing moves, and all interactions occur by the hideous action at adistance. Rather, if e is propagated at a finite speed, v, then e must satisfythe wave equation,

v2∇2e = e. (3.8.4)

Page 186: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 159

Then, since

∇2 = ∇(∇ · ) − ∇ × (∇ × ), (3.8.5)

it follows that

−v2∇ × (∇ × e) = e,

in space free of matter.But, we also have by (3.8.1) that

∇ × h = −e/G, (3.8.6)

in the absence of matter. Differentiating (3.8.6) with respect to time gives

e = −G∇ × h,

and by (3.8.4) this becomes

(v2/G)∇ × e = h. (3.8.7)

Then calling µ = G/v2, which is analogous to the magnetic permeability,(3.8.7) can be written as

∇ × e = µh. (3.8.8)

Heaviside concludes that the second circuital law (3.8.8) is a consequence ofa finite speed of propagation, which could have been inferred straight-offfrom the analogy with electromagnetism.

The subsidiary condition, (3.8.3), and

J = −v2∇ρ, (3.8.9)

implies that the mass density, ρ, propagates at speed v, since ρ too satisfiesa wave equation,

ρ = v2∇2ρ.

If we introduce b, the induction field, according to b = µh, the circuitallaws become:

ν2∇ × b = (e − GJ),(3.8.10)∇ × e = −b.

Page 187: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

160 A New Perspective on Relativity

The Maxwell gravitational equations (3.8.11) are equivalent to the waveequation, (3.8.4).

Now, the point is this:

The wave equation (3.8.4) is valid even if b = 0, since

e = v2∇(∇ · e) = v2∇2e, (3.8.11)

on account of the fact that ∇ × e = 0. This means that e is the gradientof some scalar potential, so that Heaviside was wrong to conclude thatthe vanishing curl of e means ‘instantaneous action.’ Since polar e ispropagated at speed v, without magnetic forces, the electric waves arelongitudinal.

Moreover, barring rigidity, a generalized displacement g can be writ-ten as

ρg = λ∇(∇ · g) − ν∇ × (∇ × g),

where λ and ν are the elastic constants related to compression and rotation.With either constant equal to zero we still obtain a wave equation. Heav-iside identifies the velocity g with e. In the case ν = 0, the vibrations arelongitudinal and propagate at a velocity v = √

(λ/ρ), whereas if λ = 0, thevibrations are transversal and propagate at a speed v = √

(ν/ρ). In the lattercase, we need two circuital laws with a medium ‘waving’ in directions nor-mal to the direction of wave propagation. Whereas, in the former case oflongitudinal propagation, all that is necessary is the ‘back-and-forth’ com-pression and rarefaction motions of the medium. Finally, in the case λ = ν,the vector identity (3.8.5) shows that a wave equation will result with bothlongitudinal and transverse wave motion. However, polarization occursonly with transversal waves as Young and Fresnel showed back in 1817,but what is polarized?

Nothing is new under the sun, and Carstoiu [69] rediscovered Heav-iside’s gravitational equations over three-quarters of a century later. In hisnomenclature, h is the gravitational ‘vortex,’ J the gravitational current,and e the (vector) gravitational field. If Carstoiu’s gravitational vortex,h, has any role, it must be related to the curl of the velocity field. Thereason why Heaviside introduced a gravitational vortex stemmed fromthe fact that the difference between the conduction current ρu and the

Page 188: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 161

displacement current εe should be divergenceless, i.e. the divergence oftheir difference should give the continuity equation (3.8.3) when Gauss’slaw is introduced.

Without an h field, there is no reason why Einstein should have putelectromagnetism and gravitation on the same footing, and propagat-ing with a common speed, c.

Einstein does not make any distinction between the velocity of prop-agation of electromagnetic and gravity waves, although the latter still haveas yet to be observed. Thus, the Einstein equations should be reduced to thecircuital equations of Maxwell, indicating that they are transverse wavesand polarizable. It is well-known that Einstein’s equations,

Rαβ − 12

gαβR = −8πGc4 Tαβ,

where Rαβ is the Ricci tensor, gαβ are the components of the metric tensor,R isscalar curvature, and Tαβ is the energy–momentum tensor, can be linearizedto read

c2∇2φαβ − ∂2

∂t2 φαβ = −16πGTαβ. (3.8.12)

Equations (3.8.12) predict that the velocity of propagation of the grav-itational potential φαβ is the same as that of light. To zeroth-order in v/c,T00 = ρc2, and to first-order in v/c, Tα0 = −ρc2(vα/c), where vα is the parti-cle velocity, and v that of the ‘aether.’c Thus, φ00 corresponds to the scalar,and φα0 to the vector, potentials of electromagnetism and can be expressedin terms of their respective density, ρ, and current, jα = ρvα, as

c2

4φ00(P, t) = 1

4πε

V

ρ(P′, t′)r

dV,

and

cφα0(P, t) = − µ

V

jα(P′, t′)r

dV,

cFor a criticism of the energy–momentum tensor see Sec. 6.7.1.

Page 189: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

162 A New Perspective on Relativity

where r is the distance between the point P and the location P′ of the volumeelement, dV, and t′ = t − r/c, meaning that the potentials are retarded. Inanalogy with electromagnetism, the permittivity is defined as ε = 1/4πG,and the gravitational permeability is µ = 4πG/c2 [Forward 61], such thattheir product is the inverse of the square of the velocity of light, c.

But, this is much more committal than Heaviside ever wanted to be. Bymultiplying (3.8.8) through by h, and subtracting e times (3.8.1) he obtainswhat he refers to as the ‘equation of activity,’

−∇ · (e × h) = −12

∂t

(µh2 + 1

Ge2

)+ F · u, (3.8.13)

which we would refer to today as a power equation. The first two terms arethe rates of decrease of the rotational and kinetic energies, while the lastterm represents the power of an impressed field where F = ρe is the forcewhose intensity is e. The vector e × h (“found by Poynting and myself”[Heaviside]) describes the flux of gravitational energy, just as it describesthe flux of electromagnetic energy.

However, Heaviside quickly realizes the direction of the gravitationalflux of energy is pointing in the wrong direction! Or, at least, it is point-ing in the opposite direction than in electromagnetism. For draw a spherearound a charged particle whose axis of spin coincides with the directionof motion. The positive pole is placed in the forward direction and thenegative pole in the back. The magnetic intensity rotates with the lines oflatitude in the direction of rotation, while the electric intensity points radi-ally outward. Then, the flux of electromagnetic energy corresponds to linesof longitude from the negative to the positive pole. The only change in thegravitational case is the direction of the electric intensity; it points radi-ally inward! Thus, there is a reversal of direction of the gravitational flux,according to Heaviside, given that “all matter being alike and attractive.”If nothing is being ‘radiated’ away, why then should there be a depletionof energy in (3.8.13)?

Carstoiu’s gravitational vortex, which is supposedly analogous to themagnetic field, is parallel to the gravitational force, which is supposed toplay the role of the electric field. This is based on the analogy that thereis a magnetic charge, Qm, which is the source of a scalar potential in ananalogous way ordinary charge is the source of the scalar potential. If Qm

Page 190: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 163

existed, there would be no need to introduce a vector potential whose curlis the magnetic force. The two fields differ by a mere 1/c. This fact destroyshis circuital equations, and his gravitational vortex is not divergence free!Since the gravitational and vortex potentials are parallel, Poynting’s vectorvanishes, and so too, the energy flow.

Moreover, u does not give rise to a source term in the energy bal-ance equation (3.8.13). It can either be solenoidal, or what Heaviside calls‘circuital,’ or it is irrotational, or ‘polar’ in Heaviside’s terminology. In theformer case it adds a contribution to the rate of increase in stored energy,while in the latter case it adds contributions to both the energy flux andenergy rates of change. Consider, first the circuital case. Then we can setu = a, where a plays the role of the vector potential intensity that satisfiese + a/v = 0. Inserting these two terms into the last term in (3.8.13) givesthe additional contribution of 1

2 (ρ/v)da2/dt to the rate of storage of energy.Now take u to be polar, i.e. u = −∇(∇·a). Introducing this into (3.8.13)

and noting that

−ρ

va∇(∇ · a) = −ρ

v∇(a∇ · a) + 1

v∂

∂t(∇ · a)2,

shows that the negative of the first term on the right-side contributes to theflux of energy, while the second term contributes to the rate of change ofstored energy. In other words, −∇ · a plays the role of a hydrostatic pres-sure, and the term which contributes to the energy flux is analogous to aterm which would contribute to the energy flux of sound waves. Rather,the quadratic term in the energy, which is proportional to (∇ · a)2, is pro-portional to the energy of compression.

The two cases correspond to transverse and longitudinal waves. Theformer can give rise to polarization, the latter not. So, Heaviside made nocommitments to whether the waves he was talking about were transverseor longitudinal, or a combination of both. In addition, there is no statementthat the gravitational waves travel at the speed of light, just that they havea finite speed of propagation, v.

3.8.2 Ritzian gravitation

One of the most extraordinary physicists of the early twentieth century wasa young Swiss scientist, Walther Ritz. O’Rahilly [38] sums up this carrier

Page 191: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

164 A New Perspective on Relativity

most succinctly:

And when, in spite of his acknowledged researches in spectroscopy and elasticity,the Swiss physicist, Walther Ritz, expressed heterodox views on electromagneticsin 1908, shortly before his death, his ideas were received with a chill of silence andhave ever since been systematically boycotted. He was out of tune with the music,and out of step with the crowd.

In this section we are going to show that young Ritz had created a theoryof gravity that not only explained the advance of the perihelia of planets,but also could have accounted for the deflection of light of a massive bodyand other tests put to Einstein’s general relativity. The only difference isthat Ritz preceded Einstein by a good seven years!

Since the middle of the nineteenth century, the anomaly of theadvances of the planets was a thorn in the side of the Newtonian theory ofgravitation. As we have seen in the last section, the analogy between elec-tricity and gravitation posed too much of an opportunity to be bypass easily.In 1864 the German astronomer, Seegers, proposed to treat the advance ofthe planets as a gravitational force in the same way that Weber’s force lawholds for electrical attraction between particles. The advance of the perihe-lion would thus be due to the motion of the other planets, and it would benecessary to take into account their relative velocities, scaled down by theinverse square of the velocity of light, and their accelerations. That is, theywere entirely cognizant of the fact that they were looking for second-ordereffects. By applying Weber’s law, Scheibner found a secular variation of6.73′′, while Tisserand found the value of 6.28′′ for the secular variation ofMercury in 1872.

In 1890, Lévy noticed that a combination of the Weber, (4.1.5), andRiemann (4.1.6), potentials could serve as a potential for gravitation.d

Furthermore, Ritz observed that his expression for the force, (4.1.7), couldbe written as a linear combination of Weber’s (W) and Riemann’s (R)forces,

Fx = 12

(1 − λ)FWx + 12

(1 + λ)FRx, (3.8.14)

dThe reader is asked to accept the formulas on faith; they will be developed fullyin the next chapter.

Page 192: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 165

in the absence of accelerations, in the x-direction where λ is an undeter-mined parameter of mixing. Its value will be chosen so that theory corre-sponds to observation. Ritz’s force (3.8.14) can be derived from the usualEuler–Lagrange equations, (4.1.3), with

L = 12

(1 − λ)LW + 12

(1 + λ)LR, (3.8.15)

as the Lagrangian.Since gravity is a much weaker force than electricity, its effects will be

much harder to determine. At least two cases were known where grav-ity acted in a way which did not comply with Newtonian mechanics:the deflection of light, and the excess rate of turning of the long axisof Mercury’s orbit. The former constituted a prediction made by JohannSöldner [04] in 1801, who calculated that a star viewed near the sun wouldbe shifted by 0.85′′. The latter was observed by the French astronomer, LeVerrier, who noticed that the major axis of the elliptical orbit was turningslightly faster than expected from perturbations exerted by the sun and byneighboring planets, as shown in Fig. 3.8.

It was realized that a correction to the Newtonian gravitational force,

Mmr2 + C

r4 ,

was needed to explain the excess rate of turning. The unknown parameterC, had to be determined by the fact that the excess rate of turning amountedto some 41 arc seconds per century. This was its value at the beginning ofthe twentieth century, more recent data fix it at 43.1′′ per century.

Fig. 3.8. Elliptical orbit of Mercury showing the excess rotation of the major axis.

Page 193: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

166 A New Perspective on Relativity

The determination of C is usually accredited to Einstein’s general the-ory of relativity. The first time we hear of Einstein’s attempt to determinethe excess turning of Mercury is in 1912 when he enlisted help from hisfriend Besso. He did not publish his result until the end of 1915 when herevised his general theory.

Yet, in 1908, Ritz published his calculation of the advance of perihelionof Mercury, long before the world heard of Einstein’s general relativity. But,sadly, the world did not hear of Ritz’s achievement. In addition, we shallshow that he had in his grasp all the tests that are usually attributed to theconfirmation of Einstein’s general relativity, except for the prediction of thegravitational shift of spectral lines.

Unlike his earlier work in which he cites Laplace for predicting thatthe propagation of gravity is some 107 times faster than the speed of lightwhich would lead to first-order corrections in the relative velocity, Ritznow assumes that the gravity propagates at the speed of light which wouldintroduce only second-order corrections.

Ritz’s starting point is his second-order force equation in the absenceof acceleration terms. He sets the components of the force, Fx, and Fy equalto mx and my, respectively. He replaces the product of the charges, ee′, byGMm, where M is the central mass and m the peripheral mass which cancelsout, and obtains

x = −GMxr3

{1 + (3 − λ)

4c2 (x2 + y2) − 3(1 − λ)4

r2

c2

}+ GM(1 + λ)

2c2r2 xr,

y = −GMyr3

{1 + (3 − λ)

4c2 (x2 + y2) − 3(1 − λ)4

r2

c2

}+ GM(1 + λ)

2c2r2 yr,

where λ is an arbitrary constant. We have made the identifications ur = r,ux = x, u2 = x2 + y2 in his force law, (4.1.7). Multiplying the first equationby y and subtracting x times the second equation from it, Ritz gets

ddt

�z = GM(1 + λ)2c2r2 r�z,

where �z = xy − yx = r2ϕ is the z-component of the angular momentum(relative to unit mass) of the planet, and (r, ϕ) are polar coordinates.

Page 194: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 167

Since the force equations contain a component parallel to thevelocity, the aerial velocity is not conserved. This is also the originof the longitudinal component of the mass in special relativity. The lackof conservation of the angular momentum is also found in the generalrelativity, as we shall see in Sec. 7.4.

Integrating Ritz gets

r2ϕ = �0e−GM(1+λ)/2c2r ≈ �0

(1 − GM(1 + λ)

2c2r

), (3.8.16)

where he identifies the constant of integration, �0, as the conserved angu-lar momentum. Conservation of angular momentum implies r2ϕ = �0 =const, and not (3.8.16). The lack of conservation of momentum is the con-sequence of choosing a particular hyperbolic stereographic inner product,as we shall see in Secs. 7.4 and 9.6. In any case, the lack of conservation ofmomentum, (3.8.16), should have disturbed Ritz, but he remained silent.

Now what Ritz’s needs is the equation of the trajectory which he mustsubsequently solve. He removes the quantities

GM2c2r

(1 + λ)x + GMx2c2r

(1 − λ)(x

rx + y

ry),

from the first of his force equations, and

GM2c2r

(1 + λ)y + GMy2c2r2

(xr

x + yr

y),

from the second, and replace them by the same expression with the excep-tion that the accelerations x and y are replaced by −GMx/r3 and −Gmy/r3,respectively. He claims that in the final analysis these changes are equiva-lent to introducing terms of fourth-order, and “completely negligible.”

Then multiplying the first of the modified equations by x and thesecond by y, adding and integrating he gets his equation of energy conser-vation,

12

u2 − 12

(1 + λ)GM

r

(1 − u2

2c2

)

− 12

(1 − λ)GM

r

(1 − r2

2c2

)− G2M2

2c2r2 = W = const. (3.8.17)

Page 195: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

168 A New Perspective on Relativity

Next Ritz transforms (3.8.17) to polar coordinates, where r2 = x2 + y2

and u2 = x2 + y2 = r2 + r2ϕ2. Employing (3.8.16) Ritz arrives at:

�20

r4

(drdϕ

)2

=[1 + 1

4(λ − 1)

α

r

]

×2W − �2

0r2 + 2GM

r+ 1

2(λ + 1)W

α

r+ 1

4(λ + 2)

(αc2

r

)2,

(3.8.18)

where α := 2GM/c2 is the so-called Schwarzschild radius. Schwarzschildwould not come across it till 1916 so that this radius should bear the nameof Ritz, and is another example of Stigler’s law of eponymy.

3.8.2.1 Mass from the gravitational field

Consider the energy conservation equation (3.8.17) in a state at rest,

W = −GMr

(1 + 1

r

). (3.8.19)

As a result of the gravitational field of the sun, a planet orbiting about itwill feel a greater attraction than the solar mass M. We may reason fromthe analogy with electrostatics [Sexl & Sexl 79] where the electric field isthe gradient of e/4πr, and the electrostatic energy is We = 1

2εE2. Heavisidetells us to replace the dielectric constant, ε by 1/4πG, and substituting GMfor the charge e gives the energy of the static gravitational field as Wg =−GM2/8πr4. The negative sign denotes attraction, whereas the plus signin the electrostatic energy indicates that like charges repel. This is whatMaxwell found so unattractive about the gravitational energy.

The gravitational energy Wg will contribute to the solar mass by anamount Wg/c2 so that its total mass will be

Mtot = M − 1c2

∫Wg dV = M − 4π

8πc2

∫ ∞

r

GM2

r4 r2 dr = M(

1 + 14

α

r

).

Remarkable as it is, this is just what Ritz could have predicted from (3.8.19)!

Page 196: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 169

3.8.2.2 Advance of the perihelion

Now, this is what Ritz did do. Ritz took the positive square root of (3.8.18),and rearranged it to read

ϕ − ϕ0 =∫ (

1 − 18α(λ − 1)ρ

)dρ

√(A − Bρ − C2ρ2)

,

where ρ = 1/r,

A = 2W/�20, B = (2GM + (λ + 1)αW/2)/�2

0, C2 = 1 − α2c2(λ + 2)/4�20,

and ϕ0 is a constant of integration. Performing the integration Ritz found

ϕ − ϕ0 =[

1 +(

αc4�0

)2

(λ + 5)

]arcsin

2C2ρ − B√(B2 + 4AC2)

+ α(λ + 1)8

√( − C2ρ2 − Bρ + A).

Hence, the difference in ϕ between two successive perihelia is

[1 +

(αc4�0

)2

(λ + 5)

].

This differs from 2π which is what we would get if there were no advanceof the perihelion. The correction,

�ϕ = π

8

(αc�0

)2

(λ + 5), (3.8.20)

is a very small quantity that will make the elliptic orbit turn in its plane.Introducing the relation,

GM

�20

= 1l, (3.8.21)

known from the elementary theory of elliptical motion [Born 60], where lis the semi-latus rectum, a(1 − ε2), a and ε being the semi-major axis andeccentricity of the ellipse, respectively, into (3.8.20) results in

�ϕ = πα(λ + 5)4l

. (3.8.22)

Page 197: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

170 A New Perspective on Relativity

As we will see in Sec. 7.6.3 general relativity predicts a shift of �ϕ = 3πα/a.In order for (3.8.22), to produce such a shift, we must set λ ≈ 7. Now let uscontinue to see what Ritz missed.

3.8.2.3 Deflection of light

The estimate of the advance of the perihelion was the only prediction thatRitz could make. His equation for the trajectory of the motion, (3.8.18),contains a wealth of information that sadly Ritz’s short life did not allowhim to uncover. If we retain only linear terms in G, the equation for thetrajectory becomes

�20

r4

(drdϕ

)2

= 2W − �20

r2 + 2GMr

(1 + λ

2Wc2

)− (λ − 1)

4α�2

0r3 . (3.8.23)

For λ = 0, (3.8.23) has a potential,

�S = −GMr

+ �20

2r2 − α�20

8r3 , (3.8.24)

which, apart from a trivial numerical factor in the last term, is often referredto as the Schwarzschild potential, because it is this potential which resultsfrom general relativity of a gravitating body that Schwarzschild solved[cf. Secs. 9.10.3 and 9.10.4]. Just as Coulomb’s law had to be modified inthe presence of charges in motion, so too does Newton’s law of gravitationin the presence of orbiting bodies. It is the last term in (3.8.24) which isa coupling between Newton’s radial attraction and centrifugal repulsionthat is responsible for the advance of the perihelion and the deflection oflight.

In fact, the deflection of light is due entirely to the presence of the cou-pling term in (3.8.24) in the absence of the gravitational attraction. We there-fore set λ = −2, and W = c2 thereby obtaining the equation for the orbit ofa photon about a massive body as:

(dρ

)2

= �−2 − ρ2 + 34αρ3. (3.8.25)

We have again set r = 1/ρ, and � = �0/√

(2)c has the dimensions of acollision parameter. It is a function of the conserved angular momentum, �0,and can take on any value we assign to it so it does not appear as a distance

Page 198: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 171

of closest approach. But, general relativity gets the same result. Curiously,the conserved angular momentum is divided by Weber’s constant.

By a change of variable, σ = �(1− 34αρ)1/2ρ, we can write the positive

square root of (3.8.25) as dρ/dϕ = √(1 − σ2)/�, which upon integration

gives:

ϕ =∫ σ

0

dσ(1 + 3

4σ/�)

√(1 − σ2)

= arcsin σ − 34

α

√(1 − σ2) + 3

�.

At maximum ρ, or the distance of closest approach, dρ/dϕ = 0, σ = 1. Theangle at which this occurs is:

ϕm = π

2+ 3

�.

The total deflection will, therefore, be:

2(ϕm − π

2

)= 3

�. (3.8.26)

The value found from general relativity is 2α/�, where � is taken as theradius of the Sun. But is it the case?

The only place where the ratio �20/GM appears is in the expression

for the semi-latus rectum of an orbital ellipse [cf. (3.8.21)]. It has absolutelynothing to do with the problem of the deflection of a light ray by a massivebody. The conserved angular momentum has nothing whatsoever to dowith the radius of the Sun! Just because it has the right dimension does notmean that it has the right physical interpretation, and, yet, this is what gen-eral relativity asserts [Møller 52].e This was not the way Einstein originallyderived his expression for the deflection, as we shall now see.

The ratio of the Schwarzschild radius to the solar radius was predictedto be the magnitude of the deflection of light by the Sun by Söldner [04]back in 1801. Surely, Ritz could not have been aware of that prediction. Andexcluding Söldner, he would have preceded Einstein by three years whoobtained the same result as Söldner in 1911. Einstein was unaware of it,

eGeneral relativity modifies (3.8.24) by changing the 18 to 1

2 . The circular orbit wherepotential �S is a minimum [cf. Fig. 7.6] determines � = �0/

√2c = √[α/2(2r −

3α)]r ≈ √(αr). The Schwarzschild radius for the sun is 3 × 103 m, and has a radius

7×108 m. If � were to have this value, it would fix the radius of the orbit r at 1014 m,which is the Schwarzschild radius of a galaxy.

Page 199: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

172 A New Perspective on Relativity

and it was only brought to his attention in 1921 by Lenard in an effort todiscredit him and his theory of relativity.

That Einstein’s result could not have been verified at the time washeralded a fortuitous circumstance since his general theory doubled thevalue. Observations were made during various solar eclipses with resultsranging from 1.5′′ to 2.2′′. The latter value was found by the Freundlicheclipse expedition, which summarily remarked: “There appears to be nofurther doubt possible that our series of measurements is not compatiblewith the value 1.75′′ asserted by theory.”

Numerical values do not make or break a theory, but their predictionsdo. Ritz could have never obtained an exact numerical result because healways had an undetermined parameter, λ, at his disposition.

In closing this chapter, perhaps some speculative remarks would bein order. As we have mentioned, Ritz [09] published a joint paper withEinstein in 1909, shortly before his death. Apparently, Ritz was not awareof the deflection of light by a massive object. But was Einstein aware of it,and if so, when? Probably as early as 1907, for as Einstein [23b] writes inhis 1911 paper,

In a memoir published four years ago I tried to answer the question whether thepropagation of light is influenced by gravitation. I return to this theme, becausemy previous presentation of the subject does not satisfy me, and for the strongerreason, because I now see that one of the most important consequences of my formertreatment is capable of being tested experimentally. For it follows from the theoryhere to be brought forward, that rays of light, passing close to the Sun, are deflectedby its gravitational field, so that the angular distance between the Sun and a fixedstar appearing near to it is apparently increased by nearly a second of arc.

The really surprising aspect of Einstein’s 1911 calculation of the deflec-tion of light by a massive body is that it employed a non-constant speedof light, as we have mentioned in Sec. 1.1.1.1. Einstein first draws the‘analogy’ between the frequency shift caused by the (classical) Dopplershift,

ν = ν0(1 − u/c), (3.8.27)

and that which could occur in a gravitational field, � = −GM/r. To ‘derive’the expression for the shift, he replaces u by gt, where g is the constantacceleration on the surface of the Earth. With the time t = h/c, where his the height that a photon falls from where it is emitted to where it is

Page 200: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 173

absorbed, and � = −gh, the shift in frequency becomes

ν = ν0

(1 + �

c2

). (3.8.28)

Supposedly, (3.8.28) is valid everywhere, and not just for constantacceleration. In fact, it is static, whose only effect is to slow down clocksin a gravitational field, � = −GM/r. The Doppler shift, (3.8.27), can alsolead to an increase in the frequency by reversing the sign of the velocity.There is no such possibility in (3.8.28). Then there is the nagging questionof what causes the Doppler shift: uniform velocity or uniform acceleration?Obviously it is the former. Finally, we know that (3.8.27) is classical so thatit would mean that there would be a relativistic generalization of (3.8.28).No one has ever mentioned it.

Einstein continues

For measuring time at a place which, relatively to the origin of the coordinates, hasthe gravitational potential �, we must employ a clock which — when removed tothe origin of the coordinates — goes (1 + �/c2) times more slowly than the clockused for measuring time at the origin of the coordinates. If we call the velocity oflight at the origin of coordinates c0, then the velocity of light c at the place wherethe gravitational potential � will be given by the relation

c = c0

(1 + �

c2

). (3.8.29)

Here, Einstein has to admit that

The principle of constancy of the velocity of light holds good according to thistheory in a different form from that which usually underlies the ordinary theory ofrelativity.

Which form he does not say. Moreover, this appears to be a second-ordereffect, and not the usual Doppler shift, which is linear in the relative velocity,u/c. In addition (3.8.29) presents itself as a cubic equation for determiningthe speed of light as a function of a static gravitational potential.

According to Pais [82], “Einstein restored sanity, but at a price.” On thecontrary, Einstein did not restore sanity, but did pay the price! The reasonwhy “we must use clocks of different constitution for measuring the timeat places with different gravitational potential” does not seem like clocksat all for there can be no consensus of the time measured nor a law telling

Page 201: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

174 A New Perspective on Relativity

us how to calculate the differences because the clocks are all of “differentconstitutions.” But let this not prevent us from continuing.

Einstein then claims that

From the proposition which has just been proved, that the velocity of light in agravitational field is a function of place, we may easily infer, by means of Huygens’sprinciple, that light-rays propagated across a gravitational field undergo deflection.

Leaving aside the ‘proof’ of the proposition, Einstein asserts that the rate ofchange of the velocity with respect to the normal of the wavefront gives theangle of deflection and equates ∂c/∂n′ with (c0/c2)∂�/∂n′, on the strengthof (3.8.29), where n′ is the normal to the surface. Einstein thus proposes adeflection, “on the side directed toward the heavenly body, of magnitude”

a = 1c2

∫ θ=π/2

θ=−π/2

GMr2 cos θ ds = 2

GM

rc20

= 4 × 10−6 = 0.83′′, (3.8.30)

where the arc ds = rdθ.Now the standpoints of Ritz and Einstein are even more curious for

Ritz contends that the speed of a photon relative to its emitter is c, butrelative to an inertial observer is c ± u, where u is the radial speed betweenthe source and observer, while Einstein argues that a photon’s speed isalways c, independent of the speed of its emitter. This debate took place in1909 [cf. Sec. 4.1.3]. Just two years later, Einstein admits that c is no longeran absolute velocity!

It is commonly assumed that the difference between the special andgeneral theories is that the latter supplies a factor of two in (3.8.30). It is notthe factor of 2 which is the important point. The corrections to Newtonianphysics are all of the magnitude of the ratio of the Schwarzschild radius tothe radius of the object under investigation. This is why all of the so-calledtests of the general theory can so easily be gotten without any of the heavymachinery of tensorial analysis [Sexl & Sexl 79]. This will be the object ofour study in Chapter 7.

References

[Born 60] M. Born, The Mechanics of the Atom (Fredrick Ungar, New York, 1960),p. 141.

[Brace 04] D. B. Brace, “On double refraction in matter moving through the aether,”Phil. Mag. 7 (1904) 317–329.

Page 202: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

A Brief History of Light, Electromagnetism and Gravity 175

[Carstoiu 69] J. Carstoiu, “Les deux champs de gravitation et propagation desondes gravifiques,” Comptes Rendu 268 série A (1969) 201–204; “Nouvellesremarques sur les deux champs de gravitation et propagation des ondesgravifiques,” ibid 268 (1969) 261–264.

[Cuttingham 14] E. Cuttingham, The Principle of Relativity (Cambridge U. P.,Cambridge, 1914).

[Einstein 23a] A. Einstein, “On the electrodynamics of moving bodies,” translatedby W. Perrett and G. B. Jeffrey from Ann. der Physik 17 (1905) in The Principleof Relativity (Metheun, London, 1923), p. 63.

[Einstein 23b] A. Einstein, “On the influence of gravitation on the propagation oflight,” Ann. der Physik 35 (1911) 898–908; translated by W. Perrett andG. B. Jeffrey in The Principle of Relativity (Metheun, London, 1923), p. 99.

[Feynman 64] R. P. Feynman, The Feynman Lectures on Physics, Vol. II (Addison-Wesley, Reading MA, 1964), p. 28–11.

[Fokker 65] A. D. Fokker, Time and Space Weight and Inertia (Pergamon Press, Oxford,1965).

[Forward 61] R. L. Forward, “General relativity for the experimentalist,” Proc. IRE49 (1961) 892–904.

[French 68] See, for example, A. P. French, Special Relativity (Van NostrandReinhold, London, 1968).

[Heaviside 94] O. Heaviside, Electromagnetic Theory, Vol. I (The Electrician, London,1894), Appendix B.

[Ives 51] H. E. Ives, “Revisions of the Lorentz transformations,” Proc. Am. Philos.Soc. 95 (1951) 125–131.

[Klein 71] F. Klein, “On the so-called noneuclidean geometry,” MathematischeAnnalen 4 (1871) 573–625; transl. in J. Stillwell, Sources of Hyperbolic Geom-etry (Am. Math. Soc., Providence RI, 1991), pp. 69–111.

[Larmor 00] J. Larmor, Aether and Matter: A Development of the Dynamical Relationsof the Aether to Material Systems on the Basis of the Constitution of Matter(Cambridge U. P., Cambridge, 1900).

[Lavenda 09] B. H. Lavenda, A New Perspective on Thermodynamics (Springer,New York, 2009).

[Lorentz 16] H. A. Lorentz, Theory of Electrons (G. E. Strechert, New York, 1916),p. 217.

[Mascart 74] E. E. N. Mascart, “Sur le modification qu’éprouve la lumière parsuite du mouvement de la source lumineuse et du mouvement del’observateur,” Annales de l’École Normale 3 (1874) 157–214.

[Møller 52] C. Møller, The Theory of Relativity (Oxford U. P., London 1952), p. 354.[Okun 89] L. B. Okun, “The concept of mass,” Physics Today June (1989) 31–36.[O’Rahilly 38] A. O’Rahilly, Electromagnetics (Longmans, Green & Co., London,

1938), p. 325.[Pais 82] A. Pais, Subtle is the Lord (Oxford U. P., Oxford, 1982), p. 199.[Poincaré 02] H. Poincaré, La Science et l’Hypothèse (Flammarion, Paris, 1902).[Poynting 10] J. H. Poynting, The Pressure of Light (Society for Promoting Christian

Knowledge, London, 1910).[Poynting & Thomson 22] J. H. Poynting and J. J. Thomson, Text-book of Physics:

Properties of Matter (Charles Griffin, London, 1922), pp. 48–52.

Page 203: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch03

176 A New Perspective on Relativity

[Rayleigh 02] Lord Rayleigh, “Does motion through the aether cause double refrac-tion,” Phil. Mag. 4 (1902) 678–683.

[Ritz 08] W. Ritz, Gesammelte Werke–Oeuvres (Gauthier Villars, Paris, 1911) see also,http://www.datasync.com/rsf1/crit2/1908-2p.htm.

[Ritz & Einstein, 09] W. Ritz and A. Einstein, “Zum gegenwärtigen Stande desStrahlungsproblems,” Phys. Z. 10 (1909) 323–324.

[Schott 12] G. A. Schott, Electromagnetic Radiation (Cambridge U. P., Cambridge,1912), p. 250. Note that the direction of the third component of the forceshould be m′, and not m.

[Sexl & Sexl 79] R. Sexl and H. Sexl, White Dwarfs–Black Holes: An Introduction toRelativistic Astrophysics (Academic Press, New York, 1979), Ch. 2.

[Silberstein 14] L. Silberstein, The Theory of Relativity (MacMillan, London, 1914).[Skilling 42] H. H. Skilling, Fundamentals of Electric Waves (Wiley, New York, 1942).[Söldner 04] J. G. v. Söldner, “Über dieAbllenkung eines Lichstrahls von seiner ger-

adlinigen Bewegung, durch di Attraktion eines Weltkörpers, an welchemer nahe vorbei geht,” Berliner Astronomiches Jahrbuch (1804) 161–172;reprinted in P. Lenard, “Über die Ablenkung eines Lichtstrahls vonseiner geradlinigen Bewegung durch die Attraktion eines Weltkörpers,an welchem er nahe vorbeigeht von J. Söldner, 1801, ” Ann. d. Phys. 65(1921) 593–604.

[Stranathan 42] J. D. Stranathan, The “Particles” of Modern Physics (Blaskiston,Philadelphia, 1942).

[Thomson 28] J. J. Thomson and G. P. Thomson, Conduction of Electricity ThroughGases, 3rd ed. (Cambridge U. P., Cambridge, 1928), p. 262.

[Trouton & Noble 03] F. T. Trouton and H. R. Noble, “The mechanical forces actingon a charged electric condenser moving through space,” Philos. Trans. R.Soc. London 202 (1903) 165–181.

[Whittaker 53] E. Whittaker, A History of the Theories of Aether and Electricity, Vol. 2(Nelson, London, 1953).

Page 204: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Chapter 4

Electromagnetic Radiation

4.1 Spooky Actions-at-a-Distance versusWiggly Continuous Fields

The early decades of the nineteenth century saw a very diverse set ofactors on the stage of electrodynamics. This stage rapidly turned into abattleground between the proponents of actio in distans, Gauss, Clausius,Weber, Riemann and Ritz, and the advocates of continuous action through amedium, Faraday, Maxwell, Lorentz, Heaviside, and Hertz. Maxwell’s ideathat “we can scarcely avoid the inference that light consists in the transverseundulation of the same medium which is the cause of electric and magneticphenomena” is counterpoised by Gauss’s assertion “two elements of elec-tricity in relative motion repel or attract one another differently when inmotion than when at rest.” So is it the wiggly nature of elastic undulationsor the ballistic nature of electrified particles that interact through action ata distance that held the day?

In an 1845 letter to Wilhelm Weber, Gauss admits:

I would doubtless have long since published my researches, were it not that at thetime I gave them up I had failed to find what I regarded as the keystone: namely,the derivation of the additional forces — to be added to the mutual action of elec-trical particles at rest when they are in mutual motion — from the action which ispropagated not instantaneously but in time as is the case with light.

So it is not only in the discovery of hyperbolic geometry where Gausslost priority due to his extremely conservative nature. Weber, rememberedtoday only for his “absolute units of measurements of electrical quantities,”is not given credit for his ideas that electricity has an atomic structure,that electrical currents consist in streams of particles, that Coulomb’s lawneeds to be modified for charges in motion, that Ampère’s law acts directlybetween the charges and not between the conductors, that there is a limiting

177

Page 205: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

178 A New Perspective on Relativity

speed at which the force of attraction vanishes, and that their interaction isnot instantaneous, as Gauss affirmed.

Actually, Ampère’s law was quite revolutionary in its day because itoverturned the apple cart of Newton’s universal inverse square law. Albeitsuch status, Ampère’s law reduces to that of Coulomb for charges at rest,it took into account not only the separation between the charges but alsothe angular direction of their currents. Ampère also alluded to the creationof magnetism through the motion of charges. Notwithstanding his revolu-tionary ideas, his work would have fallen into oblivion had it not been forthe intervention of Gauss, who with the collaboration of a young physicist,Wilhelm Weber, sought experimental verification of Ampère’s hypothesis.The gap began to widen between the Maxwellian continuity of the electricand magnetic fields and the Weberian belief that forces between chargedparticles depended on their relative velocities and accelerations in whichAmpère’s angular dependency diminishes the force of attraction when theparticles are in relative motion.

Weber, and all those who agreed with him, like Rudolf Clausius, waslambasted by the English school in the personified figure of Peter GuthrieTait. In his first edition of Sketch of Thermodynamics, Tait insinuates that

Weber’s inadmissible theory of the forces exerted on each other by moving electriccharges, for which the conservation of energy is not true; while Maxwell’s result isin perfect consistence with that great principle.

Maxwell, in his Treatise, latter admitted that the non-conservation of energy“does not apply to the formula of Weber.” Why this retraction on the partof Maxwell?

According to O’Rahilly [38], it is due to the presence of an accelerationterm, r, in Weber’s force law,

Fx = ee′

r2 cos (rx)(

1 + rr − 12

r2)

, (4.1.1)

where the argument of the cosine denotes the angle formed between r andx, so that it accounts for the energy lost through radiation. In the secondedition of Sketch, Tait admits to his mistake, but, now, faults Weber on thefact that his “potential involves relative velocities as well as relative positions,and cannot therefore be called potential energy.” Tait is here saying that onlyabsolute velocities, or velocities with respect to the aether, can be used in

Page 206: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 179

electrodynamics! Next, he attacks Weber on the nature of the force law; anytrue law of force must only be a function of the mutual distance betweenthe charges e and e′, and cannot involve velocity, either relative or absolute.This is nothing but unfettered authoritarianism.

Even the ‘man who believed in atoms,’ Ludwig Boltzmann,acquiesced to the continuum model of Maxwell, saying

The hypothesis of electric fluids was brought to high perfection by Wilhelm Weber,and the general recognition given to his work in Germany stood in the way of thestudy of Maxwell’s theory. . . It is certainly useful if Weber’s theory is held up forever as a warning example that we should always preserve the required mentalelasticity.

In essence, Boltzmann is negating his entire life’s work! But, to Boltzmanndiscreteness, or atomism, was a mere way of counting. He always took thecontinuum limit at the end of his calculations — something Planck wasunable to do.

Weber’s force law (4.1.1) can be derived from the so-called electroki-netic potential,

L = e(φ − u · A/c), (4.1.2)

where φ and A are the scalar and vector potentials. The idea is to introduceexpressions for these potentials, such as:

φ =∫

ρ

rdV, (Poisson–Gauss)

A = 1c

∫jrdV, (Ampère)

φ =∫

ρt−r

rdV, (Riemann–Lorenz)

A = 1c

∫jt−r

rdV, (Riemann–Lorenz)

φ = er − u · r/c

, (Liénard–Wiechert)

A = eu/cr − u · r/c

, (Liénard–Wiechert)

where the subscripts on the charge density, ρ, and current density, j, meanto evaluate them at the earlier time t′ = t − r/c, and r is the radius vector,taken from the point where the charge is located to the point where it is

Page 207: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

180 A New Perspective on Relativity

observed. Then (4.1.2) could be treated as a mechanical Lagrangian so thatthe force is given by the variational expression

Fx = −∂L∂x

+ ddt

∂L∂ux

. (4.1.3)

Clausius’s Lagrangian,

L = ee′

r

(1 −

∑uxu′

x/c2), (4.1.4)

where the sum is over all particles, introduces absolute velocities, u and u′,and not relative ones. Unwittingly, Clausius joined electron theory to theaether, and undoubtedly, his expression (4.1.4) is similar to his virial, wherethe charges are replaced by the masses.

Earlier expressions for the force, derived from Weber’s,

LW = ee′

r(1 + u2

r/2c2), (4.1.5)

and Riemann’s,

LR = ee′

r(1 + u2/2c2), (4.1.6)

Lagrangians contained only relative velocities, where ur is the radial veloc-ity. Ritz showed that his second-order force relation for the interaction oftwo charges in uniform motion,

Fx = eEx + 1c

(u × H)x

= ee′

c2r2 cos (rx)[c2 + 1

4(3 − λ)u2 − 3

4(1 − λ)u2

r

]

− (1 + λ)ee′

2r2c2 u′xur − ee′

2c2r

[u′

x + u′r cos (rx)

], (4.1.7)

can be expressed as a linear combination of the Weber and Riemann forces,and, hence, be expressed as a linear combinations of their Lagrangians,(4.1.5) and (4.1.6) [cf. 3.8.15]. Moreover, for λ = 1, Ritz’s force law reducesto Liénard’s [98] expression,

Fx = ee′

c2r2 cos (rx)(

c2 + 12

u2)

− ee′

r2c2 u′xur − ee′

2c2r

[u′

x + u′r cos (rx)

], (4.1.8)

where the acceleration terms have been included.

Page 208: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 181

4.1.1 Irreversibility from a reversible theory

Not only was Riemann a great mathematician, but, like the prince ofmathematicians, Gauss, he also contributed to electrodynamics. Riemannobserved

I have found that the electrodynamic actions of galvanic currents may be explainedby assuming that the action of one electrical mass on the rest is not instantaneous,but is proportional to them with a constant velocity which, within the limits ofobservation, is equal to that of light.

Not only in the field of thermodynamics was there a rivalry betweenthe English and Germans in the nineteenth century [Lavenda 09], but thatrivalry poured over into electrodynamics. Maxwell was quick to object tothe notion that “potential is propagated like light” and refers to Clausius’scriticisms of Riemann to the effect that Riemann fails to obtain the knownlaws of electrodynamics. Clausius here was in no way as successful as hewas on the thermodynamics front.

The finite time of propagation, referred to by Riemann, could eithermean the potential is retarded insofar as the observation of the charge orcurrent is made at a later date, or there is a finite time involved in theinteraction of the charges, i.e. there is no action at a distance. Maxwelldeals with the first possibility and claims that “the electrical potential,which is the analogue of temperature, is a mere scientific concept.” Thisis in sharp contrast to his A, the “electrokinetic momentum,” which “mayeven be called the fundamental quantity in the theory of electromag-netism.” Verily, Maxwell was not aware of the four-vector status of thepotentials.

The explicit introduction of retarded potentials was made by LudvigLorenz in 1867. Lorenz wrote,

φ =∫ [ρ]

rdV, A = 1

c

∫ [j]r

dV, (4.1.9)

where the square brackets indicate that the charge, ρ, and current, j = ρu,densities are to be evaluated at an earlier time, t − r/c. Maxwell reactedmore kindly to Lorenz than he did to Riemann saying, in his Treatise, that“his conclusions are similar to those in this chapter, though obtained by anentirely different method.” Maxwell then claims priority over Lorenz.

Page 209: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

182 A New Perspective on Relativity

Fig. 4.1. The configuration for calculating the retarded scalar potential.

The form of the retarded potential in practice today was first givenby Liénard in 1898, and slightly later by Wiechert in 1900. Liénard’s proofthat the scalar potential is given by

φ = er(1 − vr/c)

, (4.1.10)

and a similar expression for vector potential, consists in considering thetotal charge e in a small volume V. Suppose that V is so small that thevelocity vr can be considered constant throughout. A sphere of radius rcuts the volume in αβ, which can be considered a plane in Fig. 4.1. Anincrease in the radius r, by an infinitesimal amount dr, leads to an increasein volume swept out by αβ by an amount αβ × dr. However, the volumeswept out with respect to the original volume is αβ × (dr + vrdt), where vr

is the rate of change of the radius. The increment in the radial coordinatedr = −cdt, looking backwards, so that the volume V expands by the amountV/(r − vr/c), and, consequently, the scalar potential where the observer islocated, (4.1.10), is greater than its value where the charge is located.a

aCuriously enough, Heaviside attributes the retarded potentials to his friendFitzGerald, “who first brought the progressive A and φ into electromagnetics . . .

But his potentials were not dopplerized. . . .” How Heaviside could have confusedretarded propagation with the Doppler effect is, indeed, a mystery. Even correctequations can sometimes be obtained through false physical reasoning.

Page 210: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 183

O’Rahilly [38] makes a good point that whereas Lorenz would writeu, Maxwell would write u + E. In so doing, Maxwell shows the dichotomyof the current: a current carrying charged particles, and a more abstractdisplacement current, so named because Maxwell was under the impressionthat such a current was the result of polarization, or the displacement ofcharges in a dielectric. The two are united through the continuity equationand Gauss’s law. However, the real magic of the displacement current,when substituted into the second circuital equation, closed the equationsand led to a wave equation for the propagation of the fields at the samevelocity of light!

Although we do not agree with O’Rahilly’s claim that “the view ofLorenz is accepted universally today, while there appears to be little orno realization of the elementary inference that Maxwell’s displacementcurrent is thereby rendered unnecessary,” and that “Maxwell’s so-calleddisplacement current is merely a mathematical equivalent expression with-out physical significance,” the retarded potential picks out the arrow of timewhich is lacking in the rest of electrodynamics, be it Maxwell’s or Lorentz’stheory.

Whereas the retarded potential corresponds to waves diverging inall directions from electric charges, the advanced potential corresponds towaves coming in from infinity and converging on the electric charges. AsRitz pointed out over a century ago: retarded potentials depend on previousstates, advanced potentials depend on future states. He concluded:

Experience shows, and Lorentz admits, that only [retarded] waves can exist, and,furthermore, contrary hypotheses would involve inadmissible consequences, suchas the possibility of perpetual motion.

For if we admit the existence of advanced potentials, charges wouldno longer be the sources of the field. Ritz uses the argument that sinceadvanced, as well as retarded, potentials satisfy Maxwell’s equations, theseequations are unable to distinguish between the two, and this is, yet,another reason for preferring the “formulas of elementary actions.”

It is rather interesting to see how Einstein vacillated back and forth.Any theory which seeks to unite Maxwell’s equations with the corpusculartheory of light must lead to inconsistencies. But, Einstein divided theminto two papers in 1905, one “On the electrodynamics of moving bodies,”and the other “On a heuristic point of view concerning the production and

Page 211: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

184 A New Perspective on Relativity

transformation of light.” According to Planck’s interpretation of Einstein,if light waves have a corpuscular constitution then Maxwell’s equationshave to be abandoned. Einstein supported this by saying

According to the usual theory an oscillating ion generates a divergent sphericalwave. The reverse process does not exist as an elementary process. . . the elementaryprocess of light emission has not as such the character of reversibility. . . Hence theconstitution of radiation appears to be different from that deduced by our wavetheory.

Undoubtedly swayed by the enormous popularity that special relativ-ity received in the years that followed, Einstein, writing in Maxwell’sCommemoration Volume, again reverted to his previous stance:

Since Maxwell’s time physical reality has been thought of as represented bycontinuous fields, governed by partial differential equations and not capable ofany mechanical interpretation.

And so he was to remain for the rest of his life.

4.1.2 From fields to particles

Maxwell took his theory of electromagnetism up to the point of specifyingthe molecular constituency of matter. For he wrote:

Here we may introduce once for all the common phrase ‘the electric fluid’ for thepurpose of warning our readers against it. It is one of those phrases which, havingbeen at one time used to denote an observed fact, was immediately taken up bythe public to connote a whole system of imaginary knowledge. As long as wedo not know whether positive electricity or negative or both should be called asubstance or the absence of a substance. . . we must avoid speaking of the electricfluid.

Although Maxwell’s original goal was “to discover a method of forminga mechanical conception of this electrotonic state,” he conceded defeat inthat

we have made only one step in the theory of the action of the medium. We havesupposed it to be in a state of stress, but we have not in any way accounted forthis stress or explained how it is maintained. . . I have not been able to make thenext step, namely, to account by mechanical considerations for these stresses in thedielectric.

That next step was taken by Hendrick Lorentz, and it was a stepbackwards to Weber, who considered the action to depend “directly onthe relative velocities of the particles, and to Riemann and Lorenz who

Page 212: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 185

envisioned a gradual propagation of something, whether potential or force,from one particle to another.” Lorentz’s force,

F = e(E + u

c× H

), (4.1.11)

is a concoction of Coulomb’s law and J. J. Thomson’s force (1881), which isexperienced by a charge e as it moves through a magnetic field, H. Lorentz’ssynthesis (1892) is this:

It is got by generalizing the results of electromagnetic experiments. The first termrepresents the force acting on an electron in an electrostatic field [F1 = eE]. . . the partof the force expressed by the second term may be derived from the law accordingto which an element of wire carrying a current is acted on by a magnetic field[F2 = j ds × H/c]. . . simplifying . . . [to] only one kind of moving electron withequal charges and a common velocity. . . [j ds = eu]. . . we now combine the two inthe way shown by the equation. . .

O’Rahilly [38] points out that the two forces are incompatible forhow can an electron be both stationary and moving? The word ‘com-bine’ leaves something to be desired. Moreover, as Thomson was thefirst to point out, (4.1.11) violates the law of action and reaction with-out the presence of the aether. Lorentz’s law is derivable from the‘electrokinetic’ potential of Schwarzschild, the ‘electrodynamic’ poten-tial of Clausius, or the ‘convection’ potential of Searle, (4.1.2), since,as Schwarzschild showed, (4.1.11) follows from the classical variationalequation, (4.1.3).

Another criticism that can be lodged against Lorentz’s synthesis isthat the electric field due to a charge e′ moving with relative velocity u′ onthe charge e traveling at relative velocity u, at a distance r apart, is

E = −∇φ − 1c

∂A∂t

= −∇ e′

r− 1

c∂

∂te′u′

r

= e′

r2 r − 1c

e′

ru′ + 1

ce′u′

r

r2 u′,

where r is a unit normal in the direction of the motion. This shows thatLorentz’s assumption will hold if we neglect accelerations and second-order terms in the relative velocities.

Page 213: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

186 A New Perspective on Relativity

Ritz remarks:

This remarkable result, due to Schwarzschild, shows that Lorentz’s theory resem-bles the older theories much more than we could at first sight believe.

He contends that it is not the fields that determine the force in (4.1.11),but, rather, “we only know F. . . where there is electrified matter, and deduceE and H by reasoning (which is not always so simple when we have toconsider absolute motion).” Moreover, when it is realized that H = e′∇ ×u′/cr, it becomes apparent that the velocities, u and u′, enter (4.1.11) “in anon-symmetrical manner which clearly shows the inequality of action andreaction, even when the accelerations are supposedly negligible and there is noradiation.” Ritz concludes by saying “the inequality of action and reactionconstitutes, therefore, a serious objection to Lorentz’s theory.” Without theaether, Lorentz’s theory appears lopsided.

4.1.3 Absolute versus relative motion

We return to our discussion in Sec. 3.2 of Maxwell’s role to determinemotion through the aether. In the year of his death, Maxwell thankedD. P. Todd, of the U. S. Nautical Almanac Office, for astronomical tableshe had sent him. In that letter he brings up the possibility of measuringthe velocity of the solar system through the aether. He thought it could bedone by measuring the eclipses of Jupiter’s moons over half the period ofJupiter’s orbit about the sun. That is by observing the apparent time of theeclipses with the Earth at diametrically opposite ends of its orbit, it wouldbe possible to infer the time it [aether] would take to travel the diameter ofthe Earth’s orbit about the Sun.

If the whole solar system is moving at speed u, and the diameter ofthe Earth’s orbit is d, the times it takes light to travel this distance is

t1 = dc + u

, or t2 = dc − u

,

depending on whether the aether is flowing towards, or away from theEarth. There will be a time difference of

�t = t2 − t1 = 2duc2 − u2 ≈ 2du

c2 .

Page 214: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 187

Maxwell emphasized that this would be superior to terrestrial mea-surements, which relied on the completion of a round trip. In such a case,one would have to determine, not the time difference, but the total time forthe ‘out and return’ trip,

t = dc + u

+ dc − u

= 2cduc2 − u2 ≈ 2d

c

(1 + u2

c2

),

where the increase in time due to the motion of the aether is

�t ≈ 2dc

u2

c2 . (4.1.12)

In contrast to the extraterrestrial measurement, which is first-order in theratio u/c, we now have a second-order effect. Maxwell realized that aterrestrial effect would be too small to measure; taking u to be the velocityof Earth in its orbit, the relative velocity is 10−4.

To save the principle of action and reaction, Ritz created a new form ofemission theory. Is radiation transmitted according to Poincaré’s analogybetween an artillery cannon and the force that a body experiences when itemits radiation, or whether radiant energy is transmitted by a medium-likedisturbance that can be described by a wave equation?

The validity of Maxwell’s equations rests on there being an absolutevelocity, the velocity of light. But, Maxwell’s equations are tacit on theexistence of advanced potentials, which according to Ritz are unphysical.Ritz assumed that the velocity of a photon relative to its emitter was stillc so that Maxwell’s equations are preserved, but the velocity of the emitterwith respect to an observer would be c + ur, where ur is the velocity of thesource in the direction of the vector joining the source to the point where itis being observed. As we saw in Sec. 3.2, this automatically accounts for thenull resort in the Michelson–Morley interferometer experiments. It is onlywhen we retain Einstein’s assumption that the velocity of light is constantto whatever frame we are in, do we need a hypothesis of a contraction inthe direction of motion, like that invented almost contemporaneously byFitzGerald and Lorentz.

What Ritz had in mind was modifying the retarded potentials, (4.1.9),so that they would read:

φ =∫

ρt−r/(c±ur)

rdV, A = 1

c

∫jt−r/(c±ur)

rdV, (4.1.13)

Page 215: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

188 A New Perspective on Relativity

where the ± indicates whether the source is receding or approaching theobserver. The new fields, (4.1.13), depend on the motion of the source,and this will increase or decrease depending on where they are beingobserved.

Ritz’s emission theory was not in contradiction with terrestrial mon-itoring since it differs from a constant velocity of light in terms involvingsecond-order in the velocity of light. If the source is traveling at a (non-relative) speed u the time it would take to make a return journey travelinga distance l and back is

t = lc + v

+ lc − v

= 2lcc2 − u2 . (4.1.14)

If the velocity of light were constant, the transit time would be 2l/c. Wecame across (4.1.14) in the explanation of the null result in the Michelson–Morley experiment [cf. (3.2.1)]. But, if Ritz’s theory held sway there wouldhave been no need to invent contraction factors.

Ritz’s emission theory put Einstein in a tight squeeze [Ritz andEinstein 09]. On the one hand, Einstein could not negate the validity ofMaxwell’s equations for he had recently based his special theory uponthem. On the other hand, he could not accept speeds greater than the veloc-ity of light for his explanation of the Fizeau drag coefficient from the rel-ativistic velocity addition law would be evanescent, as we shall see in thenext section. In a futile attempt to fend off Ritz’s allegations that advancedpotentials would violate the second law, all Einstein could say was “theirreversibility rests exclusively upon the grounds of probability.”b

The inverse process of a charge absorbing its own radiation, was notconsidered as an elementary process in 1909, though it had to be admittedthat it was a solution to Maxwell’s equations. To Einstein the inverse processconsisted of an enormous collection of radiating particles that somehowcould concentrate all of their radiation in a single point. In order to removethis asymmetry in Maxwell’s equations, Einstein suggested following acorpuscular theory with no corpuscles of light ever exceeding c. But thatwould leave his construct, special relativity, without a firm foundation onMaxwell’s theory.

bThis is seen here as a retraction of his adage that “God does not play dice.”

Page 216: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 189

Although Maxwell’s calculation is in conformity with Ritz’s, it is inconflict with special relativity. Ritz considers

two points A and B which move with constant absolute velocity v in the directionAB. A luminous wave starting from A at instant t will arrive at B at instant t′. It willhave to travel the distance AB + v(t − t′) with speed c; we then have

t′ − t = AB + v(t′ − t)c

, or t′ − t = ABc − v

.

He then goes on to say that there will be a first-order correction, (AB/c)v/cto the ‘true’ time to get what he calls the ‘local’ time. Not so for terrestrialmeasurements.Although Ritz is following Maxwell, he quotes Lorentz. Forterrestrial measurements of the velocity of light,

we are obliged to make it over a closed path which brings it back to its startingpoint; thus eliminating first-order terms. So, in the example considered, if the waveemitted at A is reflected at B, it will arrive at A after a time

t′ − t = AB(

1c − v

+ 1c + v

)= 2AB

c

(1 + v2

c2 + · · ·)

.

4.1.4 Faster than the speed of light

In the following chapters, we will see that hyperbolic geometry allows forvelocities greater than light. Not to encroach on material of later chapters,suffice it to say here that Felix Klein’s [71] definition of distance is thefollowing:

The distance between an element z and the element z1 is the logarithm of thequotient, z/z1, divided by the constant log λ.

Here, λ is the scale factor; it is the unit of measure into which the intervalis divided. If z = c + u and z1 = c − u, the ‘distance’ between them is

c + u − (c − u) = 2u = c ln

(c + uc − u

), (4.1.15)

where we have set ln λ = 1/c. If we expand the logarithm to first-orderwe get 2u = 2u, but the presence of higher-order terms in the expansionmeans that the ‘u’ on the left-hand side of (4.1.15) is not the same as the‘u’ on the right-hand side. Whereas the velocity on the right-hand side isthe Euclidean measure, the velocity on the left-hand side is the hyperbolicmeasure of the velocity.

Page 217: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

190 A New Perspective on Relativity

Furthermore, the composition law,

u′ = u − v1 − uv/c2 ,

implies that the difference of their hyperbolic counterparts, which we willindicate with a bar, is

u−v = c2

ln

(c + u′

c − u′

)= c

2ln

(c + uc − u

· c − vc + v

)= c

2ln{u, v|−c, c}, (4.1.16)

where the logarithm of the last term is the cross-ratio. We cannot over-stressthe facts that

whereas hyperbolic velocities are absolute, the absolute constant beingthe velocity of light, their Euclidean counterparts are relative. Andwhereas the former are strictly additive the latter satisfy the relativis-tic velocity composition law. Multiple longitudinal Doppler shifts givethe cross-ratio so that its logarithm, (4.1.16), is a true measure of hyper-bolic distance. The velocity of light determines the absolute limit of theEuclidean velocity while it determines the unit of measurement withrespect to its hyperbolic measure.

To say that the meter is the distance traveled by light in vacuum duringa time interval of 1/2.9979×10−8 of a second is a convention, and dependson how a second is defined. If all meter sticks and cesium clocks were todisappear from the face of the Earth, there would be no way for us to definea meter nor measure a second. We can thus say that velocities are relativein Euclidean space. Angles, on the other hand, are absolute, and, in non-Euclidean geometries, a triangle is determined by its three angles. This iscompletely foreign to Euclidean geometry.

The angle between the relative velocities u1 and u2 is

cos (u1u2) = u1 · u2

u1u2.

If we put our system on a platform moving at constant velocity v, theexpression for the angle becomes

cos θ = (u1 − v) · (u2 − v) − (u1 × v)(u2 × v)/c2√{(u1 − v)2 − (u1 × v)2/c2} · √{(u2 − v)2 − (u2 × v)2/c2} ,

(4.1.17)

Page 218: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 191

where θ is the angle at v in the triangle formed from the three vertices, v,u1, and u2. Expression (4.1.17) is the cosine of the angle in a hyperbolictriangle.

In Chapter 8, we will see that the relative velocity is related to thecorresponding segment of a hyperbolic straight line by |u| = c tanh |u/c|.The relative velocity between u1 and u2 will be equal to the difference inthe lengths u2 and u1. With u2 > u1, this distance is:

v = c tanh (u2 − u1)/c = ctanh u2/c − tanh u1/c

1 − tanh (u1/c) tanh (u2/c),

which is none other than the relativistic law for the composition ofvelocities,

v = u2 − u1

1 − u1u2/c2 ,

as was first pointed out byArnold Sommerfeld in 1909. The fact that u → ∞implies u → c, means that c looses its primacy in hyperbolic space, and isrelegated to an absolutely determined constant, analogous to the radius ofcurvature, whose numerical value will depend on the choice of units.

Tolman asserted the necessity of an experimental test to decidebetween Ritz’s hypothesis that to an observer the velocity of emissiondepends on its source, while Einstein claimed that it did not. Ehrenfest’sintervention brought with it an amusing paradox in that the believers in theaether would have to join ranks with the relativists in supporting Einstein’shypothesis.

The hopes of the emission proponents were (temporarily) dashed byde Sitter’s observations of the light emitted from eclipsing binary stars. Ifthe velocity of light depended additively on the velocity of the source, thetime for light to reach Earth from an approaching star would be smallerthan that from the receding member of the doublet. From the laws ofmechanics, de Sitter concluded that the effect, if it existed, would intro-duce a spurious eccentricity into the orbit. No such peculiarities wereobserved.

However, the propagation of light through a medium, no matter howrare it is, involves a continual process of absorption and reemission oflight as secondary radiation. This would have the effect of erasing any‘memory’ that light has of the original source. This phenomenon, referred

Page 219: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

192 A New Perspective on Relativity

to as extinction and presented as a theorem of Ewald and Oseen, transformslight emerging from a dispersive medium into the velocity characteristicof the medium after a single ‘extinction length.’ In the case of interstellarspace, the dispersive medium would be a permeating gas, and this hasinvalidated de Sitter’s lack of peculiarities.

Consider the passage of light through a medium of index of refrac-tion, η. The speed of light in such a medium would be c/η, where η > 1. Ifthe source were moving a speed u in the direction away from an observer,Ritz would contend that the observer measures the speed of light,

c = c/η + u,

where u is in the line of sight between the source and the observer, whileEinstein would contend that they are separate velocities, and that they ‘add’according to the velocity addition law,

c = c/η + u1 + u/cη

.

Assuming slow motion, u � c/η, the denominator can be expanded inpowers of u/c, and to first-order Einstein obtained

c ≈ cη

+(

1 − 1η2

)u.

This is precisely Fresnel’s result who, as we saw in Sec. 3.1, used it toexplain the partial ‘dragging’ of light by the medium. It is the main reasonwhy Einstein could not believe in emission theories, for they would invali-date the composition law of velocities. This law shows that no matter howyou combine velocities, and no matter how many you take in the combi-nation, the velocity of light can never be exceeded. This is, however, validonly in Euclidean space.

4.2 Relativistic Mass

It is fair to say that Einstein made Lorentz’s theory of the electron “appli-cable to ‘material points’ without charge” [O’Rahilly 38]. Even Abrahamwas into generalizing all electromagnetic results into those valid for all

Page 220: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 193

particles. Referring to the expressions for the transverse and longitudinalmasses he claims that these formulas

agree with those deduced for the Lorentz electron. We have derived them here,without making any assumption whatever concerning the configuration or thecharge-distribution, solely from the theorem on the momentum of the energystream. . .The same equations must hold for the ‘particle’ of elementary mechanics.

Abraham assumes that W = mc2 holds for any radiation process what-soever. The associated momentum is G = W/c if the particle is traveling atthe speed of light, or G = W/u if it is traveling at a speed u < c. Then theforce is

F = dGdt

= ddt

mu,

by definition. The increment in the work necessary to keep the system inmotion can only come from a change in the energy,

dW = c2 dm = Fu dt = u2 dm + mu du,

or upon rearranging,

dm/m = u du/(c2 − u2).

Integrating leads to

m = m0√(c2 − u2)

,

as the expression for the transverse mass at constant speed, as we havederived it in Sec. 1.1.3. This derivation was even accepted by Lenard, anarch enemy of relativity, who attributed it to Hassenöhrl.

Over the years there have been raging controversies of the validity ofthe original ‘proof’ of Einstein’s mass–energy equivalence. There is a gen-eral consensus that the criticism of Einstein’s [35] proof rests with Ives [52],but, Einstein himself found it necessary to offer another proof of the famousrelation some thirty years after the original one. And that one was not sooriginal because Poincaré came up with it five years before Einstein, as wesaw in Sec. 1.2.2.2.

On the strength of Ives’s criticism, Jammer [61] writes:

It is a curious incident in the history of scientific thought that Einstein’s own deriva-tion of the formula E = mc2, as published in his article in the Annalen der Physik,

Page 221: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

194 A New Perspective on Relativity

was basically fallacious. In fact, what for the layman is known as “the most famousmathematical formula ever projected,” in science was but the result of a petitioprincipii. . .

Arzeliès [66] uses Einstein’s derivation as a bulwark for other seedyrelations in physics when he states

It might appear rather piquant that a relation of this importance was introducedinto physics by this expedient. In fact, it is just one example among others of theslight importance of logic in physical research.

Let us now turn to what Einstein did, or rather did not do.

4.2.1 Gedanken experiments

Einstein [05] considers a body at rest which emits a ‘light wave’ of totalenergy L into two opposite directions of equal magnitude. The body, whichcertainly cannot be an electron, conserves momentum but loses energy. If ithad energy E0 before the emission took place and E1 after, the conservationof energy demands

L = E0 − E1.

Now consider, says Einstein, the same process in another inertial systemmoving at speed u relative to the former. The same emission will lead tothe energy conservation

L√(1 − u2/c2)

= H0 − H1,

where H0 and H1 are the energies before and after radiation. This result, heclaims, follows from the law of transformation of energy from one inertialframe to another that he derived previously. Subtracting the former fromthe latter gives

(H0 − E0) − (H1 − E1) = L(

1√(1 − u2/c2)

− 1)

. (4.2.1)

Einstein continues

H and E are the energy values of the same body, referred to two coordinate systemsin motion relative to each other, in one of which . . . the body is at rest. It is thereforeclear that the difference H − E can differ from the kinetic energy K of the body withrespect to the other system . . . only by an additive constant C which depends onthe choice of the arbitrary additive constants in the energies H and E.

Page 222: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 195

Thus he sets

H0 − E0 = K0 + C,(4.2.2)

H1 − E1 = K1 + C,

where the constant C will not be altered by the emission of radiation. This isobvious, for, otherwise, it would not be an arbitrary constant. Then replac-ing the energy difference in (4.2.1) by these expressions he comes out with

K0 − K1 = L(

1√(1 − u2/c2)

− 1)

,

where K0 and K1 are “the initial and final kinetic energies of the body withrespect to the inertial frame in which its velocity is u” [Stachel & Torretti82]. However, if u is the relative velocity of frame 1 with respect to frame 0, itfollows that K0 = 0, for the latter is at rest with respect to the former.

What Ives [52] does is to get rid of the difference on the right handside of (4.2.1) by introducing the expression for the difference in the kineticenergies,

Ki = mic2(

1√(1 − u2/c2)

− 1)

, i = 0, 1, (4.2.3)

thereby obtaining

(H0 − E0) − (H1 − E1) = L(m0 − m1)c2 (K0 − K1).

But, if K0 = 0 it follows that m0 will not appear because what multiplies it in (4.2.3)is zero! Notwithstanding, Ives separates this equation into two equations

H0 − E0 = L(m0−m1)c2 (K0 + C),

H1 − E1 = L(m0−m1)c2 (K1 + C),

(4.2.4)

and claims that they are not (4.2.3) because they differ from them by themultiplicative factor, L/(m0 −m1)c2. So, claims Ives, they will become themwhen we set the multiplicative factor equal to 1. This is the petitio principii,or the begging of the question, to which Jammer refers. Ives concludes: “Therelation [E = �mc2] was not derived by Einstein.” Although the latter iscorrect, it was not because Ives pin-pointed the error.

Page 223: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

196 A New Perspective on Relativity

The criticism [Riseman & Young 53] lodged against (4.2.4) is that thetwo equations are not independent. Stated slightly differently, the firstequation knows ahead of time that there is going to be an emission inenergy leading to a decrease in mass.c Even worse, as we have alreadymentioned, no motion in O means no K0 and, therefore, the mass m0 willnot appear in the equation so that there is no decrease in mass on accountof the radiation.

Stachel and Torretti [82] emphasize that “though ‘clear’ to Einstein,their validity [referring to (4.2.3)] has not been quite so evident to others.”They hope to remedy the situation by the following considerations.

Consider, they say, a body in a rest frame whose internal state is char-acterized by a set of state parameters, S. By the ‘relativity principle’ it mustbe possible to have the same state in motion with the same state parame-ters, for, otherwise, we could distinguish motion from rest, or have absolutemotion simply by noting a change in S. Thus, the energy E, can only be afunction of the velocity and set of state parameters, E = E(u, S).

Now comes the crux of their argument:

The kinetic energy of the body, by definition, is equal to the work necessary tobring the body from the state of rest to uniform motion with velocity u. But, byconservation of energy, this must be equal to the difference between its energy forthe state S and speed u and its energy for the same internal state when at rest:

K = E(u, S) − E(0, S).

They then deduce that

E(u, S0) − E(0, S0) − [E(u, S1) − E(u, S1)]= K(u, S0) − K(u, S1) = L

(1√

(1 − u2/c2)− 1

).

We beg to differ with the statement that the “kinetic energy, by defi-nition, is equal to the work. . .” since in order for the system to be broughtfrom a state of rest into one of uniform motion it will have to undergo accel-eration, i.e. a change in its velocity. So it is not the kinetic energy Stacheland Torretti are referring to, but, rather, the work, G · u, necessary to keepthe system in a state of uniform motion, regardless of the manner it gotthere, i.e. by accelerating it.

cThis sounds like an advanced potential.

Page 224: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 197

This is to say that the differences in energies of the two states are

E(u, S0) − E(0, S0) = G0u,

E(u, S1) − E(0, S1) = G1u,

and subtracting the latter from the former gives

E(u, S0) − E(0, S0) − [E(u, S1) − E(0, S1)] = (G0 − G1)u = (m0 − m1)u2√

(1 − u2/c2).

This is certainly not the difference in the kinetic energies of the two states; inparticular, for u/c � 1 it reduces to twice the difference in kinetic energies.

According to Jammer, after ‘correctly’ proving (4.2.1), Einstein “mis-takenly put” H − E equal to the kinetic energy. Stachel and Torretti rejoinby saying “And yet it is hard to see what else one could mean by kineticenergy of a body with internal state S and speed u.” Rather, it is the dif-ference between the total energy, what they refer to as E, and the internalenergy which can depend only on S, is the work necessary to keep thesystem in uniform motion [Lavenda 02].

The authors then claim that

Einstein had proved (4.2.3) for the kinetic energy of an ‘electron,’ i.e. a chargedstructureless particle in his first relativity paper; but he studiously avoided usingit in the derivation of the mass–energy equivalence, even though. . .it would havesimplified his task.

Surely, Stachel and Torretti know that bringing the body from of a stateof rest to uniform motion involves acceleration, and accelerating electronsradiate; this would carry us way beyond the limits of special relativity.

4.2.2 From Weber to Einstein

The birth of electrodynamics came in 1826 with the publication ofAmpère’smemoir on the interaction of two small currents of electricity. Ampèreclaimed that the force acting between two elements of current was not justproportional to the inverse square of their distance, but, also to the angleswhich these small elements made with the line connecting their centers.

In Ampère’s own words:

it is no longer contradictory to admit that from the actions proportional to theinverse square of the distance which each molecule exerts, there can result between

Page 225: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

198 A New Perspective on Relativity

two elements of conducting wires a force which depends not only on their distancebut also on the direction of the two elements. . .

Ampère’s memoir caught the eye of Gauss who, in 1835, came tothe conclusion that “two elements of electricity in relative motion repel orattract one another differently when in motion and when in relative rest.”We have already quoted Gauss’s letter to his assistant Weber, in Sec. 4.1,where he expressed his view that these interactions are not instantaneous,but propagated “as with the case of light.”

At the beginning of the nineteenth century discoveries were seem-ingly unconnected, for as Fechner wrote in 1845, “Faraday’s phenomena ofinduction and the electrodynamic phenomena ofAmpère have been relatedonly by an empirical rule.” According to Fechner, and subsequently Weber,all electric currents are currents of convection; that is, they are due to themotion of electricity. The corpuscular interpretation of electricity was elabo-rated upon by Weber himself. He accepted Fechner’s hypothesis that bothpositive and negative elements of electricity move at equal velocities inopposite directions. On the strength of Coulomb’s law these interactionsshould cancel out and there should be no motion. But, this contradictedthe fundamental experiments conducted by Ampère who demonstratedbeyond any doubt that a motion is produced between wires. Therefore,there must be a force not contained in Coulomb’s law.

Weber went on to derive a force existing between charged particlesthat took into account their relative velocities as well as accelerations. Ifmass is basically electromagnetic in nature, the velocities contribute to theforce, and not to any variation of mass with speed. Weber’s ideas held swayon the continent largely due to the authority of Helmholtz.

However, Helmholtz found what he believed to be a flaw in Weber’sforce law: it could lead to infinite work arising from finite motion of elec-trical particles. Weber replied that in order for this to happen, particleswould have to move at enormous speeds surpassing his constant c, theonly parameter that appears in his equation. When the motion of parti-cles reach this value, the force between the electrical particles vanish. Inthe Weber–Kohlrausch experiment, performed in 1854, the constant wasdetermined to have the same value as the product of the speed of light,in vacuo, with the square root of two. Riemann who was present during

Page 226: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 199

the experiment was impressed by the very profound connection betweenelectricity in motion and light.

In England a different sort of synthesis was brewing. Maxwellattempted a different sort of amalgamation between a tentative explana-tion of electrical actions through mechanical properties of the aether, anda purely phenomenological description in terms of two fields that satis-fied a set of partial differential equations. Gone were the forces by whichelectrical charges interact whether in motion or at rest.

Through the introduction of the displacement current, as an auxiliarymeans by which charges move, Maxwell was able to close his set of equa-tions into a single wave equation with a velocity of propagation equalto that of light. It took thirty years since Maxwell’s first publication “OnFaraday’s line of force” in 1856 to get even a hearing for his theory onthe continent. However, the experiments of Hertz clinched the success ofMaxwell’s theory.

Maxwell’s theory, or Hertz’s generalization to bodies in motion, doesnot agree with well-known optical phenomena of aberration and experi-ments like the Fizeau drag and the Eichenwald experiment on the magneticfield produced by the rotation of a dielectric in an electric field. Anotherstep was needed and it was provided by Lorentz who brought together twoseemingly disconnected laws into a single force law. In doing so, Lorentzfilled the huge abyss between Maxwell’s field equations and the mechan-ical theories that saw forces resulting from attraction and repulsion, andfrom motion.

It is commonly accepted that Einstein [49] tore down the scaffoldingof the Maxwell–Lorentz theory and replaced it by two postulates, which inhis own words are

The insight fundamental for the special theory of relativity is this: The assump-tions relativity and light speed invariance are compatible if relations of a new type(“Lorentz transformation”) are postulated for the conversion of coordinates andtimes of events... The universal principle of the special theory of relativity is con-tained in the postulate: The laws of physics are invariant with respect to Lorentztransformations (for the transition from one inertial system to any other arbitrarilychosen inertial system). This is a restricting principle for natural laws.

So Einstein accepted the continuous field concept encapsulated inMaxwell’s equations, and, with them, the existence of an absolute velocity.

Page 227: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

200 A New Perspective on Relativity

Both Maxwell’s equations and the Lorentz force are time-symmetric, and,therefore, cannot explain phenomena involving radiation which is a clearlyirreversible process. This was Ritz’s contention which was formulated intothe “Ritz–Einstein Agreement to Disagree.”

Ritz argued that Maxwell’s equations do not discriminate betweenretarded and advanced potentials, and thus cannot explain phenom-ena involving radiation. Einstein, wishing to preserve the central role ofMaxwell’s equations, insisted that Maxwell’s equations can be solved, inprinciple, using the end state, instead of the initial state, with the aid of theadvanced potential. According to Lanczos [74] “Ritz took strong objectionto this view, and Einstein admitted his mistake.” This is rather curious sinceneither author admitted making a mistake. It could be that Lanczos wasreferring to Einstein’s opinion in his later years.

Einstein goes on to sustain that irreversibility is a statistical effect likefluctuations in blackbody radiation and Brownian motion. However, thereis nothing statistical about solving a partial differential equation, and thisshows Einstein’s tenacious effort to maintain the unadulterated status ofMaxwell’s equations and a constant, limiting speed.

However, with age, Einstein had growing disillusionments that dif-ferential equations are the correct setting for a unified theory. To Pais [82]he remarked “he was not sure whether differential geometry was to be theright framework for further progress,” while to Besso he wrote the yearbefore his death, “I consider it quite possible that physics cannot be basedupon the field concept, i.e. on continuous structures. In that case, nothingremains of my entire castle in the air, gravitation theory included, [and of]the rest of modern physics.”

4.2.3 Maxwell on Gauss and Weber

In the last chapter of his treatise, Maxwell [91] compares his notion of thetransmission of radiant energy from one particle to another with those ofGauss and Weber. He titles his chapter “Theories of Action at a Distance,”which is a forewarning of the critical attitude he is to take.

Typical texts on electricity and magnetism use the 1820 formulationof Biot and Savart, while Gauss and Weber based their theory on that givenby Ampère in 1825. The two expressions differ in their prediction of the

Page 228: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 201

Fig. 4.2. Orientation of two circuit elements ds and ds′.

force acting between two current elements in an open circuit, but coincidein a closed circuit because all angle dependent terms disappear.

If ds and ds′ are two lengths of wire located at a distance r apart, andcarrying currents I and I ′, respectively, then Ampère showed that the forceexerted on ds by ds′ is

κ′ II ′ds ds′

r2

(2 cos ε − 3 cos ϕ · cos ϕ′), (4.2.5)

where ε is the angle between the two elements, ds and ds′ and ϕ and ϕ′ arethe angles formed between the radial vector r connecting them and theirorientations, as shown in Fig. 4.2.

The constant κ′ is determined by the units. Another constant, κ, againdetermined by the units, appears in Coulomb’s law for the force actingbetween two charges e and e′ at a distance r apart,

κee′

r2 . (4.2.6)

Once one of the constants is chosen to define a unit of charge, the other mustbe related to the speed of light,

√(κ/κ′). This was determined by Weber in

1852, who fixed the constant at√

2 times larger than the speed of light.The idea is to consider only the relative motion of the two particles.d

If v and v′ are their speeds, then the square of their relative speed is

u2 = v2 − 2vv′ cos ε + v′ 2.

dThe true relativists belonged to the nineteenth century, not the twentieth!

Page 229: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

202 A New Perspective on Relativity

And the rate of change of the distance between the two particles is

r = drdt

= v∂r∂s

+ v′ ∂r∂s′ .

To get rid of the absolute square of the velocities, recourse was madeto Fechner’s hypothesis, in which the electric current consists of positiveand negative charged particles traveling in opposite directions, and areequal and opposite in magnitude. Introducing these expressions into anequivalent form of Ampère’s law,

− ee′

r2

(∂r∂s

∂r∂s′ − 2r

∂2r∂s ∂s′

), (4.2.7)

with charges replacing current elements, Gauss and Weber finally came outwith

ee′

r2

[1 + 1

c2

(u2 − 3

2r2

)], (4.2.8)

ee′

r2

[1 + 1

c2

(rr − 1

2r2

)]. (4.2.9)

Both expressions reduce to Coulomb’s law in the static limit. The constantc appearing in these equations is

√(κ/κ′).

Maxwell attributes the first of these expressions, (4.2.8), to Gauss,found posthumously in his notebooks dated July 1835, and the secondto Weber, published in 1867, under the title “Determinations of electrody-namic measure.” According to Gauss, “two elements of electricity in a stateof relative motion attract or repel one another, but not in the same way asif they are in a state of relative rest.”

Actually, both expressions (4.2.8) and (4.2.9) appear on the same pagein Gauss’s notebooks, and probably Maxwell wanted to distinguish thetwo expressions by giving them two different names, since the latter iswhat appears in Weber’s book. But, it is (4.2.8) that he based his viewof the non-instantaneous, ballistic transmission of energy, and the onewhich Ritz will re-derive in the absence of accelerations and for a spe-cific value of his parameter, λ = −1. So, as O’Rahilly [38] remarks, Ritz’spaternity is directly traceable to Gauss. We will come to Ritz’s emissiontheory in the next section, but, first we want to see what Maxwell had to

Page 230: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 203

say about (4.2.8) and (4.2.9), and how he ultimately distanced himself fromthem.

While Maxwell admits that the two expressions give the samemechanical force between two currents, and, in this sense are identicalto Ampère’s law, (4.2.7), they differ when considered as laws of nature. Inparticular, do they obey the conservation of energy? Here, Maxwell baseshimself on conventional wisdom by stating that only those forces that actbetween particles are a function of distance only, and not upon “the time,or the velocities of the particles, [for which] the proof would not hold.” Thisrings of Tait’s earlier claim near the start of Sec. 4.1. Maxwell concludes that

a law of electrical action, involving the velocity of the particles, has sometimes beensupposed to be inconsistent with the principle of the conservation of energy.

In reality, Maxwell is not willing to go that far, and, in the next para-graph, says that only Gauss’s formula, (4.2.8), is inconsistent with the con-servation of energy “and therefore must be abandoned.” What, supposedly,saves Weber’s force from the same fate is that it is derivable from a potential,

L = ee′

r

(1 − 1

c2 r2)

. (4.2.10)

The work involved in moving the particle from the beginning to the end ofany path segment is ψ1 − ψ0. Now, says Maxwell, ψ depends only on thedistance r, and its rate of change, r, so that when a particle is moved in aclosed path, the potential will be the same as when it started. That is, “nowork will be done on the whole during the cycle of operations.”

Notwithstanding this, Maxwell goes on to cite Helmholtz’s criticismthat Weber’s force can become the seat of a perpetual motion machine by thefact

that two electrified particles, which move according to Weber’s law, may have atfirst finite velocities, and yet, while still at a finite distance from each other, they mayacquire an infinite kinetic energy, and may perform an infinite amount of work.

So it appears that Sadi Carnot’s use of perpetual motion to outlaw certainforms of cyclic motion lasted well into the nineteenth century. But, herewe are not talking about a machine that can operate over a cycle and stillhave energy to burn, but one of instantaneous motion. Weber’s rebuttalconsisted in saying that any such particle would have a velocity greater

Page 231: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

204 A New Perspective on Relativity

than c, and that the distances involved where the particles energies wouldbe infinite would be so small as to go unperceived.

The former would have been a sufficient answer to Helmholtz, butthe latter gave Helmholtz space to maneuver. Consider, says Helmholtz,a non-conducting sphere of radius, a, and a uniform surface charge, σ. Aparticle whose mass is m and carrying a charge e moving at a speed v willhave a potential, according to Weber, given by

4πσae

(1 − v2

6c2

),

which is completely independent of the position of the particle within thesphere. Adding this to the kinetic energy, 1

2mv2, of the particle, and towhatever other potential energies may be present, V, the conservation ofenergy requires

12

(m − 4

3πσae

c2

)v2 + 4πσae + V = const. (4.2.11)

Interestingly, a mass term of the form of the second term in (4.2.11),but with opposite sign, will be found by Thomson in 1881 for a slowlymoving charge in Sec. 5.9. But, with the sign as indicated in the formula,Helmholtz argued that the second term in the coefficient of v2 could beincreased indefinitely by increasing a, while keeping the surface density,σ, constant, so that the coefficient of v2 can become negative. A negativekinetic energy could further be made more negative by frictional terms,which ordinarily oppose the motion thereby decreasing the kinetic energy,and would thereby lead to a perpetual motion machine when carried overa cycle. Any potential which introduces a negative sign in the coefficient ofv2 would lead to the same conclusion.

Now, Maxwell insists that Weber’s law is “consistent with the prin-ciple of the conservation of energy insofar that a potential exists,” whileGauss’s law, (4.2.8), or what was supposedly attributed to him, does not.However, introducing [O’Rahilly 38, p. 525]

r = vr − v′r = ur,

r = ddt

∑x

(vx − v′x)(x − x′)r

= (u2 − u2r )

r+ ar − a′

r,

Page 232: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 205

since

u2 =∑

x

(vx − v′

x)2 , vr − v′

r =∑

x

(vx − v′x)(x − x′)r

,

ar − a′r =

∑ (vx − v′x)

(x − x′)r,

into Weber’s formula, (4.2.9), gives precisely Gauss’s law, (4.2.8), whenthe accelerations, ar and a′

r, are omitted! So, the whole argument lodgedagainst Gauss’s formula, (4.2.8) is entirely unjustified. To make mattersworse, Helmholtz criticized Weber’s formula, (4.2.9) precisely due to thepresence of acceleration, which does not apply to Gauss’s (4.2.8). ThusMaxwell cannot maintain that since Gauss’s law is inconsistent with theprinciple of the conservation of energy it will not explain all the laws ofinduction, and, hence, is unacceptable.

To aggravate matters further, there is a sign error in (4.2.10), whichshould read [cf. (4.1.5)]

L = ee′

r

(1 + u2

r

2c2

). (4.2.12)

Now, we are free to choose r and ur as the independent variables, as wellas x and vx. In the former case we get

Fr = −∂L∂r

+ ddt

∂L∂ur

, (4.2.13)

which is Weber’s law, (4.2.9), or Gauss’s, (4.2.8), if the accelerations arenegligible. In the second case, we must bear in mind that r is not independentof x, i.e.

∂ur

∂x= ∂

∂x

∑ (vx − v′x)(x − x′)r

(4.2.14)

= vx − v′x

r−

∑ (vx − v′x)(x − x′)r2 cos (rx). (4.2.15)

We thus get the force in the x-direction,

Fx = −∂L∂x

+ ddt

∂L∂vx

,

which introduces a cos (rx) into the force law, reducing to (4.2.12) when ther lies along the x-axis.

Page 233: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

206 A New Perspective on Relativity

Consequently, we must change the negative sign to a positive one in(4.2.11). We must now inquire into what it means. Recalling the definitionof the electrokinetic potential, (4.1.2) or L = e(φ − (ur/c)Ar), the inducedemf,

E = ddt

∫∂L∂ur

dr. (4.2.16)

Introducing Weber’s expression for the electrokinetic potential, settinge = e′, and considering a spherical conductor of radius 3a we come out with

E = e2

3au2

r

c2 . (4.2.17)

This particular choice allows us to identify with Thomson’s expression onSec. 5.9 for a medium of unit permittivity. There will be an additional con-tribution to the kinetic energy, which is now

12

(m + 2

3e2

ac2

). (4.2.18)

Now, Thomson [20] attributes the increase in the mass as due tothe magnetic field surrounding the charge which has been created by itsmotion. Here, we appreciate it as due to magnetic induction, which is aterm of order u2

r/c2, that cannot be neglected. So, it is, in fact, in April of1872 that Helmholtz, unwittingly, discovered the inertia of energy due toa circulating electric charge. Hindsight has indeed twenty-twenty vision!

Interestingly, the criticism lodged against Weber by Helmholtz can beused against Clausius’s expression,

L = ee′

r

(1 −

∑vxv′

x/c2),

because, in the words of Maxwell, “This impossible result [accelerationcausing a decrease in the kinetic energy] is a necessary consequence ofassuming any formula for the potential which introduces negative termsinto the coefficient of v2.” But, it cannot be used against Riemann, whoseexpression is

L = ee′

r

(1 + u2/c2

),

and whose formulation Clausius so viciously attacked.

Page 234: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 207

Maxwell tell use

The mathematical investigation given by Riemann has been examined by Clausius,who does not admit the soundness of the mathematical processes, and shews thatthe hypothesis that potential is propagated like light does not lead either to theformula of Weber, or to the known laws of electrodynamics.

What Riemann did was to derive phenomena related to induction from amodified form of Poisson’s law,

∇2V + 4πρ = 1α2

∂2V∂t2 ,

where V is the electrostatic potential, ρ, the charge density, and α is thevelocity of propagation. Riemann was coming too close for comfort, andthe only exception Maxwell could take is that he avoids “making explicitmention of any medium through which the propagation takes place.” Tomodern relativists, this is no exception at all, and it smacks of the pettydifferences drawn to distinguish Poincaré’s principle of relativity from thatof Einstein’s.

Coming closer to the true motivation behind Maxwell’s criticism,Maxwell refers to the 1845 letter of Gauss to Weber in which he consid-ers the action between charged particles not to be instantaneous (action ata distance), but, rather, “propagated in time, in a similar manner to that oflight.” A finite-time propagation mechanism would undoubtedly involvewave propagation, but Gauss probably did not think it was that simple,given the complicated angular dependencies of Ampère’s law, (4.2.5).

But, Maxwell found it suitable for his needs to claim that Clausiusshowed that the “potential is propagated like light does not lead either tothe formula of Weber, or to the known laws of electrodynamics.” Be that asit may, the exact same criticism can be leveled against Maxwell’s potentialsand fields! Maxwell, however, did not take Weber’s force nor Ampère’s lawas a criterion that had to be fulfilled. This is clearly seen by the followingquestion posed by Maxwell.

eRiemann presented his article to the Royal Society of Göttingen in 1858 but hadto withdraw it because of Clausius’s opposition to its publication. It was later pub-lished in the Annalen, posthumously in 1867. This is another example of unjust, andbiased referring where Clausius missed his mark.

Page 235: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

208 A New Perspective on Relativity

“If something is transmitted from one particle to another at adistance,” queries Maxwell, “what is its condition after it has left the oneparticle and before it has reached the other? If this something is in the formof potential energy how then does it exist in the intervening time, after it leftone particle and before it reached its destination?” According to Maxwell,there ought to be a medium which can house and transport this energy,and it is to this medium that we must focus our attention. And, “this hasbeen my constant aim in this treatise.”

We have seen in Sec. 3.8.1 that this type of reasoning led Maxwell todrop the whole idea of finding an explanation of gravity as a field effect,and witnessed in Sec. 3.8.2 that Ritz’s approach was far superior in that itcould explain all the known deviations from Newtonian theory known atthat time. What was so fruitful to Maxwell in the derivation of his equationsbecame a deterrent for further progress.

4.2.4 Ritz’s electrodynamic theory of emission

4.2.4.1 Absolute versus relative velocities

What is not intuitive at all is Einstein’s second postulate which says thatall observers will always measure the same speed of light regardless oftheir motion. It abolishes the parallelogram addition of velocities andreplaces it by a composition law which makes sure that the speed of light isnever exceeded no matter how may velocities are added together. Einsteinneeded this composition law because it explained the first-order correc-tion of the Fizeau experiment [cf. Sec. 3.1], but Ritz is quoted as saying “itwould be deplorable for our economy of thought if we had to accept suchcomplications.”

Walther Ritz was a young Swiss physicist who is best noted for hiscombination principle, and the Rayleigh–Ritz perturbation technique. Aswe have seen he did much more in predicting the advance of the perihelionof Mercury Sec. 3.8.2 from his force equation, and he formulated the onlyserious alternative to special relativity and Maxwell’s electrodynamics.

Ritz [08] asked the question “Do [Maxwell’s] equations really deservesuch extreme confidence?” His immediate response was “The answerto this question is decisively no.” His most damning criticism was thatMaxwell’s field equations admit an infinite number of solutions, many of

Page 236: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 209

which are unphysical. To eliminate such solutions we must invoke retardedpotentials. And if we are to deal directly with the scalar and vector poten-tials what good are the field equations which determine the evolution ofthe electric and magnetic fields?

Moreover, we never observe the fields themselves, but, rather, deducethem from the (Lorentz) force that acts upon charged particles. This force isreversible in time, when we reverse the velocities we also have to reverse themagnetic field. However, radiation is clearly an irreversible phenomenonso Maxwell’s equation together with Lorentz’s force are unable to copewith the real nature of irreversible radiative phenomena. We will see thatRitz obtained the self-reaction of the electron upon itself when it radiateswithout the support of either Maxwell or Lorentz.

Ritz assumed that all charged particles constantly emit ‘fictitious’charged particles which are infinitely small. If the charges are in motionthen the velocity of these particles would be the vector sum of the veloc-ity of the charges, at the instant of emission, and the velocity of light. Ritzinsisted that his was not a true theory, but only one example where Lorentzinvariance is not part and parcel of every relativistic phenomenon. Oddlyenough, Lorentz invariance, or better the invariance of the cross-ratio, isat the very heart of the additivity of the velocities when the velocities arehyperbolic ones.

Consider a circle of radius c. The center of the circle is located at theorigin and let us calculate the cross-ratio for the collinear points (−c, 0, u, c).As we know from Sec. 2.2.4 it is given by

2c

u = ln

(c

c − u· c + u

c

)= ln

(c + uc − u

)= ln {0, u|c, −c},

where c is the absolute constant which determines the scale. The Maclaurinexpansion of the logarithm shows that for small velocities u = u, while atlarge velocities u → ∞ as u → c.

Now if we want to add two velocities, u and v, we get

u + v = c2

ln

(c + uc − u

· c + vc − v

)= ln

(c + wc − w

), (4.2.19)

where

w = u + v1 + uv/c2 ,

Page 237: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

210 A New Perspective on Relativity

which is precisely the relativistic composition law of velocities. And it isprecisely this law that guarantees the additivity of their hyperbolic coun-terparts! In other words,

Lorentz invariance in Euclidean space guarantees the additivity of thevelocities in hyperbolic space. Whereas Euclidean velocities cannotexceed c, there are no restrictions placed upon hyperbolic velocities,which, are in fact, additive.

Thus, the violation of parallelogram law for the addition of(Euclidean) relativistic velocities should be an indication of the hyperbolicnature of the velocity space. Ritz died in 1909 and his ‘ballistic,’ or c + u,model of radiation seems to have met a similar fate in 1913 when de Sitter’sbinary star observations failed to predict the c +u effect. Basically, de Sitterargued that if the velocity of light emitted by a binary star were additivewith the velocity at which the star is moving then certain effects shouldbe observable (which he did not observe). In other words, there wouldbe an interaction between the light that is emitted at the slower velocity,c − u, when the side of the orbit is moving further away from the observerand the faster velocity, c + u, one-half orbit later when it is approachingthe observer. At the point where the faster light overtakes the slower lightemitted one-half orbit earlier, the light from the star should be observed intwo different parts of its orbit simultaneously [Fox 62].

If � is the distance between the star and the observer when it is recedingfrom him, the time it will take for slow light to arrive is

ts = �

c − u,

while the time it takes for fast light to reach the observer is

tf = τ + �

c + u,

where τ is the time for the star to complete one-half of its orbit. At the exacttime where ts = tf , the time to complete one-half of its orbit will be

τ = �

(1

c − u− 1

c + u

)= 2�u

c2 − u2 ,

Page 238: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 211

which is the same reasoning used in the Michelson–Morley experiment, inSec. 3.2, except there the sum, (3.2.1), instead of the difference was taken.This is further evidence that the Michelson–Morley interpretation is a bal-listic one. From the above equation one solves for �, and obtains approxi-mately τc2/2u.

The speed of light is the absolute constant of hyperbolic velocity space,and it makes no sense to add to it the hyperbolic velocity u. We can, however,consider a Euclidean velocity c/η, where η is the index of refraction ofthe medium that light is propagating in. In fact, the so-called ‘extinctiontheorem’ is an argument in favor of Ritz’s ballistic theory. The theorem,supposedly put forth by Ewald and Oseen [Born & Wolf 59],

shows how an external electromagnetic disturbance traveling with the velocity oflight in vacuum is exactly canceled out and replaced in a substance by the secondarydisturbance traveling with an appropriately smaller speed.

Terrestrial extinction occurs on the surface of a dipole field, which hasan extinction length of a thin surface layer 10−4 cm, whereas interstellarextinction due to gaseous envelopes surrounding binary stars are still smallin comparison to the distance, τc2/2u.

Granted there is no Doppler shift for c + u because c is the absolute con-stant. But, suppose that the emitter traveling at speed u is Doppler-shiftedalong with the emitted radiation in a medium whose index of refraction isη. Setting c/η = c′, Ritz’s addition law would be

c′ + u′ = c ln

(c + u′

c − u′

)= c ln

((c + u)(c − u)

· (η + 1)(η − 1)

), (4.2.20)

where

u′ = c/η + u1 + u/ηc

.

Although c > u′ > c′, the left-hand side of (4.2.20) can be of unlimitedmagnitude, the closer u′ is to c, or the closer η is to 1.

Many criticisms have been lodged against emission theories, notablyby Pauli [58]. Pauli questions whether one measures a change in frequencyor a change in wavelength, since emission theory does not require theirproduct to be constant, i.e. λν = c. If a star approaching the Earth at aspeed u emits radiation at a frequency ν, it will be seen by an earthling atfrequency ν′ = ν(1 + u/c), but with no change in wavelength. According to

Page 239: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

212 A New Perspective on Relativity

Ritz this is due to the fact that the emitter, moving at constant speed, emitsconcentric spherical wavefronts each one of which is centered on the sourceas it moves. Thus, the wavefronts will diverge with no interference. Rather,if there is an intervening medium present which re-emits radiation at aspeed c, then there will be a decrease in wavelength by the amount λ/(1 +u/c).

A change in the speed of light would also occur when light from outerspace enters the atmosphere with an index of refraction different from unity.The extra-terrestrial Michelson experiment was performed by Tomaschekin 1924 giving the same null result found in the terrestrial experiment.This provided “telling evidence against the Ritz theory” [Fox 65]. But, aswe know from Sec. 3.2, the explanation of the null experiment rests on thesupposition that light is traveling at two different speeds, c+u if the emitteris approaching the observer and c − u if it is receding from the observer.

Pauli also criticized emission theories in that if a light wave were trav-eling at velocity c+u, it could not interact with particles scattering radiationat a velocity c. Fox [65] rebuffs this criticism on the basis that the interac-tion can only occur when their frequencies, and not their velocities, are equal.Scattering of radiation will indeed change velocities of the waves, and therewill be a tendency to ‘localize’ their velocities about c with an obliterationof their phase differences in much the same way that the extinction theorypredicts.

4.3 Radiation by an Accelerating Electron4.3.1 What does the radiation reaction force measure?

The radiation reaction force,

Frad = 2e2

3c3 γ2

[u + γ2

c2 u(u · u) + 3γ2

c2 u(u · u) + 3γ4

c4 u(u · u)2]

, (4.3.1)

was first derived by Abraham [05] in 1905, by taking the time-derivativeof his ‘electromagnetic momentum,’ and deduced directly by Schott [12],who called it a ‘radiation pressure.’ Schott “dismissed it very briefly” byclaiming that

It is a small quantity of the order zero, and depends only on the magnitude ofthe charge and its mean motion, not at all on its configuration nor on the relativemotion of its parts. It is not difficult to prove that the expression for [Frad] which is

Page 240: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 213

in question, can be obtained by means of the Lorentz–Einstein transformation fromthe well known expression for the radiation pressure on an electric charge e, whichis vibrating with high frequency but small amplitude, so that its velocity is alwaysvery small while its acceleration is finite. The radiation pressure on such a chargeis equal to

2e2

3c3d3r

dt3,

where r denotes its radius vector and t the time. If we regard this system as movingrelative to a fixed system with velocity v, considered as constant for the time being,and transformed by the method of Lorentz and Einstein, we obtain the expression[(4.3.1)] for [Frad], for a charge e moving with the velocity v, now regarded asvariable.

The transformation from an inertial to a non-inertial one after a Lorentztransform has been performed is, indeed, miraculous. We prefer to refer to(4.3.1) as a radiation reaction force rather than as a radiation pressure for itrepresents the self-interaction of the electron caused by its own radiation.

The radiation reaction force, (4.3.1) can also be derived by an expan-sion in inverse powers of c either from the Liénard, (4.1.8), or the Ritz,(4.1.7) expression for the force. The first term in (4.3.1) is the force exertedby a charge on itself, which is independent of the dimensions of the body,and represents a sort of ‘friction’ due to loss of energy. The radiation force(4.3.1), unlike the Lorentz force, (4.1.11), is clearly not invariant undertime-reversal.

However, if we go back to the original source [Abraham 05], we seethat Abraham began with the Liénard expression for the rate of loss ofenergy due to radiation,

∫ t2

t1dt Wrad = −2

3e2

c3

∫ t2

t1dt γ2{u2γ2 + (u · u)2γ4/c2}, (4.3.2)

and observed that the reaction force was u/c2 times this quantity, i.e.∫ t2

t1dt Frad = −2

3e2

c5

∫ t2

t1dt γ2u{u2γ2 + (u · u)2γ4/c2}. (4.3.3)

Then considering the time intervals of (4.3.2) and (4.3.3) to be short, andobserving the following integrations by parts,

∫ t2

t1dt γ2u = uγ2

∣∣∣t2

t1−

∫ t2

t1dt 2γ4u(u · u)/c2,

Page 241: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

214 A New Perspective on Relativity

∫ t2

t1dt γ4u(u · u)/c2 = u(u · u)

γ4

c2

∣∣∣t2

t1

−∫ t2

t1dt

γ4

c2 {u2u + u(u · u) + 4(u · u)2uγ2/c2},

because

ddt

γ = γ3 u · uc2 .

Then assuming uniform acceleration, he sums the two sides to obtain

∫ t2

t1dt γ2

{u + u(u · u)γ2/c2

}

= −∫ t2

t1dt

γ4

c2

{uu2 + 3u(u · u) + 4u(u · u)2

γ2

c2

}.

If the reaction radiation force is given by (4.3.3), Abraham gets back (4.3.1).But, how is it possible that the integrand of (4.3.3) equals (4.3.1)? It

does not! What is equal to (4.3.1) is

Frad = 2e2

3c3ddt

[u2γ2 + u(u · u)

γ4

c2

]+ u

c2 Wrad. (4.3.4)

However, under the condition of uniform acceleration the terms involvingthe total derivative vanish, so this cannot be right. Actually, under thecondition of uniform acceleration the entire reaction radiation force, (4.3.1)vanishes!

To this already confusing situation Rohrlich [65] adds more. Hedefines the “energy rate of radiation by a charge,” R, to be the negativeof (4.3.2). However, the rate of energy loss by radiation is not (4.3.2), but,rather

Wrad = 2e2

3c3

{ddt

[(u · u)γ2

]− u2γ4 + (u · u)2

γ6

c2

}. (4.3.5)

Supposedly, the terms in the total differential, when evaluated over theends of the time interval, will vanish under the assumption of uniformacceleration. But, under this condition (4.3.5) vanishes altogether.

Page 242: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 215

Rohrlich considers R to be “a Lorentz invariant and constitutes therelativistic generalization of the famous nonrelativistic Larmor formula,(2e2/3c3)u2,” probably not realizing that it was derived by Liénard beforethe advent of relativity. He then goes on to say that onef

is tempted to identify the Abraham four-vector [(4.3.1) in vector notation] with theradiation reaction. This, however, leads to various difficulties: it is possible thatat a particular instant [u = 0] but [u �= 0]; this results in no radiation emission,R = 0, but a nonvanishing “radiation reaction,” [(4.3.1)]. Conversely, it is possiblethat [(4.3.1)= 0] but that radiation is being emitted, R. This is the case whenever

uµ − 1c2 uνuνuµ = 0, uνuν �= 0. (4.3.6)

We recognize this equation as the condition for uniform acceleration. Thus, in uni-formly accelerated motion radiation is emitted while the ‘radiation reaction’ [Frad]vanishes. For these reasons the interpretation of [Frad] as a radiation reaction forceis to be rejected. Obviously, the radiation reaction −Ruµ vanishes if and only if noradiation is emitted, R = 0.

However, if (4.3.1) vanishes, so, too, will (4.3.5) vanish, and this does notappear to be compatible with the fact that energy is being radiated! Such aradiation loss would surely affect the particle’s motion, but it is nowhere tobe found in the Lorentz–Dirac equation [Rohrlich 90, p. 171, formula pre-ceding (6–80)], making that equation extremely dubious to the point whereits existence is called into question. This would eliminate, at one stroke,problems related to pre-acceleration, lack of causality, and self-acceleratingor run-away solutions.

Denoting uν as the velocity four-vector, Abraham’s radiation reactionforce (4.3.1) can be written as

Fµ = 2e2

3c3

(uµ − uνuνuµ

)

= 2e2

3c3

{uγ2 + 3

2dγ2

dtu +

[3γ2 + (u · u)

γ4

c2

]u

}, (4.3.7)

which is the same as (4.3.1), where [Corben 68]

uνuν = R = γ4

(u2 + (u · u)2

γ2

c2

). (4.3.8)

fThe term uµ is a velocity four-vector, see (4.3.8) below.

Page 243: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

216 A New Perspective on Relativity

Then, in order for (4.3.7) to be the radiation reaction force (4.3.1), it isnecessary that

uµ = ddt

(uγ2 + u(u · u)

γ4

c2

), (4.3.9)

and the condition that this term equal Liénard’s radiation rate is preciselythe vanishing of (4.3.1). But, it is not Liénard’s radiation rate that remainsconstant in time [Rohrlich 90, p. 121 formula (5–42) and p. 169 formula(6–114)], but, rather, γ−2 times that expression, or the Beltrami metric,(4.3.16) below.

With the rate of energy loss being given by

Wrad = 2e2

3c3

[c2γγ − γ4u2

]= 2e2

3c3

(c2 d

dt(γγ) − uνuν

), (4.3.10)

and G being identified as the radiation reaction force, (4.3.1), we would havefor small radiation damping and assumed periodic motion at a constantangular velocity ω [Corben 68]

W = −2e2

3c3 ω2u2γ4,

G = −2e2

3c3 ω2uγ4.

These equations imply,

u · G = W ,

which is a mechanical relation, rather than a radiative one. This equation isnot of the form

Frad = G �= uWrad/c2 = mu, (4.3.11)

as the analogy with special relativity would lead us to believe with m asthe mass equivalent of radiation.

Thus, it makes no sense to consider

mu = Frad + · · · (4.3.12)

as an equation of motion for the electron where the dots indicate terms likethe Lorentz force. To see this, it suffices to consider uniform accelerationwhere Frad = 0, and non-constant, G. Equation (4.3.12) is often referred to

Page 244: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 217

as the Lorentz–Dirac equation, which is a third-order equation, manifestingunphysical, ‘run-away,’ solutions, pre-acceleration, and the like.

We can, however, identify the acceleration in (4.3.2) with the Lorentzforce, in which case it becomes [Landau & Lifshitz 75]

Wrad = − 2e2

3m2c3 γ2

{(E + u

c× H

)2 + (u · E)2γ2

c2

}.

The radiation reaction force, according to Abraham, would be u/c2 timesthis value. Hence, there is no logic to adding the Lorentz force onto anequation which already contains it.

4.3.2 Constant rate of energy loss in hyperbolicvelocity space

There is a much more elegant, and intuitive, way to proceed. We want togeneralize Larmor’s formula,

Wrad = −23

e2

m2c3

(dGdt

· dGdt

), (4.3.13)

where G = mu. The Lorentz-invariant generalization of (4.3.13) is[Jackson 75]

Wrad = −23

e2

m2c3

(dGµ

dGµ

), (4.3.14)

where dτ = dt/γ is the proper time element, Gµ is the charged particle’smomentum–energy vector, and the four-vector scalar product,

−dGµ

dGµ

dτ=

(dGdτ

)2

− 1c2

(dWdτ

)2

.

Introducing W = γmc2 and G = γmu into the four-vector scalar productgives Liénard’s expression

Wrad = 23

e2

c3 γ6{u2 − (u × u)2 /c2

},

Page 245: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

218 A New Perspective on Relativity

or, equivalently,

23

e2

c3 γ4{u2 + γ2 (u · u)2 /c2

}. (4.3.15)

Now the remarkable thing about (4.3.15) is that if we evaluate thetime-derivatives in the co-moving frame using proper time,

Wrad = 23

e2

c3 γ2

{(dudτ

)2

+ γ2(

u · dudτ

)2

/c2

}, (4.3.16)

we clearly see that it is none other than the Beltrami metric! So the Beltramimetric is the rate of energy loss in a frame which is moving with theelectron. It is apparent that the rate of energy loss will be decreased inthe co-moving frame by and amount γ−2.

It is also prodigious that the Beltrami metric,

(dsdτ

)2

= γ2

c2

{u2 + (u · u)2

γ2

c2

}, (4.3.17)

is an electrokinetic potential from which the Euler–Lagrange equations canbe derived. Taking the positive square root of (4.3.17) we obtain the shortestarc length of a curve traced out by the system between times τ1 and τ2 fromthe action principle

s = c∫ τ2

τ1

√(2L)dτ, (4.3.18)

where

L = γ2

2c2

{u2 + (u · u)2

γ2

c2

}, (4.3.19)

and the dot will now stand for the proper time-derivative. The shortestcurve that is traced out in the time interval between τ1 and τ2 will be thatfor which the variation of the integrand, (4.3.18), vanishes. The Lagrangian,(4.3.19), bears an uncanny similarity to Liénard’s expression for the rate ofenergy loss due to radiation, (4.3.2). In fact, it is just γ−2 times smaller!

Page 246: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 219

The Lagrangian, (4.3.19), is a second-order homogeneous function ofthe accelerations,g

u∂L∂u

= 2L,

and we will now see that the Euler–Lagrange equations requires that it bea first integral of the motion.

The variational equations (in velocity space!) are

ddτ

1√(2L)

∂L∂u

− 1√(2L)

∂L∂u

= 0. (4.3.20)

The condition that (4.3.20) becomes the Euler–Lagrange equations is thatL must be an integral of the motion. That is to say, the condition that dL/dτ =0 is

u = −2(u · u)

c2 γ2u. (4.3.21)

For then (4.3.20) becomes

ddτ

∂L∂u

− ∂L∂u

= 0,

which is

γ2{

u +(γ

c

)2(u · u)u + 2

c

)2(u · u)u + 2

c

)4(u · u)2u

}= 0. (4.3.22)

The variational equations (4.3.22) will not be satisfied if there is an externalforce, Fext acting on the system. In the co-moving frame, γFext must beadded to the right-hand side of (4.3.22), viz.

Fext = αγ

{u +

c

)2(u · u)u + 2

c

)2(u · u)u + 2

c

)4(u · u)2u

}

= α

{ddτ

[uγ + (u · u) uγ3/c2

]− γ3

c2

[u2 + (u · u)2

γ2

c2

]u

}, (4.3.23)

where α is a constant of proportionality, yet to be determined.

g∂L/∂u is a symbolic representation of the vector whose components are the deriva-tives of L with respect to the corresponding components of u.

Page 247: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

220 A New Perspective on Relativity

Integrating (4.3.23) over a proper time interval, the first term vanishesleaving

∫ τ2

τ1

Fextdτ = −α

{∫ τ2

τ1

dτγ3

c2

[u2 + (u · u)2

γ2

c2

]u

}. (4.3.24)

If we set α = 2e2/3c3, (4.3.24) is the Abraham radiation force, (4.3.3), inco-moving frame.

Introducing the acceleration [Fock 59],

a = uγ2, (4.3.25)

we can write the Euler–Lagrange equations in the form

γFrad = 2e2

3c3

(a + (u · a)u

γ2

c2

). (4.3.26)

In a state of uniform acceleration, a = const., and in a frame co-movingwith the radiating electron the radiation reaction force, (4.3.26), van-ishes. There will be a constant rate of energy loss,

W ′rad = −2e2

3c3 2L = −2e2

3c2 γ2

(u2 + (u · u)2

γ2

c2

)= const., (4.3.27)

in the co-moving frame which is γ−2 times smaller than Liénard’sformula, (4.3.15), which is what a stationary observer would measure.

We will see in Sec. 9.6 that the Beltrami metric is the metric for auniformly rotating disc. Give it a charge and its power loss due to radiationwill be given by (4.3.16).

4.3.3 Radiation at uniform acceleration

The condition for uniform acceleration, (4.3.21), can be generalized to anyform of hyperbolic motion,

u = −n(u · u)

c2 γ2u, (4.3.28)

Page 248: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 221

for integer n. Abraham’s case is n = 3, while the Beltrami metric corre-sponds to n = 2. We will begin with the latter.

Consider a uniformly accelerated particle moving in the x-direction,where u is the only component of the velocity in this direction. Criterion(4.3.21), or what is the same, d(uγ2)/dτ = 0, can be integrated to give

u1 − u2/c2 = g, (4.3.29)

where g is the particle’s uniform acceleration. (4.3.29) will easily be rec-ognized as the equation for the velocity of a body falling under the forceof gravity, −g, or under the force of electrical attraction. Multiplying bothsides by u, gives

uuc2 − u2 = gu/c2. (4.3.30)

The negative of (4.3.30) represents the rate at which a homoge-neous compression parallel to the direction of motion is taking place[Schott 12, p. 174].

Where does this strain come from? Obviously, we need to consider theelectron as finite, say with a bounding surface S(x, y, z, t) = 0 that remainsstationary in time, i.e.

dSdt

= ∂S∂t

+ ux∂S∂x

+ uy∂S∂y

+ uz∂S∂z

= 0. (4.3.31)

We assume that the velocity components are linear functions of the coor-dinates, viz.

ux = σ11x + σ12y + σ13z,

and similar expressions for uy and uz.Now suppose that the electron undergoes a FitzGerald–Lorentz

contraction in the direction of motion, which is the x-direction, so that

x = x0√

(1 − β2), y = y0, z = z0,

where the coordinates (x0, y0, z0) are the coordinates of any point on theelectron at rest while (x, y, z) are those of the same point when the electronis in motion, and β = u/c. Consequently, the velocity components are

ux = − ββ

1 − β2 , uy = 0, uz = 0.

Page 249: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

222 A New Perspective on Relativity

Hence, the electron undergoes a pure strain,

σ11 = − ββ

1 − β2 ,

which is what (4.3.30) says.The condition that the bounding surface remains stationary,

∂S∂t

− ββ

1 − β2 x∂S∂x

= 0,

can be solved for its characteristics by means of Lagrange’s method,

dx−ββ

1−β2 x= dy

0= dz

0= dt.

The three independent integrals are:

x√(1 − β2)

= const., y = const., z = const.,

so that the equation of the surface is

S(

x√(1 − β2)

, y, z)

= 0,

when the electron is in motion, and reduces to S(x, y, z) = 0 when at rest.Returning to (4.3.29), and assuming that the velocity is zero at time

t = 0, we get by integration

u = c tanh (gτ/c), (4.3.32)

where gτ is the hyperbolic measure of the velocity. (4.3.32) expresses therelative velocity in terms of a segment of a Lobachevsky straight line. If wefurther assume that x = 0 at τ = 0, we obtain

x = c2

gln cosh (gτ/c).

We have set the arbitrary integration constant equal to zero because, forgτ � c, it must reduce to the nonrelativistic relation, x = 1

2gt2, for themotion of a particle with constant acceleration since there is no longer anydistinction between proper and coordinate times.

Page 250: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 223

The rate of energy loss due to the uniform acceleration of the electronthrough radiation is

Wrad = −2e2

3c3 g2γ2 sech2(gτ/c) = −2e2

3c3 g2, (4.3.33)

which is precisely Larmor’s formula, (4.3.13) for constant acceleration. Therate loss of energy, (4.3.33) remains constant in time.

The case n = 3 is well-known and was discovered in the early daysof special relativity. For a particle under the influence of a constant gravi-tational acceleration,

g = ddt

(u√

(1 − u2/c2)

), (4.3.34)

will be constant so that integration gives simply

gt = u√(1 − u2/c2)

= c sinh u/c. (4.3.35)

Now, the velocity can be written as

u = c tanh u/c = gt√(1 + (gt)2/c2)

= dxdt

. (4.3.36)

If we further assume that x = 0 at t = 0, we get a second integral

x = c2

g(√

(1 + (gt)2/c2) − 1) = c2

g( cosh u/c − 1). (4.3.37)

This is the one-dimensional hyperbolic motion found by Born [09] in 1909,and by Sommerfeld one year later. It will be our prototype of a one-dimensional system at constant acceleration.

The measure of the hyperbolic velocity u can be obtained fromtime dilatation. Infinitesimal increments in moving inertial frame, dτ withvelocity u, and dt are related by

dτ = dt/γ = √(1 − u2/c2)dt. (4.3.38)

Integrating (4.3.38),

τ =∫ t

0

dt√(1 + (gt)2/c2)

= cg

sinh−1 (gt/c),

Page 251: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

224 A New Perspective on Relativity

or

gt = c sinh (gτ/c), (4.3.39)

we obtain the time interval indicated by the moving clock when the elapsedtime according to the clock at rest is t. In the derivation of (4.3.39) we haveused (4.3.36). On the strength of (4.3.35), we find the hyperbolic measureof the velocity as u = gτ. As t → ∞, τ will increase more slowly than t.

Dividing (4.3.37) by (4.3.39) we find

u = xt

= c(

cosh (gτ/c) − 1sinh (gτ/c)

)= c tanh (gτ/2c),

which is, again, the line segment in Lobachevsky space. The rate at whichenergy is lost by a uniform accelerating electron,

Wrad = − e2

6c3 g2, (4.3.40)

is, again, constant in time, but only one-quarter as large as Larmor’sformula, (4.3.33).

Finally, in the case where the acceleration is perpendicular to the veloc-ity, which can be the case of a charge moving in a circle of given radius andgiven angular velocity, the rate of energy loss will be given by (4.3.16),

Wrad = −2e2

3c3 u2γ2. (4.3.41)

Expression (4.3.41) is again γ−2 smaller than the rate of energy loss reportedin standard texts [Panofsky & Phillips 55], because our frame of referencecoincides with that of the electron.

One can argue that in the frame where the electron seems at rest,there would be no magnetic field, and, therefore, Poynting’s vector wouldvanish indicating that there is no flow of energy. Hence, we will not seeany radiation. However, in the transformation to proper time, (4.3.38), thevelocity is the electron’s momentary velocity, and since the acceleration isnon-vanishing, it will be constantly changing. The very fact that the acceler-ation is non-vanishing attests to the fact that the electron must be radiating.

To summarize we may say that the existence of the metric (4.3.17)shows that there is no meaning to solving a third-order equation, referredto as the Lorentz–Dirac equation. It appears that no one ever thought of

Page 252: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 225

that Liénard’s formula for the rate of energy loss by an accelerating electronas something to render extremum. We have shown that it is not Liénard’sformula that is a first integral of the motion, but, rather, his expressiontransformed to the frame of the instantaneous velocity of the electron. Thepaths which render (4.3.19) an extremum are those of hyperbolic motionthat is followed by a uniformly accelerating charge. Along such paths theradiation reaction force vanishes. The radiation reaction force is not to berejected outright [Rohrlich 90], but, rather, has to be interpreted as measur-ing the deviation from hyperbolic motion executed by a uniformly accel-erated charge. Such a situation occurs when the curvature is no longerconstant.

4.3.4 Curvatures: Turning and twisting

The decomposition of the force into its curvature components of turningand twisting was first carried out by Schott [12]. At any point in the three-dimensional particle trajectory a mutually orthogonal frame can be erectedwith unit vectors q, n, and b, as shown in Fig. 4.3. As the particle tracesout its trajectory, these unit vectors change direction but always remainorthogonal to one another.

Denote by r the radius vector to the given point along the curve andlet s stand for the arc length. Call q = dr/ds the unit vector tangent to thecurve, and denote ρ = ds/dϕ as the radius of curvature. Since the angularfrequency is ϕ = dϕ/ds,

dqds

= n/ρ, or ˙q = nωρ, (4.3.42)

Fig. 4.3. Frenet frame field for a trajectory of the motion.

Page 253: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

226 A New Perspective on Relativity

where the component of the angular velocity, ωρ = u/ρ, and n is normal toq, tangent to the curve, and is directed toward the center of curvature.h

Finally, denote by b, the binormal, b = q × n. Since db/ds is normalto b we can take it along n, i.e.

dbds

= −n/τ, or ˙b = −nωτ , (4.3.43)

where ωτ = u/τ, and the minus sign is traditional. The function, τ is theradius of torsion, or torsion for short, and unlike ρ, it may be negative, oreven zero at points along the curve. Moreover, since n = b × q, it followsthat

dnds

= −n × q/τ + b × n/ρ = b/τ − q/ρ,

or equivalently,

n = bωτ − qωρ. (4.3.44)

Equations (4.3.42), (4.3.43), and (4.3.44) are known as the Frenet–Serretequations.

Now, in order to write the radiation reaction force, (4.3.1), incomponent form, we introduce the four-vector,

uν = (γβ, γ) = (q sinh β, cosh β),

since uνuν = −1, where β = u/c, and β = u/c is the hyperbolic measureof the relative velocity. Employing the Frenet equations we calculate itsderivatives as

uν =(q ˙β cosh β + ωρn sinh β, ˙β sinh β

),

uν =((ωρωτ sinh β)b + (ωρ sinh β + ωρ

˙β cosh β)n (4.3.45)

+ ( ¨β cosh β + ( ˙β2 − ω2ρ) sinh β)q, ¨β sinh β + ˙β2 cosh β

).

The square of the first equation in (4.3.45),

uνuν = ˙β2 + ϕ2 sinh2 β, (4.3.46)

hThe dot denotes the derivative with respect to coordinate time, and not propertime as in Rohrlich [65].

Page 254: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 227

where we have substituted ϕ for ωρ. Expression (4.3.46) is none other thanthe Beltrami metric, (4.3.17). Apart from a multiplicative factor, (4.3.46) iswhat has been referred to as an ‘invariant radiation rate’ [Rohrlich 90].

The radiation reaction force has three components:

(i) the tangential component,

Fq = 2e2

3cγ3

[ ¨β +( ˙β2 − ω2

ρ

)β + 2 ˙β2β2γ4

](1 + β2γ4), (4.3.47)

(ii) the normal component,

Fn = 2e2

3cγ3

[ωρβ + ωρ

˙β(1 + 2β2γ4

)], (4.3.48)

and(iii) the binormal component,

Fb = 2e2

3cγωρωτ , (4.3.49)

where we substituted γ for cosh β and γβ for sinh β.

The condition for the vanishing of the radiation reaction force, (4.3.21),gives two conditions:

(i) one in the direction of the principal normal vector field of the trajectory,

ωρβ + ωρ˙β(1 + 2β2γ4

)= 0, (4.3.50)

and(ii) one along the tangent vector field,

¨β +( ˙β2 − ω2

ρ

)β = −2 ˙β2β2γ4. (4.3.51)

Condition (4.3.50) makes (4.3.48) vanish, while (4.3.51) makes (4.3.47)zero. The last remaining component of the self-force, (4.3.49) requires us toset ωτ = 0. Since ωτ measures the twisting, or torsion, of the trajectory, itsvanishing is a necessary condition for the radiation reactive force to vanish.

Page 255: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

228 A New Perspective on Relativity

The rate at which energy is radiated can be determined from (4.3.46);it is

Wrad = −2e2

3c3 γ2(ω2

ρβ2 + β2γ2

). (4.3.52)

From (4.3.52) we can conclude that unlike the torsion, the curvature func-tion will be involved in the rate at which energy is lost through radiation.Since the last term in (4.3.50) is small, it can be neglected; then integratingthe equation we get

ωρ = c1

√(1 − β2)

β,

where c1 is a constant of integration. Introducing ωρ = u/ρ we come outwith

uρ = u2

ρ= c2 × √

(1 − β2),

where c2 = cc1. Consequently, the centripetal acceleration will decrease asthe velocity increases. The decrease in the centripetal acceleration, corre-sponding to an increase in the radius of curvature ρ, is due to energy lossthrough radiation. The particle begins to spiral outwards with the conse-quence that the angular velocity ωρ = u/ρ decreases.

It is rather peculiar that no mention has been made of the decreasein the rest mass due to radiation, and, in fact, no mention has been madeof mass at all. Where then is the equivalence of mass loss and radiatedenergy? Corben [68] associates Wrad with the energy of the particle, and theradiation reaction force Frad with the change in momentum of the particle,G. Moreover, he considers the case of gyroscopic motion, where ω = u/r =mc2/s, for a particle moving in a circle of radius r in a plane normal tothe spin s, whose magnitude is s. As a result of radiation, circular motionis converted into helical motion. However, it all hangs on the associationof rest mass with sω/c2 that allows him to conclude “the rest-energy andtotal energy become progressively smaller, corresponding to the fact thatelectromagnetic energy is being radiated away from the particle.”

Page 256: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 229

4.3.5 Advanced potentials as perpetual motion machines

Much use and abuse has been made of retarded and advanced potentialsin the self-interaction of the electron. The following discussion has beenused by Feynman [64] to show the utility of introducing advanced, as wellas retarded, potentials. Although it is undoubtedly motivated by the factsthat the different forces add, and that the inertial term is proportional to theacceleration, while the Schott term, or the self-reaction of the electron dueto its radiation, is proportional to the rate of change of the acceleration, itcan find no justification in the original derivations of the force expressionsgiven by Liénard (1898), Heaviside (1902), Schwarzschild (1903), nor Ritz(1908). All of them began with the Lorentz force, (4.1.11), and used thenotion of retarded potentials to cast it in a form

Fx = ee′

r2

[A cos (rx) − B

u′x

c− C

ux

c2

],

for the x-component, where the coefficients A, B, and C depend on thevelocities and accelerations, but not upon higher time-derivatives.

However, according to Feynman, the self-interaction of the electronis described by a force which begins with the acceleration and containshigher-order derivatives,

Fx = −m′x − 23

e2

c3...x + e2a

c4....x + · · · , (4.3.53)

where m′ = 23 e2/ac2 is the electrostatic mass, which we will meet in

Sec. 5.4.3, and a is the ‘classical’ electron radius. In the limit as a → 0,the third term in (4.3.53) will go to zero, while the first term becomes infi-nite, i.e. an infinite mass. The second term is the radiation damping, and itis independent of a. This term describes the action of the electron on itself,and we want to extract this term from the others in (4.3.53).

Let us try changing c into −c for it will change the sign of the secondterm in (4.3.53). The resulting force from such an operation would be

Fadvx = −m′x + 2

3e2

c3...x + e2a

c4....x + · · · . (4.3.54)

But, changing the sign of c changes a retarded into an advanced potential.Instead of viewing the charge at the previous time, t − r/c, we now view itat a later time, t + r/c. Since it is the second term that we want, Dirac said

Page 257: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

230 A New Perspective on Relativity

that the electron acts on itself half the time as a retarded field, and half thetime as an advanced field [Feynman 64]. The net force is half the differencesof the forces, (4.3.53) and (4.3.54),

12

(Fx − Fadv

x

)= −2

3e2

c3...x , (4.3.55)

since any higher-order terms that persist will go to zero with a. So simplyby continually running time forward and backward, we will have simulta-neous diverging and converging waves acting on the electron, which willfeel a force given by (4.3.55).

The electron will, according to Ritz, be excited by its own radiationby the advanced wave, and will radiate by the retarded wave. This cancontinue indefinitely and constitutes a perpetuum mobile of the second kind.As such it is outlawed by the second law. Although there is nothing toexclude advanced potentials, or linear combinations with retard potentials,such solutions to Maxwell’s equations should be outlawed on the basisthat they violate causality, irreversibility, and they constitute elements forconstructing perpetual motion machines.

Expanding Liénard’s force, (4.1.8) to terms in 1/c3 gives

F = −m′a + 2e2

3c3 a. (4.3.56)

The last term is the so-called Schott [12] radiation term, because it wasSchott who first brought out its significance as the friction due to lossof energy by radiation, or the self-reaction of the charge on itself dueto radiation. If it could change sign, as Dirac supposes, what was radia-tion loss would be radiation gain, and an unlimited energy supply wouldexist leading to perpetual motion of the second kind. However, beforewe jump to conclusions, let us take a closer look at the origin of thisterm.

Frenkel [26] tells us that the acceleration at time t is not a, but, rather,there is a component coming from an earlier time t′ = t − r/c, viz.

a′ = a(t) + (t′ − t)a.

This introduces an extra term into the force,

de de′

2c3

(ax + ar cos2 (rx)

).

Page 258: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 231

If we take the acceleration a along the x-axis so that ar = ax = a, thenperforming the integration gives

e2

2c3 a(

1 + 13

)= 2e2

3c3 a.

We know that the charge accelerates because we can see its radiation.A change in sign of the Schott term would mean that there is a contributionto the acceleration at a later time, t′′ = t+ r/c. We are consequently viewingsomething that will happen in the future, and this destroys causality! Wecansee something thathappened in the past, like the stars thatglowatnight,because it takes light a finite time to reach us. But, to see something in thefuture would be to have a crystal ball at our disposal. Since there are nocrystal balls, there can be no so-called pre-acceleration [Rohrlich 90, p. 151].Hence, (4.3.54), and, (4.3.55), have no meaning since an advanced potentialis meaningless.

The rate at which energy is lost is obtained by taking the scalar productof (4.3.56) and u; we then obtain

F · u = −m′

2ddt

u2 + 2e2

3c3ddt

(a · u) − 2e2

3c3 a2. (4.3.57)

If the charge oscillates back and forth, the time average of the first two termsin (4.3.13) vanish, thereby leaving a single term that was found by Larmorin 1897.

However, we cannot use Maxwell’s method of showing that a term inthe Lagrangian of the form (2e2/3c3)(a · u) would lead to an indeterminatesign in the expression for the kinetic energy since this term has no effectupon the Euler–Lagrange equations,

Fx = −∂L∂x

+ ddt

∂L∂ux

− d2

dt2∂L∂ax

.

The last two terms would lead to an exact cancellation.In closing this chapter, we might mention the anecdote that it was

none other than Einstein who, during his Princeton years, brought Ritz’semission theory to the attention of Wheeler and Feynman [45, 49] in 1941,who were working on a time-symmetric absorber theory which used acombination of retarded and advanced potentials. Was he still carrying onhis debate with Ritz, and having second thoughts about emission theories,

Page 259: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

232 A New Perspective on Relativity

as Lanczos seemed to feel? As for their theory it has but all been forgotten,for a time-symmetric absorber theory risks being branded as a perpetualmotion machine.

References

[Abraham 05] M. Abraham, Theorie der Strahlung (Leipzig, 1905), p. 123, Eq. (85);1st ed. Theorie der Elektrizität, Vol. 2, 5th ed. (Teubner, Leipzig, 1923), p. 115.

[Arzeliès 66] H. Arzeliès, in Rayonnement ed Dynamique du Corpuscule Chargé Forte-ment Acceléré (Gauthier-Villars, Paris, 1966), pp. 74–79.

[Born 09] M. Born, “Die Theorie des starren Elektrons in der Kinematik des Rela-tivitätsprinzips,” Ann. der Physik 30 (1909) 1–56; A. Sommerfeld, “Ber dieZusammensetzung der Geschwindigkeiten in der Relativtheorie,” Verh.der DPG 21 (1909) 577–582.

[Born & Wolf 59] M. Born and E. Wolf, Principles of Optics (Pergamon Press,New York, 1959), p. 70.

[Corben 68] H. C. Corben, Classical and Quantum Theories of Spinning Particles(Holden-Day, San Francisco, 1968), Sec. 11.

[Einstein 05] A. Einstein, “Ist die Trägheit eines Körpers von seinem Energieninhaltabhängig,” Ann. Phys. 18 (1905) 639–641.

[Einstein 35] A. Einstein,“Elementary derivation of the equivalence of mass andenergy,” Bull. Am. Math. Soc. 22 (1935) 223–230.

[Einstein 49] A. Einstein, Autobiographical notes, 1949.[Feynman 64] R. P. Feynman, The Feynman Lectures on Physics, Vol. II (Addison-

Wesley, Reading MA, 1964), Ch. 28, p. 11.[Fock 59] V. Fock, The Theory of Space, Time, and Gravitation (Pergamon Press,

New York, 1959), p. 41.[Fox 62] J. G. Fox, “Experimental evidence for the second postulate of special

relativity,” Am. J. Phys. 30 (1962) 297–300.[Fox 65] J. G. Fox, “Evidence against emission theories,” Am. J. Phys. 33 (1965) 1–17.[Frenkel 26] J. Frenkel, Lehrbuch der Elektrodynamik, Vol. 1 (Berlin, 1926), pp. 208–

211.[Ives 52] H. E. Ives, “Derivation of the mass-energy relation,” J. Opt. Soc. Am. 42

(1952) 540–543.[Jackson 75] J. D. Jackson, Classical Electrodynamics, 2nd ed. (Wiley, NewYork, 1975),

p. 660.[Jammer 61] M. Jammer, Concepts of Mass and Modern Physics (Harvard U,

Cambridge MA, 1961).[Lanczos 74] C. Lanczos, The Einstein Decade (1905–1915) (Elek Science, London,

1974), p. 161.[Landau & Lifshitz 75] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields

(Pergamon Press, Oxford, 1975), p. 195. In (73.7) there is a sign differenceand the absence of γ2 in the second term.

[Lavenda 02] B. H. Lavenda, “Does the inertia of a body depend on its ‘heat’ con-tent?,” Naturwissenschaften 89 (2002) 329.

Page 260: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

Electromagnetic Radiation 233

[Lavenda 09] B. H. Lavenda, A New Perspective on Thermodynamics (Springer,New York, 2009).

[Liénard 98] A. Liénard, “Champ électrique et magnétique produit par une chargeélectrique concentrée en un point et animée d’un mouvement quel-conque,” L’éclairage électrique 16 (1898) pp. 5, 53, 106.

[Maxwell 91] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed.(Clarendon Press, London, 1891), Ch. 23.

[O’Rahilly 38] A. O’Rahilly, Electromagnetics, (Longman, Green & Co., London,1938).

[Pais 82] A. Pais, Subtle is the Lord (Oxford U. P., Oxford, 1982), p. 467.[Panofsky & Phillips 55] W. K. H. Panofsky and M. Phillips, Classical Electricity and

Magnetism (Addison-Wesley, Reading MA, 1955), p. 307, Eqs. (19) (35).[Pauli 58] W. Pauli, Theory of Relativity (Pergamon Press, New York, 1958), pp. 5–9.[Riseman & Young 53] J. Riseman and I. G. Young, “Mass-energy relationship,”

J. Opt. Soc. Am. 43 (1953) 618; H. E. Ives, “Note on ‘Mass-energy relation-ship,’ ” ibid 43 (1953) 618–619.

[Ritz 08] W. Ritz, “Ricerches critiques sur l’Électrodynamique Générale,” Ann.Chimie et Physique, 8th series, XIII (1908) 145–275; translated and com-mented upon by W. Hovgaard, “Ritz’s Electrodynamic Theory,” J. Math.Phys. 11 (1932) 218–254.

[Ritz and Einstein 09] W. Ritz and A. Einstein, “Zum gegenwärtigen Stande desStrahlungsproblems,” Phys. Z. 10 (1909) 323–324.

[Rohrlich 90] F. Rohrlich, Classical Charged Particles: Foundations of Their Theory(Addison-Wesley, Reading, MA, 1990).

[Schott 12] G. A. Schott, Electromagnetic Radiation (Cambridge U. P., Cambridge,1912), p. 246.

[Stachel & Torretti 82] J. Stachel and R. Torretti, “Einstein’s first derivation of themass–energy equivalence,” Am. J. Phys. 50 (1982) 760–763.

[Thomson 21] J. J. Thomson, Elements of the Mathematical Theory of Electricity andMagnetism (Cambridge U. P., Cambridge, 1921), p. 388.

[Wheeler & Feynman 45] J. A. Wheeler and R. P. Feynman, “Interaction with theabsorber as the mechanism of radiation,” Rev. Mod. Phys. 17 (1945)157–181.

[Wheeler & Feynman 49] J. A. Wheeler and R. P. Feynman, “Classical electrody-namics in terms of direct interparticle interaction,” Rev. Mod. Phys. 21(1949) 425–433.

Page 261: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch04

This page intentionally left blankThis page intentionally left blank

Page 262: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

Chapter 5

The Origins of Mass

5.1 Introduction

The hallmark of the potential theory of a long rod is that the attractionof an infinitely long rod for a particle which is at a distance r from it isinversely proportional to r, and not to the inverse of its square. This givesrise to logarithmic potentials and the connection with inverse hyperbolicfunctions. We may then supplant the long rod by constant spheroidal levellayers in which there appears the eccentricity of a meridian section, whichis the intersection of a surface of revolution, in this case a prolate spheroid,with a plane that contains the axis of revolution. The axis of revolutioncoincides with the direction of the rod. The eccentricity, or the ratio betweenthe distance form a point on the conic to the focus and the distance from thatpoint to the directrix will have the exact same role as the relative velocity inaberration. Not only will this allow us to draw the parallelism between themotional distortion of stellar aberration and the eccentricity of the potentialfor prolate ellipsoid, but, moreover, it will pave the way to determining themass dependence on the speed once we introduce the relativistic expressionthat relates the energy, or potential, to the momentum.

In this way we will appreciate that the two models of an electronbased on the deformation of a sphere into prolate and oblate spheroids aretwo sides of the same coin. The fact that the eccentricity of the meridiansection of the spheroid is identified as the relative speed implicates thatthe deformation of the spherical electron at rest into a spheroid in motionis caused by the FitzGerald–Lorentz contraction, just like Abraham andLorentz conceived it to be. However, there is no reason to pass summaryjudgment on the two models and declare Lorentz the winner. For it willturn out that these models are related to one another as elliptic geometry is

235

Page 263: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

236 A New Perspective on Relativity

related to hyperbolic geometry! And it is our premise that if a phenomenonoccurs in one of the two non-Euclidean geometries it will almost certainlyoccur in the other.

5.2 From Motional to Static Deformation

Consider two incoming light signals that make angles ϕ1 and ϕ2 withrespect to the z-axes in two frames that are traveling at a relative veloc-ity u in (b) with respect to one in (a), as shown in Fig. 5.1.

If the velocities of the two outgoing signals are u1 = cos ϕ1 andu2 = cos ϕ2, the velocity composition law in the z-direction is

cos ϕ2 = cos ϕ1 − u1 − u cos ϕ1

,

while, in one of the perpendicular planes,

sin ϕ2 =√

(1 − u2) sin ϕ

1 − u cos ϕ.

Fig. 5.1. Stellar aberration: (a) A telescope at rest, and (b) a telescope aimed at thesame star but in relative motion.

Page 264: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 237

With the aid of the trigonometric identity, tan ϕ/2 = sin ϕ/(1 + cos ϕ), weobtain

tan ϕ2/2 =(

1 + u1 − u

)1/2

tan ϕ1/2. (5.2.1)

Equation (5.2.1) says that the ratio of half the tangent angles in the twoinertial frames are longitudinally Doppler-shifted. We will now see howaberration arises in potential theory when the relative velocity is replacedby the eccentricity. Then we will see how mass depends upon eccentricity.Reverting to relative velocities shows how relativistic mass acquires adependence on them.

5.2.1 Potential theory

Consider a rod of length 2� pointing in the z-direction. It will have aconstant, linear mass density ρ = m0/2�. If we consider the infinitesimalsection dz, measured from the center of the rod, it will have an infinitesimalmass of

dm = ρ dz,

as shown in Fig. 5.2. The potential � that a particle will feel at a pointP at a distance r from the rod is determined from the fact that for a rodthe attraction is proportional to the inverse of the distance, instead of theinverse square of the distance so that

�(r) = G∫

dmr

= Gρ

∫ +�

−�

dzr

,

where G is Newton’s gravitational constant.If ϕ is the angle subtended by the rod and the line to the point P from

the rod, then the infinitesimal arc length that is swept out when we movefrom the origin to a distance dz up the rod is

r dϕ = sin ϕ dz.

This allows us to express the potential as

� = Gρ

∫ ϕ2

ϕ1

sin φ= Gρ ln

(tan ϕ2/2tan ϕ1/2

). (5.2.2)

Page 265: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

238 A New Perspective on Relativity

Fig. 5.2. The potential of a homogeneous rod.

If we use the half angle formula, tan ϕ/2 = √[(1−cos ϕ)/(1+cos ϕ)], we canwrite the potential (5.2.2) as the logarithm of the cross-ratio

� = 12

Gρ ln

(1 + cos ϕ1

1 − cos ϕ1· 1 − cos ϕ2

1 + cos ϕ2

)

= 12

Gρ ln {cos ϕ1, cos ϕ2| − 1, 1} . (5.2.3)

So fromverysimplegeometrical arguments, we havecreated forourselvesahyperbolic space equipped with a cross-ratio, whose logarithm is a measureof hyperbolic distance. The potential (5.2.3) is related to hyperbolic distance.In fact, it is the difference of two hyperbolic lengths,

� = Gρ[tanh−1 ( cos ϕ1) − tanh−1 ( cos ϕ2)].

The potential vanishes for equal hyperbolic lengths. And for ϕ2 = π/2, ϕ1

becomes the angle of parallelism which is a function only of the distance,(�/Gm0)2�.

Page 266: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 239

If x is the normal distance between the rod and the point P then

tan ϕ1 = xz + �

, tan ϕ2 = xz − �

, (5.2.4)

and using the half-angle formula for the tangent, (5.2.2) becomes

� = Gρ ln

(√[(z − �)2 + x2] − (z − �)√[(z + �)2 + x2] − (z + �)

).

Now, if r1 and r2 are the distances between the ends of the rod and the pointP, as shown in Fig. 5.3, viz.

r21 = x2 + (z + �)2, r2

2 = x2 + (z − �)2, (5.2.5)

Fig. 5.3. A rod AB has length 2� with O as its center. The attracted point P with anelement of mass dm at a distance r from it. r1 and r2 are the lines joining P to theends of the rod at A and B.

Page 267: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

240 A New Perspective on Relativity

we can write the potential as

� = Gρ ln

(r2 − z + �

r1 − z − �

).

The difference between the two expressions in (5.2.5) is

4�z = r21 − r2

2, (5.2.6)

and so,

� + z = r21 − r2

2 + 4�2

4�, � − z = 4�2 − r2

1 + r22

4�.

The potential can thus be brought into the form

� = Gρ ln

(4�r2 + r2

2 + 4�2 − r21

4�r1 − r21 + r2

2 − 4�2

)

= Gρ ln(r1 + r2 + 2�)(r2 − r1 + 2�)(r2 + r1 − 2�)(r2 − r1 + 2�)

= Gρ ln

(r1 + r2 + 2�

r1 + r2 − 2�

). (5.2.7)

Surfaces of constant potential, or level surfaces, are defined by therelation

r1 + r2 = 2a, (5.2.8)

for which � = const. The level surfaces are prolate ellipsoids with a majoraxis 2a, and foci that are located at the ends of the rod. Therefore, the rod canbe supplanted by an infinite number of spheroidal level layers. Introducingthe eccentricity, ε, of a meridian section of the level surface by � = εa, into(5.2.7) leads to

� = Gρ ln

(1 + ε

1 − ε

)= 2Gρ tanh−1 ε. (5.2.9)

When viewed far from the rod, a is large and ε must be small, if the rod is tohave constant length. The equipotential surfaces appear nearly spherical.Rather, for points on the rod itself � = a, and with ε = 1, � is infinite. Forpoints in the neighborhood of the rod, ε < 1 and � is very large. At largedistances both ε and � tend to zero.

Page 268: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 241

Comparing (5.2.9) with (5.2.2) we obtain the condition for equipoten-tial surfaces, where � is constant, as

tan ϕ1/2 = D−1 tan ϕ2/2, (5.2.10)

where

D =(

1 + ε

1 − ε

)1/2

. (5.2.11)

On an equipotential surface, � is constant and equal to

D = e�/Gρ = eε,

where ε is the hyperbolic measure of the eccentricity, which is no longerlimited to the interval [0, 1].

But (5.2.10) will be identical to (5.2.1) when we identify the eccentric-ity, ε, with the relative speed, u/c. Both create large distortions for valuesin the neighborhood of unity. As the eccentricity is varied a sphere is trans-formed into an infinitely long rod. Of all the shapes that the system canpass through, the sphere has the minimum volume. We can pass throughconstant level surfaces by varying ε just as we can by transforming fromone inertial frame to another. Gravitational attraction is supplanted by elec-trostatic attraction. Inside the spheroid, a particle is attracted equally in alldirections so the net attraction vanishes. This means that the potential isconstant inside the spheroid and if its shell is infinitely thin, the surface isan equipotential surface. At the points exterior to the shell, the potential isthe same as though the mass (or charge) were uniformly distributed overthe surface. The shell attracts an exterior particle just as the rod does.

Following MacMillan [30] we use the double angle formula,

2tan ϕ2

= 1tan ϕ2/2

− tan ϕ2/2, (5.2.12)

and a similar expression for ϕ1. Into the latter expression we introduce thelevel layer condition (5.2.10) to get:

2tan ϕ1

= Dtan ϕ2/2

− tan ϕ2/2D

. (5.2.13)

Page 269: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

242 A New Perspective on Relativity

Multiplying (5.2.13) by D and subtracting (5.2.12), and then multiplying(5.2.12) by D and subtracting (5.2.13) result in the pair of equations

2(

Dtan ϕ1

− 1tan ϕ2

)= (D2 − 1)

tan ϕ2/2,

2(

1tan ϕ1

− Dtan ϕ2

)= − (D2 − 1)

Dtan ϕ2/2.

Multiplying the two equations together eliminates the half angle terms andresults in

(1

tan ϕ2− D

tan ϕ1

) (D

tan ϕ2− 1

tan ϕ1

)= (D2 − 1)2

4D.

Finally, introducing (5.2.4) results in the equation for equipotential surfacesin two dimensions:

(D − 1)2

(D + 1)2z2

�2 + (D − 1)2

4Dx2

�2 = 1, (5.2.14)

which are a family of confocal ellipses. Introducing

4D�2

(D − 1)2= s,

(D + 1)2

(D − 1)2�2 = �2 + s,

in (5.2.14) we get confocal conics,

z2

�2 + s+ x2

s= 1,

shown in Fig. 5.4. The hyperbola,

z2

�2 − s− x2

s= 1, 0 < s < �,

represents the lines of force which are always normal to curves of equipo-tential. In three dimensions, we get the equipotential surfaces,

z2

�2 + s+ x2 + y2

s= 1,

which are prolate spheroids.

Page 270: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 243

Fig. 5.4. Family of ellipses and orthogonal confocal hyperbolas.

5.3 Gravitational Mass

5.3.1 Attraction of a rod: Increase in masswith broadside motion

In order to obtain expressions for the gravitational mass, we calculate theforce of attraction in a plane normal to the rod, Fx, and the attractive force inthe plane parallel to the rod, Fz [MacMillan 30]. These forces are defined by

Fx = −∂�

∂ε

∂ε

∂x, Fz = −∂�

∂ε

∂ε

∂z,

where

ε = 2�

r1 + r2,

where r1 and r2 are given by (5.2.5). With the aid of the expressions,

∂�

∂ε= 2Gρ

1 − ε2 ,

Page 271: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

244 A New Perspective on Relativity

∂ε

∂r1= ∂ε

∂r2= − 2�

(r1 + r2)2= − ε

2a,

∂r1

∂x= x

r1,

∂r2

∂x= x

r2,

∂r1

∂z= z + �

r1,

∂r2

∂z= z − �

r2,

we find the normal component and parallel components of the force as:

Fx = Gρεxa(1 − ε2)

(1r1

+ 1r2

), Fz = Gρε

a(1 − ε2)

(z + �

r1+ z − �

r2

).

From (5.2.6), (5.2.8) and � = aε we have r1 = a + εz and r2 = a − εz. Theseenable the normal and parallel components of the force to be written as

Fx = Gρ

1 − ε2 · 2εxa2 − ε2z

, Fz = 2Gρεza2 − ε2z2 . (5.3.1)

Eliminating x in the normal component of the force through the equa-tion of the ellipse,

x2

a2(1 − ε2)+ z2

a2 = 1,

gives

Fx = ± 2Gρε3√

(1 − ε2)·√

(�2 − ε2z2)�2 − ε4z2 , Fz = 2Gρε3z

�2 − ε4z2 . (5.3.2)

Assuming z is fixed and small, such that |z| � �, and recalling the definitionof the mass, m0 = 2ρ�, the normal and parallel components of the forcebecome

Fx = Gmx

�2 , Fz = Gmzz�3 (5.3.3)

where the masses are

mx = m0ε3

√(1 − ε2)

, mz = m0ε3. (5.3.4)

Replacing the eccentricity ε by the relative velocity, (5.3.4) gives theincrease in inertia due to the motion. As the relative velocity tends to zero,the rod shrinks to a point particle. In analogy with the motion of Faradaytubes, broadside motion of the rod should lead to a greater inertial massbecause more of the surrounding ‘aether’ is dragged with it than when itperforms frontal movement. This prediction has been corroborated by themasses (5.3.4).

Page 272: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 245

5.3.2 Attraction of a spheroid on a point in its axisof revolution: Forces of attraction as minimalcurves of convex bodies

As a preliminary, we treat the attraction of a circular disc to a point onthe axis of the disc through its center. This will aid us in the subsequentdetermination of the attraction of a spheroid on a point in its axis ofrevolution.

Consider a disc of radius R and surface density ρ. Place the origin ofour coordinate system at the center O of the disc and introduce the polarcoordinates, r and θ. The attracted point P is a distance z from the center ofthe disc, as shown in Fig. 5.5.

The element of mass of the particle is

dm = ρr dr dθ,

and the distance from the attracted particle on the disc to the point P isgiven by the Pythagorean theorem

h = √(r2 + z2).

Fig. 5.5. Attraction of a circular disc on its axis.

Page 273: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

246 A New Perspective on Relativity

The Newtonian law of attraction is therefore given by

Gdmh2 = Gρr dr dθ

r2 + z2 .

If φ is the angle at P, then the component of this force along the axis is

Gρr dr dθ

r2 + z2 cos φ = Gρr dr dθ

r2 + z2zh

= Gρzr dr dθ

(r2 + z2)3/2 .

The total force of attraction is obtained by integrating from the centerto the rim of the disc and integrating around the entire disc, viz.

Fz = Gρz∫ R

0

∫ 2π

0

r dr dθ

(r2 + z2)3/2 = 2πGρz∫ R

0

r dr(r2 + z2)3/2

= −2πGρ

(z√

(z2 + R2)− 1

).

As the force (5.3.5) changes sign when z passes from positive to negativevalues, but does not vanish with it, the attractive force possesses a finitediscontinuity at z = 0. The force undergoes a finite jump, as z passes fromnegative to positive values, equal to 4πGρ. Now, if R were to tend to infinity,as the Earth was once thought of as a flat, infinite, disc, the acceleration dueto gravity would, indeed, be constant everywhere.

We now let z be the axis of revolution of an oblate ellipsoid, Z0Z,with coordinates ξ, η, ζ as in Fig. 5.6. For a system of particles that form acontinuous density, the attractive force in the z direction is proportional to

Fig. 5.6. A figure of revolution.

Page 274: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 247

the difference in the masses averaged over the surface of the ellipsoid

Fz = Gρ

∫ ∫ ∫

Z

ζ − zR3 dξ dη dζ.

Now, we can use our result of the disc, (5.3.5) to avoid two integrations. Todo so, we consider a thin cross-section dζ at a distance ζ from the origin Z0.We must now interpret ρ as a volume density, rather than a surface density,and integrating from Z0 to Z, the limits of the ellipsoid along ζ, we find

Fz = 2πGρ

∫ Z

Z0

(z − ζ√[(z − ζ)2 + R2] − 1

)dζ, (5.3.5)

for the total force of attraction at a point z > Z.The surface of an oblate ellipsoid with a = b > c, shown in Fig. 5.9 (a),

is given by the equation,

ξ2 + η2

a2 + ζ2

c2 = 1.

Introducing the radius of the cross-section at a distance ζ from the origin,

R2 = a2

(1 − ζ2

c2

),

into (5.3.5) gives

Fz = 2πGρ

∫ c

−c

(z − ζ)dζ√ [

(z − ζ)2 + a2 − a2

c2 ζ2] − 2c

.

The rest is a technical matter of integration, which can be found inMacMillan [30].

The final result is

Fz = − 3GM(a2 − c2)

[1 − z√

(a2 − c2)tan−1

√(a2 − c2)

z

], (5.3.6)

where M is the mass of the ellipsoid,

M = 43πρa2c.

It is apparent from the expansion,

tan−1 x = x − 13

x3 + · · · ,

Page 275: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

248 A New Perspective on Relativity

that

Fz = −GM

[1z2 − 3

5a2 − c2

z4 + · · ·]

.

If the body were spherical, only the first term would subsist so that we mayconclude that the force of attraction of an oblate spheroid at a point of itsaxis is less than it would be for a sphere at the same distance.

When (5.3.6) it is evaluated at the surface z = a of the ellipsoid thereresults

Fz=a = − 3GM(a2 − c2)3/2 {√(a2 − c2) − a tan−1 (

√(1 − c2/a2))}. (5.3.7)

The terms in the parenthesis of (5.3.7) represent the minimal distance of aconvex curve in elliptic space, and is related to the phase of the asymptoticHankel function whose argument is greater than its order. We will elaborateon this in Sec. 5.4.4. Moreover, the ratio of the two terms will be shown togive the capacitance of an oblate ellipsoid [cf. (5.4.34) below]. Finally, wecan express (5.3.7) in terms of the eccentricity ε = √

(1 − c2/a2),a

Fz=c = −3GMa2ε3 (ε − tan−1 ε).

In contrast, for a prolate ellipsoid (c > a = b) whose surface is definedby the equation:

ξ2

c2 + η2 + ζ2

a2 = 1,

the force of attraction of a point on the surface x = c is

Fx=c = −3GMc2ε3 (tanh−1 ε − ε)

= −3GM2c2ε3

{ln

(1 + ε

1 − ε

)− 2ε

}, (5.3.8)

aThis corrects an error in Landau and Lifshitz [60] who give the eccentricity asε = √

(a2/c2−1). This is obviously incorrect since the eccentricity must vary between0 and 1. Their expression for the ‘depolarization coefficient’ of an oblate ellipsoid(4.34) should be replaced by

n(z) = 1ε3 [ε − √

(1 − ε2) cos−1 √(1 − ε2)).

Page 276: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 249

expressed in terms of the eccentricity, ε = √(1− a2/c2). In Sec. 5.4.4 we will

see that the distortion of the sphere into an oblate, or prolate, spheroid isattributed to the motion. This means setting the eccentricity equal to therelative velocity for when the latter vanishes so, too, will the former. It willthen be appreciated that the terms in the parentheses of (5.3.8) representthe difference between total electric and magnetic energies of a chargedspheroid and the electrostatic energy it would have had it remained asphere [Bucherer 04]. The terms also happen to be the difference betweenhyperbolic and Euclidean distances in velocity space. Consequently, theforce is a measure of the deviation from Euclidean geometry due to defor-mation caused by the motion. For small values of the eccentricity, (5.3.8)reduces to the Newtonian law, Fx=c = −GM/c2.

In the following, we will replace the uniform mass distribution overthe spheroid by a uniform charge distribution, and the eccentricity by therelative velocity. In so doing we will derive the mass dependence on therelative velocity from the relativistic expression for the momentum.

5.4 Electromagnetic Mass. . . the whole idea of electromagnetic mass is based on the view that the forces betweenpoint-charges do not obey the principle of action-reaction [O’Rahilly 38]

It may be said that electromagnetism was clarified by relativity whereasmechanics was transformed by it. The idea was that quantities like thePoynting vector could be used in defining mass since it is proportionalto the momentum. Relativistic motion of matter led to inconsistencieswith Newtonian dynamics, and only increased the need to place classi-cal mechanics on an electrodynamic foundation in order to reconcile itsdivergence with classical theory.

The desire of symmetry in the natural laws led Maxwell and hisfollowers to consider a magnetic pole on the same footing as electric charge.However, it was the discovery of the electron and the failure to find the mag-netic monopole that led the mass concept to be associated with the electronand its electric charge. If, for the sake of symmetry alone, we could restorethis equivalence by writing alongside the Lorentz force,

Fe = e(E + v

c× B

), (5.4.1)

Page 277: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

250 A New Perspective on Relativity

the Lorentz force for the magnetic pole,

Fq = q(H − v

c× D

), (5.4.2)

where E and H are the electric and magnetic field intensities, B and D themagnetic and electric flux densities, and e and q the electric charge andmagnetic pole strength, respectively.

The Lorentz force (5.4.1) is what the observer at rest would measureon an electron in motion with a velocity v. It can be decomposed intocomponents that are parallel,

F‖ = eE‖

and perpendicular,

F⊥ = e(E⊥ + v

c× B

),

to the velocity. If we were to apply Newton’s second law, like the early prac-titioners of relativity, we would come out with two masses instead of one.These masses were baptized the ‘transverse’ and ‘longitudinal’ masses. Atrelativistic speeds, the longitudinal mass was much larger than the trans-verse mass, and since it did not fit the experimental measurements made byKaufmann in the early part of last century it was swept under the rug. Thisis just one example where electromagnetism raised havoc with mechanics.

Now, let us consider the mass of a charged conductor at rest. Theenergy density due to the electrostatic field is

We = ε0

2|E|2.

If, for the sake of simplicity, we assume the conductor to be a sphere ofradius R, with a uniform surface charge density ρ = e/4πR2, then the electricfield density is

E = e4πε0r2 r, (5.4.3)

for r ≥ R, where r is the unit vector and ε0 is the dielectric constant in freespace. The total energy over all space is

ε0

∫ ∞

R

12|E|2 4πr2dr = e2

8πε0R. (5.4.4)

Page 278: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 251

Equating this with the rest energy, mc2, gives the expression

mel = e2

8πε0Rc2 , (5.4.5)

for the electrostatic mass.Consider, now, the charge to be moving with a uniform velocity u. At

low speeds a charge generates an electric field intensity, E, and a magneticfield density, B, whose magnitudes are related by

B(r) = |B(r)| = |u × E| = eµ0|u × r|4πr2 = eµ0u sin θ

4πr2 , (5.4.6)

for r > R, and 0 for r < R, where we introduced Coulomb’s law (5.4.3), andθ is the angle between r and v. Equation (5.4.6) is known as the Biot–Savartlaw, named after its discoverers, where µ0 is the magnetic permeability offree space.

Thomson now considers the magnetic field energy as the kineticenergy since that has been generated by the motion of the charge. Thus,according to Thomson, the kinetic energy is

12

m′u2 = 12µ0

∫ ∞

R

∫ π

0B2(r)2πr2 sin θ dθ dr = µ0e2

6πRu2.

A comparison with (5.4.5) readily gives

m′ = 43

mel = mem, (5.4.7)

which has been called the electromagnetic mass. From its derivation it wouldappear that expression (5.4.7) is valid only for small velocities.

Instead of electrostatic considerations, we can begin with the Liénardforce law of electron theory [cf. Eq. (4.1.8)],

Fx = e2 cos θ

4πε0c2r2

c2 + u2 − 3u2

r −∑

j

uju′j

+ e2

4πε0r2 uxur − e2

8πε0r(ux + ur) ,

(5.4.8)

where θ is the angle between r and x. If we take u to be in the x-direction,with ur = u cos θ, ux = u, and ur = y cos θ all terms in the velocities willcontain odd powers in cos θ, and, hence, average to zero, whereas there will

Page 279: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

252 A New Perspective on Relativity

be a finite contribution coming from the acceleration terms. Then averaging

Fx = −ue2

8πε0r(1 + cos2 θ),

gives

Fx = −melu(

1 + 13

)= −memu. (5.4.9)

There is consensus that [Yaghjian 92]

Lorentz and Abraham were also unconcerned with the electromagnetic mass mem

equaling 43 the electrostatic mass mel, defined as the energy of formation of a spher-

ical charge . . . because they derived the equation of motion before Einstein’s 1905papers on relativistic electrodynamics . . .

Nothing could be further from the truth, and Einstein’s papers do notprovide one iota of insight into the 4

3 factor.Although it does not resolve the 4

3 factor, we have given two diamet-rically opposite proofs of the electromagnetic mass being 4

3 the electro-static mass. In the first proof we have assumed a continuous distributionof charge, and the mass derived applies to a state of uniform motion. In thesecond derivation the spherical symmetry of the electron eliminates anydependency on the velocity in the force law, leaving only the accelerationterms. The origin of the mass is placed squarely on the interaction of thetwo charges, and the resulting force does not obey Newton’s third law. Soit seems, that here again, two proofs with incompatible assumptions leadto the same result.

O’Rahilly [38] thinks that the error lies in the definition of the ‘kinetic’energy, T. It should be clear from the expression of the Lagrangian what isthe kinetic energy. Starting with the Liénard electrokinetic potential,

L = ee′

4πε0c2r

c2 + u2 − u2

r −∑

j

uju′j − rur

, (5.4.10)

from which the equation of motion (5.4.8) follows via the Euler–Lagrangeequations,

Fx = −∂L∂r

+ ddt

∂L∂ur

,

he claims that Thomson’s procedure, “still reproduced in text-books,” con-sists in setting u = u′, e = e′, taking the velocity along the x-direction so

Page 280: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 253

that u = ux = ur cos θ and is constant, where θ is the angle between r and x.In this case the Liénard electrokinetic potential reduces to

L = 2W0 − W0(1 + cos2 θ).

Averaging over θ gives

2W0 − L = 43

W0 = 43

melu2 = memu2.

This would identify the left-hand side as 2T which makes no sense.Furthermore, the definition of the electrostatic mass, and conse-

quently the electromagnetic mass, depends on our conception of what theelectron looks like. For if we consider the charge to be uniformly distributedover the surface, σ = e/4πε0R2, then the field inside the surface is zero andoutside is (5.4.3). This gives an electrostatic energy (5.4.4), correspondingto an electromagnetic mass (5.4.7). Alternatively if there is a volume dis-tribution of charge with a charge density ρ = 3e/4πε0R3, there will bea contribution to the electrostatic energy for r < R, since there is a non-vanishing electric field E = 1

3ρr = er/4πε0R3. This the total electrostaticenergy,

W0 = 12

∫ R

0

e2

4πε0R6 r4 dr + 12

∫ ∞

R

e2

4πε0r2 dr = 65

(e2

8πε0R

), (5.4.11)

which is the same as the average gravitational energy, and gives an elec-tromagnetic mass, 6

5 · 43mel = 6

5mem, greater than (5.4.7) due to the volumecontribution to the energy in (5.4.11). There is nothing stranger in consid-ering like charges distributed over a surface than to consider their densitydistribution in a finite volume. Both make no physical sense since likecharges repel one another, and to consider a Poincaré pressure acting on asurface or throughout a volume has no physical relevance.

Everything is fine so long as we treat the energy of interaction of twopoint charges. When we attempt to give the electron ‘body and shape,’ werun into trouble. The energy integrals extend over all of space, where theaether lives, but the electron can only be of finite extension. This is echoedin J. J. Thomson’s view where

the mass has been increased by the charge; and since the increase is due to themagnetic force in the space around the charge, the increased mass is in this spaceand not in the charged sphere.

Page 281: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

254 A New Perspective on Relativity

In the force law interpretation, the electric charges retain their point-likecharacter and in defining the electrostatic mass there is no integral over allthe volume. But, it suffers from resulting in an expansion in the relativevelocity to order 1/c2. According to Ritz’s treatment, higher-order terms inthe expansion give the next correction to the force as

F(2)x = e2

8πε0c3 (ux + ur cos θ).

Taking u along x, and ur = u cos θ, and then averaging give

F(2)x = e2

6πε0c3 u. (5.4.12)

Expression (5.4.12) is independent of the electron’s classical radius, andrepresents dissipation due to radiation.

According to Larmor [00], who discovered this effect in 1897, theexpression for the magnetic field must be modified to read

B = µ0e sin θ

4πr

(ur

+ uc

), (5.4.13)

where for periodic motion the second term would average out to zero, but“will be preponderant in the integral of B2 across the shell [of radiation]when r is great.” In fact, the integral over the radial coordinate from R toinfinity will diverge. But, since the integral is independent of r, the “energyof the expanding shell is conserved as it moves.”

Averaging the power density,

F · u = − ddt

(12

memu2 − e2

6πε0c3 uu

)− e2

6πε0c3 u2, (5.4.14)

over the ‘time of motion’ leaves only the last term in (5.4.14) which, accord-ing to Larmor [00],

represents the amount of energy per unit time that travels away and is lost to thesystem, the velocity of the electron being as usual taken to be of a lower order thanthat of radiation.

This being so, then the neglect of such terms in the electromagnetic masswould mean that such mass cannot be converted into radiation because the

Page 282: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 255

loss term is absent. That is, if we expand the force,

Fx = ee′

4πε0r2

(A cos (rx) − B

ux

c− C

rux

c2 − Dr2ux

c3 − · · ·)

,

in powers 1/c, where the coefficients are functions of the velocities andaccelerations, it becomes clear that the radiation terms will enter athigher-order. Thus, we come to the conclusion that it is not the electro-static or electromagnetic mass which can be converted into radiation, butonly the contributions coming from the higher-order terms.

It would therefore appear that without acceleration, mass cannot beconverted into radiation. What then does the equivalence of mass andenergy mean?According to O’Rahilly [38] the Lorentz–Einstein theory “hasnothing . . . to say about the general interconvertibility of mass and energy.”We may say that hν/c2 is the mass equivalent of radiation, but it cannot beidentified with the mass of an electron.

5.4.1 What does the ratio e/m measure?

At the beginning of the twentieth century, experiments were devised tomeasure the charge-to-mass ratio in the hope of discovering the true originof mass. Associating the kinetic energy of a particle with the energy ofthe magnetic field produced by an electric charge in motion, J. J. Thomsonreasoned that it would take more energy to start or stop an electric chargethan if it were neutral. This extra energy could be attributed to the inertiaof a particle so that it would appear to have an additional mass due to itscharge. This idea was first expressed by Thomson inApril 1881, to which wewill return in Sec. 5.4.3, but, it did not attract much interest until 1897 whenhe found that cathode rays were negatively charged electrons travelingalong the cathode ray tube at very high speeds.

A few years later, it was observed that rays emitted from radium saltsbehaved in electromagnetic fields as if they were composed of negativelycharged particles. Experiments, involving the deflection of these rays inelectric and magnetic fields, showed that the charge-to-mass ratio was thesame order as that of cathode rays, ∼107. They were christened β-rays byRutherford, and were used interchangeably with (high speed) electrons. Incontrast to cathode rays, the β-rays were deflected less in a magnetic field

Page 283: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

256 A New Perspective on Relativity

than cathode rays and could thus reach higher speeds, even exceeding 0.9that of light.

β-rays were therefore better candidates to determine the variation ofthe ratio of e/m with velocity than cathode rays which did not register anyvariation, and which could only be accelerated up to 0.3 the speed of light.The experimental variation with velocity could then be confronted with thatpredicted by theory. There were two main contenders which viewed theelectron in motion as prolate and oblate spheroids, whose axes in the direc-tion of motion were shortened as a result of the FitzGerald–Lorentz con-traction.

We will discuss these models in greater detail in Sec. 5.4.4; let it sufficehere to give their expressions. Whereas the Abraham model predicted themass should vary as

m = 34

m0

β2

{1 + β2

2βln

(1 + β

1 − β

)− 1

},

the Lorentz model predicted

m = m0√(1 − β2)

,

where m0 is the mass of the electron in a state of rest, and β = u/c is therelative velocity.

The initial experiments were not of sufficient accuracy to distinguishbetween these two expressions for the mass, but they did show a verymarked change of e/m with the relative speed. This led Thomson [28] tothe conclusion that this is “consistent with the view that all the mass iselectrical.” For if the mass were not entirely electrical in origin, “a constantterm would have to be added [to the mass expressions] to represent thenon-electrical mass.”

Experiments carried over a span of more than a decade showed[Thomson & Thomson 28]

e/m√

(1 −β2) plotted against u, so the points should lie on a line parallel to the axisif the Lorentz formula is true, and it will be seen that they do so within the errorsof experiment.

But, in Fig. 5.7 we see that it is the ratio e/m which is plotted against u/c.The experiments carried out by Bucherer and others use crossed fields.

Page 284: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 257

Fig. 5.7. The ratio of charge to the mass as a function of the relativity velocity.The sloping curve is the ratio determined by Abraham while the horizontal curveresults from Lorentz’s formula.

β-rays are generated between plates of a condenser, and an outer solenoidapplies a magnetic field. On emerging from the condenser the electronsstrike a photographic plate at a distance δ from the condenser. The electricfield has a sole component, Ey, perpendicular to the direction of motionof the electrons along the x-axis. The magnetic field has two components,Hx = H cos ϑ, and Hz = H sin ϑ. The configuration is shown in Fig. 5.8.There is no acceleration along the x- and z-directions, and the accelerationalong the y-direction will vanish when the Lorentz force vanishes, viz.

Ey = βHz = βH sin ϑ.

This is the condition for the rectilinear motion of the electrons so thatthey will be able to pass through the narrow gap in the condenser plates. Onemerging from the plates they are acted on by the magnetic field causingan acceleration, ay, in the y-direction,

euH sin θ = may. (5.4.15)

The deviation in the y-direction, for a particle moving under constant accel-eration, is given by

y = 12

ayτ2,

Page 285: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

258 A New Perspective on Relativity

Fig. 5.8. The orientation of the fields in Bucherer’s experiment.

where τ is the time of flight. Using it to eliminate the acceleration in (5.4.15)gives the charge-to-mass ratio as

em

= 2u2yEyδ2 , (5.4.16)

where τ = δ/u has been used to eliminate it. However, (5.4.16) does notagree with the experimental results.

Rather, if we modify the right-hand side of (5.4.15) by multiplying itby 1/

√(1 − β2), (5.4.16) becomes

em

= 2u2yEyδ2√(1 − β2)

, (5.4.17)

whose constancy does agree with what is observed experimentally. Thus,it is not what Thomson claims that is held constant. Moreover, O’Rahilly[1965b] claims that no Lorentz type of “ad hoc modification” is requiredin Ritz’s theory since as far as second-order terms in the relative velocityare concerned, there is no change in H but Ey will undergo an increase

Page 286: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 259

by the amount 1/√

(1 − β2). This does not give the same result as (5.4.17).Finally, if we use the Liénard–Wiechert expression for the field producedby a charge moving at a uniform velocity u [cf. (5.4.21) below], we comeout with (5.4.16) when it is introduced into (5.4.17), or if it is introducedinto (5.4.16), we come out with a ratio, e/m, decreasing as u increases, asthat predicted by Abraham’s model in Fig. 5.7!

Returning to Thomson, he goes further and claims that

This result might be supposed to prove that the whole of the mass of the electronis electrical. If this is so, and the electron is assumed spherical, its radius a can be

found from the equation m0 = 23 e2/a.

If we set such a rest mass in motion we should find that it acquires inertiaof the order 1/

√(1 − β2). But, then what is there to be distinguished in the

ratio e/m? Putting a charge in motion should also increase its inertia by thesame amount. We will return to this shortly.

What is even more provoking is Thomson’s affirmation that

Einstein has shown that to conform with the principles of Relativity mass mustvary with velocity according to the law m0

/√(1 − u2). This is a test imposed by

Relativity on any theory of mass. We see that it is satisfied by the conception thatthe whole of the mass is electrical in origin, and this conception is the only one yetadvanced which gives a physical dependency of mass on velocity.

General principles can never be used to determine specific formulas.If the mass is completely electrical then it should have the increasedinertia predicted from the relativistic formula, although the distinctionbetween mass and charge has all but disappeared. Moreover, whatThomson is saying is that neutral mass should not manifest any varia-tion with speed, for a neutral particle in motion does not create a magneticfield. How then does relativity distinguish between charged and neutralmatter?

Kaufmann is slightly more cautious and writes the total mass as thesum of the mechanical (‘real’) mass and the electromagnetic (‘apparent’)mass. It is the latter that contains all the dependency on the speed. If thiswere the case the ratio e/m would contain the charge in both the numeratorand in the denominator so that at zero speed it would reduce to

em + e2/6πε0ac2 .

Page 287: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

260 A New Perspective on Relativity

When the particle is set in motion why should only the second term in thedenominator acquire a dependency on its speed?

According to Millikan,

since an electric current, by virtue of the property called self-induction, opposesany attempt to increase or diminish its magnitude, it is clear that an electric chargeas such possesses properties of inertia . . . It is clear then that theoretically that anelectrically charged pith ball must possess more mass than the same pith balluncharged.

Until mass is defined in a way which does not employ charge thedistinction between the two is more than precarious. As far back as 1911,More suggested that the ratio,

e/m = (e/m)0√

(1 − β2), (5.4.18)

can either be interpreted as Lorentz does, e = e0 and m0 = m√

(1 − β2), orm = m0 and e = e0

√(1 − β2). Mass will increase with speed at constant

charge, or charge will diminish with speed at constant mass. According toBridgman, “the operations do not exist by which unique meaning can begiven to the question of whether the magnitude of a charge is a function ofits velocity.”

If we opt for the second choice, we can expect modifications toCoulomb’s law when charges are set into motion. Such corrections wereknown prior to the advent of the special theory and Lorentz transforms.They were derived by Liénard and Wiechert in 1898 and 1900, respectively.

They found the expressions for the vector and scalar potentials as[cf. (4.1.10) with the difference that we are now using rationalized units],

a = eβ4πε0(r − β · r)

, φ = e4πε0(r − β · r)

, (5.4.19)

where u is the velocity of the charge, and r is the radius vector, takenfrom where the charge is located to where it is observed. The terms on theright-hand sides of (5.4.19) must be evaluated at the earlier time where thecharge was located. The negative of the electric field is obtained by takingthe sum of the gradient of the scalar potential and the time-derivative ofthe vector potential,

E = −1c

a − ∇φ.

Page 288: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 261

In an inertial frame, the same expression is obtained from a Lorentz trans-form, viz.

E = er4πε0r3

1 − β2

(1 − β2 sin2 ϑ)3/2 (5.4.20)

where ϑ is the angle between the radius vector r and the direction of motionof the charge. If the two should coincide, (5.4.20) reduces to

E‖ = e4πε0r2 (1 − β2),

while if the two directions are perpendicular to one another,

E⊥ = 14πε0r2

e√ (1 − β2

) . (5.4.21)

As the speed of a charge increases it will have opposing effects on thecomponents of the electric field. The component of the field in the directionof the motion is contracted like that of a sphere into a spheroid, while thenormal component is increased: like the mass? or the electric charge? Thishas not gone unnoticed, for Bridgman claims that Bush has “shown thatthere are advantages in supposing the charge of an electron to change whenit is set in motion.” The electric field entering in Lorentz’s law is (5.4.20), ifthe acceleration terms are omitted as they must be in an inertial state. Sucha state cannot radiate electromagnetic waves, but such a state is preciselythat in which the ratio e/m has been determined!

The electrokinetic potential for deriving the force is

L = e(φ − β · a) (5.4.22)

= e(1 − β2)4πε0 (r − β · r)

= e(1 − β2)4πε0r′√(1 − β2 sin2 ϑ′)

, (5.4.23)

where r′ is the distance from the charge to the observer at the exactmoment he observes the charge. The last equality in (5.4.22) follows fromr′ = r − ur/c, and from which it follows that the ratio of the magnitudesof the two distances, ‘then’ and ‘now’ are in the inverse ratio of their

Page 289: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

262 A New Perspective on Relativity

respective sines, i.e.

r′

r= sin ϑ

sin ϑ′ ,

where ϑ′ is the angle between r′ and u.The potential (5.4.22) can be traced all the way back to Clausius who

introduced it as the ‘electrodynamic’ potential in 1857. Later it was redis-covered by Schwarzschild who named it the ‘electrokinetic’ potential, andto what Searle referred to as the ‘convection’ potential for a charge movingat a constant velocity. The retarded potentials (5.4.19), or for that matterthe potentials themselves, do not enter Maxwell’s equations. Heavisideshowed dispraise for them, referring to them as “the metaphysical natureof the propagation of the potentials,” but, are necessary when Maxwell’sequations are to accommodate mass, as we shall see in Sec. 11.5.2.

The force components follow just as in mechanics, viz.

Fx = −∂L∂x

+ ddt

∂L∂ux

.

On this expression, Ritz remarked:

This expression reduces, in first approximation, to the law of the inverse squareof the distance; we can therefore call it the law of Newton generalized . . . In theseformulas the notion of field does not intervene . . . This remarkable result, due toSchwarzschild, shows that Lorentz’s theory resembles the older theories much morethan we could at first sight believe.

The ‘older’ theories Ritz is referring to are those of action at a distance inwhich the charged particles retained their corpuscular character and werenot something belonging to a region of a continuum — the aether — whichwould increase or decrease, appear or disappear in a continuous fashion.

5.4.2 Models of the electronWe do not have the remotest idea of how the single electron holds together. Itought to be one of the most explosive and unstable things in physics; yet itbehaves as a permanent existence in defiance of every known physical law.[Soddy 32]

Thomson [81, 88], as early as 1881, reasoned that a mass should be heavierwhen it is charged than when it is uncharged. Although this would beconsidered relativistic heresy today, the reason he gave was essentially

Page 290: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 263

that of self-induction: Just as starting a current creates an instantaneouselectromotive force opposing it, so a charge set in motion creates an electricfield, together with a changing magnetic field, that act on the charge toretard its motion. Likewise, when the particle decelerates, the electric fieldproduced by the changing magnetic field acts in the direction of motionof the charge, thereby, again, increasing its inertia. Thus, it appears that acharged mass has a greater inertia than when it is uncharged.

This idea was further developed by Abraham [03] who transformed itinto a model of the ‘rigid’ electron, and thus constructed the first theoreticalmodel of a subatomic particle. Its generality [Cushing 81] and completeabsence of ad hoc assumptions make it all the more incredible that it did notcorrespond to experiment. Today, it is almost forgotten, being surpassed bythe Lorentz model, although it was the first field model of an elementaryparticle. Abraham’s idea that the mass of the electron could be accountedfor by the electrodynamic fields meant that its energy and momentumcould be determined from these fields in the case where deviations froma spherical form could be explained as a distortion due to the motion. SoAbraham’s model is not as ‘rigid’ as it is made out to be.

Abraham based his model on a prolate ellipsoid, whose expressionfor the capacitance, and, hence, the electrostatic energy he found given inMaxwell’s Treatise on Electricity and Magnetism. Lorentz, on the other hand,considered that the motion distorts the spherical electron into a ‘Heaviside’ellipsoid, as it was referred to in old literature, or what we commonly knowtoday as an oblate ellipsoid.

Our aim will be to show that the expressions for the energies of theoblate and prolate ellipsoids are related by ‘analytic continuation,’ R → iR,that converts elliptic into hyperbolic geometry, respectively.

5.4.3 Thomson’s relation between charges in motionand their mass

The calculation of the additional mass in the small motion limit,

m′ = e2

6πε0ac2 , (5.4.24)

where a is the radius of a small sphere, had been made by equating theenergy in the magnetic field — equal to the energy in the electric field — to

Page 291: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

264 A New Perspective on Relativity

the kinetic energy 12m′u2. The kinetic energy per unit volume is

12µ0

B2 = µ0e2u2 sin2 θ

32π2r4 ,

where θ is the angle between the direction of motion of the charge, and apoint which is a distance r from the center of the sphere. The kinetic energydensity is to be integrated over the volume with an element of a ring whoseaxis is in the line of motion of the charge with a cross section r dr dθ andperimeter 2πr sin θ, where r sin θ is the radius. This gives the element ofvolume as 2πr2 sin θ dθ dr. Hence,

12µ0

∫B2 dV = µ0e2u2

∫ π

0sin3 θdθ

∫ ∞

a

drr2 = µ0e2

12πau2, (5.4.25)

is the kinetic energy outside of the sphere of radius a.Thus, there is an additional mass which is given by (5.4.24), in the

small motion limit. In other words, if m is the mass of the uncharged sphere,when it is charged and set into motion at a velocity u, it will have a kineticenergy given by

12

(m + µ0

e2

a

)u2

where the second term in the parenthesis is seen to be m′ given by (5.4.24),observing that µ0 = 1/ε0c2. Thus, Thomson [21] concludes that “when asphere moves through a liquid it behaves as if its mass were m+m′, where mis the mass of the sphere, and (5.4.24) the mass of the liquid displaced by it.”

Thomson then goes on to replace the ‘liquid’ by the ‘aether,’ saying thatit is necessary for the conservation of momentum. In fact, the entire analogywith self-induction is inappropriate since accelerative motion has not beentaken into account, nor have transient effects. According to classical theory,an accelerated charge must necessarily radiate, and, hence, mass will belost. Moreover, (5.4.24) applies to “indefinitely slow motion.”

According to a respectable text [Richtmyer & Kennard 42], for finitemotion, relativistic effects must be included in which “the sphere becomescontracted in the direction of the motion in the ratio

√(1 − β2) : 1.” Hence,

Page 292: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 265

there is a further increase in the mass bringing it to

m′ = µ0e2

6πac2 · 1√(1 − β2)

. (5.4.26)

But, this makes no sense since m′ is the additional mass produced by theaccompanying electromagnetic fields during the motion. Lorentz’s factorin (5.4.26), therefore, appears as an ad hoc factor appended onto a resultwhich has already taken into account the motion of the mass. The dilata-tion (5.4.26) is no more fundamental than the hypotheses of Lorentz andAbraham.

In fact, Lorentz [52] admitted Thomson’s priority, but his calculationwas considered by him to be “somewhat different from that to which one isled in the modern theory of electrons.” In effect, it had nothing to do withit, and Thomson stood by his results well into the mid 1920’s even after theadvent of special relativity.

5.4.4 Oblate versus prolate spheroidsThe shortening of the electron ‘is true but not really true’Eddington

Lorentz and Abraham both devised models of the electron as a sphere suf-fering from (FitzGerald–Lorentz) contraction when in motion. AlthoughAbraham’s model seemed initially the more promising one, it was Lorentz’smodel that finally won the almost ten year long battle. Abraham’s modelwent to oblivion, and even he does not refer to it in the later editionsof his second volume of Theorie der Elektrizität. Many an author consid-ered [O’Rahilly 38]

Abraham’s results are largely of merely historical interest. We shall therefore turnto Lorentz’s ‘contractile electron.’

Our path is not to follow the historical evolution of these two models andthe particular assumptions that went into their evaluation [for that seeCushing [81]], but, rather, to show they were two sides of the same coinand when flipped they turned into one another by the addition of i, or itsremoval. Nevertheless, as late as 1938, the matter of deciding between theAbraham and Lorentz models was far from settled. According to Zahn andSpees [38]

Page 293: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

266 A New Perspective on Relativity

So far as is known to the authors, it appears that, at least for higher velocities, no verysatisfactory experimental distinction between the two types of electron has as yetbeen made by direct electric and magnetic deflections. In view of the fundamentalimportance of such experiments it seems that much is left to be desired.

They isolated the problem in the so-called ‘10% effect.’ Bucherer andNeumann designed their experiments to make a quantitative measure-ment of the variation of the electron mass with velocity. In so doing theyhoped to distinguish between the Abraham and Lorentz electrons. FromNeumann’s data, it appeared that the ratio e/m0 calculated from Lorentz’stheory remained practically constant, while in Abraham’s theory it variedabout 10% [cf. Fig 5.7]. The very large spreads, focusing effects, and scat-tering, could have resulted in errors of 10% in such a way that masked thevariation in Lorentz values while causing an observable variation in thoseof Abraham.

Notwithstanding the merits of the experiments, the Lorentz andAbraham electrons are, in fact, models of elliptic and hyperbolic geom-etry, respectively, and come out very simply from two different spheroids.Had Abraham only knew his was a model of hyperbolic geometry he couldhave found allies in Varicak and Silberstein. But, everyone was focused onthe expression for the mass and how it compared with the experimentalmeasurements of the ratio e/m, made by Kaufmann, and later Buchererand Neumann. Actually the Bucherer and Neumann experiments do lit-tle more than Kaufmann’s, which only succeeded in indicating a large,qualitative, increase in mass with velocity. Again, through the prejudice ofwanting the relativistic electron to succeed, and along with it all relativisticmatter whether charged or not, we see another golden opportunity wastedof investigating the geometrical structure of the two electron models.

The equations of an ellipsoid,

x2

a2 + s+ y2

b2 + s+ z2

c2 + s= 1, (a > b > c) (5.4.27)

is a cubic equation in s, having three different real roots lying in the follow-ing ranges,

s1 ≥ −c2, −c2 ≥ s2 ≥ −b2, −b2 ≥ −a2.

These roots are coordinates of a point (x, y, z); surfaces of constant s1, s2,and s3 are ellipsoids and hyperboloids of two sheets which are confocal

Page 294: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 267

with the ellipsoid,

x2

a2 + y2

b2 + z2

c2 = 1.

The problem of finding the electric field of a charged ellipsoid reducesto solving Laplace’s equation [Landau & Lifshitz 60],

dds

(R(s)

dds

)= 0, (5.4.28)

where surfaces of the constant ellipsoidal coordinate, s, are equipotentialsurfaces. In particular, s = 0 represents the surface of the ellipsoid itself,and

R(s) = √[(s + a2)(s + b2)(s + c2)],where s ≥ a2 > b2 > c2.

The solution to Laplace’s equation,

�(s) = e16πε0

∫ ∞

s

dsR(s)

= e16πε0

∫ ∞

s

ds√[(s + a2)(s + b2)(s + c2)] ,

can be simplified by the change of variable,

z =(

a2 − c2

a2 + s

)1/2

.

For then we have the known integral,√

(a2 − c2)2

∫ ∞

s

ds√[(s + a2)(s + b2)(s + c2)]

=∫ √(

a2−c2

a2+s

)

0

dz√[(1 − z2)(1 − κ2z2)] , (5.4.29)

where

κ =(

a2 − b2

a2 − c2

)1/2

(5.4.30)

is the modulus. A further change of variable, z = sin ϕ reduces (5.4.29) to

∫ √(a2−c2

a2+s

)

0

dz√[(1 − z2)(1 − κ2z2)] =∫ φ

0

dϕ√(1 − κ2 sin2 ϕ)

= F(φ, κ),

Page 295: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

268 A New Perspective on Relativity

where

sin φ =(

a2 − c2

a2 + s

)1/2

.

The integral, F(φ, κ) is Legendre’s elliptic integral of the first kind. Puttingthe pieces together, we find the expression,

� = e8πε0

√(a2 − c2)

F(φ, κ). (5.4.31)

The capacity of the conductor is thus

C−10 = �/e = 1

8πε0√

(a2 − c2)F(φ, κ). (5.4.32)

Neither (5.4.31) nor (5.4.32) can be found in closed form.But, the situation changes when any two semi-axes become equal,

for then the ellipsoid degenerates in a spheroid. A spheroid is a surface ofrevolution obtained by revolving an ellipse about one of its axes. When theaxis of revolution is the major axis the ellipsoid is ‘cigar-shaped,’ or prolate,while, if it is the minor axis, the ellipsoid is ‘pancake-shaped,’ or oblate.These spheroids are shown in Fig. 5.9.

Fig. 5.9. (a) Oblate ellipsoid with a = b > c; (b) prolate ellipsoid with a = b < c.

Page 296: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 269

The reason for the distortion of the sphere into spheroids is that motionwill, in general, distort objects. Whether this is real or apparent is anothermatter, and depends on the geometric space that the observer is in. For anoblate ellipsoid a = b > c, so that we may think of the semi-minor axis c asbeing due to a FitzGerald–Lorentz contraction,

c = √(1 − β2)a.

Since the modulus (5.4.30) of the elliptic integral of the first kind F vanishes,the integral in (5.4.32) can easily be performed. We then get

Co = 8πε0√

(a2 − c2)cos−1 (c/a)

. (5.4.33)

Consequently,

�o = eC0

= ecos−1 (c/a)

8πε0√

(a2 − c2), (5.4.34)

is the field at the surface of the oblate ellipsoid, and whose energy is e times(5.4.34),

Wo = e�o = e2

8πε0acos−1 √ (

1 − β2)

β. (5.4.35)

The momentum, Go, associated with the total energy Wo isb

Go = uWo/c2 = mel cos−1 √(1 − β2),

where we have set mel = e2/8πε0ac2 [cf. Eq. (5.4.5)]. Actually, m representsthe electrostatic mass. Moreover, if we use our definition of mass as

m = ∂Go

∂u= melγ , (5.4.36)

we obtain Lorentz’s expression for the ‘transverse’ mass, where γ = 1/√(1 − u2/c2). Now by the definition of the force,

F = dG0

dt= d

dtmγu = mγu + mγ3(u · u)u/c2, (5.4.37)

bConfusion should not arise between the velocity of light and the c-axis of theellipsoid.

Page 297: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

270 A New Perspective on Relativity

there will be a component of the force in the direction of the velocity. Thisis non-Newtonian, and leads to a rate of working

F · u = mγ(u · u)

[1 + u2

c2 γ2

]= m(u · u)γ3. (5.4.38)

Introducing this into the last term in (5.4.37) leads to

u = F − (F · u) u/c2

melγ. (5.4.39)

If the velocity is parallel to the force, (5.4.39) becomes

u = Fmelγ

3 , (5.4.40)

where ml = melγ3 is referred to as the longitudinal mass, while if the

force is perpendicular to the velocity there results the transverse mass. InSec. 3.7.3.2 we remarked that such a situation corresponds to a uniformlyrotating disc where the velocity is tangent to the disc while the centripetalforce is directed inward toward the center of the disc. This can hardlydescribe the rectilinear motion of an electron.

Since ml is not what Kaufmann’s experiments predicted, the longi-tudinal mass was quickly forgotten. The vanishing of (5.4.38) gives thecondition for the absence of the longitudinal mass: Either the velocity isconstant, or the velocity is orthogonal to the acceleration. For radiationphenomena neither of these two conditions are met.

If the ellipsoid of revolution is prolate, (5.4.34) becomes imaginary inform, since c > a = b, though not in reality. The potential at any distance rfrom the surface is:

�p(r) = e8πε0

√(c2 − a2)

ln

√(r + c2) + √

(c2 − a2)√(r + c2) − √

(c2 − a2)

= e4πε0

√(c2 − a2)

tanh−1

(√(c2 − a2)√(r + c2)

).

The eccentricity of the prolate ellipsoid is ε = √(1 − a2/c2), which as a → 0

degenerates into a long thin rod of length � = εc.As we have seen in Sec. 5.3,the potential �p becomes infinite as ε → 1. Instead, as ε → 0, c ≈ a, and

Page 298: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 271

the equipotential surfaces are nearly spheres. We are then far from the rod,and deformations due to motion, or, for that matter, attraction, are hardlyperceptible.

Bearing in mind that√

(a2 − c2) = i√

(c2 − a2), and cos−1 (c/a) =i cosh (c/a), the capacity (5.4.33) becomes

Cp = 8πε0√ (

c2 − a2)

cosh−1 (c/a),

so that the electrostatic field at the surface of a prolate ellipsoid is

�p = e8πε0

cosh−1 (c/a)√(c2 − a2)

= e8πε0c

tanh−1 ε

ε.

The energy at the surface of the prolate ellipsoid in terms of the relativevelocity β,

Wp = e�p = melc2 tanh−1 β

β= melc2

2βln

(1 + β

1 − β

), (5.4.41)

gives a momentum,

Gp = uWp/c2 = melc tanh−1 β. (5.4.42)

Abraham would not have arrived at this expression for the velocity had henot introduced another contraction in the direction of the motion, and notused the definition of the momentum as the derivative of the Lagrangianwith respect to the relative velocity. His definition of the Lagrangian as thenegative of the energy which has been Lorentz-contracted is inaccurate.

Using the definition of mass as (5.4.36), we now find

m = dGp

du= mel

1 − β2 , (5.4.43)

It was precisely this mass that Lewis and Tolman [09], and Wilson andLewis [12], found when the units of mass and length vary with a changeof axes. The latter even used it to fault Minkowski’s definition which coin-cided with that of Lorentz, (5.4.36). It may be thought that the electrostaticmass undergoes a Lorentz contraction,

mel = m0√

(1 − β2),

Page 299: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

272 A New Perspective on Relativity

where m0 is the invariant mass. Then, since Gpu is the work necessary tokeep the electron in a state of constant motion, the total mass would be

m0√(1 − β2)

= m0

[β2

√(1 − β2)

+ √(1 − β2)

].

Lewis and Tolman consider the reflection of light off a mirror locatedon a platform in motion at velocity u. They find that the calculated back-and-forth path of light is “greater in the ratio 1/(1 − β2).” But, this seemsto contradict their previous finding that the length should be lengthenedonly by 1/

√(1 − β2). Here is how they patch things up:

Now the velocity of light must seem the same to the observer, whether he is atrest or in motion. His measurements of velocity depend upon his units of lengthand time. We have already seen that a second on a moving clock is lengthenedin the ratio 1/

√(1 − β2), and therefore if the path of the beam of light were also

greater in this same ratio, we should expect that the moving observer would findno discrepancy in his determination of the velocity of light. From the point of viewof a person considered at rest, however, we have just seen that the path is increasedby the larger ratio 1/(1 −β2). In order to account for this larger difference, we mustassume that the unit of length in the moving system has been shortened in the ratio√

(1 − β2)/1.

Lewis and Tolman fail to appreciate that the to-and-fro motion of lightbouncing off a mirror is equivalent to an inelastic collision between twoparticles each traveling at the same velocity in opposite directions. If theirspeed be u, we can always transfer to a frame in which one of the bodiesis stationary and the other moves with speed U = 2u/(1 + β2). If m0 is thestationary mass, its energy would have appeared to increase by

E = m0c2√

(1 − B2)=

(1 + β2

1 − β2

)m0c2 = m(B)c2,

where B = U/c. Now the total mass,

m0 + m(B) = M,

would have seemed to increase more than the sum of the stationary massesby the factor 1/(1 − β2), since

M = 2m0

1 − β2 . (5.4.44)

Page 300: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 273

Similar considerations apply to the lengthening of the path of lightupon reflection from a moving mirror so there is no need to assume a massor a length contraction by a factor of

√(1 − β2)! It is precisely the mass

increase (5.4.44) that the prolate model of an ellipsoid predicts, (5.4.43).This is just Poynting’s derivation of the mass–energy relationship that wediscussed in Sec. 3.5.2.

The electrostatic energies (5.4.35) and (5.4.41) have already been seento be related by an imaginary factor. We can show that they are, in fact,related to the phase of the Bessel function in the bright and shadow regions.In order to do so, we place the uncharged conducting ellipsoid in a uniformexternal electric field which coincides with the major axis of the ellipsoid.The expression for the electrostatic potential is the potential of an electricdipole which is proportional to the electric field. The coefficients of propor-tionality are the depolarization coefficients, which, if the coordinate axesdo not coincide with the spheroid, form a symmetric tensor of rank two.

Consider an electric field directed along the x-axis which coincideswith the major axis of the spheroid, the a-axis of the oblate and the c-axisof the prolate. The constant and parallel field to the major axis will inducea non-uniform charge distribution on the surface of the spheroid whosepotential is

�′(s) = e8πε0

∫ ∞

s

ds(s + a2)R(s)

≈ e8πε0

∫ ∞

r2

dss5/2 = e

12πε0r3 .

At large r, s ∼ r2 and the potential of the induced charge �′(r) ∼ e/12πε0r3

is that of a dipole. Rather, at the surface of the sphere, the potential of theinduced charge is

�′o(0) = 1

8πε0c(a2 − c2)3/2 {√(a2 − c2) − c cos−1 (c/a)}. (5.4.45)

Apart from a constant factor, (5.4.45) is identical to the attractive gravita-tional force at the surface of an oblate spheroid, (5.3.7). In the next sectionwe will show how it is related to the minimal curve of a convex body inelliptic space.

For a prolate ellipsoid (c > a = b) the field is aligned with the z-axiswhich is parallel to the major axis c. The potential of the induced

Page 301: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

274 A New Perspective on Relativity

distribution at the surface is

�′p(0) = e

8πε0cε3

{12

ln

(1 + ε

1 − ε

)− ε

}

= e8πε0cε3 {tanh−1 ε − ε}

= e8πε0c(1 − a2/c2)3/2 {cosh−1 (c/a) − √

(1 − a2/c2)}, (5.4.46)

where ε = √(1 − a2/c2) is the eccentricity, which is not to be confused

with the dielectric constant, ε0 in free space. Again, apart from a constantfactor (5.4.46) is identical to the attractive, gravitational force of a prolateellipsoid, (5.3.8). This, too, will be related to a minimal curve of a convexbody in the next section, but one occurring in hyperbolic space.

Moreover, expression (5.4.45) is the phase of the Debye asymptoticform of the Hankel function for a > c [Babic & Buldyrev 91], or for onewhose argument is greater than its order. Hankel functions describe theperiodic propagation of a wave field whose wavefronts, �′ = const., areinvolutes to the circle of radius a = c. The rays � are half-lines tangent tothe circle a = c, as shown in Fig. 5.10.

In the shadow region a < c, where the rays do not penetrate, theimaginary phase of the Debye asymptotic form of a Hankel function, whoseorder is greater than its argument, is given by (5.4.46). The Hankel function

Fig. 5.10. The caustic circle of radius c separates the bright (periodic) region a > cfrom the shadow (exponential) region, a < c.

Page 302: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 275

describes a wave field that decays exponentially in space. Separating brightand shadow regions is the caustic circle of radius c.

Even more can be said — and here is the hook-up with hyperbolicgeometry. For variable minor axis, a = x say, the terms in the parenthesisof (5.4.46) can be written as

y = c lnc + √

(c2 − x2)x

− √(c2 − x2). (5.4.47)

Equation (5.4.47) is the equation of a tractrix shown in Fig. 2.1.9. Revolvingthis curve about its asymptote we obtain a surface of revolution, whichwe discussed in Sec. 2.5. The surface is none other than the pseudospherewhich has a constant, negative curvature −1/c2.

5.5 Minimal Curves for Convex Bodiesin Elliptic and Hyperbolic Spaces

A convex body is characterized by its area A, perimeter L, diameter D, andthickness, E. The diameter and thickness are the maximum and minimumwidths of the convex body, respectively. Inequalities involving pairs ofthese quantities are [Sholander 52]

E ≤ L/π ≤ D. (5.5.1)

The reason for the inequalities is that a circle of perimeter L has diameterL/π. Since a circle has largest area among curves of a given perimeter, thelargest value of L/π is D. In any event, L/π cannot be smaller than theminimum width of the convex body. Inequalities involving more than twoquantities are

2√ (

D2 − E2)

+ 2E sin−1 (E/D) ≤ L, (5.5.2a)

L ≤ 2√ (

D2 − E2)

+ 2D sin−1 (E/D). (5.5.2b)

Simply combining the two inequalities gives the first and third inequalitiesin (5.5.1).

However, inequalities (5.5.2a) and (5.5.2b) tell us far more. To see this,we use the relation between the inverse trigonometric functions to write

Page 303: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

276 A New Perspective on Relativity

them as

√ (D2 − E2

)− E cos−1 (E/D) ≤ 1

2(L − πE), (5.5.3a)

12

(πD − L) ≥ D cos−1 (E/D) − √ (D2 − E2

). (5.5.3b)

Introducing the angle,

r = cos−1 (E/D), (5.5.4)

measured in radians, enables (5.5.3a) and (5.5.3b) to be written as

0 ≤ tan r − r ≤ 12

(L/E − π), (5.5.5a)

0 ≤ r − sin r ≤ 12

(π − L/D). (5.5.5b)

Whereas the left-hand inequalities are well-known elementarytrigonometric inequalities, those on the right-hand side are not. That is,the first inequality in (5.5.5a) guarantees that the isoperimetric quotient ofany regular polygon is less than that of a circle. The inequality on the left-hand side of (5.5.5b) guarantees that the area of any sector inscribed in aright triangle is less than the area of the right triangle.

Rather, the right-hand inequalities in (5.5.5a) and (5.5.5b) say some-thing different. Adding (5.5.5a) and (5.5.5b) results in

2D sin r = 2√ (

D2 − E2)

≤ L. (5.5.6)

This inequality is stronger than the inequality for segments which states2D ≤ L [Sholander 52]. If E is the radius of a circle, and D is a line from thecenter to any point outside that circle then the half-lines emanating fromthe point outside the circle that are tangent to the circle plus the arc lengthon the circle connecting the points where the half-lines touch the circle formthe perimeter L of the shared area in Fig. 5.11. Then inequality (5.5.6) statesthat the length of the perimeter cannot be inferior to the two half-lines that aretangent to the circle.

Inequalities (5.5.5a) and (5.5.5b) refer to elliptic geometry. In order todemonstrate this we consider a regular n-gon, where L = L′/n is the lengthof a side. For a circumscribed n-gon in a circle of elliptic radius R, we add

Page 304: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 277

Fig. 5.11. The perimeter L consists of the two half-lines that are tangent to thecircle and the arc length between them.

inequalities (5.5.5a) and (5.5.5b) to obtain

tan r ≤ 12

L′

nE. (5.5.7)

We lose no generality by assuming the width E = 1 [Sholander 52]. Con-sider the right triangle BAO with central angle π/n in Fig. 5.12. From spher-ical geometry, tan (π/n) = tan r/ sin R, and introducing this into (5.5.7) gives

2π(n/π) tan (π/n) sin R ≤ L′.

Now, taking the limit as n → ∞ shows that the spherical length of thecircumference of a spherical circle is the lower bound to the perimeter of aregular n-gon

2π sin R ≤ L′. (5.5.8)

If we had, instead, wrote the sum of (5.5.5a) and (5.5.5b) as the inequality

sin r ≤ 12

L′

nD,

we would have considered a regular n-gon inscribed in a circle of radiusR. This would again lead to (5.5.8).

The trigonometry of the Euclidean plane is transformed into thetrigonometry of the hyperbolic plane simply by allowing the absolute

Page 305: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

278 A New Perspective on Relativity

Fig. 5.12. A circle inscribed in an n-gon.

constant to become imaginary. However, inequalities must be inverted

ϑ

tan ϑ< 1,

ϑ

sin ϑ> 1,

iϑtan (iϑ)

= iϑi tanh ϑ

= ϑ

tanh ϑ> 1, (5.5.9)

iϑsin (iϑ)

= iϑi sinh ϑ

= ϑ

sinh ϑ< 1.

Thus, any self-contradiction that may arise in hyperbolic geometrymust also necessarily arise in Euclidean geometry. The second equality,(5.5.9), applies to a tractrix, and shows that it is the result of treatingthe relative velocity as purely imaginary, just like the transformationfrom elliptic to hyperbolic geometry by treating the arc length as purelyimaginary.

The transition to the hyperbolic realm is easily made by allowing Dto become smaller than E. Then, instead of (5.5.1) we now have

E ≥ L/π ≥ D.

Page 306: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 279

Moreover, inequalities (5.5.3a) and (5.5.3b) become

E cosh−1 (E/D) − √(E2 − D2) ≥ 1

2(πE − L), (5.5.10a)

√(E2 − D2) − D cosh−1 (E/D) ≥ 1

2(L − πD). (5.5.10b)

Defining the ‘angle,’

r = cosh−1 (E/D),

inequalities (5.5.10a) and (5.5.10b) can be written as

r − tanh r ≥ 12

(π − L

E

), (5.5.11a)

sinh r − r ≥ 12

(LD

− π

). (5.5.11b)

Adding (5.5.11a) and (5.5.11b) gives

2D sinh r = 2√

(E2 − D2) ≥ L. (5.5.12)

This says that twice the straight-line segments of rays in a caustic, D ≤ E,cannot be inferior to the length, L.

In order to demonstrate that we are in hyperbolic space consider aregular n-gon of length L = L′/n inscribed in a circle of hyperbolic radius R.Introducing this length into (5.5.12) gives

sinh r ≥ 12

L′

nD. (5.5.13)

Again we lose no generality in assuming the minimum width D = 1. Con-sider the right triangle BAO in Fig. 5.13 with central angle π/n. Using thehyperbolic right angle formula, sinh r = sinh R sin (π/n), inequality (5.5.13)becomes

2π(n/π) sin (π/n) sinh R ≥ L′.

Proceeding to the limit as n → ∞, we get

2π sinh R ≥ L′, (5.5.14)

showing that the perimeter of a polygon is bounded from above by thehyperbolic circumference of a circle. This could have immediately been

Page 307: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

280 A New Perspective on Relativity

Fig. 5.13. A regular n-gon inscribed in a circle.

obtained from the elliptic inequalities by allowing their arguments tobecome imaginary and inverting the inequalities. Rather, had we written(5.5.13) as

tanh r ≥ 12

L′

nE,

we would have been led to consider a regular n-gon circumscribed abouta circle of hyperbolic radius R.

Inequalities (5.5.8) and (5.5.14) show that the perimeter of an n-gon inthe limit where n → ∞ is bounded from below and above by the ellipticand hyperbolic circumferences of a circle, respectively.

5.6 The Tractrix

Since the elliptic plane can be represented on a sphere without distortion,it is natural to inquire whether there exists a ‘pseudo’-sphere upon whichthe hyperbolic plane can be developed. Such a surface in Euclidean spacewould have all its distances measured on its surface related to distancesmeasured on the hyperbolic plane. The only distance we have is hyperbolic

Page 308: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 281

measure of the relative velocity (in natural units)

u = 12

ln

(1 + u1 − u

)= tanh−1 u. (5.6.1)

These lines in the hyperbolic plane are geodesics, like the great circles ona sphere. The ‘smoothness,’ or point-to-point homogeneity, of the hyper-bolic plane requires a constant, negative, specific curvature, K, for thesurface.

In 1827 Gauss proved that the total curvature of any triangle, withangles A, B, and C, formed by geodesics on a surface S is given by

∫ ∫K dS = K

∫ ∫dS = K A = A + B + C − π.

Since the area must be positive A = (A+B+C−π)/K > 0, the angle ‘excess’implies K > 0, while an angle ‘defect’ necessitates K < 0. Consequently, anangle excess implies a surface of positive constant curvature, K > 0, whilean angle defect implies a surface of negative constant curvature, K < 0. Anexample of the former is a triangle on the surface of a sphere, while thatof the latter is a triangle on a pseudosphere. They can also be pictured as‘fitting errors,’ as in Fig. 9.7.

The simplest example of a pseudosphere was given by Minding in1839 [cf. Sec. 2.5]. It is a bugle-shaped tractoid, shown in Fig. 2.19, formedby revolving the tractrix around the z-axis. The surface of revolution forwhich the cylindrical coordinates r = √

(x2 + y2) and z are expressible interms of the parameter u, defined in (5.6.1); specifically, we have

r = a = c sech u, z = c(u − tanh u).

The latter shows that the hyperbolic measure of the velocity can never beinferior to its Euclidean measure.

There are two reasons why this model of hyperbolic geometry is infe-rior to that of elliptic geometry [Coxeter 65]. First, the maximum and mini-mum normal curvatures at any point on the pseudosphere have a constantproduct K = −1, though they individually vary from point to point on thesurface. By constrast, the maximum and minimum normal curvatures ofthe sphere are constant everywhere. Second, the tractroid does not representthe entire hyperbolic plane since the bugle has a ‘rim’of length 2πc at u = 0,

Page 309: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

282 A New Perspective on Relativity

Fig. 5.14. Newton’s tractrix.

though the total area is finite. Beltrami suspected, and Hilbert proved, thatthere is no smooth surface that can cover the whole hyperbolic plane.

By considering the tractrix itself we can make connection with thedepolarization coefficient, introduced earlier, and, consequently, with thephase of the asymptotic form of the Bessel function. The equation for thetractrix, shown in Fig. 5.14, in the xy-plane is

dsdy

= − cy

, (5.6.2)

where ds = √(dx2 + dy2) is the arc length, and c is the constant slope.

Huygens interpreted the curve as the ‘path’ of a stone pulled by a rope oflength c.

Introducing the arc length into (5.6.2) leads to

√ [1 +

(dxdy

)2]

= − cy

.

Squaring, rearranging, and taking the negative square root give

dx = −√

(c2 − y2)y

dy,

and integrating results in

x = c cosh−1 (c/y) − √(c2 − y2). (5.6.3)

Comparing (5.6.3) with the last line in (5.4.46) shows that x would corre-spond to the depolarization coefficient n(x), and y to the semi-minor axis ofthe ellipsoid, a (cf. Eq. (4.2) of [Landau & Lifshitz 60]). The semi-major axisc would correspond to the slope of the tractrix.

Page 310: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 283

Now, the integral of (5.6.2),

y/c = e−s/c = tan�(s)

2=

√ (1 − u2

), (5.6.4)

identifies c with the absolute constant of the hyperbolic space, and � isthe angle of parallelism. The second equality follows from the Bolyai–Lobachevsky formula, and the last equality follows from identifying y witha = c sech u. For a system at rest, � = π/2, and the geometry is Euclidean,while for � < π/2 and u < 1 it is definitely hyperbolic.

Since the left-hand side of (5.6.4) satisfies the functional relation

f (s1) · f (s2) = f (s1 + s2),

it follows that the right-hand side of (5.6.4) is equivalent to the hyperbolicPythagorean theorem

cosh u1 · cosh u2 = cosh u, (5.6.5)

for a right angle hyperbolic triangle with u1 ⊥ u2 [Silberstein 14]. Sincesech u = sin �, (5.6.5) is equivalent to

sin �1 · sin �2 = sin �.

5.7 Rigid Motions: Hyperbolic Lorentz Transformsand Elliptic Rotations

It is no mere coincidence that we repeatedly had to deal with doubleangle formulas of trigonometric and hyperbolic functions. Any motion inEuclidean geometry can be reduced to a pair of reflections [Sommerville58]. In general, a displacement of an object is equivalent to a pair of inver-sions in two circles which cut perpendicularly a given circle. And, in anygeneral displacement, there are always two points which are left unaltered.

Rigid motions, characterizing different geometries, correspond to dif-ferent inertial frames. The most important transformation is an involu-tion which is a projectivity of period two. It is a non-trivial projectivitywhich is its own inverse. The term involution was coined by the Frenchgeometer Desargues, which, literally denotes the twisted state of youngleaves [Rosenfeld 88]. If the two fixed points are real, the involution ishyperbolic, whereas, if the fixed points are imaginary, the involution is

Page 311: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

284 A New Perspective on Relativity

elliptic. The parabolic case corresponds to a coincidence of the two fixedpoints.

The ‘canonical’ form of an involution is

xx′ ± 1 = 0,

where the plus and minus signs refer to elliptic and hyperbolic involutions,respectively. That is, an elliptic involution, xx′ + 1 = 0, has the fixed points±i, while the hyperbolic involution, xx′ − 1 = 0, has the real fixed points±1. For unequal fixed points, a and b, the involution becomes

xx′ − 12

(a + b)(x + x′) + ab = 0. (5.7.1)

In particular, if x = x′, then (x − a)(x − b) = 0, while, if we let a → ∞,

x + x′ = 2b, (5.7.2)

which is a reflection.By a symmetry argument, it can be shown that the conjugate non-real

fixed points of an elliptic involution are u and −1/u [Schwerdtfeger 62].The involution (5.7.1) becomes

xx′ − 12

(u − u−1)(x + x′) − 1 = 0, (5.7.3)

which is a reflection. If we let u → 0, we get the reflection in the origin[cf. (5.7.2)]

x + x′ = 0. (5.7.4)

Combining the two reflections, (5.7.3) and (5.7.4), gives the translation[Coxeter 65],

xx′ − 12

(u − u−1

)(x − x′) + 1 = 0, (5.7.5)

which can be brought into the form of the addition law for tangent,

x − x′

1 + xx′ = 2u1 − u2 . (5.7.6)

If u = tan 12ϑ, and x = tan ϕ and x′ = tan ϕ′, then tan

(ϕ − ϕ′) = tan ϑ. And in

terms of ϑ, the translation (5.7.5) becomes

xx′ − cot ϑ(x − x′) + 1 = 0, (5.7.7)

Page 312: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 285

which is a camouflaged way of writing the clockwise rigid rotation,

M(x) = x′ = cos ϑx + sin ϑ

− sin ϑx + cos ϑ. (5.7.8)

The Möbius transform (5.7.8) is a product of rotations, each throughan angle θ = 1

2ϑ. For the counter-clockwise rotation we have

M(x) = x − tan ϑ

tan ϑx + 1, (5.7.9)

which is the addition law for tangent. Because M( tan (ϑ)) = 0, its conju-gate non-real fixed point is − cot ϑ, since M( − cot ϑ) = ∞. In just the sameway that the cross-ratio of fixed points and their conjugates give the dou-ble angle formulas for hyperbolic functions, so the cross-ratio of the non-real fixed points and their conjugates gives the double angle formulas fortrigonometric functions.

For consider the conjugate points tanh u1 and tanh u2 whose conjugatepoints are coth u1 and coth u2. The distance between u1 and u2 is the positivesquare root of the cross-ratio:

{tanh u1, coth u1| tanh u2, coth u2} = tanh2 (u1 − u2).

In an analogous way, a segment whose ends are u1 and u2 have non-realconjugate points at u1 ±π/2 and u2 ±π/2. The fixed and conjugate non-realpoints are tan u1, tan u2, and −cot u1 and −cot u2. Their cross-ratio,

{tan u1, −cot u2| tan u2, −cot u1} = tan2 (u1 − u2

),

gives the double angle formula for the tangent. Switching the second andfourth member would make it negative without changing its magnitude.

In contrast, if the fixed points u and 1/u are real, the involution is

xx′ − 12

(u + u−1)(x + x′) + 1 = 0. (5.7.10)

This is a general reflection, which if x = x′ becomes

(x − u)(x − u−1) = 0.

Combining the reflection (5.7.10) with the reflection through the origin,

x′ + x = 0, (5.7.11)

Page 313: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

286 A New Perspective on Relativity

by allowing u → 0, gives

x′x + 12

(u + u−1)(x′ − x) − 1 = 0, (5.7.12)

or, equivalently,

x′ − x1 − xx′ = 2u

1 + u2 = α. (5.7.13)

Expression (5.7.13) can be simplified by a hyperbolic substitution. Set-ting x = tanh u and x′ = tanh u′ with u = tanh 1

2 α, we get tanh (u − u′) =tanh α, which can only be the case if

u + u′ = 0, (5.7.14)

so that α = 2u. This shows that the relativistic velocity composition law,(5.7.13), is the addition law for equal and opposite velocities, where therelative speed of the two systems is

α = 2u1 + u2 = tanh (2u) = tanh α.

The Möbius transformation,

Mu(x) = x + tanh utanh u x + 1

, (5.7.15)

will readily be appreciated as the Lorentz transform in homogeneous coor-dinates. In fact, it is a product of Lorentz transforms at relative speedsu = tanh u. From this we may safely conclude that

space and time are not separate entities, but enter only through theirratio x/t, as a homogeneous coordinate.

Thus, the Möbius transforms (5.7.8) and (5.7.15) may be combined toread

M(x) = x + tan (u√

κ)/√

κ

x tan (u√

κ)/√

κ + 1, (5.7.16)

where κ = ±1 in the elliptic and hyperbolic cases, for which u = u andu = u, respectively. But, whereas the hyperbolic measure of distance is

u = tanh−1 u = 12

ln

(1 + u1 − u

), (5.7.17)

Page 314: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 287

the elliptic measure of distance is

u = sin−1 u. (5.7.18)

Noting the symmetry between (5.7.18) and sinh−1 (u/

√(1 − u2)

) = u, andthat between (5.7.17) and tan−1 (

u/√

(1 − u2)) = u, these definitions of

hyperbolic and elliptic measures of distance become apparent.

5.8 The Elliptic Geometry of an Oblate Spheroid

In this section we show that the motion of an oblate spheroid describes thephenomenon of aberration within the realm of elliptic geometry. That is,we will derive the expression for the distance,

ϑ′ = cos−1 √(1 − α2), (5.8.1)

whose argument for an oblate ellipsoid is the ratio of the semi-axes,c/a = √

(1 − α2) from the phenomenon of aberration. This presupposesthat if there is a physical phenomenon in hyperbolic space there must be acorresponding one in elliptic space.

The minimum distance, ϑ′ = 0, occurs for a stationary system forwhich α = 0. The maximum distance, on the other hand, ϑ′ = π/2, occursin the ultrarelativistic case α = 1, since α is always proportional to therelative velocity. For motion in the x-direction, the composition laws forthe velocities in the x- and y-directions are

u′x = ux − u

1 − uux, u′

y = uy√

(1 − u2)1 − uux

. (5.8.2)

This imposes the constraint that we measure the incoming rays asstraight lines inclined to the vertical, instead of the usual convention ofexpressing their altitude with respect to the plane. Thus, ux = − sin ϑ, andu′

x = − sin ϑ′ so that the composition laws become

sin ϑ′ = sin ϑ + u1 + u sin ϑ

, cos ϑ′ = cos ϑ√

(1 − u2)1 + u sin ϑ

.

Their ratio,

tan ϑ′ = sin ϑ + ucos ϑ

√(1 − u2)

, (5.8.3)

Page 315: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

288 A New Perspective on Relativity

describes a plane wave changing its direction on transition from one inertialframe to another. If light propagates along the normal to the direction inwhich the frame is moving at uniform velocity, ϑ = 0, and

tan ϑ′ = u/√

(1 − u2) = sinh u

so that ϑ′ is the angle of aberration.The condition ux = 0 applied to the second equation in (5.8.2) shows

that the transverse velocity component becomes contracted in the primeframe. Another important special case occurs when ux = −u, implyingthat in the prime frame the longitudinal component of the velocity reduces(5.8.3) to the double angle formula,

tan ϑ′ = α√(1 − α2)

= tan (2ϑ) = 2u1 − u2 =: λe, (5.8.4)

where

sin ϑ′ = sin (2ϑ) = 2u1 + u2 =: λh, (5.8.5)

and

cos ϑ′ = cos (2ϑ) = 1 − u2

1 + u2 . (5.8.6)

Finally, equating (5.8.6) with the inverse of (5.8.1) gives α = λh. Asϑ′ → 0 so, too, does u, while as ϑ′ → π/2, u → 1. Since aberration occursin elliptic geometry, it has a size effect associated with it. An observer atthe north pole of a sphere will see an object moving toward the equatordecrease in size until it reaches it at π/2. Then it will tend to increase in sizeuntil it reaches the south pole at π. Klein fixed the maximum distance inelliptic space as π/2, and an object which is moving away from an observerwho will always see the object diminishing in size. Since aberration wasthought to live in hyperbolic space, no size effect was ever predicted. But,as we have seen, there is a size effect to aberration when it resides in ellipticspace.

Page 316: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 289

5.9 Matter and Energy

As shown in Sec. 5.4.3, that the inertia of a charged body is increased by itsmotion can be traced all the way back to a little known paper by J. J. Thom-son [81] entitled “On the effects produced by the motion of electrified bod-ies,” published in April 1881. Thomson suggested that the electrification ofparticles would affect their inertia in such a way that it would increase withvelocity. Back in 1881 no one knew what the carriers of electricity were, onlythat the ‘fluid’ appeared to be incompressible.

Ironically, Hall’s 1879 experiment in which he attempted to validatethe electric fluid model by showing that a current carrying conductor setsup a difference of potential perpendicular to the direction of the currentwhen placed in a magnetic field, actually signaled its death knell. Hall’ssupervisor Rowland, slightly earlier, established that a moving charge cre-ates a magnetic field just like a current in a wire. And what Thomson setout to do was to determine “what is the magnetic force produced by sucha moving body.”

Thomson [81] claims

I have shown that the kinetic energy of a small sphere of mass m charged with aquantity of electricity e and moving with velocity v is

{12

m + 215

µe2

a

}v2,

where a is the radius of the sphere and µ the magnetic permeability of the dielectricsurrounding it.

The existence in the kinetic energy of this term, which is due to the “displacementcurrents” started in the surrounding dielectric by the motion of the electrificationon the sphere, shows that electricity behaves in some respects very much as if ithad mass.

From the above relation Thomson concluded that the electromagnetic fieldcreated by the moving charge produces a reaction on the charge therebyincreasing its mass by a factor 4µe2/15a. On account of the conservation ofmomentum the increase in the mass must be “impulsively diminished.”

For historical inaccuracy, it was Thomson, and not Lorentz, who wrotethe force on a charge e moving at velocity v, as 1

2 e(v

c × H). The removal

of the incorrect factor of a half was made by Heaviside eight years later.

Page 317: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

290 A New Perspective on Relativity

It was still another three years away before Lorentz wrote down the totalforce as e

(E + v

c × H), but that is still no excuse for not giving Thomson

the credit of the new term. And it is not a banality to say that the totalforce is the vector sum of static one and a motional one. Again Stigler’s lawof eponymy has been verified! Moreover, from the vanishing of the totalforce, Thomson gave a procedure for determining not only the velocity ofthe particle, but, in addition, the ratio e/m [cf. Sec. (3.7.4.1), Eq. (3.7.15)].

Parenthetically, we might mention another historical inaccuracy.Everyone is familiar with Einstein’s derivation of Avogadro’s number froma dynamic equilibrium between opposing forces, say between a gravita-tional force dragging particles down and an opposing concentration gra-dient pushing them up [Lavenda 84]. This was in 1905. But, even earlier,Townsend [Thomson 04, pp. 79–83] showed that the charge on a gaseousion is equal to that on the ion of hydrogen in hydrolysis, by measuring thecoefficient of diffusion and comparing it with the velocity that the ion haswhen acted upon by an impressed electric force. The dynamic equilibriumis between the flux tending to drive the particles and the velocity acquiredby the particles under the action of the electric force, or pressure gradientdivided by the viscosity. The latter is given by the ratio Ee/v. Then relatingthe pressure and number of ions of the gas to the atmospheric pressure andAvogadro’s number, the latter can be determined if we know the chargeon the ion, or conversely, we can determine the charge on an ion if weknow Avogadro’s number. So, Einstein’s gedanken experiment was well-known in scientific circles at the time he incorporated it into his theory ofBrownian motion.

At the turn of the twentieth century, the only elementary particle thatwas known was the electron, a name coined by Stoney in 1891, and found byThomson six years later. In any charged system, according to the Victorians,part of the mass will that be of the aether. The aether was the storageplace for energy derived from the magnetic and electric fields. The storedenergy increases from the mechanical motion of the sphere. The motionof the sphere is met with resistance, and it is through this resistance thatmechanical energy is converted into electromagnetic energy. As Thomsonargued

. . .[it] must correspond to the resistance theoretically experienced by a solid movingthrough a perfect fluid. In other words, it must be equivalent to an increase in themass of the charged moving sphere.

Page 318: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 291

Thomson carried the analogy further and concluded that the capacityof a condenser in motion will not be the same as that when it is at rest,“but the difference depends on the square of the ratio of the velocity of thecondenser to the velocity of light which will be exceedingly small.” Exper-iments on cathode rays showed that the carriers of electric charge mustbe considered as particles with very small mass. Thus, the idea evolvedthat the electromagnetic inertia of a charged particle was comparable tothe entire inertia of the particle. This was actually shown by Kaufmann inSec. 3.7.4.2, who experimenting with β-rays emitted from radium, showedthat the apparent mass of these particles increased in a regular way withvelocity [Cuttingham 14].

In his Yale University lectures, given in May 1903, Thomson [04]returned to the problem of mass due to the motion of electric charges. Hereasoned that a mass in motion creates a magnetic field, H. Wherever thereis a magnetic field there are µ0H2/8π units of energy in Gaussian units.Averaging the energy over all points exterior to the sphere gives an addi-tional amount of energy µ0e2v2/3a, where µ0 is the magnetic permeabilityof the medium surrounding the sphere.c This corrects the numerical factorhe found in 1881. Thus, he concludes that the whole of the kinetic energyis 1

2

(m + 2

3µ0e2/a)

v2, and that there is a contribution to the mass due toits charge which in motion creates a magnetic field. Notice, that Thomsondoes not claim that all the mass is electromagnetic in origin, just its kineticpart is.

Though he leaves out the c2 in the denominator of his ‘extra’ mass,a proof of E = mc2 can be found in these lectures. Thomson relied onimagery where the lines of force between negative charged particles werecalled ‘tubes of force,’ or simply ‘Faraday tubes.’ The Faraday tubes havethe same direction as the electric force. The difference between the numberof Faraday tubes which leave a closed surface and those which enter it isequal to the number of charges inside the surface. This is what Maxwellreferred to as the electric displacement through the surface.

cThis differs from (5.4.25) by a factor of 1/4π, occurring on the transition fromGauss to rationalized units. According to Heaviside, “the effect of changing fromirrational to rational units is to introduce 4π … For the unnatural suppression ofthe 4π in the formulae of the central force, where it has a right to be, drives it intothe blood, there to multiply itself, and afterwards break out all over the body ofelectromagnetic theory.”

Page 319: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

292 A New Perspective on Relativity

Now, if there are N Faraday tubes passing through a unit area at rightangles to their direction, and B is the magnetic induction, then the momen-tum per unit volume will be NB sin θ, where θ is the angle between themagnetic induction and the Faraday tubes. The direction of the momen-tum is normal to both the magnetic induction and the Faraday tubes, andparallel to Poynting’s vector, which determines the direction of energy flowin the field.

If v is the velocity of the sphere then the magnetic induction is 4πµNv,where µ is the magnetic permeability of the medium surrounding thesphere. Consequently, the momentum per unit volume is 4πµN2v sin θ. Ifthe Faraday tubes were to move in the direction normal to their lengththey would carry a mass of the surrounding medium equal to M = 4πµN2

with them, just like a cylinder being dragged broadside in a viscous liq-uid [cf. Sec. 5.3.1]. This, according to Thomson, was the mass of the boundaether. The aether was necessary in order not to violate Newton’s thirdlaw, so as to provide the missing link in the conservation of momentum.The momentum given to the Faraday tubes must be equal and opposite tothe momentum lost in the bound aether.

Thomson then claims that

It is a very suggestive fact that the electrostatic energy E is proportional to M, themass of the bound aether in that volume.

He offers the following proof. The electrostatic energy is E = 2πN2/ε,where ε is the specific inductive capacity of the medium (i.e. its dielectricconstant). Introducing the mass, M = 4πµN2, he comes out with

E = 12

Mµε

.

But, from Maxwell’s theory he deduces that c2 = 1/µε, and so

E = 12

Mc2, (5.9.1)

the 12 being the last vestige of the nonrelativistic kinetic energy. Thomson

concludes “E is equal to the kinetic energy possessed by the bound masswhen moving with the velocity of light.” This is comparable to Poynting’sderivation given in Sec. 3.5.2.

We should emphasize that there is no need to postulate a limit-ing velocity for it comes out directly from Maxwell’s theory, although

Page 320: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 293

Heaviside would not agree. According to Heaviside [99], “If [the electro-magnetic laws] are valid at any speed, then there is nothing to preventspeeds of motion greater than light.” This is very unrelativistic, but theremust be some very special properties of the aether that makes the productof the specific inductive capacity and the permeability to always come outto be the same constant, independent of the specific nature of the aether.Thomson also established a proportionality between the tension in a Fara-day tube and its mass per unit length of the string. This is the first utteranceof the anisotropy of matter.

Okun [89], writing in Physics Today comments that the “rest energywas one of Einstein’s great discoveries.” We have seen in Sec. 1.2.2.2 thatPoincaré [00] also arrived at it by considering a pulse of light like a can-non ball shot from an artillery piece. According to Poynting’s theorem inSec. 3.5.2, it will carry a momentum G = E/c. At the same time, the momen-tum is by definition G = Mv, and using G = vE/c2, he arrived at twice(5.9.1).

Although Okun [89] is correct in saying that Thomson’s increasein mass due to motion is velocity-independent, it was Heaviside whoextended Thomson’s analysis of a slowly moving charge to one movingat any speed, and even speeds faster than light! He did so by consideringFaraday tubes on the surface of a sphere, representing a charged particle.When a Faraday tube is in the equatorial region it imprisons more aetherthan when it is near the polar regions. The equatorial plane passes throughthe center of the sphere normal to the direction of motion.

If we remember that Faraday tubes repel one another the crowd-ing together in the equatorial plane would give rise to a pressure thatre-establishes the uniform distribution of the tubes. The actual distribu-tion is a balance between the opposing forces. The excess density of tubespacked into the equatorial region resulting from the increasing speed atwhich the charged particle is moving results in an additional aether that isimprisoned and this leads to an additional increase in mass.

What Heaviside succeeded in doing was to show that, while the pro-jection of the tubes on this plane is the same as that for the uniform distri-bution of tubes, the distance of every point in the tube from the equatorialplane is reduced by the factor

√(c2 − u2). From this result, Heaviside con-

cluded that it is only when the velocity of the charged body is comparable to

Page 321: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

294 A New Perspective on Relativity

that of light that the distortion in the distribution of Faraday tubes becomesappreciable.

All derivations of the mass variation and energy rely on the conser-vation of momentum [Ives 52,Rohrlich 90] although motion need not beinvolved. Some of these derivations we have discussed in Sec. 4.2.1. Theidea is to consider a body suspended in the interior of an enclosure inwhich the system is stationary with respect to the medium transmitting theradiation. The body will emit radiation in the forward and aft directionssymmetrically having energy 1

2E in each of these directions. The momentaare equal and opposite in the opposite directions so no change in the stateof a body will be observed.

Now let the whole system be set in motion at a constant velocityv. In a medium, such as water, waves travel at a speed v. If the sourceemitting waves travels at a speed u, which is less than the wave speed, theobserved frequency will be different from the source frequency. Dependingon whether the source is approaching or receding from the observer therewill be a change in frequency by the amount (1 ± u/v). This is the well-known Doppler effect.

However, if the observer is moving at a speed u toward a station-ary source emitting sound waves at a speed v he will see wavefrontsapproaching him that are separated by a constant wavelength at a relativespeed v + u. What changes is the frequency at which the observer sees thewavefronts. But if the source is a light source, we cannot have a speed c+u,for that would imply an emission theory. Both frequency and wavelengthmust shift in order to keep their product, the velocity of light, constant.This is often explained by observing that light waves are not vibrations ofthe ‘aether’, but, rather, self-maintained oscillations of the electromagneticfields.

So long as the observer’s velocity is small compared with that of light,the linear approximation to the Doppler shift holds. But when his veloc-ity becomes comparable with c, we must use the full-fledged relativisticformula. It is often said, that this follows from the assumed mass depen-dency on the velocity. Rather, we would like to believe that it comes fromthe hyperbolic measure of velocity. Thus, the energy emitted in oppositedirections will be given by

12

E(

1 + u/c1 − u/c

)1/2

and12

E(

1 − u/c1 + u/c

)1/2

.

Page 322: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 295

We have already seen in Sec. 1.2.2.2 that Poincaré [00] — as earlyas 1900 — established that the momentum lost by radiation should be1/c times the energy of the body E. In other words, Poincaré consideredelectromagnetic radiation as a ‘fluide fictif’ that has a density E/c2. WhatPoincaré could not convince himself was that the mass decrease was dueto the loss by radiation. In other words, the mass of such a fluid wouldbe destructible, being able to reappear in other forms. Since such a massseemed to have little to do with mass, as Poincaré knew it, it is for thisreason that he referred to it as a ‘fictitious fluid,” rather than a real one.

The corresponding momentum of this ‘fictitious fluid’ would be thedensity times c. Thus, the momentum from forward and backward radia-tion would be

E2c

(1 + u/c1 − u/c

)1/2

andE2c

(1 − u/c1 + u/c

)1/2

,

so that the net momentum would be the difference of the two

Euc2√(1 − (u/c)2)

.

Although this is momentum because W/c2 has units of momentum, therehas been no hint of inertial mass.

If u′ is the velocity of the body prior to the emission of radiation, whenit had mass m′, the conservation of momentum demands

m′u′√

(1 − (u′/c)2)= mu√

(1 − (u/c)2)+ (E/c2)u√

(1 − (u/c)2). (5.9.2)

This would mean that we could observe the motion of the system withrespect to that of its enclosure. Relativity forbids this by claiming thatu′ = u,and so the conservation of momentum gives as its condition

m′ − m = E/c2,

a result that is entirely independent of whether the system and enclosureare in motion or not!

This ‘proof’ is what Ives [52] attributes to Poincaré’s ‘principle of rel-ativity,’ which he formulated in 1904, to the effect that it is impossible byobservation on a body to detect its uniform translational motion. Now, weonly know classical mechanics and the Planck relation. We have no knowl-edge that momentum conservation looks anything like (5.9.2).

Page 323: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

296 A New Perspective on Relativity

A radiation source is mounted onto a moving railroad car which istraveling at a constant velocity u. The radiation source emits two pulsesof radiation, both at frequency ν in the forward and aft directions withregard to the moving car. The frequency in the forward direction will beDoppler-shifted toward the blue by an amount

ν′ = 12ν

(1 + u/c1 − u/c

)1/2

, (5.9.3)

while that shifted downstream will be subjected to a redshift by an amount

ν′′ = 12ν

(1 − u/c1 + u/c

)1/2

. (5.9.4)

Classically, we can measure only differences in energy and momen-tum. The change in energy, �E = h(ν′ + ν′′), and the change in momentum,�G = h(ν′ − ν′′)/c will be given by [Steck & Route 83]

�E = hν√(1 − u2/c2)

, (5.9.5a)

�G = Ec2

u√(1 − u2/c2)

. (5.9.5b)

Dividing (5.9.5a) by (5.9.5b) results in

�Ec2 u = �G = �mu, (5.9.6)

because the only way that the momentum can change at constant velocityis for the mass to change. This is the famous Einstein equivalence betweenmass and energy which he derived using the relativistic expression for thekinetic energy. Here, we have employed only classical physics.

We have used the longitudinal Doppler shift to ‘derive’ the result thata change in mass is measured as a change in the energy of a body. It issurprising that both space contraction and time dilatation do not dependupon whether we are moving toward or away from the source.

For slowly moving inertial systems, Einstein predicts that “the timemarked by the moving clock, viewed in the stationary system, is slowedby . . . 1

2u2/c2 per second,” where u is the relative velocity between theclocks. To lowest order, the Doppler effect gives a correction proportionalto u/c, which is larger than that predicted by Einstein.

Page 324: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 297

Rather, it is only when we take the sum and difference of the two fre-quencies, (5.9.3) and (5.9.4), do the first-order terms cancel, giving a resultthat is independent of the direction of motion. This is what exactly was donein the 1938 experiment performed by Ives and Stilwell, that we discussedin Sec. 3.4, who wanted to measure the second-order Doppler effect wherelight is emitted in the transverse direction to the motion. Instead of measur-ing this, which is nearly impossible, they measured radiation in the forwardand backward directions normal to the transverse direction. By averagingthe two they obtained a second-order Doppler effect, (3.4.6), instead of theusual, first-order effect, which predicts a shift in wavelength, �λ ≈ (u/c)λ.

It is rather ironic that French [66] in his Special Relativity adds thefollowing anecdote to the Ives and Stilwell experiment:

It is a curious sidelight on this experiment that its authors did not (even as late as1938) accept special relativity. In their view the results simply demonstrated that amoving clock runs slow (as Larmor and Lorentz suggested) by just the same factor,and in just as real a way, as a moving rod was believed to be contracted if it pointedalong its direction of absolute motion through the aether. Old and cherished ideasdie hard.

According to Whitrow [80], Einstein would argue that the contraction ofa rod or the dilatation of time are only apparent changes, not involvingany real change in the constituent of matter. Or would he? Just recall theincident with Varicak [11] who argued that the contraction was “so to speak,only a psychological and not a physical fact.” Although Einstein requestedEhrenfest to respond to Varicak, he could not let something like this gounanswered. Now, Einstein argued that “contraction was completely real,”and subject to physical measurement by an observer not moving with thecontracting body [Klein 70]. However, according to the experiment of Ivesand Stilwell, contraction of the rod should occur in the transverse directionof the “absolute motion through the aether.”

When considering the motion of one clock with respect to another,in either frame the clock will register less ticks from the other clock thanits own clock. It does not depend on whether the observer is approachingor receding from the other clock, which means that a first-order Dopplereffect is not involved. This seems strange at first sight.

Moreover, since the clock will register less ticks from the moving clockthan from its own clock, some ticks will have gotten lost [Essen 78]. On around trip the two clocks are compared and the moving one is seen to

Page 325: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

298 A New Perspective on Relativity

have gone slower, i.e. the so-called twin paradox. But, when the clocks arefinally compared there is no relative velocity between them so what is beingcompared? The existence of a second-order time dilatation follows from theLorentz transformation, “but it is now a real physical effect just as in theLorentz theory from which Einstein started” [Essen 78]. Do we actuallyknow what is going on? and why should time, unlike frequency, undergo asecond-order slowing down when the latter undergoes a first-order shift?

Blame it on accelerations! A twin leaves Earth, where his brotherremains, and makes a round trip only to find that his brother has agedmore than he has. Accelerations and decelerations are needed to completea round trip, and, somehow, these have shortened time.According to Bondi,this is nonsense since each brother has measured ‘his’ time. But, how do wecompare the times? If one brother remains inertial, the only way is to havethe other brother undergo acceleration and deceleration. Again, accordingto Bondi, “the time taken by such an observer is less than the time recordedby the inertial observer.” But, the two clocks are not symmetrical for accel-eration and deceleration have intervened. Unfortunately, acceleration doesnot enter into special relativity so it can have no effect whatsoever.

Yet, the relation is independent of the motion, and so applies to astationary system. Hence, the conservation of momentum does not apply,and a valid derivation of the relation should be completely ‘at rest.’

In electrodynamics, the four-force can be derived from the divergenceof a stress tensor. In the early days of relativity it was believed that any forcecan be reduced to the electrodynamic force, apart from the gravitationalforce [Pauli 58]. It is the symmetry property of the energy–momentumtensor that asserts a proportionality between the momentum density andthe energy flux density. This was first proposed by Planck [07] who set thedensity of inertial mass equal to the heat content. We have reviewed itspresent status [Lavenda 02]. This can be considered as a generalization ofthe equivalence of mass and energy. However, it makes a statement as tothe localization of momentum and energy. By a mere integration over thesystem’s finite volume, the total momentum and energy is recovered.

References

[Abraham 03] M. Abraham, “Prinzipien der Dynamik des Elektrons,” Ann. Phys.10 (1903) 105–179.

Page 326: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

The Origins of Mass 299

[Babic & Buldyrev 91] V. M. Babic and V. S. Buldyrev, Short-Wavelength DiffractionTheory (Springer-Verlag, Berlin, 1991), Ch. 3.

[Bondi 64] H. Bondi, Relativity and Common Sense: A New Approach to Einstein (Dou-bleday, New York, 1964).

[Bridgman 27] P. W. Bridgman, The Logic of Modern Physics (Macmillan, New York,1927).

[Bucherer 04] A. H. Bucherer, Mathematische Einführung in Die Elektronentheorie(B. G. Teubner, Leipzig, 1904), p. 50, Eq. (91a).

[Coxeter 65] H. S. M. Coxeter, Non-Euclidean Geometry, 5th ed. (U. Toronto Press,Toronto, 1965).

[Cushing 81] J. T. Cushing, “Electromagnetic mass, relativity, and the Kaufmannexperiments,” Am. J. Phys. 49 (1981) 1133–1149.

[Cuttingham 14] E. Cuttingham, The Principle of Relativity (Cambridge U. P., Cam-bridge, 1914), p. 152.

[Einstein 05] A. Einstein, “Ist die Trägheit eines Körpers von seinem Energieninhaltabhängig,” Ann. Phys. 18 (1905) 639–641.

[Essen 78] L. Essen, “Relativity — joke or swindle?,” Wireless World, October 1978,44–45.

[French 66] A. P. French, Special Relativity (van Nostrand Reinhold, London, 1966).[Heaviside 99] O. Heaviside, Electromagnetic Theory, Vol. II (The Electrician,

London, 1899), pp. 533–534.[Ives 52] H. E. Ives, “Derivation of the mass–energy relation,” J. Opt. Soc. Am. 42

(1952) 540–543.[Klein 70] M. J. Klein, Paul Ehrenfest (North-Holland, Amsterdam, 1970).[Landau & Lifshitz 60] L. D. Landau and E. M. Lifshitz, Electrodynamics of Contin-

uous Media (Pergamon, Oxford, 1960), Sec. 4, Eq. (4.32).[Larmor 00] J. Larmor, Aether and Matter (Cambridge U. P., Cambridge, 1900)

pp. 227–229.[Lavenda 84] See, for instance, B. H. Lavenda, Nonequilibrium Statistical Thermo-

dynamics (Wiley-Interscience, Chichester, 1985), pp. 22–24.[Lavenda 02] B. H. Lavenda, “Does the inertia of a body depend on its ‘heat’ con-

tent?,” Naturwissenschaften 89 (2002) 329–337.[Lewis & Tolman 09] G. N. Lewis and R. C. Tolman, “The principle of relativity

and non-Newtonian mechanics,” Phil. Mag. 18 (1909) 510–523.[Liénard 98] A. Liénard, “Champ électrique et magnétique produit par une charge

électrique concentrée en un point et animée d’un mouvement quel-conque,” L’Éclairage Électrique 16 (1898) 5, 53, 106.

[Lorentz 52] H. A. Lorentz, The Theory of Electrons, 2nd ed. (Dover, New York, 1952),p. 212.

[MacMillan 30] W. D. MacMillan, Theory of the Potential (McGraw-Hill, New York,1930), pp. 17–18.

[Milne 48] E. A. Milne, Kinematic Relativity (Clarendon Press, London, 1948).[Okun 89] L. B. Okun, “The Concept of Mass,” Physics Today June 1989, 31–36.[O’Rahilly 38] A. O’Rahilly, Electromagnetics, (Longman, Green & Co., London,

1938).[Pauli 58] W. Pauli, Theory of Relativity (Pergamon Press, London, 1958).

Page 327: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch05

300 A New Perspective on Relativity

[Planck 07] M. Planck, “Zur Dynamik bewegter Systeme,” Berl. Ber. 13 (June, 1907)542–570; also in Ann. der Phys. Lpz. 76 (1908) 1–34.

[Poincaré 00] H. Poincaré, “The theory of Lorentz and the principle of reaction,”Arch. Nederland 5 (1900) 252–278.

[Richtmyer & Kennard 42] F. K. Richtmyer and E. H. Kennard, Introduction to Mod-ern Physics, 3rd ed. (McGraw-Hill, New York, 1942), pp. 80–82.

[Ritz 08] W. Ritz, “Ricerches critiques sur l’Électrodynamique Générale,” Ann.Chimie et Physique, 8th series, XIII (1908) 145–275; translated and com-mented upon by W. Hovgaard, “Ritz’s electrodynamic theory,” J. Math.Phys. 11 (1932) 218–254.

[Rohrlich 90] F. Rohrlich, “An elementary derivation of E = mc2,” Am. J. Phys. 58(1990) 348–349.

[Rosenfeld 88] B. A. Rosenfeld, A History of Non-Euclidean Geometry (Springer, NewYork, 1988).

[Schott 12] G. A. Schott, Electromagnetic Radiation (Cambridge U. P., Cambridge,1912), p. 246.

[Schwerdtfeger 62] H. Schwerdtfeger, Geometry of Complex Numbers (U. TorontoPress, Toronto, 1962).

[Sholander 52] M. Sholander, “On certain minimum problems in the theory of con-vex curves,” Trans. Amer. Math. Soc. 32 (1952) 139–173.

[Silberstein 14] L. Silberstein, The Theory of Relativity (Macmillan, London, 1914).[Soddy 32] F. Soddy, Interpretation of the Atom (John Murray, London, 1932).[Sommerville 58] D. M. Y. Sommerville, The Elements of Non-Euclidean Geometry

(Dover, New York, 1958).[Thomson 81] J. J. Thomson, “On the effects produced by the motion of electrified

bodies,” Phil. Mag. 11 (1881) 229–249.[Thomson 88] J. J. Thomson, Applications of Dynamics to Physics and Chemistry (Daw-

sons, London, 1888).[Thomson 04] J. J. Thomson, Electricity and Matter (Yale U. P., New Haven, 1904).[Thomson 21] J. J. Thomson, Elements of the Mathematical Theory of Electricity and

Magnetism 5th ed. (Cambridge U. P., London, 1921), p. 388.[Thomson & Thomson 28] J. J. Thomson and G. P. Thomson, Conduction of Elec-

tricity through Gases, Vol. I, 3rd ed. (Cambridge U. P., Cambridge, 1928),Sec. 70.

[Varicak 11] V. Varicak, “Zum Ehrenfestschen Paradoxon,” Phys. Z. 12 (1911) 169.[Whitrow 80] G. J. Whitrow, The Natural Philosophy of Time, 2nd ed. (Clarendon

Press, Oxford, 1980).[Wilson & Lewis 12] E. B. Wilson and G. N. Lewis, “The space-time manifold of rel-

ativity. The non-Euclidean geometry of mechanics and electromagnetics,”Proc. Nat. Acad. Sci. 48 (1912) 387–507.

[Yaghjian 92] A. D. Yaghjian, Relativistic Dynamics of a Charged Sphere: Updating theLorentz-Abraham Model (Springer-Verlag, Berlin, 1992) p. 11.

[Zahn & Spees 38] C. T. Zahn and A. H. Spees, “A critical analysis of the classicalexperiments on the relativistic variation of the electron mass,” Phys. Rev.53 (1938) 511–521.

Page 328: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Chapter 6

Thermodynamics of Relativity

6.1 Does the Inertia of a Body Dependon its Heat Content?

Not only could electromagnetic energy increase the mass, but it was pointedout by Hasenöhrl [04] in 1904, that also heat energy could increase the‘mechanical’ mass of a body. Shortly after Planck’s analysis of blackbodyradiation, his students studied the problem of a radiation cavity set inmotion and traveling at constant velocity. Building on Hasenöhrl’s [04]result that “to the mechanical mass of our system must be added an appar-ent mass m = 8E/3c2,” which he later corrected to m = 4E/3c2, Planck’sdoctoral student, Mosengeil [07], found thermodynamic expressions forthe dynamics of moving systems. These results were later generalized byhis mentor, Planck [07].

The nineteenth century saw the equivalence between heat and work,while the twentieth century witnessed the equivalence between heatcontent and mass.

According to Planck,

through every absorption or emission of heat the inertial mass of a body alters, andthe increment of mass is always equal to the quantity of heat . . . divided by thesquare of the velocity of light in vacuo.

The absorption or emission of radiant energy results in the productionof heat. The heat is the average kinetic energy of the particles which isrelated to the change in mass of the particles. The radiating body losesmass, and the condition must be independent of the velocity at which itis traveling. It is the aim of this chapter to give mathematical substance tothis statement.

301

Page 329: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

302 A New Perspective on Relativity

Einstein’s famous relation,

�E/c2 = �m, (6.1.1)

asserts that the mass of a body is a measure of its energy content.As we saw in Sec. 4.2.1 Einstein’s heuristic derivation used a gedankenexperiment in which the emission of radiation from a body causes it tolose mass. It is well-known that, under well-defined conditions, exter-nal forces can cause heating, thereby increasing its rest mass. In fact,Planck has pointed to the fact that the stresses acting on the surfaceof a body also contribute to its apparent increase in mass, so that theleft-hand side of (6.1.1) should be the heat content, or enthalpy, andnot the change in energy, divided by the square of the velocity of lightin vacuo.

Moreover, the difference between the electromagnetic mass, (5.4.7),and the electrostatic mass, (5.4.5), of an electron has never been clearedup in a completely satisfactory way, which, like so many other things, hasbeen swept under the carpet in the course of time. The electrostatic massis defined as the energy of formation of a spherical charge divided by c2.There was a missing factor of 1

3 separating the two magnitudes. So Lorentz’sconjecture that the origin of the mass of an electron was entirely electro-magnetic had to be abandoned because there was something missing thatwas of a non-electromagnetic nature.

The conventional argument, as given by von Laue [19], contends thatsince the system is not closed, the energy–momentum vector will not trans-form as a four-vector, as it should. By closing the system with the additionof mechanical components to the energies and momenta of the electron,the correct pre-factor of 4

3 could be achieved, but the price paid wouldbe extremely high. For this procedure would introduce a negative pres-sure, known as the Poincaré’s stress after its author, which is related to thebinding potential, or the work done by the internal binding forces as thespherical charge distribution undergoes distortion due to its motion. Asdiscussed in Sec. 5.4.4, the FitzGerald–Lorentz contraction is used to trans-form the stationary spherical form of the electron into an oblate ellipsoidwhen in motion. However, the electron must accelerate in order to achievea finite velocity and this has nothing to do with the FitzGerald–Lorentzcontraction which requires an inertial regime. But the work required to

Page 330: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 303

accelerate the electron cannot come from “a constant external pressureacting on a deformable and compressible electron, whose work isproportional to the variation in the volume of the electron,” as Poincaré [06]imagined.

Slowly the Lorentzian electromagnetic world picture of an electrongave way to a thermodynamic one which was supposedly completely gen-eral, independent of any model chosen for an electron. As early as 1907Planck showed that it was the enthalpy and not the energy that transformedcorrectly under a Lorentz transformation. Einstein had earlier referred to a‘strange’ result by remarking

If a rigid body on which originally no forces are acting is subject to the influenceof forces that do not impart acceleration to the body, then these forces — observedfrom a coordinate system that is moving relative to the body — perform an amountof work dE on the body that depends only on the final distribution of force and thetranslation velocity.

Mechanical forces that to not impart acceleration to a body are indeedstrange. The strange result is that the energy does not transform as it shouldunder a Lorentz transform. There is something left over which, when com-bined with the work done by compressional forces, yields a Lorentz invari-ant, namely, the enthalpy. But a Lorentz transform implies that the systemis inertial, and this excludes all forces which create accelerations. This isalso the limit of a thermodynamic formulation, to which we now turn ourattention.

6.2 Poincaré Stress and the Missing MassWe must return to Lorentz’s theory, but, in order to maintain this free from unac-ceptable contradictions, a special force must be invoked to account both for thecontraction and for the constancy of two of the axes. I have attempted to determinethis force, and have found that it can be regarded as a constant external pressure actingupon an electron capable of deformation and compression, the work done being proportionalto the change in the volume of the electron.

Henri Poincaré23 July 1905

As we have mentioned in Sec. 5.4.4, Kaufmann’s experiments on β-raysshowed clearly that mass increases with velocity in a regular way.Abrahamconcluded that the mass–velocity relation found by Kaufmann is exactlythe same as the electromagnetic inertia which should vary according to his

Page 331: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

304 A New Perspective on Relativity

theory. A natural condition, which Abraham drew, is that the mass of anelectron is entirely of electromagnetic origin.

Abraham’s model views an electron as a sphere, of radius a, with auniform distribution of charge over its surface as if it were a conductor.The problem of the distribution of the energy of a spherical conductor hadbeen derived earlier, in 1897, by George Searle [96], a close collaboratorand friend of Heaviside. So it was only a matter of writing down the fieldenergy

E = e2

8πε0a

{cu

lnc + uc − u

− 1}

,

and its momentum,

G = e2

16πε0ac

{c2 + u2

u2 lnc + uc − u

− 2cu

},

which is in the same direction as the velocity. The expression of the massfollows directly from it; that is, the mass is a concept derived from thedefinition of the momentum.

In contrast, according to Lorentz, the momentum is that of a uniformcharge,

G = γue2

6πε0ac2 ,

where γ−1 is the FitzGerald–Lorentz contraction factor. Lorentz derives thisexpression from the Poynting vector for energy flow, E × H/c integratedover all space. If the motion is along the x-axis, Poynting’s vector will

be (γu/c2)(E2

y + E2z

), where Ey (Ez) is the y- (z-)component of the electric

field due to a spherical electron at rest. By symmetry, the integrals of thesquares of these components integrated over all space are equal to 2

3Eel,where Eel = e2/8πε0a, the electrostatic energy of the electron at rest for auniform distribution of charge over the surface. In this way Lorentz finds

G = γ43

uEel/c2. (6.2.1)

This is the origin of the famous 43 factor.

Page 332: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 305

Likewise, Lorentz determines the energy as

E = γe2

6πε0a

(1 + 1

3u2

c2

). (6.2.2)

Abraham was quick to criticize Lorentz’s model by the fact that the rate ofworking,

u · dGdt

= ue2

6πε0ac2ddt

(u√

(1 − (u/c)2)

),

does not equal the time-derivative of (6.2.2). Either energy conservationfails, or the electron cannot be a purely electromagnetic entity!

Now, if the momentum would not be given as

G′ = uc2 γE,

in the frame moving with velocity u with respect to a frame at rest, but,rather by

G′ = uc2 γ(E + PV),

and the equation of state for photons,

P = 13

E/V, (6.2.3)

were used (though we are talking about electrons!), then we would, indeed,find (6.2.1),

G′ = 43

uc2 γE. (6.2.4)

Rather than being valid for electrons, it has been claimed [Landau &Lifshitz 75] that the equation of state (6.2.3) is valid for the electromagneticinteractions between electrons. Be that as it may, Planck was later to identifythe pressure, P, as a Lorentz invariant, and this would necessarily implythat E/V would be the density in internal energy, and not the density of thetotal energy of an electron.

Page 333: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

306 A New Perspective on Relativity

According to Lorentz, the energy transforms as (6.2.2), which we canwrite as

E′ = γ(E + β2PV), (6.2.5)

on the strength of (6.2.3). This is what Einstein referred to as a “strangeresult.” Somehow, the β2 had to disappear from (6.2.5), and Poincaré [06]set himself the task of making it disappear.

Though Poincaré was much more mathematically than physicallyminded, he quite ingeniously split the momentum and energy into field,f , and mechanical, m, components,

X′ = X′f + X′

m,

where X stands either for G or E. Because G′f gave the correct result, it

was necessary that G′m = 0. But, because of the numerical discrepancy

in the value of the mass, both components of the energy were requiredto be non-vanishing. The mechanical component of the energy was setat Em = 1

3melc2 = −PmV, where mel = e2/8πε0ac2 is the electrostatic mass,(5.4.5).

Poincaré then added the pressure Pm to the field pressure,

P = Pf + Pm,

in order that it be annulled through the balance Pf = −Pm. The nega-tive pressure Pm was referred to as the Poincaré stress by Lorentz.a Then,since the mechanical component of the energy satisfies the same Lorentztransformation,

E′m = γ

(Em + β2PmV

),

as the field energy (6.2.5), the energy resulting from the application of thePoincaré stress must be given by

E′m = 1

3γ−1melc2. (6.2.6)

aAs Whittaker notes, for relativity Lorentz and Poincaré swapped roles, Lorentzbecame the mathematician, while Poincaré became the physicist.

Page 334: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 307

This is supposedly the non-electromagnetic contribution to the energy thatwas required to make the energy transform,

E′ = γmelc2(

1 + 13β2

)+ 1

3γ−1melc2,

come out just like the Lorentz transform for the momentum, (6.2.4).Even today, (6.2.6) is associated with the work done by the binding

forces as a spherical distribution accelerates and contracts [Yaghjian 92].How (6.2.6) causes accelerations is not broached. But, the binding energyis proportional to the volume, and this explains the contraction factor, γ−1,in (6.2.6). The non-electromagnetic origin of the Poincaré stress soundedthe death knell for a purely electromagnetic explanation of the mass of theelectron.

This is a classic case where prejudice overruled logic, and force wasapplied to make a preconceived notion come out as desired. However,Planck was not ruled by any of these prejudices. Transferring attentionfrom energy to heat content, which as we know from the Joule–Thomsonprocess, is conserved in an adiabatic process we find

H ′ = E′ + PV′

= γ−1(E + PV) + β2γ(E + PV)

= 43γE. (6.2.7)

Dividing through by the contracted volume V′ = γ−1V in (6.2.7) gives thetotal enthalpy density,

h′ = h + u · g′ = γ2ρelc2,

where g′ = γ2ρelu is the momentum density, and

ρ = (ε + P)/c2 = h/c2 = 43ε/c2, (6.2.8)

is the Lorentz-invariant mass density given in terms of the enthalpy density,h. This is precisely Planck’s result!

Planck, in a talk given in Köln on the 23 of September 1908, referredto (6.2.8) as the law of inertia of energy: The corresponding momentumdensity hu/c2 was in the same direction as the Poynting vector, hu, for the

Page 335: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

308 A New Perspective on Relativity

flow of energy. It is as it should be: Mass is related to a conserved quantityin an adiabatic process.

To paraphrase Planck, consider a ponderable flux of energy undera pressure P through a surface element dA normal to the velocity u. In atime dt mechanical energy will be performed P · dA · dt. The accompanyingenergy transferred is dA ·ε ·u dt, where ε is the energy density in (6.2.8). Themomentum density then will be their sum, u(ε+P)/c2, referred to as a unitof volume, where (ε+P)/c2 can be considered the density of mass in (6.2.8),which “is a well-known relation in relativity theory,” but has subsequentlybeen forgotten [Lavenda 02].

Although the presence of the PV term in the heat content means thatan electron cannot be compressed to a point particle, no structure has beengiven to it to-date. Equating the electrostatic energy to the rest energy givesa radius of 10−15 m, known as the classical radius of the electron. However,modern experiments put the electron radius at much less than 10−15 m.But, because energy does not transform into energy under a Lorentz trans-form, it is necessary to consider the enthalpy with the additional PV term.Moreover, it is entirely reasonable to associate relativistic mass with theheat content, and not with the energy, because only the former is con-served in adiabatic processes where the volume is altered. So even if theelectron’s size is extremely small, thermodynamics still demands that it canbe attributed a volume.

6.3 Lorentz Transforms from the VelocityComposition Law

It may turn out that we shall be compelled to create a totally new mechanics, which we canimagine only vaguely, a mechanics in which inertia would increase with velocity, while thespeed of light would be an insuperable limit.

Henri PoincaréJanuary 1904

In Victorian times, work was associated with controllable coordinates, andheat with uncontrollable ones [Thomson 68]. This decomposition can beextended to the velocity components of a particle. Consider the velocity atwhich a particle is moving to be comprised of two components, a uniformvelocity component, u, and a component w due to random thermal motion.

Page 336: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 309

Since the latter will require averaging, only two components of w will berequired: the components in the direction of motion and in the oppositedirection to the motion.

The composition of collinear velocities,b

v± = u ± w1 ± uw/c2 , (6.3.1)

for the addition of the causal, u, and random, w, velocity components isequivalent to the addition formula [Sommerfeld 09],

tanh(θ′ ± θ

) = tanh θ′ ± tanh θ

1 ± tanh θ′ tanh θ, (6.3.2)

for the hyperbolic tangent. From this observation derives Robb’s [11] def-inition of ‘rapidity.’ We will consider the motion in a single dimension,indicating where necessary the generalization to higher dimensions.

Given the collinear velocities (6.3.1), the total energy of N non-interacting particles of rest mass m is

E = 12

Nmc2

√(1 − v2+/c2)

+ 12

Nmc2

√(1 − v2−/c2)

.

Introducing the composition law (6.3.1) results in

E =(mN + K (w)/c2) c2

√(1 − u2/c2)

= Nmc2 cosh θ cosh θ′, (6.3.3)

where

K (w) = Nmc2[

1√(1 − w2/c2)

− 1]

= Nmc2 (cosh θ − 1), (6.3.4)

is the random kinetic energy of the thermal motion of N particles.

bThe original idea of using a stochastic component of the velocity is unknown.Ives [44] used it, and it post-dates Becker [33], whom he referenced, but the trailseems to stop there.

Page 337: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

310 A New Perspective on Relativity

Analogously, the momenta of N non-interacting particles,

G = 12

Nmv+√(1 − (v+/c)2 )

+ 12

Nmv−√(1 − (v−/c)2 )

,

becomes

G =(mN + K (w)/c2) u√

(1 − u2/c2)= Nmc sinh θ′ cosh θ, (6.3.5)

under the composition law, (6.3.1). The energy, (6.3.3), and momentum,(6.3.5), will not form a two-vector unless the random kinetic energy, (6.3.4)is constant because

E2/c2 − G2 = (Nmc)2 cosh2 θ.

In fact, averaging will be required since w is the random thermal componentof the velocity. Notwithstanding this, the total energy and momentum forma two-vector without any averaging.

Now consider the differences in energy and momentum. The differ-ences in energy and momentum are

�E = 12

Nmc2

(1√

(1 − v2+/c2)− 1√

(1 − v2−/c2)

)

= Nmu w√(1 − u2/c2)

√(1 − w2/c2)

= Nmc2 sinh θ′ sinh θ, (6.3.6)

and

�G = 12

Nm

(v+√

(1 − v2+/c2)− v−√

(1 − v2−/c2)

)

=(mN + K (w)/c2) w√

(1 − u2/c2)= Nmc sinh θ cosh θ′, (6.3.7)

respectively. Adding (6.3.3) and (6.3.6) gives the total energy,

E′ = E + �E = Nmc2 cosh(θ′ + θ

), (6.3.8)

while adding (6.3.5) and (6.3.7) gives the total momenta,

G′ = G + �G = Nmc sinh(θ′ + θ

). (6.3.9)

Page 338: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 311

It is now evident that (6.3.8) and (6.3.9) form a two-vector,

E′ 2/c2 − G′ 2 = (Nmc)2. (6.3.10)

Unwittingly, and most surprisingly, (6.3.8) and (6.3.9), constituteLorentz transforms for the energy and momentum! All we have to do isto expand the double angle formulas, and note that G0 = Nm0c sinh θ andE0 = Nm0c2 cosh θ, with γ = cosh θ′. Implicit in this is the definition ofrapidity, u/c = tanh ϑ′, for then the Lorentz transformation can be writtenin the suggestive form as a rotation from θ to θ + θ′:

E′ = γ (E + u G) = E + �E, (6.3.11)

G′ = γ(G + u

c2 E)

= G + �G, (6.3.12)

which are none other than the addition formulas for the hyperbolic sineand cosine. We could have equally as well assumed the Lorentz transformsand worked our way back to derive the composition law for the velocities.The procedure is completely reversible.

These equations should be compared to

E′ =∫

T′tt dV = γ

∫ (Ttt + β2

Txx

)dV0, (6.3.13)

G′ = ic

∫T

′xt dV = γ

uc2

∫(Ttt + Txx) dV0, (6.3.14)

for a frame moving at constant velocity u in the x-direction. Ttt and Txx

are the energy density and the energy flux density of the stress tensor T,respectively, in the frame at rest. According to Planck’s hypothesis [Becker33], which Pauli [58] refers to as a theorem: To each energy flux density, S ,there corresponds a momentum density, g = G0/V0,

g = S u/c2. (6.3.15)

However, (6.3.15) had been derived seven years before by Poincaré [00],as we saw in Sec. 1.2.2.2. This is yet another example of Stigler’s law ofeponymy.

Regardless of who discovered it, (6.3.15) is correct and will satisfy theenergy balance equation. However, we will show that it is not correct to

Page 339: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

312 A New Perspective on Relativity

relate the energy flux with space components of the stress tensorT, such that

S = u · T. (6.3.16)

We have to thank (6.3.16) for all the inconsistencies in dealing with theresulting balance equations.

Considering the body to be isotropic, Txx = Tyy = Tzz = P, whereP is the pressure, the Lorentz transforms (6.3.13) and (6.3.14) would resultfrom (6.3.11) and (6.3.12) by setting

G = uc2 PV, (6.3.17)

so as to give

E′ = γ(E + β2PV

), (6.3.18a)

G′ = γuc2 (E + PV). (6.3.18b)

But, we are not permitted to write G′ = uPV′/c2 because the volume under-goes a FitzGerald–Lorentz contraction,

V′ = Vγ−1. (6.3.19)

The Lorentz transforms (6.3.18a) and (6.3.18b) were first given byPlanck [07] by assuming that his kinetic potential, K = PV, was a func-tion of the velocity, as well as the temperature and volume.

Adding PV′ to the both sides of (6.3.18a), and using (6.3.19), result in

H ′ = E′ + PV′ = γ (E + PV). (6.3.20)

This is the enthalpy, and introducing it into (6.3.18b) gives

G′ = uc2 H ′, (6.3.21)

which can no longer be considered a spatial component of the stress, asPlanck intended in (6.3.16).

The problem is that (6.3.18a) and (6.3.18b) are not Lorentz transforms,but, rather, definitions of the total energy and momentum. It was Planck [07]

Page 340: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 313

who first wrote his “gesamte energie” (total energy) as

E′tot = E′ + uG′ = E/γ + γβ2H = γ

(E + β2PV

), (6.3.22)

where

E = TS − PV, (6.3.23)

is the internal energy, and S is the entropy.The transformation in (6.3.22) is the result of the FitzGerald–Lorentz

contraction on the volume, (6.3.19), and the fact that objects get cooler asthey travel at greater speedsc

T′ = Tγ−1. (6.3.24)

Under the Lorentz transformation, the momentum transforms from(6.3.17) to (6.3.21). The energy transforms from the internal energy (6.3.23)to the total energy (6.3.22). In thermodynamics, it is the enthalpy H =E + PV, and not the internal energy, E, that transforms correctly undera Lorentz transformation,

H′ = γH, (6.3.25)

which together with the momentum,

G′ = γuc2 H, (6.3.26)

are Lorentz-invariant. This is to say that

H ′ 2/c2 − G′ 2 = H2/c2, (6.3.27)

cOver the years controversies have arisen as to what gets colder when in movement.In 1963 Ott supposedly demonstrated the inverse of Planck’s transformation laws.Following this, Arzelies independently arrived at the same conclusions. Landsbergdissented from both choices and made temperature an invariant. And somewhatlater van Kampen made both the temperature and heat Lorentz invariants.Ahistoryof thermodynamic Lorentz invariants is given by Callen and Horowitz [71] whosided with Landsberg, and claimed that enthalpy, and not the energy, is the naturalpotential for relativistically confined systems. All these proposals overlooked thefact that the entropy must be a Lorentz invariant for, otherwise, we could distinguishrest from motion by the degree of disorder of the system. So heat and temperaturemust transform the same way, and both decrease when in motion.

Page 341: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

314 A New Perspective on Relativity

like (6.3.10), is a transformation from a state at rest to one in relative motion.Introducing (6.3.25) into (6.3.26) gives Planck’s relation,

G = uHc2 , (6.3.28)

showing that the ratio H/c2 behaves as the mass.

The heat content of a body is a measure of its inertia: the mass, ingeneral, will be a function both of the temperature and volume.

Planck also derived his relation (6.3.28) on thermodynamic groundson the assumption that his kinetic potential, K, is a homogeneous functionof T, V, and β, i.e.

K = ∂K∂T

T + ∂K∂V

V − (1 − β2)β

∂K∂β

. (6.3.29)

If we introduce the total energy,

Etot = uG + TS − K,

into (6.3.29) we come out with

Etot + PV = c2

uG,

implying that Htot/c2 is the mass, where Htot = Etot + PV is the totalenthalpy since Etot = E + uG is the total energy. The internal energy con-tracts when in motion, but the work necessary to keep the system in a stateof uniform motion ensures that the total energy will dilate when in motion.

As Planck [07] rightly emphasized, the stresses acting on the surfaceof a particle also contribute to the mass–energy of the particle so it is reallya mass–enthalpy relationship. In the present formulation, it is the kineticenergy, (6.3.4), resulting from the heat motion that contributes to the massof the particle. But, this motion even in a state at rest, u = 0, should notvanish, as (6.3.10) would have us believe. The reason is that (6.3.4) is definedstochastically so that without suitable averaging it has no meaning.

Page 342: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 315

If we average, (6.3.3) and (6.3.5),(E/c2, G

)does, indeed, form a two-

vector since

E2/c2 − G

2 = (Nm′c

)2,

where

m′ = m + K (w)/Nc2. (6.3.30)

The average random kinetic energy, resulting from random thermalmotions, contributes to the increase in the inertial mass.

6.4 Density Transformations and the Field Picture

We have considered how energy and momentum transform, as well asother thermodynamic quantities. Now we consider how their densitiestransform in one dimension. The total energy density can be written as

ε′ = 12

mnc2

(1

1 − v2+/c2+ 1

1 − v2−/c2

),

where n is the number density. With the aid of the composition law for thevelocities, (6.3.1), we find

ε′ = mnc2

[1 + (uw/c2)2

(1 − u2/c2)(1 − w2/c2)

]. (6.4.1)

Because the velocities u and w enter symmetrically, the FitzGerald–Lorentz contraction of the volume should be given by

V′ = V√

(1 − u2/c2)√

(1 − w2/c2). (6.4.2)

Since (6.4.2) is not (6.3.19) we can expect that the densities will not give backthe transformation laws for their macroscopic counterparts on multiplica-tion by the volume in the moving frame. This can be seen by multiplyingboth sides of (6.4.1) by (6.4.2), and averaging; it does not yield (6.3.18a),but, rather,

E′ = γ(Nmc2 + K (w) + β2PV

), (6.4.3)

Page 343: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

316 A New Perspective on Relativity

where

P = nmw2

√ (1 − w2/c2

) (6.4.4)

is the pressure in one-dimension. The first term in the parenthesis in (6.4.3)is our definition of the total energy, (6.3.3), in a state of rest. Thus, (6.4.3)can be considered as the Lorentz transform applied to the energy to take itfrom a state of rest to one of uniform velocity, (6.3.18a).

Consider now the momentum density,

g = 12

nm

{v+

1 − v2+/c2+ v−

1 − v2−/c2

}.

Again making use of velocity composition law results in

g = mnu1 − u2/c2

(1 + w2/c2

1 − w2/c2

). (6.4.5)

Multiplying both sides of (6.4.5) by (6.4.2), and averaging, give

G′ = γuc2

(Nmc2 + K (w) + PV

). (6.4.6)

The first term in the parenthesis is, again, the energy (6.3.3) in a state of rest.Thus, (6.4.6) can be considered as the Lorentz transform on the momentum,(6.3.18b).

The work necessary to keep the platform moving at a constant speedu is u · G′. The increment in the heat in a frame moving at this velocity is

dQ′ = dE′ − dL′,

according to the first law, where the increment in the work is

dL′ = −P dV′ + u · dG′.

Thus,

dQ′ = dE′ + P dV′ − u · dG′

= γ[dE + β2P dV] + γ−1P dV − γβ2[dE + P dV] (6.4.7)

= γ−1[dE + P dV] = γ−1dQ,

showing that a moving body loses heat because energy must be spent tokeep it in motion. This is entirely reasonable, since there are no free lunches.

Page 344: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 317

If objects would become hotter the faster they travel, it would constitute aperpetual mobile since they could transform partially their heat into workto be made to go still faster.

The product of the momentum G and the deterministic velocity u is thework necessary to keep the system in a state of steady motion, whilethe average of the product of the change in momentum, �G, in a stateof rest, and the random velocity w is proportional to the static pressure.

Equations (6.4.7) make another point: We used the volume contractionlaw (6.3.19), and not (6.4.2).Although the random thermal velocity was cru-cial in (6.4.5) in defining the pressure according to (6.4.4), it has henceforthdisappeared on the macroscopic scale. Planck did not need (6.4.4) to findthat it was a relativistic invariant, the same in all inertial frames. Hence,there is a loss of information as we transform from densities to extensive quan-tities, and the volume contraction (6.4.2) is the last vestige of the actionsof random thermal motions. We shall come back to this point in the nextsection.

The field picture can be derived from a modified version of Planck’sderivation. We introduce the kinetic potential K as a function of x, T, andV. The kinetic potential K, which is PV, transforms as

K′ = K√ (

∂x′

∂x

), (6.4.8)

where√ (

∂x′

∂x

)= 1 − x′u/c2

√(1 − β2)

=√

(1 − β2)1 + xu/c2 , (6.4.9)

and the second equality follows from

1 − β2 =(

1 + xuc2

) (1 − x′u

c2

), (6.4.10)

i.e.

u = x′ − x1 − xx′/c2 .

It is apparent that the kinetic potential undergoes a FitzGerald–Lorentzcontraction when transferred to the frame traveling at a relative velocity

Page 345: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

318 A New Perspective on Relativity

u greater than the other frame if that frame is at rest x = 0, implying thatx′ = u. This we will do at the end of the calculation.

Now we inquire as to how the momentum,

G = ∂K∂x

, (6.4.11)

transforms. From (6.4.8) we have

G′ = ∂K′

∂x′ (6.4.12)

= ∂

∂x

[K

√ (∂x′

∂x

)]∂x∂x′ +

{∂K∂T

∂T∂x′ + ∂K

∂V∂V∂x′

} √ (∂x′

∂x

).

Both the temperature and volume transform as

X′ = γ

(1 − x′u

c2

)X, (6.4.13)

which becomes the FitzGerald–Lorentz contraction when x′ = u. Hence,(6.4.12) is given explicitly by

G′ = γ{G + u

c2 (E + PV)}

, (6.4.14)

when x′ = u, and use has been made of the Euler relation for the internalenergy, E = TS − PV, and K = PV.

According to Planck, the total energy is given as a double Legendretransform of the kinetic potential,

E = T∂K∂T

+ V∂KV

− K, (6.4.15)

which makes K a function of the internal state variables T and V, and alsoof γ−1, where

S = ∂K∂T

, P = ∂K∂V

, G = 1c

∂K∂β

.

To find the functional dependence of K we take its differential, and use

dEtot = u dG + T dS − P dV,

to obtain

dK = S dT + P dV + G du.

Page 346: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 319

Finally, if we introduce the Euler relation for Etot into (6.4.15) we findK = PV.

That K is a function of V, T, u, means that P = k(T, u), where k is thedensity of the kinetic potential. As such this contradicts Planck’s findingthat P is an invariant. This is precisely what Abraham [20] found, who thengoes on to treat blackbody radiation in a moving cavity. There he findsP = const. × T4γ2, which on the basis of (6.3.24) makes the pressure aninvariant. But, the pressure is only a function of u through its dependencyon T. And if (6.3.24) holds, then P is invariant.

This can be readily seen by considering radiation being reflected offa mirror, which would behave as a monochromatic piece of charcoal ina blackbody. If θ is the angle that the incoming radiation makes with thevelocity vector of the moving blackbody, and θ′ the angle of reflection withrespect to the direction of motion, then using the Doppler shift and Wien’sdisplacement law together with Stefan’s law, we can go a very long wayin determining the correct velocity dependencies on the thermodynamicquantities.

Stefan’s law says that the intensity of radiation, �(θ, β), varies as T4.Wien’s displacement law says that the product of the frequency ν, wherethe intensity of radiation peaks as a function of frequency, and absolutetemperature T is constant. Thus, the ratio of the incoming and reflectedradiation is given by the Doppler shift,

�(θ, β)�(θ′, β)

=( ν

ν′)4 =

(1 − β cos θ

1 − β cos θ′

)−4

.

If we observe the radiation in the direction normal to the motion of theblackbody, then θ′ = π/2, and the above expression reduces to

�(θ, β) = �(π/2, β)(1 − β cos θ)4

. (6.4.16)

Integrating over the element of solid angle 2π sin θ dθ gives the energy den-sity,

ε = 2π

c

∫ π

0sin θ �(θ, β)dθ. (6.4.17)

Page 347: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

320 A New Perspective on Relativity

Moreover, the component of the momentum is less than in the case thatthe incident ray were parallel to the velocity in proportion to cos θ. Conse-quently, the momentum density is

g = 2π

c2

∫ π

0sin θ cos θ �(θ, β)dθ. (6.4.18)

These relations were first derived by Mosengeil [07] in his dissertation,and were published posthumously by Planck, and generalized by him.Quite remarkably the relation,

Pβ = gc − βε (6.4.19)

= 2π

c

∫ π

0( cos θ − β) sin θ �(θ, β)dθ,

is another way of writing (6.3.28) in density form. The first published pre-diction that radiation ‘carries’ mass was made by Hasenöhrl [04, 05] in theyears 1904 and 1905.

Introducing (6.4.16) into (6.4.17), (6.4.18), and (6.4.19), and performingthe integration lead to

ε = 2π

c�(π/2, β)

∫ π

0

sin θ dθ

(1 − β cos θ)4= 4π

c�(π/2, β) · 1 + 1

3β2

(1 − β2)3,

g = 2π

c2 �(π/2, β)∫ π

0

sin θ cos θ dθ

(1 − β cos θ)4= 16π

3c2 �(π/2, β) · β

(1 − β2)3,

P = 4π

3c�(π/2, β) · (1 − β2)−2.

Taking ratios of the terms eliminates the unknown �(π/2, β) and leads to

gc = ε · 4β

3 + β2 = ∂k∂β

,

P = ε · 1 − β2

3 + β2 = k.

These two relations lead to the differential equation,

∂k∂β

= k · 4β

1 − β2 ,

which can be integrated to give

k = (T)(1 − β2)2

,

Page 348: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 321

where the constant of integration, (T), can be a function only of T.Inserting this expression for k into (6.4.15) leads to the differential equationfor the unknown , viz.

Td

dT= 4 .

Integration immediately yields Stefan’s law, (T) = σ3 T4, where σ is the

radiation constant. Hence, the pressure, energy, momentum, and entropyare found to be

P = σ

3

(T√

(1 − β2)

)4

,

E = σ

3T4V

3 + β2

(1 − β2)3,

G = 43

σ

cT4V

β

(1 − β2)3,

S = 43σ

(T√

(1 − β2)

)3

· V√(1 − β2)

.

It follows from (6.3.19) and (6.3.24) that the pressure and entropy are rela-tivistic invariants.

We have thus shown that the pressure depends on the velocity onlythrough the dependence on the temperature, and that the temperatureof a body in motion is lower than when it is at rest implies that thepressure is the same in every inertial frame.

Although the twenty-one year old Pauli [58] writing in Encyklopädieder Mathematischen Wissenschaften “was still a student at the time, he wasnot only familiar with the most subtle arguments of the Theory of Relativ-ity through his own research work, but was also fully conversant with theliterature on the subject.” The quote is taken from Sommerfeld’s preface ofthe special German edition. However, when dealing with blackbody radi-ation in a moving cavity (Sec. 49) Pauli says that “by means of the formulasof Sec. 46” the above relativistic expressions follow. The formulas he isreferring to are (6.4.6), (6.4.3), and the volume contraction, (6.4.2), withoutthe stochastic velocity, etc., and are just the Lorentz transforms. They can-not provide the relativistic expressions for cavity radiation. Rather, it is thespectral distribution in the moving cavity argument that he subsequently

Page 349: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

322 A New Perspective on Relativity

develops which yields the relativistic thermodynamic expressions. It is notas he claims that these formula agree with his previous results. The spectraldistribution analysis is the only way of arriving at them.

Pauli also misses the relation between inertia and heat, (6.3.28). Hefinishes with the conclusion

Because of the extreme smallness of the expected effects it seems unlikely that theinertia of radiation energy could be demonstrated experimentally.

This can also be said of relativity in general. But, as Pauli points out, allresults were derived before “the theory of relativity had been formulated,”alluding to the fact that there is more than one way to skin a cat.

The transformation of the total energy can be found from the Eulerrelation,

E′tot = uG′ + ST′ − PV′. (6.4.20)

Introducing (6.4.14) and (6.4.13) gives

E′tot = γ(uG + E + β2PV), (6.4.21)

on setting x = 0. The transformation law (6.4.21) shows how the energynecessary to keep the frame at a constant speed u, uG′, transforms theinternal energy into the total energy,

E′tot − uG′ = γ−1E.

The transformations (6.4.14) and (6.4.21), with PV added to both sides,can thus be written as

G′ = γ(G + u

c2 H)

,

H ′ = γ(uG + H).

These Lorentz transforms attest to the invariance of

H ′2 − (cG′)2 = H2 − (cG)2, (6.4.22)

which agrees with (6.3.27). We will now investigate the kinetic origins ofthe pressure using the relativistic virial theorem.

Page 350: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 323

6.5 Relativistic Virial

The usual derivation of the pressure from the virial is to set the pressureequal to one-third the average of twice the kinetic energy density sincethe pressure will be isotropic in each of the three directions [Clausius 70].Averaging is carried out over all the momenta with a given probabilitydensity function. Why should the deterministic momentum be averagedwith respect to a probability distribution? The answer that would be givenis that the particles are moving with a distribution of momenta [Einbinder48a]. But such a distribution cannot be the work of external fields. So appealis implicitly being made to the action of random thermal motion, and notto a velocity that is determined by external fields. The distinction betweenthe two is never made.

So pressure must be defined in terms of the momenta due to randomthermal motion. However, if we define pressure in terms of the product ofthe stochastic change in the momentum, (6.3.7), times the stochastic veloc-ity, it must be for the state of rest, u = 0. For otherwise it would introduce adependency of the pressure on the velocity which no averaging can annul,and so destroy its Lorentz invariance. Thus, we define the pressure as

P = n3�G(w)w = n

3mw2

√(1 − w2/c2)

, (6.5.1)

where the change in the stochastic momentum, �G, is given by (6.3.7) in astate at rest, u = 0.

The average pressure, (6.5.1), can be written as the virial

3PV =(K (w) + L (w)

), (6.5.2)

where K (w) is the average of the random kinetic energy, (6.3.4), and

L (w) = Nmc2[1 − √ (

1 − (w/c)2)]

is the average Lagrangian due to random thermal motion. This is because

�Gw =(K (w)/c2 + Nm

)w2 = K (w) + Nmc2

[1 − √

(1 − w2/c2)]

,

gives back the definition of the random kinetic energy, (6.3.4). Moreover, itshows that the virial must be defined in a state of rest, for, otherwise, thepressure would not be a Lorentz invariant. We will see that whether the

Page 351: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

324 A New Perspective on Relativity

system is in motion or at rest is determined by the ideal gas law [cf. (6.7.9)below]. For in that law there are two quantities, (6.3.19) and (6.3.24), thatcontract when in motion, but compensate one another so that the gas lawalways remains valid no matter what inertial frame we are in.

The virial (6.5.2) can be written entirely in terms of the average kineticenergy by eliminating the square root using the definition (6.3.4) of therandom kinetic energy. We then obtain

3PV = K (w)(K (w) + 2Nmc2

)

K (w) + Nmc2

= K (w) + Nmc2 K (w)K (w) + Nmc2 . (6.5.3)

Equation (6.5.3) will only reduce to linear equations of states in theultra- and non-relativistic limits, or in the small and large mass limits,respectively. Since K (w) is a random function, it makes no sense to saythat it is much larger or smaller than the rest energy, Nmc2 [Einbinder 48a].In the former limit we get

K (w) = 3PV, (6.5.4a)

while in the latter limit,

2K (w) = 3PV, (6.5.4b)

which are well-known.However, if we appeal to thermodynamics, which is insensitive to

fluctuations, an average of a function will be equal to a function of itsaverage. Consequently, thermodynamics interprets (6.5.3) as the Grüneisenequation of state,

PV = sK (w), (6.5.5)

where s is the so-called Grüneisen parameter. On thermodynamic grounds,it is a phenomenological parameter ranging from s = 2

3 for a perfect, mate-rial gas, to s = 1

3 for a photon gas in three-dimensions. A comparison of(6.5.3) with (6.5.5) gives

s = 13

(1 + Nmc2

K (w) + Nmc2

), (6.5.6)

Page 352: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 325

for the Grüneisen parameter. In d-dimensions it will be given by

s = 1d

(1 + Nmc2

K (w) + Mmc2

).

In 1908 Grüneisen enunciated his empirical law as “the ratio of thecoefficient of expansion of an isotropic solid to its specific heat is indepen-dent of the temperature.”

The Grüneisen parameter, (6.5.6), will be independent of the averagekinetic energy, or equivalently, the absolute temperature, in the extremenonrelativistic limit, where Nmc2 � K (w), in which case s = 2

3 , andin the ultra-relativistic limit, where Nmc2 � K (w), s = 1

3 in three-dimensions.

Equation (6.4.2) predicts a thermal contraction of the volume, in addi-tion to the mechanical contraction of FitzGerald–Lorentz type. Since w is arandom speed, only the average of (6.4.2) has meaning. On the basis of thedefinition of the random kinetic energy, (6.3.4), it is equivalent to

V(w)′ =

(Nmc2

Nmc2 + K (w)

)Vγ−1 = (3s − 1)Vγ−1, (6.5.7)

the last equality follows in the absence of fluctuations, where an averageof a function is equal to a function of the average. This shows that in thenonrelativistic limit, where Nmc2 � K (w), there will be no thermal effecton the volume contraction. The same volume contraction is obtained asin the absence of random thermal velocities. However, as we move intothe relativistic regime, we expect the effect to be ever increasing, lead-ing to smaller and smaller volumes, which are proportional to the ratio,Nmc2/K (w). At the end of Sec. 6.7.2, we will provide more quantitativelimits.

6.6 Which Pressure?

Various pressures have appeared in the literature. There is a pressure associ-ated with the propagation of electromagnetic waves [cf. Sec. 3.5.1], anotherpressure invented by Poincaré that was supposed to hold the charge to the

Page 353: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

326 A New Perspective on Relativity

surface when an electron is in motion [cf. Sec. 6.2], and another kinetic gaspressure that appears in special relativity [cf. Sec. 6.4].

In Sec. 3.5.1 we derived the pressure of radiation from the Dopplereffect applied to radiation reflected off a mirror. Here we derive theradiation pressure from electromagnetism by considering the Lorentztransformations:

Ex = E′x,

Ey = γ(E′

y − uc

H′z

),

Ez = γ(E′

z + uc

H′y

),

for the components of the electric, E, and magnetic, H, field strengths. Fora plane wave of monochromatic light traveling in the x-direction, E′

y = H ′z,

the radiation pressure is [cf. (3.5.1)]

P = 14π

(E2

y + H2z

)=

(1 − u/c1 + u/c

)P′, (6.6.1)

where the pressure in the stationary frame is twice the energy density ofthe incoming wave, P′ = E′2

y /2π, as in (3.5.2) when the mirror is at rest. Theradiation pressure (6.6.1) is dependent on the velocity u, and is, therefore,not an invariant.

Then there is the Poincaré stress of Sec. 6.2, which, oddly enough,Einstein [07] attempted to associate with the kinetic gas pressure (6.5.1) inthree-dimensions. In the xy-plane, the electron in a state of rest will appearas a circle which becomes an ellipse when set in motion. The componentsof the force per unit charge are

Fx = E′x,

Fy = E′y/γ .

The magnitude of the force normal to the surface of an electron of radiusa is

F = √(F2

x + F2y) =

√ (E′2

x + E′2y /γ2

)

= e2

4πa2

√ (cos2 θ′ + sin2 θ′/γ2

). (6.6.2)

Page 354: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 327

According to an argument by Page and Adams [40], the surface ele-ment in motion will be reduced by the same amount as the force, (6.6.2),

σ = 4πa2√ (

cos2 θ′ + sin2 θ′

γ2

), (6.6.3)

which is supposedly the origin of the invariant pressure.Dividing (6.6.2) by (6.6.3) gives twice the pressure acting on the surface

of the electron,

F/σ = e2

16π2a4 = 2P. (6.6.4)

Page and Adams fail to realize that the surface element is in the hyperbolicplane, and not in the Euclidean plane, so that there is no need to invokecharge conservation. We will rectify their situation in Sec. 9.8.

The electron’s surface can be considered as a perfect reflector. If thepressure is transmitted by waves [Poynting 10], the reflected waves pushback just as much as the waves impinging on the electron’s surface so thatthe pressure is doubled [cf. Sec. 3.5.1]. Not surprisingly, the work,

43πa3P = e2

24πa= (mem − mel)c2,

is the difference in energy between the electromagnetic rest energy, memc2 =e2/6πε0a, and the electrostatic rest energy, mesc2 = e2/8πε0a. It is thismechanical tension that Poincaré had to invent so that mem = 4

3mel

[cf. (5.4.7)]. This invariant pressure Einstein [07] rederived in his 1907 paper.Such a pressure is a pure constant, independent of the state of the body.Hence, it has nothing whatsoever to do with the kinetic gas pressure (6.5.1)that results from the random thermal motions.

6.7 Thermodynamics from Bessel Functions

Since the relativistic energy is related to the hyperbolic function cosh θ, andthe momentum to sinh θ, the former enters into the exponential Boltzmannfactor and a finite power of the latter is proportional to the density of states.The modified Bessel function is related to the relativistic partition function,which is a generating function with the only exception that the dummy vari-able is given a thermodynamic significance, i.e. the inverse temperature.

Page 355: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

328 A New Perspective on Relativity

Therefore, the modified Bessel functions can be used to determine ther-modynamic equations of state that are valid in the relativistic regime. Thisapproach dates back to 1911 with the work of Jüttner.

6.7.1 Boltzmann’s law via modified Bessel functions

We use the representation of the modified Bessel function of the secondkind of order r > 1

2 ,

Kr(x) = xr

(2r − 1)!!∫ ∞

0e−x cosh θ sinh2r θ dθ, (6.7.1)

where !! is the double factorial, or semi-factorial, function, i.e. (2r − 1)!! =1 · 3 · 5 · (2r − 1). An integration by parts yields

Kr(x) = xr−1

(2r − 3)!!∫ ∞

0e−x cosh θ sinh2(r−1) θ cosh θ dθ (6.7.2)

Continuing to integrate by parts generates the well-known recursion rela-tions that the Kr(x) satisfy.

The asymptotic forms of the modified Bessel functions of the secondkind are important since only in these limits will closed expressions for theprobability densities exist. In the large x-limit, the asymptotic form of themodified Bessel function is

limx→∞

Kr(x)√(π/2x)

ex = 1, (6.7.3)

for any order, r. In the opposite limit of small x, the asymptotic form of themodified Bessel function is

Kr(x) = 12

(r − 1)(

2x

)r

. (6.7.4)

The parameter, x, will now be shown to be inversely proportional to theabsolute temperature so that the (6.7.3) and (6.7.4) will be the low and hightemperature limits, respectively.

In order to show that these are the extreme temperature limits, wemust ask ourselves: What does a modified Bessel function have to do withrelativistic statistical physics? As we have mentioned in the introduction

Page 356: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 329

to this section, the kinetic energy E = mc2( cosh θ − 1), and the momentumG = mc sinh θ, so that the density of states for r = 2 is

dN(G) = 24πVh3 G2dG = 8πV

λ3c

sinh2 θ cosh θ dθ, (6.7.5)

where λc = h/mc is the Compton wavelength.The cube of the Compton wavelength is proportional to the smallest

volume of phase space in which a particle can be localized. So (6.7.5) isthe ratio of the actual phase volume to that of the smallest possible phasevolume. However, it is not as Chandrasekhar [58] claims that the factor of2 in (6.7.5) is due to the fact that “the Dirac equation has two (or no) linearlyindependent solutions according as [the relativistic conservation of energy]is satisfied (or not).” Planck got the same factor in his expression for thedensity of states, and he never heard of the Dirac equation. The factor isdue to the two independent directions of polarization, and it comes outof classical electrodynamics without any recourse to quantum mechanics,or relativity. Maxwell’s equations contain all the necessary ingredients forcharacterizing the two states of polarization of a photon as we shall appre-ciate in Sec. 11.5.1. And any microscopic explanation of a macroscopic phe-nomenon does not constitute a new phenomenon, as Einstein maintainedin his letter to Seelig which we cited in Sec. 1.1.1.3.

We can thus calculate the total number of particles,

N = 8πVλ3

cex

∫ ∞

0e−x cosh θ sinh2 θ cosh θ dθ, (6.7.6)

and the kinetic energy,

K (θ) = 8πVλ3

cex

∫ ∞

0e−x cosh θ sinh2 θ cosh θ( cosh θ − 1) dθ, (6.7.7)

as integrals over all values of θ, where x = mc2/T is the ‘modulus,’ inGibbs’s terminology, in energy units where Boltzmann’s constant is unity.Both the total number of particles, (6.7.6), and the average kinetic energy,(6.7.7) are Lorentz-invariant since they do not depend on γ . This will notbe true of the total energy (6.3.3).

In the case where the number of particles is conserved, the chemicalpotential is a function of temperature [Lavenda 91]. However, we have setthe chemical potential equal to zero since we will deal only with ratios

Page 357: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

330 A New Perspective on Relativity

of thermodynamic variates. In the case of non-conservation of the particlenumber (6.7.6) will be a function of the temperature.

Rather than relying on a specific ensemble for the expression of thepressure, which in the case of a non-constant particle number would bethe grand canonical ensemble, we can apply a homogeneity argumentand assume that the energy is a homogeneous function of order 3s in themomenta, where s is the Grüneisen parameter that we have introducedearlier in (6.5.6). That is, we can write the virial as

PV = 8πV3�3 ex

∫ ∞

0e−x cosh θG

∂E∂G

G2dG

= 8πVmc2

3λ3c

ex∫ ∞

0e−x cosh θ sinh4 θ dθ = sE. (6.7.8)

The Grüneisen parameter is related to the order of homogeneity of theenergy with respect to the momentum. In a d-dimensional momentumspace, the energy is a homogeneous function of order d · s with respectto the momentum.

By an integration of parts in the last integral in (6.7.8), we can writethe virial as the ideal gas law:

PV = −8πV3λ3

mc2

x

∫ ∞

0sinh3 θ de−x cosh θ

= T8πVλ3

c

∫ ∞

0e−x cosh θ sinh2 θ cosh θ dθ = NT. (6.7.9)

Equation (6.7.9) has the appearance of Mariotte’s law, but appearancescan be deceiving especially when the particle number can be a function ofthe temperature! The validity of Mariotte’s law was proved explicitly byJüttner [11], but it was implicit in Planck’s papers of 1907. It was, however,certainly not appreciated that N was not a conserved quantity, nor that whatwas being dealt with was not a material ideal gas. One would have to waitmore than a decade until Einstein, prodded by Bose’s ‘misinterpretation’ ofBoltzmann’s counting procedure, would realize that one was dealing witha degenerate gas, or one that does not conserve the particle number. Hence,the name ‘Bose–Einstein’ statistics.

Page 358: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 331

In order to determine the average kinetic energy, we consider theaverage

cosh θ = xr−1

(2r − 3)!!∫ ∞

0

e−x cosh θ

Kr(x)sinh2(r−1) θ

(1 + sinh2 θ

)dθ

= Kr−1(x)Kr(x)

+ 2r − 1x

. (6.7.10)

In three-dimensions r = 2, and the average kinetic energy is

K (θ) = 3NT + Nmc2(

K1(x)K2(x)

− 1)

. (6.7.11)

On the strength of Mariotte’s law (6.7.9), (6.7.11) can be written as

K (θ) − 3PV = Nmc2(

K1(x)K2(x)

− 1)

(6.7.12a)

or

K (θ)/PV = 3 + x(

K1(x)K2(x)

− 1)

= 1/s. (6.7.12b)

The expression for the kinetic energy applies to a state at rest as well asa state in uniform motion; it is a Lorentz invariant. The uniform velocityu does not enter into the average (6.7.10). It is only when we form the totalenergy by multiplying (6.7.10) by cosh θ′ that we must change x to x′ in theaverage because it now applies to a state of uniform motion.

Since K1(x) < K2(x) it follows from (6.7.12a) that

K (θ) − 3PV ≤ 0. (6.7.13)

We can verify this in the extremes cases; in the low temperature, nonrela-tivistic limit, x � 1, and

K1(x)K2(x)

= 1 + 3/8x1 + 15/8x

= 1 − 32x

, (6.7.14)

so that s = 23 in (6.7.12b), and in the high temperature limit, or the x � 1

limit,

K1(x)K2(x)

= x2

,

Page 359: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

332 A New Perspective on Relativity

and (6.7.12b) becomes

K (θ)/PV − 3 = −x(1 − x/2), (6.7.15)

with s = 13 in the extreme ultrarelativistic limit, showing that inequality

(6.7.12a) still holds.

In summary, the average kinetic energy has the limiting expressions

K (θ) = 3NT + Nmc2(

K1(x)K2(x)

− 1)

32NT

↗ x � 1

↘ x � 1

3NT

in the nonrelativistic and relativistic limits, which coincide with the lowand high temperature limits, respectively.

Introducing the kinetic expressions for the average of the randomkinetic energy, (6.3.4), and the pressure (6.5.1) in d = 3 dimensions into(6.7.12a), give

K (θ)/V − 3P = nmc2

(1√ (

1 − w2/c2) − 1

)− 3

nm3

w2√ (

1 − w2/c2)

= nmc2

√ (1 − w2

c2

)− 1

= nmc2

(K1(x)K2(x)

− 1)

≤ 0,

(6.7.16)

where n is again the number density. In the nonrelativistic limit, (6.7.14)shows that (6.7.16) tends to 1, while in the ultrarelativistic limit, (6.7.15)shows that it tends to 0 as it should.

In Classical Theory of Fields, Landau and Lifshitz [75] claim inequality(6.7.13) should be written as

ε − 3P = nmc2√ (

1 − u2/c2)

≥ 0, (6.7.17)

which is the trace Tii of their energy–momentum tensor. This they claim is

the result of the kinetic expressions for the energy density and pressure,

ε = nmc2

√ (1 − u2/c2

) , P = nm3

u2√ (

1 − u2/c2) .

Page 360: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 333

If this is true then there should appear an average on the right-hand sideof (6.7.17). But, these averages, we learn, are over a “certain time interval,”and are not statistical averages.

If we evaluate Mariotte’s law (6.7.9) in the prime system we will havePV′ = NT′, since P and N are relativistic invariants. That is, Mariotte’slaw guarantees — and foresaw — the correct relativistic transform of thevariables entering into it. Then, in view of the definition of the total energy,(6.3.3), we multiply (6.7.10) through by nmc2 cosh θ′ to get [cf. second equal-ity in (6.5.1)],

ε − 3nT′γ = ε − 3P = nmc2γK1(x)K2(x)

≥ 0, (6.7.18)

where we have used (6.3.24). Inequality (6.7.18) is not inequality (6.7.17).The pressure is a Lorentz-invariant, and, thus, can only be compared withanother Lorentz-invariant. The total energy, (6.3.3) is not Lorentz–invariantand cannot appear in the expression for the energy–momentum tensor,(6.7.17).

Consequently, if the energy–momentum tensor of relativistic mechan-ics has any meaning at all, it must be the average kinetic energy density,and not the total energy density that appears in their expressions. Since theright-hand side of (6.7.18) contains the factor γ , the equation of state in theultrarelativistic limit is not ε = 3P, as Landau and Lifshitz would have usbelieve, since it diverges as (6.7.18) clearly shows. The correct equation ofstate in the ultrarelativistic limit is (6.5.4a), with the average kinetic energy,and not the total energy, for it cannot depend upon γ in its definition.

A degenerate gas can either condense where the pressure is a solefunction of the temperature, independent of the volume, or have a repulsivezero-point energy [Einbinder 48b]. In the latter case, the average kineticenergy, which is the thermodynamic internal energy, is a sole function ofthe volume, K (θ) = CV−s. Thermodynamically, the pressure is defined as

P = − ddV

K (θ) = sCV−(1+s) = sK (θ)/V,

which is the Grüneisen equation of state, (6.5.5). Then on the strength of(6.7.13),

3PV − K (θ) = (3s − 1)K (θ) ≥ 0,

requires s ≥ 13 .

Page 361: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

334 A New Perspective on Relativity

6.7.2 Asymptotic probability densities

We can form a probability distribution function from the modified Besselfunctions as

f (θ|x)dθ = xr−1

(2r − 3)!!e−x( cosh θ−1)

Kr(x)sinh2(r−1) θ cosh θ dθ, (6.7.19)

where f is the probability density function. In this subsection we willsee how exact, and well-known, probability density functions arise in theasymptotic limits of large and small x.

In the large x limit,

f (u|x) =√ (

) (mc2

T

)3/2

e−x(1/√

(1−u2/c2)−1) u2/c3

(1 − u2/c2)2, (6.7.20)

where the contribution of the Jacobian, 1/c(1 − u2/c2), is included in thelast term in (6.7.20). In the limit u � c, (6.7.20) becomes

f (u|T) =√ (

) (mT

)3/2e−mu2/2Tu2, (6.7.21)

which is exactly Maxwell’s speed distribution. The speed of light has com-pletely disappeared in (6.7.21) implying that it is not valid in the relativisticregion for which we must return to (6.7.20).

In the opposite limit of small x, we have the vibrancy condition givenby the transverse Doppler effect,

ω/ω0 = cosh θ, (6.7.22)

where the proper frequency of the source ω0 must be given. No matter whatit is, it appears from (6.7.22) to be the lower cut-off to the angular velocity, ω.In special relativity the frequency, like the total energy, is increased by themotion.

Introducing the vibrancy condition, (6.7.22), into (6.7.19) and takingnote of (6.7.4) result in

f (ω|ω0, x) = 12

x3e−xω/ω0ω

ω20

√ (ω2

ω20

− 1

), (6.7.23)

for r = 2, where the Jacobian, 1/ω0

√ (ω2/ω2

0 − 1), has been included. If we

set ω0 = mc2/�, we come out with, in the limit ω � ω0, the well-known

Page 362: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 335

law of ultraviolet blackbody radiation

f (ω|ω0, T) = 12

(�

T

)3

e−�ω/Tω2,

= VN

e−�ω/T ω2

c3 , (6.7.24)

where the expression for the total, non-conserved, particle number, N =2V (T/�c)3 has been introduced into the second line. Wien proposed hisdistribution, (6.7.24), on an apparent analogy with Maxwell’s speed distri-bution for monochromatic radiation.

We can also use the vibrancy condition [Wilkins & Willams 01] of thelongitudinal Doppler effect,

θ = ln(ω/ω0

). (6.7.25)

It is quite remarkable that (6.7.25) gives the representation of the modifiedBessel function as:

∫ ∞

0zν−1 exp

[−1

2x (z + 1/z)

]dz = 2Kν(x). (6.7.26)

From (6.7.26) it is apparent that exp[ 12x(z + 1/z)] is the generating function

of the modified Bessel function.In one dimension, the density of states, cosh θ, and the longitudinal

Doppler effect, (6.7.25), give the probability density function,

f1(ω|x) = ω/ω0 + ω0/ω

2ωK1(x)e−(x/2)(ω/ω0+ω0/ω), (6.7.27)

and the moment equation,

12

(ω/ω0 + ω0/ω

) = −d ln K1(x)dx

.

However, there is no maximum likelihood estimate, which estimates inten-sive thermodynamic parameters in terms of samples of extensive thermo-dynamic variables [Lavenda 91], because there is an insufficient numberof states.

Page 363: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

336 A New Perspective on Relativity

Things are different in three dimensions, with a density of states,sinh2 θ cosh θ, and the longitudinal Doppler effect giving a probability den-sity function,

f3(ω|ω0, x) = 18

xK2(x)ex

{(ω

ω0

)3

+(ω0

ω

)3

− ω

ω0− ω0

ω

}e− 1

2 x(ω/ω0+ω0/ω−2), (6.7.28)

where we multiplied and divided by ex. In the large x-limit (6.7.28) becomes

f3(�ω|ω0, x) = √ (2xπ

)xω

{18

(�ω)6

ω3ω30

+ 34

(�ω)4

ω2ω20

+ (�ω)2

ωω0

}e− 1

2 x(�ω)2/ωω0 ,

where �ω = ω − ω0 is the frequency shift. Now, in the limit of small shiftsthis becomes the probability density function,

f3(�ω|ω0, x) =√ (

) (mλ2

T

)3/2

(�ω)2e−m(λ�ω)2/2T , (6.7.29)

where λ = c/√

(ωω0) ≈ c/ω0 is the wavelength corresponding to maximumintensity of the line (center) at ω ∼ ω0. The probability density function(6.7.29) is again the three-dimensional Maxwell speed distribution withthe random speed w = λ�ω. The exponential in (6.7.29) was derived byLord Rayleigh [89] for the distribution of intensities caused by the Dopplershift, where m is the mass of the emitter.

By way of contrast, in the small x-limit, and for high frequencies, theprobability density function (6.7.28) becomes

f (ω|ω0, x) = 12

(x2

)3 ω2

ω30

e−(x/2)(ω/ω0). (6.7.30)

For a lower cut-off half as great as in the case where the vibrancy conditionwas given by the transverse Doppler effect (6.7.22), viz. ω0 = mc2/2�, theprobability density function (6.7.30) becomes precisely that of the Wiendistribution, (6.7.24).

Finally, we can use the probability density function (6.7.19) to corrob-orate, and quantify, our results on the thermal volume, (6.5.7). To this end,we average the thermal contraction factor,

√(1−w2/c2), where w = c tanh θ

Page 364: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 337

is the stochastic rapidity, with respect to the modified Bessel pdf (6.7.19).We then obtain for r = 2:

√(1 − w2/c2) = 1

K2(x)

∫ ∞

0e−x cosh θ sinh2 θ dθ.

If we use the double angle formula, sinh2 θ = 12 ( cosh 2θ − 1), then we can

use the representation of the Bessel function as:

∫ ∞

0e−x cosh ϑ cosh νθ dθ = Kν(θ).

We then find

√(1 − w2/c2) = x

2

(K2(x) − K0(x)

K2(x)

),

which is none other than (6.7.16) when the recursion relation,

Kn+1(x) = Kn−1(x) + 2nx

Kn(x),

with n = 1 is introduced. The limits can be displayed as

√(1 − w2/c2) = K1(x)

K2(x)=

1↗ x � 1

↘ x � 1

mc2/2T,

where the lower limit goes to zero in the ultrarelativistic limit.We can now estimate the error that was committed by exchanging the

average of a function for the function of the average. If we do this in (6.5.7)then

√(1 − w2/c2) = Nmc2

Nmc2 + K (θ)=

1↗ x � 1

↘ x � 1

Nmc2/K (θ).

In the small x-(ultrarelativistic) limit (6.7.11) gives K (θ) = 3NT(1 − mc2/3T) whereas (6.7.4) would give 2NT. Although off by a numericalfactor, in the small x-limit, the conclusion remains the same: The thermal

Page 365: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

338 A New Perspective on Relativity

volume contraction increases inversely to the absolute temperature. In thenonrelativistic limit, there is no thermal volume contraction,

V(θ) = Vγ−1, x � 1,

while in the ultrarelativistic limit,

V(θ) = 12

Vγ−1x, x � 1,

the volume tends to zero with x. In the relativistic regime, the tempera-ture and averaged volume are no longer independent variables since theaverage thermal contraction factor depends on the temperature.

References

[Abraham 20] M. Abraham, Theorie der Elektrizität, Vol. 2 (Teubner, Leipzig, 1920),p. 347.

[Becker 33] R. Becker, Theorie der Elektrizität, Vol. 2 (B. G. Teubner, Leipzig, 1933),p. 348.

[Callen & Horowitz 71] H. B. Callen and G. Horowitz, “Relativistic thermodynam-ics,” Am. J. Phys. 39 (1971) 938–947.

[Chandrasekhar 58] S. Chandrasekhar, An Introduction to the Study of Stellar Struc-ture (Dover, New York, 1958), pp. 394–397.

[Clausius 70] R. Clausius, “On a mechanical law applicable to heat,” PoggendorffsAnn 141 (1870) 124–130.

[Efimov 80] N. V. Efimov, Higher Geometry (Mir, Moscow, 1980), p. 490.[Einbinder 48a] H. Einbinder, “Generalized virial theorems,” Phys. Rev. 74 (1948)

803–805.[Einbinder 48b] H. Einbinder, “Quantum statistics and the ℵ theorem,” Phys. Rev.

74 (1948) 805–808.[Einstein 07] A. Einstein, “Relativitätsprinzip und die aus demselben gezogenen

Folgerungen,” Jahrbuch der Radioaktivität und Elektronik 4 (1907) 411–462;5 98–99 (Berichtigung).

[Fock 66] V. Fock, The Theory of Space Time and Gravitation, 2nd ed. (Pergamon Press,Oxford, 1966), p. 50.

[Hasenöhrl 04] F. Hasenöhrl, “Zur theorie der Strahlung in bewegten Körpen,”Ann. Phys. 15 (1904) 344–370.

[Hasenöhrl 05] F. Hasenöhrl, “Zur theorie der Strahlung in bewegten Körpen,Berichtigung,” Ann. Phys. 4 (1905) 4, 16.

[Ives 44] H. E. Ives, “Impact of a wave packet on an absorbing particle,” J. Opt. Soc.Am. 34 (1944) 222–228.

[Jüttner 11] F. Jüttner, “Das Maxwellsche Gesetz der Geschwindigkeitsverteilungin der Relativtheorie,” Ann. d. Physik 34 (1911) 856–882; “Die Dynamikeines bewegten Gases in der Relativtheorie,” ibid, 35 (1911) 145–161.

Page 366: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

Thermodynamics of Relativity 339

[Landau & Lifshitz 75] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields(Pergamon, Oxford, 1975), Sec. 34.

[Laue 19] M. von Laue, Die Relativitätstheorie (Vierweg, Braunschweig, 1919).[Lavenda 91] B. H. Lavenda, Statistical Physics: A Probabilistic Aprroach (Wiley-

Interscience, New York, 1991).[Lavenda 95] B. H. Lavenda, Thermodynamics of Extremes (Horwood, Chichester,

1995), p. 39.[Lavenda 00] B. H. Lavenda, “Special relativity via modified Bessel functions,”

Z. Naturforsch. 55a (2000) 745–753.[Lavenda 02] B. H. Lavenda, “Does the inertia of a body depend on its heat con-

tent?,” Naturwissenschaften 89 (2002) 329–337.[Mosengeil 07] K. V. Mosengeil, Dissertation, Berlin, 1906; “Theorie der stationären

Strahlung,” Ann. der Phys. (Leipzig) 22 (1907) 867–906.[Page & Adams 40] L. Page and N. I. Adams, Jr., Electrodynamics (Van Nostrand,

New York, 1940), p. 267.[Pauli 58] W. Pauli, Theory of Relativity (Pergamon Press, New York, 1958).[Planck 07] M. Planck, “Zur Dynamik bewegter Systeme,” Berliner Sitzungsberichte,

Erster Halbband 29 (1907) 542–570; Ann. der Phys. Lpz. 76 (1908) 1.[Poincaré 00] H. Poincaré, “The theory of Lorentz and the principle of reaction,”

Arch. Néderland. Sci. 5 (1900) 252–278.[Poincaré 06] H. Poincaré, “Sur la dynamique de l’Électron,” Rend. Circ. Mat.

Palermo 21 (1906) 129–176.[Poynting 10] J. H. Poynting, The Pressure of Light (Soc. Promotion Christian Knowl-

edge, London, 1910), p. 32.[Robb 11] A. A. Robb, Optical Geometry of Motion (W. Heffer & Sons, Cambridge,

1911).[Searle 96] G. F. C. Searle, “Problems in electric convection,” Phil. Trans. A 187 (1896)

675–713.[Sommerfeld 09] A. Sommerfeld, “Über die Zusammensetzung der Geschwin-

digkeiten in der Relativtheorie,” Verh. der DPG 21 (1909) 577–582; “Onthe Composition of Velocities in the Theory of Relativity,” Wikisourcetranslation at en.wikisource.org/wiki/Portal:Relativity.

[Steck & Roue 83] D. J. Steck and F. Roux, “An elementrary development of mass–energy equivalence,” Am. J. Phys. 51 (1983) 461–462.

[Thomson 68] J. J. Thomson, Application of Dynamics to Physics and Chemistry(Dawsons, London, 1968).

[Watson 44] G. N. Watson, Bessel Functions, 2nd ed. (Cambridge U. P., Cambridge,1944), p. 79.

[Wilkins & Willams 01] D. Wilkins and D. Williams, “From rapidity to vibrancy(logarithmic vibrancy),” Am. J. Phys. 69 (2001) 158.

[Yaghjian 92] A. D. Yaghjian, Relativistic Dynamics of a Charged Sphere: Updating theLorentz–Abraham Model (Springer, Berlin, 1992).

Page 367: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch06

This page intentionally left blankThis page intentionally left blank

Page 368: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

Chapter 7

General Relativity in a Non-EuclideanGeometrical Setting

As an older friend I must advise against it. . . In the first place you won’t succeed; and evenif you succeed, no one will believe you.Planck’s advise to Einstein against trying to formulate a general theory of relativity.

7.1 Centrifugal versus Gravitational Forces

General relativity is based on the notion that gravity, rather than beinga force acting between masses, is the curvature of space-time itself. Thesource of the (positive) curvature is mass itself, just like electric charge isthe source of the electromagnetic field. And just as free particles followstraight lines in space-time, they follow geodesics in the curved space-timeof a gravitational field.

Calculations of the gravitational redshift, the time-delay of radarechoes from planets, the bending of light near a massive object, and thegeodesic effect all offer their support to general relativity. Not very rarelycan new insights be gained by looking at old, established results from a newpoint of view. The old quantum theory of the hydrogen atom, which com-bined a mixture of continuous and discrete conditions, was subsequentlyreinterpreted by wave mechanics. And wave mechanics was found to beable to go way beyond the hydrogen atom in explaining the atomic consti-tution of matter. It was the transition from particle to wave–particle dualitythat opened up our ability to explore the world of atoms.

So too the addition of the wave nature to the study of gravitation willallow us to explain the well-known relativistic effects of the time-delay inradar sounding, the deflection of light and the advance of the perihelion ofMercury. All this can be accomplished by widening our mechanical view

341

Page 369: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

342 A New Perspective on Relativity

by allowing for the occurrence of optico-gravitational phenomena. Andwill we not need the entire optical spectrum for all the tests of generalrelativity fall within the short-wavelength limit.

No recourse to general relativity is necessary. And since these effectsare static, no assumption need be made as to how gravitational interac-tions propagate with the exception of the gravitational redshift that doesnot use the spatial components of the metric. The trajectory of a light raywill be determined in the same way as in an inhomogeneous refractivemedium. Fermat’s principle of least time, in Sec. 2.2.3, relates the lengthand orientation of a light ray to the time for light to propagate along apath of the ray. The analogy between the index of refraction and the squareroot of twice the difference between the total and potential energies is alsowell-known from the quantum mechanical explanation of tunneling. It isprecisely this separation of the metric and mechanical properties that aredescribed by the potential energy that can be used to distinguish betweenthe centrifugal and gravitational fields which cause acceleration. We shallreturn to this lack of equivalence of between gravitational and centrifugalforces which cause acceleration in Chapter 9, but we already have seen it inaction in Sec. 2.2.3 where Maxwell used this strategy in his classic treatmentof inversion in elliptic space.

If we take a flat space metric in the plane and consider a constantindex of refraction, we will demonstrate that Fermat’s principle is capableof yielding the phase of the oscillation of a Bessel function of the first kind inthe periodic domain in the asymptotic short-wavelength limit. This is iden-tical to the WKB result, and it allows us to associate a wave phenomenonwith a geodesic trajectory. The only potential present is the repulsive cen-trifugal potential, and that keeps the trajectory open.

In contrast to the generalization of special relativity, where gravity isconsidered to be the curvature of space-time instead of a bona fide force, thecentrifugal force is built into the phase of the Bessel function in the peri-odic domain where the trajectory consists of straight-line segments that aretangent to a caustic circle, whose radius is determined by the magnitudeof the angular momentum, and the arc segment joining the points of tan-gency. Whereas all forces causing acceleration are on the same footing ingeneral relativity, it is the gravitational force — and not the centrifugal

Page 370: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 343

force — that has to be introduced by allowing for a variable index ofrefraction, implying that the medium through which the light ray prop-agates is inhomogeneous. In other words, the effect of gravitation is tomake the medium optically denser in the neighborhood of a massive body,while the centrifugal force has no effect upon the optical properties of themedium.

Centrifugal, Coriolis and gravitational forces are usually consideredto be fictitious insofar as they can be eliminated by a change of frame. Thecentrifugal and Coriolis forces can be transformed away by transformingto a non-rotating frame, while the fictitious force of gravity is transformedaway by transforming from a non-free falling frame to a free-falling one.We can easily appreciate that the force of gravity will affect the opticalproperties of the medium while the centrifugal force affects the geometricalproperties by determining the radius of the caustic circle separating brightand shadow regions.

All the well-known results of general relativity can be analyzed fromthis perspective. The presence of a Newtonian potential will cause the mod-ification of the phase of the Bessel function and determine whether the orbitis periodic (bright zone) or aperiodic (shadow zone) depending on whetherthe total energy is negative or positive, respectively.

The bending of light requires a coupling of the gravitational and cen-tripetal potentials that is commonly referred to as Schwarzschild’s potential[cf. Fig. 7.6]. Here, it arises when we determine the extremum condition forFermat’s principle. It shows that like gravitational radiation, the interactionbetween a light ray and massive body is a quadrupole interaction withoutany appeal being made to general relativity.

In contrast, we shall find that the advance of the perihelion requiresboth the gravitational potential and the quadrupole interaction. In the gen-eral theory, the quadrupole interaction appears as a relativistic correction tothe square of the transverse velocity in the conservation of energy. WhereasNewton’s potential is the cause of the closed elliptical orbit, the quadrupolecauses the perihelion to slowly rotate in a rosette orbit. A dipole momentwould have been sufficient to cause the advance of the perihelion, but sincethere is conservation of momentum, the center of mass of the system cannotaccelerate and so neither can the mass dipole moment.

Page 371: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

344 A New Perspective on Relativity

7.2 Gravitational Effects on the Propagationof Light

7.2.1 From Doppler to gravitational shifts

That motion causes changes in the frequency and/or wavelength has beenknown since the middle of the nineteenth century when J. C. Doppler dis-covered it. If the relative speed of the waves is not the same for a movingobserver as for an observer at rest, there is a frequency shift even though thewavelength remains the same. Not so for light waves where the speed oflight is the same for all observers, whether they be at rest or traveling nine-tenths the speed of light. The constancy of the speed of light imposes thatboth frequency and wavelength change such that their product, the speedof light remains c. The distinction between the two is that where frequencyand wavelength change independently the vibrations belong to that of themedium but where they are negatively correlated the self-contained vibra-tions are of the electromagnetic fields, and not the media through whichthey propagate for there may even be none.

In Sec. 2.5 we have described how Einstein [11] considers c as thegravitational potential, thereby negating his principle of the constancy ofthe speed of light. Instead of inertial frames, he considers one frame to beuniformly accelerated with respect to the other. A light signal of frequencyν0 is emitted from a source in a frame at rest. If light has traveled a distanceh at a speed c then it will have acquired a speed gh/c which is the productof the (constant) gravitational acceleration, g, and the time, h/c. Thus, thefirst-order Doppler shift,

ν = ν0 (1 + u/c) ,

is converted into

ν = ν0

(1 + gh/c2

).

Now, on the strength of the equivalence principle, acceleration isequivalent to a gravitational field, and gh can be replaced by the gravi-tational potential, �, to get

ν = ν0

(1 + �/c2

), (7.2.1)

Page 372: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 345

or even,

c = c0

(1 + GM/rc2

), (7.2.2)

where c0 is the velocity of light in vacuo. Einstein has made the followingreduction:

relative motion −→ uniform acceleration −→ static gravitational field.

Thus, the speed of light is like the speed of water waves, it is different forstationary and moving observers, and even more, it depends on the positionone is located in the gravitational field. From (7.2.2) Einstein deduces that achange in the velocity normal to a wavefront is proportional to the changein the gravitational potential. This makes c a potential for gravitation, andthe latter is responsible for the change in the rate at which a clock ticks.What do the experiments say?

To get an effect on the change of rate of a clock on the gravitationalfield, one needs to consider a clock on Earth and another in a satellite orbit-ing the Earth. Comparison of the two clocks will require ‘back-and-forth’light signals so in addition to the gravitational effect, (7.2.2), there will be theeffect of time dilatation. A round the world trip was accepted as the secondbest substitute to satellite experiments, and in 1971 four atomic clocks werecompared with a stationary observer after they had made round the worldtrips in eastward and westward directions.

The transported clocks feel a smaller gravitational attraction to theEarth than the grounded clock so that they will appear to go faster. Theinertial system of the ground is that it moves with the Earth about the Sunbut it does not partake in the Earth’s rotation. Because gravity and timedilatation supposedly interfere destructively on the eastward journey thetraveling clock should be slowed down, while, on the westward journeythey complement one another so that there is a gain. The results were her-alded as a great triumph for special relativity [Hafele & Keating 72].

Not only was the difference in time between the westward and east-ward journeys of the same order as the difference between the individualclocks [Essen 78], but how does one go about decomposing time dilatationfrom gravitational effects? The latter certainly does not belong to the realmof special relativity.

Page 373: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

346 A New Perspective on Relativity

7.2.2 Shapiro effect via Fermat’s principle

We will now calculate the increased travel time of light in a gravitationalfield known as the Shapiro effect. We will not treat it from general relativity,nor by (7.2.2), but, rather, as an example of Fermat’s principle of least time.

Fermat’s principle asserts that the ray path connecting two arbitrarypoints makes the optical path length,

I = cτ =∫

η√

(2T) dt, (7.2.3)

stationary, where, as before, η is the index of refraction, c the velocity oflight, τ, the propagation time, and T is the kinetic energy per unit mass.

As a first application of Fermat’s principle (7.2.3) we will determinethe time-delay in radar sounding. We will later see, in Sec. 9.10.3, how gen-eral relativity accounts for this time-delay by evaluating the Schwarzschildmetric on the null geodesic when all the angular dependencies areignored.

If a light signal is sent from Earth, located on the x-axis at −xE, toVenus say, which is located behind the Sun at xV , as shown in Fig. 7.1, thelight ray will be bent as it passes the gravitational field of the Sun. Clockswill thus be slowed down, and the time it takes the ray to bounce off thesurface of Venus and return to Earth will be longer than if the sun were notpresent.

The simplest mechanical analog of the index of refraction is η =√ (1 − 4�/c2), where � is the potential energy per unit mass. We will jus-

tify this choice shortly. Since the gravitational field of the Sun makes themedium optically denser, � can be identified as the gravitational poten-tial, −GM/r, where G is Newton’s gravitational constant, M the mass ofthe Sun, and r = √ (

R2 + x2) is the distance from the center of the Sun toVenus, with R as the Sun’s radius.

Fig. 7.1. The set-up for the Shapiro effect.

Page 374: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 347

Now, according to Fermat’s principle, (7.2.3), the propagation time τ

along a ray connecting the two endpoints −xE and xV is given by

τ =∫ xV

−xE

η(r)c

dx =∫ xV

−xE

√(1 + 2α/r)

cdx,

where α := 2GM/c2 is commonly referred to as the Schwarzschild radius.Every mass has its accompanying Schwarzschild radius: A human with amass of 102 kg, and radius of 1 m has a Schwarzschild radius of 10−25 m,while the Sun with a mass of 2 × 1030 kg, and a radius of 7 × 108 m willhave a Schwarzschild radius of 3 × 103 m.

The slowing down of clocks in a gravitational field will result in anapparent reduction in the speed of light. Light will therefore travel at thephase velocity u(r) = c/η(r), rather than c as it does in vacuo.

The gravitational potential enters through the index of refraction tomodify the speed of light, and not through any putative connectionwith the Doppler effect.

Consequently, the travel time will be

τ + �τ ≈ τN + α

c

∫ xV

−xE

dxr

= τN + α

c

∫ xV

−xE

dx√ (R2 + x2

) ,

where τN = (xE +xV)/c is the Newtonian travel time, and we have used theapproximation

√(1 + x) ≈ 1 + x/2. The second term is half the time-delay

for a signal to bounce off Venus and return to Earth.Fermat’s principle thus predicts a time dilatation,

2�τ = 2α

cln

(xE + √ (

R2 + x2E

)

−xV + √ (R2 + x2

V

))

≈ 2α

cln

(4xExV

R2

)= 2.4 × 10−4 s,

where the square roots have been expanded to lowest order,√ (

R2 + x2E

) ≈xE and

√ (R2 + x2

V

) ≈ xV + R2/2xV since R � xV , xE. The factor 2 is due tothe fact that the signal must make a round trip.

Page 375: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

348 A New Perspective on Relativity

This is known as the Shapiro effect, and has been calculated withoutgeneral relativity. The increased travel time corresponds to an apparentincrease in distance of 36 km from Venus to Earth. In a simplified demon-stration [Sexl & Sexl 79], the time-delay of radar due to the presence ofa massive body is given by τ = ∫

dx/ceff, where the effective velocity oflight is given by Einstein’s expression (7.2.2), which supposedly accountsfor both time dilatation and the shrinking of measuring rods in a gravi-tational field. The final expression for τ is valid to first-order in α/r. Boththese factors have been incorporated into the stretching of the line elementby the index of refraction η = √

(1 + α/r). Rather, if the effective velocityceff were to be identified with the phase velocity, there would be only halfof the effect which is within 3% of the experimental uncertainty.

7.3 Optico-gravitational Phenomena

The wave equation for a wave of definite angular frequency ω may bewritten from (3.8.4) as

∇2A + ω2

c2 η2A = 0, (7.3.1)

in terms of the vector potential intensity A, or what Maxwell referred to asthe ‘electrokinetic momentum’ intensity. This is the equation of light in amedium of index of refraction

η2 = 2m[W − �(r)

] c2

ω2 , (7.3.2)

where W is the total energy and� is the potential energy, which is supposedto be a function only of the radial coordinate r. The index of refraction isusually positive, but, under certain circumstances like the total internalreflection of light, it can be imaginary. In this case there is an exponentialpenetration of light from a denser to a less dense medium.

To see what this implies, take the scalar product of the subsidiarycondition (3.8.9) with J; we then obtain

−J · ∇ρ

J2= η2

c2 .

Page 376: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 349

Whereas in the normal case where the refractive index is real, the rate ofchange of the gravitation current is in the opposite direction to the densitygradient, the case of an imaginary index of refraction would imply that thetwo are in the same direction.

Although (7.3.1) is what we have derived from the circuital equations,and the definition of the index of refraction, we will find that it is insufficientto account for relativistic gravitational phenomena. This is due to the factthat η accounts for the potentials, like the gravitational potential, −GM/r,but it cannot account for the metric dependent terms. This will becomeclear from Fermat’s principle.

Expression (7.3.2) can be shown to agree with Cauchy’s expressionfor the index of refraction — in this case � represents the gravitationalpotential. In 1836 Cauchy proposed a simple formula for the variation ofthe index of optical glasses with wavelength. Cauchy’s formula dependsonly on two empirical constants, C1 and C2,

η = C1 + C2ω2. (7.3.3)

For gases in which η differs little from unity, we can replace 2η by η2 + 1,and write (7.3.3) as

η2 = 2C1 − 1 + 2C2ω2. (7.3.4)

Now consider (7.3.2) in the case where the density ρ = 3M/4πr3 is constant.We then obtain

η2 = −A + 8π

3Gρr2, (7.3.5)

where A is an arbitrary constant. But, Gρ is proportional to the square ofthe frequency of free-fall so that we can write (7.3.5) as

η2 = −A + B(ωr)2, (7.3.6)

which is tantamount to Cauchy’s formula, (7.3.4), where B is another arbi-trary constant. Comparison of our original formula (7.3.2) with (7.3.6)embodies the equivalence relation,

GMr

= ω2r2. (7.3.7)

In the remainder of this section we will use natural units where G = c = 1.

Page 377: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

350 A New Perspective on Relativity

In the Euclidean plane where ϑ = π/2, the kinetic energy, expressedin polar coordinates, is

2T = r2 + r2ϕ2,

where ϕ is the azimuthal angle. However, we need not restrict ourselvesto such a simple form of the kinetic energy. Rather, we can consider theexpression

2T = Er2 + Gϕ2, (7.3.8)

where E and G are the coefficients of the first fundamental form, which canbe a function only of r.

Introducing (7.3.8) into (7.2.3) we get

I =∫

η

√ (Er2 + Gϕ2

)dt

=∫

η

√ (E + Gr2ϕ′ 2

)dr, (7.3.9)

where the prime indicates the derivative with respect to r. The true ray pathconnecting any two arbitrary points will make (7.3.9) stationary. Observingthat ϕ is a cyclic coordinate in as much as it is not present in the integrandwhile its derivative with respect to r, ϕ′, is; we know that a first integral tothe motion exists. Calling the integrand �, it is

∂�

∂ϕ′ = ηGϕ′√ (

E + Gϕ′ 2) = �,

which is a constant, regardless whether the medium is homogeneous ornot. Recall that in an inhomogeneous medium, the refractive index will bea function of r. For the moment, we will assume, for simplicity, that it is aconstant. The constant � will be identified as the angular momentum in thenatural units we are working in.

Solving for ϕ′, we obtain the equation of the orbit as

dr= ϕ

r= ± �

√E√

G√ (

η2G − l2) . (7.3.10)

What we have derived in (7.3.10) is the famous Clairut parametrization,which is an orthogonal parametrization in which both parameters of thefirst fundamental form, E and G, are only functions of r. Solutions to (7.3.10)

Page 378: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 351

are geodesics of constant speed and zero geodesic curvature. They arereferred to as ‘pre-geodesics,’ and the geometrical interpretation of theangular momentum, �, is the slant of the curve.

The geodesic equation (7.3.10) contains all the information we need.It may be decomposed into two equations: the definition of the angularmomentum,

� = G√E

ϕ, (7.3.11)

and the radial equation,

r = ±√

(η2G − �2)√G

. (7.3.12)

In Euclidean space G = r2 and E = 1, (7.3.11) reduces to

� = r2ϕ, (7.3.13)

which is Kepler’s law of equal areas in equal times, while (7.3.12) deter-mines two families of curves in the rt-plane — the so-called characteristiccurves.

Instead of (7.3.1) we now have

∇2A +(ω2η2 − �

)A = 0, (7.3.14)

where � is the centrifugal energy per unit mass,

2� = �2

r2 . (7.3.15)

Although we have derived (7.3.14) as a geodesic from Fermat’s principle,it would be instructive to look for its electromagnetic origin, which mightshed some light on the appearance of the last term in (7.3.14).

Whereas (7.3.1) follows immediately from the circuital equations,

∇ × E = −ηH,

∇ × H = ηE. (7.3.16)

simply by introducing H = ∇ × A, and the subsidiary equations,

E + ηA + ∇φ = 0,

∇ · A + ηφ = 0, (7.3.17)

Page 379: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

352 A New Perspective on Relativity

(7.3.14) needs an Ampère current,

J = �A. (7.3.18)

This converts the reduced wave equation,

∇(∇ · A) + ω2η2A = J,

into (7.3.14). Equation (7.3.14) requires the divergence of (7.3.18) to vanish,which is not guaranteed solely by the vanishing of the divergence of thevector potential since � is a function of r. An auxiliary condition is neededthat requires r ·A = 0, or that the vector potential be normal to the directionof propagation, which is the case of a transverse wave. A current propor-tional to the vector potential is a hallmark of superconductivity where thecoefficient of the vector potential is proportional to the mass. We will returnto this in Sec. 11.5.2.

In his studies of the transmission of electromagnetic waves alongcylindrical cables, Heaviside [94] came across (7.3.14) with the centrifu-gal energy (7.3.15) in which the electric and magnetic fields were zero- andfirst-order Bessel functions when there was no angular variations. Thisnecessitated considering two components of the electric field vector, E andF. Let us first consider the spherically symmetric case.

Let z be the axis of the cable of radius r0, and r the distance from it.Then either E or H will be circular about this symmetry axis. In either casethe electric field will have an additional, or radial, component F. If H iscircular, and E longitudinal the circuital equations are

1r

∂r H∂r

= εE, −∂H∂z

= εF,∂E∂r

− ∂F∂z

= µH, (7.3.19)

where (1/r)(∂/∂r)r denotes the curl which operates on a solenoidal field.In contrast, ∂/∂r is the gradient, and it operates on an irrotational field.Differentiating the last equation in time and substituting in the first twoequations lead to a Bessel equation for H whose solution is a Bessel functionof order one,

H′′ + 1r

H ′ +(

s2 − 1r2

)H = 0,

Page 380: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 353

where the prime stands for differentiation with respect to r, and

s2 = ∂2

∂z2 − η2 ∂2

∂t2 ,

while the equation for E is a Bessel function of order zero.Rather, if H is longitudinal and E circular, the circuital equations are

∂H∂r

= εE, −∂H∂z

= εF,1r

∂r E∂r

− ∂F∂z

= µH.

Now, H satisfies

1r

∂rr∂H∂r

+ ∂2H∂z2 = η2H, (7.3.20)

whose solution is a zero-order Bessel function, while E is a first-order Besselfunction.

Finally, if both E and H are longitudinal, we get

∂2H∂r2 + ∂2H

∂z2 = η2 ∂2H∂t2 . (7.3.21)

Furthermore, if H has a periodic dependency on z, i.e. eiz/λ, (7.3.21) becomesthe Klein–Gordon equation when the wavelength λ is identified as theCompton wavelength. We will have much more to say about (7.3.21) inSec. 9.5, but it should be borne in mind, even at this stage, that the Klein–Gordon equation involves only irrotational fields. Hence, Schrödinger’sconclusion that

Plane waves have only two possible states of polarization, not three, as would beexpected for a vector wave (e.g. an elastic wave; remember the historical dilemmaconcerning the ‘elastic properties of the aether’).

As mass is associated with the longitudinal (helicity zero) state (cf.Sec. 9.6), we see that it is compatible with irrotational fields alone anddoes not arise from some spontaneous symmetry breaking in which thedisappearance of a transverse degree of freedom makes its appearanceas a longitudinal mode of vibration of the electromagnetic field.

Heaviside noted that when there are no angular variations the onlyBessel functions are J0 and J1. Since the generalization to include angulardependencies, leading to higher-order Bessel functions is “so easily made

Page 381: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

354 A New Perspective on Relativity

that it would be inexcusable to overlook it,” we, along with Heaviside,consider it.

Take H longitudinal and the electric field with two components, onecircular, E, and one radial component, F. The circuital equations are

−∂H∂r

= εE,∂H∂ϑ

= εF,1r

(∂rE∂r

− ∂F∂ϑ

)= −µH, (7.3.22)

at z = const. Assuming H to be a periodic function of both angle and time,the resulting equation,

1r

∂rr∂H∂r

+(

ω2η2 − m2

r2

)H = 0, (7.3.23)

is Bessel’s equation of order m.By defining the analytic function G = E + iF, we can write the circuit

equations (7.3.22) as(

∂r− i

∂ϑ

)H = −ε

∂G∂t

,

1r

(∂

∂rr + i

∂ϑ

)G = −µ

∂H∂t

. (7.3.24)

Differentiating the second equation in time gives

1r

(∂

∂rr + i

∂ϑ

) (∂

∂r− i

∂ϑ

)H = η2H.

Since we have considered all fields to be real, the real and imaginary partsof this equation must be equated to zero separately. The real part gives theequation for the Bessel function of order m, (7.3.23), while the imaginarypart, when equated to zero,

−∂E∂ϑ

= ∂r F∂r

,

is one of the Cauchy–Riemann conditions for the existence of an analyticfunction. This just says that the mixed second derivatives of G are equal.

Ironically, Heaviside loathed complex variables, going so far as towrite to Bromwich “I could never stomach your complex integral method.”Two years after Heaviside’s death, Jeffreys was to show “that manyof Heaviside’s solutions could be obtained easily by workers without

Page 382: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 355

his amazing skill in manipulation, by using the theory of the complexvariable.”

To understand the nature of the individual terms in (7.3.22) we mayavail ourselves of Heaviside’s construction of the power equation, or whathe referred to as ‘activity.’ Again, Heaviside missed out on the discoveryof the energy flux by a matter of months, and seeing that Poynting usedMaxwell’s original notation, it appears almost miraculously that he couldhave made the discovery. Heaviside’s derivation was just two pages!

Multiplying the first, second and third equations in (7.3.22) by E, F andH, respectively, and combining them in such a manner that the followingequation results

1r

∂r EH∂r

− 1r

∂ FH∂ϑ

= −12

∂t

{ε(E2 + F2) + µH2

}.

The right-hand side is just the decrease in the total energy due to the flowof energy toward the outside of the cylinder which is represented by theleft-hand side. The total flow towards the outside, which is responsible forthe subsequent decrease in energy, is obtained by multiplying both sidesby r dr dϑ integrating from r = 0 to r = r0 and from ϑ = 0 to ϑ = 2π. Wethen obtain

2πr0H(r0)E(r0) = −∫ r0

0

∫ 2π

0

ddt

{ε(E2 + F2) + µH2

}r dϑ dr, (7.3.25)

since the term in the derivative with respect to ϑ averages out to zero.The first term in (7.3.25) is the Poynting flux directed radially outward.a

It involves only the circuital component of the electric field, and vanisheswhen r0 is a zero of the Bessel function.

By contrast, in the case of the Klein–Gordon equation, (7.3.21), wewould get an unphysical source term,

−2π

∫ r0

0EH dr,

from the necessity of performing an integration by parts. Hence, if the fieldsare entirely irrotational there will be no energy flux.

The case of spherical symmetry can be handled analogously, and inthe presence of a gravitational potential it leads to the gravitational analog

aThis agrees with Heaviside except for a minus sign.

Page 383: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

356 A New Perspective on Relativity

of the nonrelativistic hydrogen atom. Again following Heaviside, the sim-plest spherical waves are those for which the lines of H are circles of equallatitude, centered on an axis from which the polar angle, ϑ, is measured.Here r is the distance from the origin, and the azimuthal angle will haveno role in what follows. H is circuital while E will have two components: acircuital component, E, that coincides with a line of longitude, and a radialcomponent F.

Heaviside writes down directly the wave equation for a sphericalharmonic. However, it is of interest to see its origin in the circuital equations.The electric field, E, has two components: a radial component E, and alatitudinal circular component F. The magnetic field has but one componentand is longitudinally circular. The circuital laws are thus

µH = −curlϕE = −1r

(∂

∂rrF − ∂E

∂ϑ

), [EM]

εF = curlϑH = −1r

∂rrH,

εE = curlrH = 1r sin ϑ

∂ϑsin ϑH.

The circuital equations [EM] can then be combined into a wave equation,or what Heaviside referred to as the ‘characteristic’ equation

εµH = 1r

∂2

∂r2 rH + 1r2

∂ϑ

1sin ϑ

∂ϑsin ϑH. (7.3.26)

The second term on the right-hand side of (7.3.26) just misses beingthe ϑ term in the Laplacian in spherical coordinates by the term 1/r2 sin2 ϑ,viz.

∂ϑ

1sin ϑ

∂ϑsin ϑ = 1

sin ϑ

∂ϑ

(sin ϑ

∂ϑ

)− 1

sin2 ϑ.

This term is analogous to the m = ±1 term for the magnetic quantum num-ber of an electron. It represents the projection of the angular momentumon the preferred z-axis. For if we call

λ = 1sin ϑ

∂ϑsin ϑ

∂ϑ+ 1

sin2 ϑ

∂2

∂ϕ2 ,

Page 384: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 357

the ‘surface harmonic,’Y�(ϑ, ϕ) satisfies the differential equation

�Y� + λY� = 0, (7.3.27)

where λ is an eigenvalue which we seek to determine.Now Y� depends only on the angles, or equivalently the ratios x/r,

y/r, and z/r, say to the power �. We can thus consider an �-th degree homo-geneous function,

�� = r�Y�,

where � satisfies Laplace’s equation, ∇2�� = 0, which in spherical coordi-nates reads

(∂2

∂r2 + 2r

∂r

)r�Y� + �

r2 r�Y� = r�−2 {� + �(� + 1)} Y� = 0. (7.3.28)

In view of (7.3.27) we find λ = �(�+1). There are 2�+1 values of m for eachvalue of �, and in this case � = 1 there should be three values, m = 0, ±1.The m = 0 is missing.

We can find the m = 0 value if we consider the propagation of theelectric field. To find this equation we take the time-derivative of the thirdequation in [EM], introduce the first equation, and use the second and thirdequations to express the circular component of the electric field in terms ofthe radial component. We then obtain

µεE = 1r

∂2

∂r2 rE + 1r2 sin ϑ

∂ϑsin ϑ

∂E∂ϑ

, (7.3.29)

which has the correct form for the Laplacian in spherical coordinates withm = 0. In Sec. 11.5.2 we will show that (7.3.29) can support dispersion.There, we will also provide an interpretation for � and m in terms of theproperties of photons.

Returning to our main theme, and assuming a periodic time depen-dence, (7.3.26) reduces to

(d2

dr2 − �(� + 1)r2 + η2ω2

)rH = 0, (7.3.30)

which is Schrödinger’s equation for the hydrogen atom. Moreover, sinceCoulomb’s law has the same form as Newton’s law, (7.3.30) can be solved

Page 385: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

358 A New Perspective on Relativity

in the exact same way by replacing e2 with GM in the expression for theindex of refraction,

η2 = 2mω2 [W − �(r)] = 2m

ω2

(W + GM

r

).

The square of the momentum is written to show that it has been derivedfrom the square of the curl, and thus polarization has already been includedin the solution to Schrödinger’s equation without ever being realized. Thisis clear from the circuital equations [EM].

These equations describe the propagation of electromagnetic waves.For photons, the projection of the total angular momentum along the polaraxis is replaced by the projection of its spin along the direction of motion.This is called helicity, and for a photon the helicity is either ±1, but neverzero. It is remarkable that this information is included in the classical cir-cuital laws, as is made evident from (7.3.28). The absence of the helicity-0state guarantees that a photon can never come to rest, i.e. it has zero restmass.

Although we will return to problem of introducing a longitudinalmode in Sec. 11.5.5, we briefly indicate how this can be done. If G is ageneralized displacement, the force acting on it will be made up of shear,compression, and rotation. The force per unit volume that arises from thestress of the medium which tends to oppose a finite elastic resistance toshear, compression and rotation is

F = n[∇2G + 13∇(∇ · G)] + λ∇(∇ · G) − µ∇ × ∇ × G, (7.3.31)

where the elastic constants related to shear, compression, and rotation aren, λ, and ν, respectively. The first term in (7.3.31) is the rigidity, and whenthere is frictional resistance, its time-derivative is the frictional resistance todistortion. We shall neglect such effects here and concentrate on the othertwo terms. The second term represents a uniform tension, and its negativerepresents the hydrostatic pressure. When the material is incompressible,λ becomes infinite, and ∇ · G = 0, but their product remains finite. The lastterm is the force that tends to rotate the body as a whole.

As we discussed in Sec. 3.8.1, there are times when it is more con-venient to associate H to the velocity than E. Yet, it was crucial to ourderivation of the electromagnetic mass in Sec. 5.4.3 that we associate H

Page 386: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 359

with the velocity. Such will prove to be the case here. So with G = H, and

ν∇ × G = E,

we take the time-derivative of the latter and identify ν−1 = ε, to get thesecond circuital equation

εE = ∇ × H.

Introducing these definitions into (7.3.31), and associating the permeabilitywith the density, we get the equation of motion,

µH = −1ε∇ × E + λ∇(∇ · p−1H), (7.3.32)

where we also set the rotational constant equal to the inverse of the dielec-tric constant, and p−1 is Heaviside’s notation for the inverse of the time-derivative.

In respect to the first equation in [EM], there is an additional term in(7.3.32) which normally should be zero because

∇ · H = 1r sin ϑ

∂H∂ϕ

= 0, (7.3.33)

since H consists of circles on a sphere of constant latitude and has noth-ing to do with ϕ which measures longitude. It is also required that H bedivergentless. Then, taking the time-derivative of (7.3.32), we get

εµH = 1r

∂2

∂r2 rH + 1r2

∂ϑ

1sin ϑ

∂ϑsin ϑH − 1

r2 sin2 ϑ

∂2H∂ϕ2 . (7.3.34)

The two numbers in quantum mechanics that describe angularmomentum are � and m, the angular momentum and its projection on thez-axis, respectively. For photons as well as elementary particles it is moreconvenient to consider the � spin, and m helicity. The photon is a spin � = 1particle, and a spin-one particle has 2� + 1 states with m = −1, 0, 1. But notfor a photon since a massless particle cannot have helicity m = 0. Thisis guaranteed by (7.3.33). But, if (7.3.33) does not hold, and H is periodicin ϕ, i.e. eimϕ, then for m = ±1, the last term in (7.3.34) cancels the termin the second expression that prevents it from being the Laplacian. In that

Page 387: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

360 A New Perspective on Relativity

event (7.3.34) becomes

εµH = 1r

∂2

∂r2 rH + 1r2 sin ϑ

∂ϑ

(sin ϑ

∂H∂ϑ

), (7.3.35)

for the m = 0 longitudinal mode.Now, the right-hand side is not the expression for the Laplacian in

spherical coordinates, which for the m = 0 state would read

1r2

∂r

(r2 ∂

∂r

).

But, it is the free-particle spatial part of the Schrödinger equation, for thesame m = 0 state, which uses the radial part of the Laplacian in cylindricalcoordinates. This has been derived from the circuital equations [EM], andnot from the operator pr = −i�∂/∂r for the radial momentum. Moreover,the m = 0 mode corresponds to the spherical harmonic

�(� + 1) = − 1sin ϑ

∂ϑsin ϑ

∂ϑ,

with eigenvalue � = 1, which is the value of its spin, while

�(� + 1) = − 1sin ϑ

∂ϑsin ϑ

∂ϑ+ 1

sin2 ϑ,

corresponds to the same eigenvalue, but with m = ±1. It is only the formerwhich is compatible with a space varying index of refraction.

By shifting the quantum numbers � and m from angular andazimuthal quantum numbers to spin and helicity, we have obtained theequations governing photons and the corresponding m = 0 longitudi-nal mode. Heaviside certainly could not have foreseen all this, but heshould have noted that he did not come out with the correct sphericalharmonic in terms of Legendre polynomials, but, rather, with their firstderivatives. There is a paucity of equations in physics, owing to the fru-gality of nature that requires their reinterpretation under different physicalcircumstances.

Heaviside began his investigation on trying to obtain condensationwaves from the rapid oscillations of plus and minus charges in a conductor.Obviously since plane waves are incapable of longitudinal vibrations, hewas led to consider the next in the order of simplicity — spherical waves.If electricity can be likened to a fluid, and if it were compressible then

Page 388: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 361

it would naturally give rise to a condensation wave. This would make itunnecessary to consider an electric current as the motion through a spaceof something since the vibrations of a lattice would do equally well. Fora constant index of refraction, Heaviside did not find any condensationwaves.

If (7.3.14) has anything to do with gravity, it will be the Helmholtzequation for the wave function that determines the gravitational field oncethe additional potential � and index of refraction are specified. This is incontrast to Poisson’s equation where the gravitational potential is deter-mined by the mass density.

7.4 The Models

Our three basic models are:

(i) The Flat Model, where

E = 1, G = r2,

for which the wave equation is:

(d2

dr2 + ω2η2 − �2

r2

)u = 0, (7.4.1)

where, without loss of generality, we consider a reduced scalar waveequation. Equation (7.4.1) bears a remarkable resemblance to the nor-mal form of Bessel’s equation, where a periodic solution exists forr > �/ηω, and an exponentially damped solution for the reverse ofthe inequality.

Even more can be said when we introduce another equivalenceprinciple

�2

r2 = ω2r2, (7.4.2)

which takes us from constant angular momentum, implying equalareas in equal times, to one of constant angular frequency. This hasthe effect of converting Bessel’s equation (7.4.1) into the differential

Page 389: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

362 A New Perspective on Relativity

equation for Hermite polynomials

(d2

dr2 + η2 − ω2r2

)u = 0. (7.4.3)

A periodic solutions exists if r stays in the limits ±η/ω.(ii) The Beltrami, or Projective, Model, where

E = 1(1 − r2/R2)2

, G = r2

1 − r2/R2 , (7.4.4)

with R the absolute constant, or radius of curvature. The wave equa-tion is

{d2

dr2 + η2 − �2

r2

(1 − r2

R2

)}u = 0. (7.4.5)

If the radius of curvature is R = 1/√

ρ [i.e. R = c/√

(Gρ)], and the totalmass, M is constant, rather than the mass density, ρ, (7.4.5) becomes

(d2

dr2 + η2 − �2

r2

(1 − 2M

r

))u = 0. (7.4.6)

For a constant index of refraction, (7.4.6) will give the deflection oflight about a massive body, M, while in the case where the indexof refraction is given by (7.3.2), it will describe the advance of theperihelion.

(iii) The Stereographic Inner Product Model, where

E = 1(1 − r2/R2)2

, G = r2

(1 − r2/R2)2,

for which the wave equation is

d2

dr2 + η2 − �2

r2

(1 − r2

R2

)2 u = 0. (7.4.7)

However, since the Gaussian curvature is no longer constant, this willcause a modification of the conservation of the angular momentum from

Page 390: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 363

its Euclidean expression (7.3.13). It will now be given by

� = ϕr2

1 − r2/R2 = ϕr2

1 − 2M/r, (7.4.8)

where we transferred from one of constant density, ρ, to one of constantmass, M, since the absolute constant,

R = √(3/8πρ). (7.4.9)

The absolute constant (7.4.9) is proportional to the free fall time, ρ−1/2.We did this so that (7.4.8) would correspond to Møller’s [52] expres-

sion for the angular momentum that he got in his treatment of the perihelionshift. His conclusion that the right-hand side of (7.4.8) “cannot in generalbe interpreted as angular momentum, since the notion of a ‘radius vector’occurring in the definition of angular momentum has an unambiguousmeaning only in a Euclidean space.” This space should rather be a spaceof constant curvature. We will return to this discussion in Sec. 9.6.

Transferring our attention to the phase S = −i ln u + const., we get

S′ = ±√(η2 − �), (7.4.10)

in the optico-geometric limit where ∇2S � 1. Equation (7.4.10) can beconsidered as the generalization of the Poisson equation of the short-wavelength theory of gravitation. The phase S plays the role of an action,and is known as the eikonal in geometrical optics. It is completely deter-mined by the gravitational potential, in the case that the index of refractionis varying, and by the generalized centrifugal potential �. Gravitationaleffects will be reflected in deviations from the Flat Model for S, which isgiven by (η = 1)

S±(r, ϕ) =∫

r dr = ±∫ √

(r2 − �2)r

dr

= ±(√ (

r2 − �2)

− � cos−1 �

r+ �ϕ

)(7.4.11)

= ±�

∫tan2 ϕ dϕ = ±�( tan ϕ − ϕ).

The second line of (7.4.11) describes wavefronts characterized by thecondition that S± = const., and are involutes of the circle r = �. Now, a

Page 391: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

364 A New Perspective on Relativity

Fig. 7.2. Rays tangent to a circular caustic of radius l.

caustic is an envelope of a family of rays. The wavefronts are normal to therays PA and PB, and the caustic curve coincides with the envelope AB ofthis family of normals in Fig. 7.2. The wavefronts are the involute to thecurve AB. The envelope of the normal to this curve is called the evolute.The rays corresponding to the eikonal S− are half-lines tangent to the circler = �. The first term in S− represents the length PA −√ (

r2 − l2), while the

second term is the arc length of AC. The negative sign in the former meansthat the direction of the ray is from P to A. An analogous explanation holdsfor S+.

�/2 times S+ in the last line of (7.4.11) represents the difference in areasof the triangle �AOB, which is 1

2�2 tan ϕ, and the area of the sector COB,12�2ϕ, shown in Fig. 7.3. Thus, S+ > 0 on the strength of the trigonometricinequality tan ϕ − ϕ > 0, and represents distance.

Rays do not penetrate into the shadow region which in the interior ofthe caustic, r < l. In this region, the eikonal, (7.4.11), becomes completelyimaginary,

S†± = i

{� cosh−1

(�

r

)−

√ (�2 − r2

)}. (7.4.12)

Since the ‘shadow’ intensities vanish exponentially, they are usuallyignored. But, because the matching conditions between periodic and ape-riodic zones furnish the quantum conditions in quantum mechanics, they

Page 392: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 365

Fig. 7.3. Sector inscribed in a triangle.

must have some physical meaning. In fact, we will see that the shadowregion lies in hyperbolic space.

In the shadow zone we are no longer restricted to the relativisticassumption that nothing travels faster than the speed of light. Thus it isentirely feasible to have angular velocity rϕ > 1. The phase and groupvelocities are also problematic in quantum mechanics. Since their productis 1, if the group velocity is less than 1, the phase velocity must be greaterthan 1. But, this could be rationalized only by excluding their use in signaltransmission since no optical effect could propagate faster than the speedof light.

Boundary conditions in general relativity usually require space-timeto be asymptotically flat. But, in rotating systems, we will learn in Chapter 9that a cut-off must be introduced for, otherwise, distances r > 1/ϕ wouldmake the time component of the metric tensor negative. Distances greaterthan 1/ϕ bring us into the caustic region, and since the region is hyperbolic,it does not lead to the Landau and Lifshitz [75] conclusion that “such asystem cannot be made up of real bodies.”

The action (7.4.12) can also be derived from Fermat’s principle usingan indefinite metric. The principle now asserts that

I =∫

η√ (

r2ϕ2 − r2)

dt =∫

η√ (

r2ϕ′2 − 1)

dr (7.4.13)

Page 393: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

366 A New Perspective on Relativity

be stationary. Following the same procedure as given above, we find thetrajectory is now given by

ϕ − ϕ0 = cosh−1(

lηr

),

for a constant index of refraction. We can always arrange that the inte-gration constant ϕ0 = 0 by fixing the initial point of the measurement ofarc length. The extremum of (7.4.13) is just the negative of the hyperbolicdistance

I = −√ [

�2 − (ηr)2]

= −� tanh ϕ. (7.4.14)

The extremum of the length of the ray is given by nothing less than thecorresponding segment of the Lobachevsky straight line!

Canonical parametrization, where we set the radius of the causticcircle equal to �, and the arc length s = sinh ϕ, enable the profile curve tobe written as:

β(s) = (g(s), h(s)

) =(

sinh−1 s − s√ (1 + s2

) ,1√ (

1 + s2))

= (ϕ − tanh ϕ, sech ϕ

).

The term g(s) measures the distance along the axis of revolution, and h(s)measures the distance from the axis of revolution.

The action (7.4.12) is merely the distance along the axis of revolution,S† = g(s), which in terms of ϕ is

S† = �

∫tanh2 ϕ dϕ = � (ϕ − tanh ϕ) . (7.4.15)

Equation (7.4.15) is the parametric equation for a tractrix, whose surface ofrevolution resembles a bugle, having Gaussian, constant negative curva-ture −1/�2. The tangent to the tractrix which intersects the x-axis has theconstant value �, as shown in Fig. 7.4. The distance from the origin to thepoint of tangency along the x-axis is �ϕ. The point on the tractrix whichhas as a tangent intercepting the x-axis is located at a distance �(ϕ − tanh ϕ)along the x-axis. This is precisely the action (7.4.15).

In contrast to the bright zone, where the action is the difference inareas between a triangle and the sector in which it inscribes, in the shadowzone, it is the difference between the distance from the origin to the point of

Page 394: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 367

Fig. 7.4. Newton’s tractrix again.

intersection of tangent of the tractrix and the x-axis and the distance fromthe origin to the point of intersection of normal of the intersection pointof the tractrix and its tangent to the x-axis, as shown in Fig. 7.4. Newtondefined the tractrix as the curve for which the length of its tangent fromthe point of contact to the x-axis is constant. Huygens pointed out that thiscurve could be interpreted as the path of a dog which is pulled by a leashof length �. Its primary importance is the role that it plays in hyperbolicgeometry where its surface of revolution is the pseudosphere, as we haveseen in Fig. 2.19.

We recall that the pseudosphere is the negative curvature counterpartof a sphere. One may wonder whether there is a closed figure like a spherewhich exhibits negative constant curvature. In was proved by Hilbert atthe turn of the twentieth century that there is no smooth unboundedsurface of constant negative curvature in ordinary space. Nevertheless,a plane of negative curvature can be obtained by introducing an indefinitemetric.

7.5 General Relativity versus Non-Euclidean Metrics. . .we have no proof of the need for a curved universe (space plus time) and the physicalmeaning of this theory is very confusing[Brillouin 70]

We may generalize the foregoing analysis to non-Euclidean geometries bywriting Fermat’s principle using the first fundamental form,

I =∫

η

√ (E dr2 + G dϕ2

)= extremum.

The fundamental form is a Clairut parametrization which is an orthogo-nal parametrization where E and G depend only on r. Then since ϕ is a

Page 395: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

368 A New Perspective on Relativity

cyclic coordinate, we obtain the pre-geodesics, or a curve of zero geodesiccurvature as

dr= ± �

√E√

G√

(η2G − �2), (7.5.1)

regardless of whether the index of refraction is a constant or not. As wehave already said, the angular momentum � gives the slant of the pre-geodesic curve.

We will determine E and G from the projective metric of a sphericaldistance on a sphere of radius R, and then let R become imaginary. This willgive us a hyperbolic metric of constant curvature, which is obviously theBeltrami metric. Then, as a final step we will consider the mass as constantand not its density. This will have the effect of going to a metric of non-constant curvature [cf. (7.4.9)].

The line element on a sphere of radius R is

ds2 = R2(dϑ2 + sin2 ϑ dϕ2

). (7.5.2)

As is shown in Fig. 7.5, the point P at (R, ϑ, ϕ), where the azimuthal, ϕ, ismeasured around the vertical axis, is projected stereographically onto theplane at point Q with coordinates (r, ϕ). Thus,

ϑ = 2 tan−1 r2R

,

whose differential is

dϑ = dr1 + r2/4R2 .

Fig. 7.5. The stereographic projection of a point on the sphere P onto the plane atpoint Q.

Page 396: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 369

Introducing this into the expression for the line element (7.5.2) results in

ds2 = dr2

(1 + r2/4R2)2+ r2 dϕ2

1 + r2/4R2 , (7.5.3)

where we have used

sin ϑ = 2 sin12ϑ cos

12ϑ = r/R

1 + r2/4R2 .

Expression (7.5.3) is the metric of the sphere as measured by the coordi-nates (r, ϕ) in the plane

tan12ϑ = r

2R.

Now, letting the radius become imaginary, R → iR, the metric for asphere, (7.5.3), transforms into

ds2 = dr2

(1 − r2/4R2)2+ r2 dϕ2

1 − r2/4R2 , (7.5.4)

which identifies the coefficients E and G in the orthogonal, first fundamen-tal form. This is none other than the Beltrami metric that we first met inSec. 2.4.

So the metric for a sphere in elliptic space becomes the metric for apseudosphere in hyperbolic space when R → iR.

Finally, as far as gravity is concerned, (7.5.4) sets the radius of curvatureproportional to the density, (7.4.9). The Gaussian curvature, which has thedimension of inverse square length, is thus proportional to the density.But, (7.4.9) says more.

Of the three tests of general relativity, it is the gravitational redshiftwhich is a consequence solely of the equivalence principle, and not of thegravitational field equations. It says that clocks will be slowed down in agravitational field by an amount

t′ = √ (1 − u2

)t = √ (

1 − ω2r2)

t = √ (1 − 2M

r

)t,

where t′ is the observer’s proper time, and ω is the angular velocity ofrotation. If the density, and not the mass is constant, the last inequality

Page 397: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

370 A New Perspective on Relativity

states

ω =√ (

),

where ρ−1/2 is proportional to the Newtonian free fall time. If the equiva-lence principle,

2M = ω2r3, (7.5.5)

holds, all the results we get with M = const., should follow with ρ = const.,With the total mass M = const., but not ρ = const., (7.5.4) becomes

ds2 = dr2

(1 − 2M/r)2+ r2 dϕ2

1 − 2M/r, (7.5.6)

where we used M = 4πρr3/3. The Gaussian curvature of the surface,

K = −Mr3

(2 − 3

Mr

), (7.5.7)

becomes constant, in the large r limit, when the density, ρ, does. SinceK = κ1κ2 = LN/EG, where κ1 and κ2 are the principal curvatures, and Land N are the orthogonal coefficients of the second fundamental form,

L dr2 + N dϕ2.

The mean curvature,

H = 12

(κ1 + κ2) = 12

(NG

+ LE

)= −1

r

(1 − 2M

r

), (7.5.8)

for L = −(2 − 3M/r)/r(1 − 2M/r)2 and N = M/(1 − 2M/r). We nowcompare this with the Schwarzschild metric for the exterior and interiorsolutions.

The planar line elements for the Schwarzschild metric are

ds2ext = dr2

1 − 2M/r+ r2 dϕ2, (7.5.9a)

ds2int = dr2

1 − r2/R2 + r2 dϕ2, (7.5.9b)

Page 398: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 371

for the exterior and interior regions, respectively. It would appearthat (7.5.9a) and (7.5.9b) are related by the equivalence relation (7.3.7).However, the exterior metric has Gaussian curvature Kext = −M/r3, whilethe interior metric has Gaussian curvature Kint = 1/R3. Thus,

by going from the exterior to the interior metric what was a surface ofnegative, non-constant curvature has become one of positive, constantcurvature! Now it is well-known that a simple coordinate transforma-tion can eliminate the singularity, r = 2M in (7.5.9a), and the solutioncan be continued all the way to r = 0. But, in no way would we expectthat the curvature of the surface changes as we go from the exteriormetric, (7.5.9a), to the interior metric, (7.5.9b). This does not happen inthe Beltrami metric for when the principle of equivalence is applied,the curvature becomes K = −1/R2.

The radial equation obtained from (7.5.1) for the Beltrami metric is

r = ± r2

√[η2 − �2

r2

(1 − 2M

r

)], (7.5.10)

and the angular momentum is conserved, � = r2ϕ. In the case of the deflec-tion of light by a massive body η = const., while in the advance of theperihelion, the refractive index is given by (7.6.1) where W < 0 wouldcorrespond to a complex index of refraction.

We can read off from (7.5.10) the potential,

2�S = −2Mr

+ �2

r2

(1 − 2M

r

), (7.5.11)

which we shall refer to as Schwarzschild’s, and which is shown inFig. 7.6 (b). The corresponding Newtonian potential, with � = 0, is shownin Fig. 7.6 (a). For W < 0, the Newtonian orbits are elliptic and the lineW = �N in Fig. 7.6 (a) gives the maximum and minimum distances ofthe particle from the central mass. By contrast, for sufficiently large valuesof the angular momentum, the Schwarzschild potential has a maximumpositive value, as shown in Fig. 7.6 (b). Particles with energy less thanthis value will not reach the origin. The minimum of �S corresponds to acircular orbit.

Page 399: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

372 A New Perspective on Relativity

Fig. 7.6. Comparison of the Newtonian potential (a) with that of the Schwarzschildpotential (b). The former is obtained from the latter by setting � = 0.

The radial equation of the Beltrami metric with non-constant curvature,i.e. constant mass, gives the coupling between rotational repulsion andgravitational attraction that allows the exact calculation of the peri-helion shift.

General relativity gives the non-conserved angular momentum (7.4.8)and the equation for the radial coordinate as [Møller 52, p. 349]

r = ±√ (

A + 2Mr

− r2ϕ2)

,

which upon inserting (7.4.8) becomes

r = ±[

A + 2Mr

− �2

r2

(1 − 2M

r

)2]1/2

, (7.5.12)

where A is a constant. By contrast, the Beltrami metric conserves the angularmomentum, � = r2ϕ, and gives a radial equation,

r = ±√ [

η2 − �2

r2

(1 − 2M

r

)], (7.5.13)

Page 400: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 373

which differs from (7.5.12) by the last term. But, since we are usually work-ing at weak fields, we may take (1 − 2M/r)2 ≈ 1 − 4M/r, and in the sameapproximation neglect the second term in the denominator of (7.4.8). But,general relativity comes out with the equation of the trajectory as [Møller52, Eqn. (28) p. 350]

1r4

(drdϕ

)2

= 2A + 2Mr

− �2

r2 + 2M�2

r3 (7.5.14)

which is what we would get from dividing (7.5.13) by r2ϕ, where the indexof refraction η = √

(2A + 2M/r). Notwithstanding the final equation thatgeneral relativity uses, there should have been a discrepancy of a factor of2 in the last term of (7.5.14).

So we may ask where do the general relativistic results come from?Consider a generalization of the inner product. It was Riemann’s idea togeneralize the ordinary dot product of two tangent vectors, v · w to theinner product,

v ◦ w = v · wg2 .

In the xy-plane, the inner product is blown up by the factor

g = 1 − x2 + y2

R2 ,

where R is the radius of the disc lying in the xy-plane. This geometric surfaceis the hyperbolic plane, and, necessarily, we must have r = √

(x2 +y2) < R.Transforming to polar coordinates, the line element is given by

ds2 = dr2 + r2dϕ2

(1 − r2/R2)2, (7.5.15)

which just misses being the stereographic projection given by the Beltramimetric (7.5.4) by being a factor of g too high in the denominator of thesecond term.

On the strength of the equivalence principle, (7.5.5), we can write(7.5.15) as

ds2 = dr2 + r2dϕ2

(1 − 2M/r)2. (7.5.16)

Page 401: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

374 A New Perspective on Relativity

The equation for the pre-geodesic is

dr= ϕ

r= ± �(1 − 2M/r)

r2√ [

η2 − �2

r2 (1 − 2M/r)2] . (7.5.17)

Although this equation splits up into the general relativistic expression forthe angular momentum, (7.4.8), and the radial equation, the latter will givetwice the value for the rotational-gravitational coupling that (7.5.12) givesin the weak field approximation. This is on account of the fact that it is thesame G coefficient in the fundamental form that appears in (7.4.8) and theradial equation.

The Schwarzschild exterior solution corresponds exactly to the stereo-graphic metric, (7.5.15) under the transformation (7.5.5).

This is the reason for the inability to integrate their equation forthe pre-geodesics. Remarkably, (7.5.17), under the equivalence principle,(7.5.5), is one of the few cases which can be integrated in closed form. Fora constant index of refraction, and setting R = 2, it becomes

dr= ± �(1 − r2/4)/r2

√ [1 − (

�(1 − r2/4)/r)2

] .

In order to perform the integration we set [O’Neill 66]

u = br

(1 + 1

4r2

), where b = �√

(1 + �2).

This reduces the pre-geodesic to

dϕ = ± du√(1 − u2)

,

so that the solution is

r2 + r20 − 2r0r cos (ϕ − ϕ0) = r2

1,

where ϕ0 is a constant of integration, and r20 = r2

1 + 4 is a Euclidean circle.The center of the circle, O, having coordinates (r0, ϕ0) lies outside the discsince r0 > 2. The center O of this circle has an arc that cuts the rim orthog-onally as shown in Fig. 7.7. All geodesics of the hyperbolic plane are either

Page 402: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 375

Fig. 7.7. Geodesic curves that cut the rim of the hyperbolic plane orthogonally arearcs of a circle whose center O lies outside the disc.

curved arcs that cut the disc orthogonally or straight lines through the cen-ter. If we reinstate the constants, the condition that � be real is ωr0 > c,and such a condition violates the limiting value of the speed of light.The fact that the curved geodesics are fixed by coordinates outside of thehyperbolic plane appears to resemble Mach’s principle whereby rotationshould be reckoned by the distribution of all the masses that make up theuniverse.

7.6 The Mechanics of Diffraction

Still considering a flat metric where coefficients of the first fundamentalform are E = 1 and G = r2, the simplest generalization is to consider avarying index of refraction,

η =√ (

2W + 2Mr

), (7.6.1)

for a closed orbit having negative total energy, W < 0. In the short-wavelength limit, the gravitational potential in (7.6.1) will have the effectof converting the phase of the Bessel function into a Laguerre function. Theequation for the pre-geodesic is

dr= ± �

r2√ (−2|W | + 2M/r − �2/r2) . (7.6.2)

Page 403: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

376 A New Perspective on Relativity

With the conservation of the angular momentum, the equation for the radialcoordinate is

dSdr

= r = ±√ (

−2|W | + 2Mr

− �2

r2

),

by definition of the eikonal, S. Now, introduce the change of variable, x =2r

√|W | to obtain

dSdx

= ±√ (

−14

+ λ

x− �2

x2

),

where λ = M/√|W |. Its square can be approximated as

(dSdx

)2

= λ

x(1 + �2/λx)− 1

4= λ

x

(1 − �2

λx

)− 1

4.

With another change of variable, x = 4λ cos2 ϑ − �2/λ, we get

dSdx

= ±12

tan ϑ. (7.6.3)

Noting that dx = −8λ cos ϑ sin ϑ dϑ, we integrate to obtain

S = ±λ(2ϑ − sin 2ϑ). (7.6.4)

With λ13 as the absolute constant,π times (7.6.4) is precisely the volume

of a sphere of radius ρ = λ1/3ϑ in elliptic geometry. The surface distance isproportional to the angle ϑ subtended at the center so that ϑ can be usedas a measure of surface distance. As ϑ increases the object you are viewingdecreases in apparent size until it arrives at the equator, ϑ = π/2. This isthe maximum distance in elliptic geometry but not in spherical geometry.Increasing ϑ still further, the object now seems to be approaching and grow-ing in size until it reachesϑ = π. On completion of the round trip, you returnto the north pole but with your head facing in the opposite direction. Theelliptic plane is thus said to have only one side at [Thurston 97]!

The finiteness of lines is the most novel feature of spherical geometry,and no line can be longer than π. Felix Klein suggested not to considerthe whole sphere, but only half of it. This suggestion was made in orderto eliminate the one, trivial, blemish of spherical geometry: any two greatcircles on a sphere meet not just in one point, but two diametrically oppositepoints, the so-called antipodes. By restricting the distance to π/2 the halves

Page 404: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 377

of great circles meet only once. Consider two identical twins placed atantipodal points on each of the hemispheres. The twins regard themselvesas a single entity and are not aware of the split. When any one of the twinsmoves, the other will also move to keep them diametrically separated. Eachtwin necessarily regards a pair of antipodal points on the sphere as a singlelocation so that to him every pair of two great circles intersects at a singlepoint, while to us it looks like two different points. When one twin traces outa triangle we see two antipodal triangles being traced out. To us Euclideanseverything seems to double! This spherical flatland is called elliptic space,and so is superior to spherical geometry.

Quite remarkably and unexpectedly, by quantizing,

λ = n�,

we come out with the energy levels of the hydrogen atom,

W = −(

Mn�

)2

,

albeit mass replacing charge. Moreover, the Clairaut equation (7.6.2) is eas-ily integrated to give

r = �2/M1 − (�/M) sin ϑ

, (7.6.5)

where � = √ (M2 − 2|W |�2). This is the equation for an ellipse since the

eccentricity,

ε =(

1 − 2|W |�2

M2

)1/2

=[

1 − 2(

n�

)2]1/2

< 1. (7.6.6)

The lengths of the major and minor axes, M and√

(2|W |)�, clearly showthe competing forces of gravitational attraction and centrifugal repulsion.

The transition point occurs where S′(x0) = 0, which is

x0 = (2n�)2 − �2

n�.

For x > x0, or equivalently,

r >(2n�)2 − �2

2M, (7.6.7)

Page 405: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

378 A New Perspective on Relativity

there is a transition to the exponential region, where (7.6.4) transforms into

S†± = ±λ( sinh 2ϑ − 2ϑ). (7.6.8)

With the same absolute constant λ13 , π times (7.6.8) is volume of a sphere

with hyperbolic radius λ1/3ϑ. But, this is exactly what we should haveexpected since the transformation from spherical to hyperbolic geometryis achieved by letting the radius become imaginary, ϑ → iϑ.

It is intriguing to know how the hydrogen atom would behave in ahyperbolic plane. Rearranging inequality (7.6.7) we get

2Mr

+ �2

r2 >

(2n�

r

)2

,

showing that gravitation and centrifugal forces no longer oppose oneanother. The gravitational potential would become repulsive, and for thehydrogen atom it would mean a repulsive Coulomb potential; the atomwould fly apart.

All effects that we have so far discussed are nonrelativistic because chas not made its appearance. Their small corrections to celestial mechanicsis fully accounted for by the Beltrami metric of hyperbolic geometry, as weshall now go on to show.

7.6.1 Gravitational shift of spectral lines

In Sec. 3.8.2.3 we saw how Einstein replaced the Doppler expression by anew one which predicted the effect that gravity would have upon spectrallines. That is, he replaced the nonrelativistic expression for the Dopplershift,

ν − ν0

ν0= −u/c, (7.6.9)

by

ν − ν0

ν0= −α/2r. (7.6.10)

The confirmation of (7.6.10) is indeed miraculous, given its deriva-tion. We know that motion causes a shift in the frequency, so the right-handside should be u/c, the relativity velocity. But, if light from an unacceler-ated frame is emitted, it will arrive at the accelerated frame in time h/c,

Page 406: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 379

whose acceleration is being caused by gravity. So we want to replace u/cby gh/c2, where g is the uniform acceleration on the Earth’s surface. Finally,we want to replace this scenario with one by a (Newtonian) gravitationalpotential which depends on the mass and has an inverse dependency ondistance. The factor of one-half in (7.6.10) is meant to show that we arein the non-relativistic region. So, we have gone from the Doppler effectwhich depends only on the speed to one in which mass and distance haveentered. The Doppler shift, (7.6.9) can also lead to a blueshift if the object isapproaching. The gravitational shift cannot occur for what was attractionwould now become repulsion.

7.6.2 The deflection of light

Gravity makes the medium optically denser in the vicinity of a large mass,like the Sun, than it would be in its absence. As a result, light rays willbe bent toward the Sun rather than being straight lines. As we know fromSec. 3.8.2 this effect was originally predicted by Söldner as far back as 1801;it was rediscovered by Einstein [52] in 1911 by redefining the Doppler effect.

The remarkable thing is that the equation for the trajectory of a lightray in a gravitational field can be derived directly from the Beltrami metriconce we transform from a constant free-fall time, or frequency, to one ofconstant mass. This is not to be taken as a physical equivalence, but, rather,a mathematical one. However, the transformed metric of non-constant cur-vature is as ‘physical’ as the original Beltrami metric of constant curvature.

The Beltrami metric at constant curvature describes a uniformly rotat-ing disc, while the same metric at non-constant curvature describes thedeflection of light in a gravitational field. The two are related by themathematical statement of the equivalence principle.

Nothing could be so simple nor so beautiful. The rest of the analysis isstandard; but, for completeness sake we reproduce it here.

We first introduce the change of variable, r = ρ−1, in the pre-geodesicfor the Beltrami metric, (7.5.10), to obtain

dϕ= ±

√ (�−2 − ρ2 + 2Mρ3

). (7.6.11)

Page 407: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

380 A New Perspective on Relativity

This identical to the general relativistic equation for the trajectory (7.5.14)with A = 0. Second, we introduce another new variable, σ = ρ�

√(1−2Mρ),

and neglecting the small term, 2Mρ3 we get

ϕ =∫ ρ

0

� dρ√(1 − �2ρ2)

= sin−1 �ρ.

The geodesic r = �/ sin ϕ, obtained by setting the constant of integration,ϕ0 = π/2, is a straight line which passes through the origin at a distance �

when ϕ = π/2, and goes to infinity for ϕ → 0, π.The exact equation (7.6.11) may be cast in the form,

dϕ=

√(1 − σ2)

�, (7.6.12)

where σ = �ρ√

(1 − 2Mρ). Since 2Mρ is a small quantity, we can use theapproximations:

σ = �ρ (1 − Mρ) , �ρ = σ (1 + Mρ) = σ (1 + Mσ/�) ,

which are valid to first-order. Differentiating we find � dρ = (1+2Mσ/�)dσ.Introducing these approximations into (7.6.12), and integrating we get

ϕ =∫ ρ

0

� dρ√(1 − σ2)

=∫ σ

0

(1 + 2Mσ/�)dσ√(1 − σ2)

= arcsin σ − 2M�

√(1 − σ2) + 2M

�.

It is apparent from (7.6.12) that ϕ will have an extremum when σ = 1.This value corresponds to the closest approach of the ray to the Sun. At thisdistance the angle ϕ will be ϕm = π/2 + 2M/�. Reinstating all constants wefind the total deflection to be

2ϕm − π = 4GM�c

, (7.6.13)

which is twice that obtained by treating the interaction through aNewtonian potential, as in Fig. 7.6 (a). The ratio of angular momentumto the speed of light, �/c, plays the role of a collision parameter, or the clos-est distance approach.As expected, (7.6.13) is the ratio of the Schwarzschildradius to a characteristic length, here the collision parameter.

Page 408: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 381

We can now show that the time of propagation along a ray will belengthened in the presence of a massive body. Since the angular momen-tum has its Euclidean value, � = ϕr2, in the Beltrami model, the actioncan be read off from (7.5.13) as

S± = ±∫ √[r2 − �2(1 − 2M/r)]

rdr,

for η = 1. Treating the last term in the numerator as a small quantity wehave

S± ≈ ±∫ √

(r2 − �2)r

[1 + M�2

r(r2 − �2)

]dr

= ±{√

(r2 − �2)(

1 + Mr

)− � cos−1 �

r

}.

The length and the time it takes to propagate along a ray will be lengthenedby the amount M/r, for small values of the gravitational potential. Thedeflection of light attests to this lengthening by curving the ray.

7.6.3 Advance of the perihelion

We treated the advance of the perihelion in Sec. 3.8.2 by the elegant methoddevised by Ritz. The only blemish is that the arbitrary parameter appearingin his force law had to be chosen to give the observational result. In thissection we determine the advance by our optico-gravitational approach.By doing so we go against the adage of Boltzmann who said that eleganceshould be left to tailors and cobblers.

Taking into account both the gravitational potential and thequadrupole interaction, the index of refraction can be written as

η =[−2W + 2M

r

(1 + �2

r2

)]1/2

.

In the unperturbed state, where the quadrupole interaction is absent, thegravitational potential must be large enough to produce a real eccentricity,

Page 409: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

382 A New Perspective on Relativity

(7.6.6). The corresponding action over a period of the motion,

S(r, l) =∮ √

(η2r2 − �2)r

dr

=∮ [

−2W + 2Mr

− l2

r2 + 2M�2

r3

]1/2

dr, (7.6.14)

shows that a closed trajectory will result from a dynamic balance betweengravitational and centrifugal forces.

The quadrupole interaction requires r > 3M in order that the angularmomentum be real at the extremum of the potential d�S/dr = 0,

�2 = Mr2

r − 3M,

as seen in Fig. 7.6(b). Introducing this fact into (7.6.14) by writing r′ = r+3M,and retaining only those terms that are at most quadratic in M, give

S(r, �) =∮ [

−2W + 2Mr

(1 − 3M

r

)− �2

r2

(1 − 6M

r

)− 2M�2

r3 + · · ·]1/2

dr,

where for brevity we have dropped the prime on r. Expanding the inte-grand in powers of the small correction terms results in

S = S(0) − 3M2S(1), (7.6.15)

where the unperturbed action is

S(0) =∮ √

(−2W + 2M

r− �2

r2

)dr = 2π

(M√(2W )

− �

)

and first-order correction is

S(1) =∮

drr√

( − 2Wr2 + 2Mr − �2)= −2π

�.

Since the trajectory is defined by the equation,

ϕ + ∂S∂�

= const.,

the change in the angle ϕ over one revolution is

�ϕ = −∂�S∂�

Page 410: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

General Relativity in a Non-Euclidean Geometrical Setting 383

= 2π + 3M2 ∂S(1)

∂�

= 2π

(1 + 3M2

�2

). (7.6.16)

General relativity gives the shift as 6πM/l, where l is the semi-latus rectum.From the equation of the ellipse, we find l = �2/M so that (7.6.16) is theexact same expression found in general relativity. This is also comparableto Ritz’s result (3.8.22), which we will do in a moment.

The rotation of the perihelion of Mercury per revolution amounts to0.104′′. The dimensionless total energy constant W = 2.59 × 10−8, and themean motion ω = �/ab = (2W )3/2/M = 8.34 × 10−7 s−1, where a = M/2Wand b = �/

√(2W ) are the semi-major the semi-minor axes. The calculated

period of Mercury is τ = 2π/ω = 87.25 days, which is close to the actualvalue of τ = 88 days. The frequency of rotation of the perihelion is �ω =ω�ϕ1 = 4.25 × 10−13 s−1.

This approach can be compared to Ritz’s calculation of the advanceof the perihelion in Sec. 3.8.2. Ritz begins with his law of force, and findsit necessary to consider that angular momentum is not conserved. This heshares with general relativity. Rather, the Beltrami metric does not lead toany violation of the conservation of angular momentum. The transformfrom a metric of negative constant curvature to non-constant curvatureuses (7.5.5). General relativity would use half that because the metric theyuse is the stereographic metric, (7.5.15). The factor of 2 enters in when thequadratic term is approximated by a linear one.

It is not the numerical factors which make or break a theory. In non-Euclidean geometries the unit of measurement is left at discretion. Thisallays any suspicion that new metrics must be invented to account forthe optico-gravitational phenomena.

References

[Brillouin 70] L. Brillouin, Relativity Reexamined (Academic Press, New York, 1970).[Carstoiu 69] J. Carstoiu, “Les deux champs de gravitation et propagation

des ondes gravifiques,” Compt. Rend. 268 (1969) 201–263; J. Carstoiu,

Page 411: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch07

384 A New Perspective on Relativity

“Nouvelles remarques su les deux champs de gravitation et propagationdes ondes gravifiques,” Compt. Rend. 268 (1969) 261–264.

[Einstein 11] A. Einstein, “On the influence of gravitation on the propagation oflight,” Ann. der Phys. 35 (1911); translated in W. Perrett and G. B. Jeffrey,The Principle of Relativity (Methuen, London, 1923), pp. 99–108.

[Einstein 52] A. Einstein, The Principle of Relativity (Dover, New York, 1952),pp. 99, 111.

[Essen 78] L. Essen, “Relativity and time signals,” Wireless World, October 1978,pp. 44, 45.

[Hafele & Keating 72] J. C. Hafele and R. E. Keating, “Around the world atomicclocks: predicted relativistic time gains,” Science 177 (1972) 166–168.

[Heaviside 88] O. Heaviside, “On electromagnetic waves, especially in relation tothe vorticity of the impressed forces; and the forced vibrations of electro-magnetic systems,” Phil. Mag. May 1888, 380–449.

[Heaviside 93] O. Heaviside, Electromagnetic Theory, Vol. I (The Electrician, London,1893), Appendix B.

[Heaviside 94] O. Heaviside, Electrical Papers, Vol. 2 (Macmillan, New York, 1894),p. 444.

[Landau & Lifshitz 75] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields(Pergamon, Oxford, 1975).

[Møller 52] C. Møller, The Theory of Relativity (Oxford U. P., London, 1952), p. 355.[Sexl & Sexl 79] R. Sexl and H. Sexl, White Dwarfs–Black Holes (Academic Press,

New York, 1979), p. 43.[Sommerfeld 23] A. Sommerfeld, Atomic Structure and Spectral Lines (E. P. Dutton,

New York, 1923), p. 466.[Thurston 97] W. P. Thurston, Three-dimensional Geometry and Topology (Princeton

U. P., Princeton, NJ, 1997), p. 34.

Page 412: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Chapter 8

Relativity of Hyperbolic Space

Of course, since Einstein, we do not use hyperbolic geometry to model the geometry of theuniverse.[Greenberg 93]

8.1 Hyperbolic Geometry and the Birth of Relativity

Almost immediately after the birth of special relativity, Sommerfeld [09]made the interesting observation that the relativistic composition lawsof velocities are “no longer the formulas of the plane but those ofspherical trigonometry (with imaginary sides)” — trigonometrical formu-las obtained by replacing the real argument by an imaginary one. Spheresof imaginary radius had been known for a long time, as this identity waspointed out by Lobaschevsky [98] himself. The first explicit connectionof Lobaschevsky geometry to relativity was made by Varicak [10]. Thehyperbolic geometry of relativity represents the velocity addition law asa triangle on the surface of a pseudosphere — a surface of revolutionlooking like a bugle [cf. Fig. 2.19] — and the angle of parallelism [cf.Fig. 2.12], which measures the deviation from Euclidean space. As the rel-ative velocity approaches unity, the angle of parallelism approaches zero.The fact that the angle of parallelism provides a unique relation betweencircular and hyperbolic functions can be found in the early textbook on rel-ativity by Silberstein [14] written in 1914. These developments did not havea follow up, and no place for hyperbolic geometry could be found in the rel-ativity textbooks that followed. Undoubtedly, this was due to the influenceof Einstein’s general relativity which is based upon Riemann geometry,where matter and geometry are woven together.

Yet, even more astonishing is that Poincaré missed all of this! Over fiftyyears after the discovery of the hyperbolic geometry, Poincaré developed

385

Page 413: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

386 A New Perspective on Relativity

two models of hyperbolic geometry: The upper half-plane and disc modelsthat we discussed in Sec. 2.4. The inhabitants of hyperbolic plane, Poincar-ites, consider geodesics as straight lines while to us Euclideans they wouldappear as circular arcs meeting the boundary orthogonally in the discmodel, and see Poincarites shrink as they approach the real axis from theupper half-plane. Poincarites would not be able to measure their shrinkagebecause the rulers they use shrink along with them.

These distortions are created by motion, and Poincaré was well awareof the contraction that bodies undergo in the direction of the Earth’s motion,of an amount proportional to the square of the aberration. This is the famousFitzGerald–Lorentz contraction that was first postulated independently byFitzGerald and Lorentz, as an explanation of the Michelson–Morley nullresult, which we discussed in Sec. 3.2. Poincaré [05] was also aware of therelativistic velocity composition law, since it was he who discovered it.Yet, he did not recognize that the longitudinal Doppler law is the invariantcross-ratio if the velocity in that law is that resulting from the relativisticsubtraction law.

Measurements in relativity consist in sending and receiving lightsignals. Distances are measured in terms of time differences. At each step,the time it takes to receive a signal sent out at a previous time is itself timesa factor which turns out to be the longitudinal Doppler shift [Whitrow 33].The space-time transformations from one inertial frame to another, involv-ing Doppler shifts, combine to give the Lorentz transformation [Milne 48].Had Poincaré realized that his definition of hyperbolic distance in termsof the logarithm of the cross-ratio, which for the distance between anytwo velocity points on a vertical half-line with an endpoint at infinity,is proportional to the logarithm of the longitudinal Doppler shift, hecould have carried over the battery of concepts and tools he developedsome twenty years before relativity — without distinguishing betweenthe ‘special’ and ‘general’ theories of Einstein. This we plan to do in thischapter.

The chapter is organized as follows. We discuss in Sec. 8.2 the con-nection between geometrical rigid motions and their relations to particu-lar inertial frames of reference. Compounding Doppler shifts at differentvelocities yields the Poincaré composition law, which, in terms of homoge-neous coordinates shows that the Lorentz transform is a unique Möbius

Page 414: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 387

automorphism that exchanges an inertial frame of equal and oppositevelocities with the state at rest. We will also appreciate it as the isomor-phism that converts the Poincaré model to the Klein, or projective, modelof the hyperbolic plane, as well as establishing the limit for hyperbolicrotations in terms of the angle of parallelism.

We then discuss in Sec. 8.3 the relativistic phenomenon of aberrationand show that it conforms to the hyperbolic law of sines. In terms of aright triangle inscribed in a unit disc, angular deformations of the non-central angle and contractions of the side of the triangle perpendicularto the motion will respectively be related to the facts that the sum of theangles of a hyperbolic triangle is less than π, and that a FitzGerald–Lorentzcontraction in the direction normal to the motion making it look like moreof a rotation than a contraction.

The hyperbolic contraction is in the direction normal to the motion,and not in the direction of the motion. It is exactly the same second-orderDoppler effect that Ives and Stilwell measured back in 1938, as we havedescribed in Sec. 3.4. Ives considered it as a demonstration that clocks inmotion run slower, and has nothing to do with a relativistic time dilatation.

The radar method of sending and receiving light signals to measureelapses in time and distance will then be used in Sec. 8.4 to contrast statesof uniform motion and uniform acceleration. We confirm Whitrow’s [80]conclusion that acceleration does, indeed, affect the rate of a clock, andconvert his inequality for time dilatation into an equality for systems inuniform acceleration. The characteristic means that we find for times ofreflection for systems in uniform motion and uniform acceleration implydifferent temporal scales.

Applications to general relativity and cosmology follow. Beltramicoordinates and logarithmic time are used to derive a metric first inves-tigated by Friedmann which corresponds to ‘dust-like’ matter at zeropressure in terms of Einstein’s energy–momentum tensor. A compari-son with the ‘general’ relativity then follows in Sec. 8.5. In Sec. 8.6 weunveil the hyperbolic geometry of relativity, and generalize it to multi-dimensional velocity space in Sec. 8.7. The Friedmann–Lobaschevsky spacefinds confirmation in Hubble’s discovery of the redshift in the spec-tra of galaxies: The greater the shift the more distant the galaxy. Thiswe show in Sec. 8.9 is a consequence of the hyperbolic measure of the

Page 415: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

388 A New Perspective on Relativity

velocity, and its relation to the logarithmic scale of time through Hubble’slaw.

8.2 Doppler Generation of Möbius Transformations

It has long been known [Silberstein 14] that the relativistic compositionof velocities obeys hyperbolic geometry. Robb [11] proposed to call theEuclidean measure of the velocity,

u = tanh u, (8.2.1)

the ‘rapidity,’ where u is the relative velocity, having set the absolute con-stant c = 1. We can invert (8.2.1) to find the expression

u = 12

ln

(1 + u1 − u

)= 1

2ln{u, 0| − 1, 1}, (8.2.2)

for the hyperbolic measure of the relative velocity u, which unlike itsEuclidean counterpart is not confined to the closed interval [−1, 1].

Recall from Sec. 2.2.4 that the curly brackets represent the cross-ratioof the distance between x and y in the closed interval [a, b],

{a, b|x, y} = (a − x)(b − y)(a − y)(b − x)

. (8.2.3)

Exponentiating both sides of (8.2.2) shows that the exponential of the hyper-bolic length is given by the longitudinal Doppler shift,

eu =(

1 + u1 − u

)1/2

=: K. (8.2.4)

We will now show that compounding Doppler shifts generates Möbiustransforms thereby relating geometric rigid motions with specific inertialframes.

And just as Poincaré found, these linear fractional functions can beused to define the concept of length under which the polygons of therespective tessellation are of equal size.

Page 416: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 389

Preliminarily, compounding the Doppler shift at velocity u with itselfgives

(1 + u1 − u

)=

(1 + λ

1 − λ

)1/2

,

where λ is the relative velocity in an inertial frame comprised of equal andopposite velocities

λ = 2u1 + u2 = tanh (2u). (8.2.5)

Next, consider the cross-ratio,

{v, λ| − 1, 1} = 1 + v1 − v

· 1 − λ

1 + λ= 1 + v′

1 − v′ .

The ‘new’ relative velocity v′ is given in terms of the ‘old’ one by

v′ = v − λ

1 − λv, (8.2.6)

which will be easily recognized as the relativistic subtraction law of thevelocities.

Introducing the second equality in (8.2.5) to (8.2.6) gives the familiarLorentz ‘rotation,’

v = v′ cosh (2u) + sinh (2u)v′ sinh (2u) + cosh (2u)

,

in terms of the homogeneous coordinates v and v′.Multiplying (8.2.6) out gives

v′v + λ−1(v − v′) − 1 = 0. (8.2.7)

Two cases are of interest: If the relative velocities are equal, v = v′, (8.2.7)becomes the simplest hyperbolic involution with conjugate points at ±1[cf. Sec. 2.2.5.2]. This is not of physical interest; rather, what is of physicalinterest is when the velocities are equal and opposite, v = −v′. For then(8.2.7) reduces to the quadratic form,

v2 − 2λ−1v + 1 = 0,

Page 417: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

390 A New Perspective on Relativity

which has two real roots

v± = 1 ± √(1 − λ2)λ

.

Since the relative velocity λ is given by (8.2.5), the two roots are v+ = 1/uand v− = u.

The negative of (8.2.6),

Mλ(v) = v − λ

λv − 1, (8.2.8)

is the unique Möbius automorphism which exchanges λ and 0, viz.Mλ(λ) = 0 and Mλ(0) = λ. This can be recognized as a special case ofthe property that Mλ is involutory: Mλ ◦ Mλ = I, the identity.

A Möbius transform takes any triplet (v1, v2, v3) into any other triplet.The demonstration of the existence of such a transform rests on showingthat there is a Möbius transform for which M(v1) = 0, M(v2) = 1 andM(v3) = ∞. The values are not only the range of the conjugate points, uand 1/u, but, moreover, allow the Möbius transform to be written as theinvariant cross-ratio

M(v) = {v, v2|v1, v3} = (v − v1)(v − v3)

· (v2 − v3)(v2 − v1)

.

Thus, by the construction of the cross-ratio, we have M(v1) = 0, M(ν2) = 1,and M(v3) = ∞.

If we identify M with (8.2.8) then the triplet (0, 1, ∞) occurs whenv1 = λ, v2 = 1, and v3 = 1/λ. The latter would imply that v is unboundedwhich contradicts special relativity. It is the symmetry of the hyperbolicinvolution which implies that if u is a solution, then so is 1/u. We must there-fore show that the conjugate point v+ is a repulsive fixed point, implyingthat repeated mappings of Mλ repel it away from v+.

The Möbius transform F conjugating the normalized Mλ to its stan-dard form sends v+ to zero and v− to infinity,

F(v) = v − v+v − v−

,

with

F−1(v) = −v− + v+−v + 1

.

Page 418: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 391

Rather than calculate the composition F ◦ Mλ ◦ F−1(z) to find the standardform of Mλ, it suffices to calculate it at a point F−1(1) = ∞ so that Mλ(∞) =1/λ, and F(1/λ) = −1. Thus,

F ◦ Mλ ◦ F−1(v) = eiπv, (8.2.9)

is the standard form of Mλ, and represents a rotation about the fixed pointv− which interchanges λ with O.

The Möbius transformation (8.2.8) is therefore elliptic. Severalproperties are immediate: v+ is the inverse of v− with respect to the cir-cle of inversion C1 in Fig. 8.1. The fixed point v− lies inside the unit discand the other fixed point v+ lies outside except when v− = v+, and thenthey both lie on the circle of inversion. Fixed points closer and closer tothe center send their conjugates points further and further away. Inver-sion takes circles orthogonal to the original one into themselves. Two suchorthogonal circles, C1 and C2, are shown in Fig. 8.1. The line � joining theircenters has the fixed points diametrically opposite lying on the circumfer-ence of the circle C2 whose center is O2 = λ−1. The arcs of C2 lying in C1,and orthogonal to it at the points of contact correspond to geodesics in thePoincaré model.

Fig. 8.1. Circles of inversion. The circle C1 cuts the circle of inversion C2 at rightangles at P and Q. A line from the origin of C1 intersects C2 at two points: v−and its inverse v+, which are fixed points of the Lorentz transform. The relativevelocity λ lies at an equal hyperbolic distance from v+ that v− lies from the origin.A hyperbolic rotation of π occurs about v− which exchanges the state of uniformvelocity λ and the state of rest at O1.

Page 419: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

392 A New Perspective on Relativity

Fig. 8.2. A more detailed description of the circle containing the fixed points v1and λ which are uniform states of motion at relative velocities u and 2u/(1 + u2).The Möbius automorphism of the disc may be considered as a composition of twohyperbolic rotations: A rotation of π about the hyperbolic midpoint between theorigin and λ, and a rotation about the origin. The maximum angle φ is determinedby the angle of parallelism, �, beyond which no motion can occur.

The intersection of the arc PQ with the line �, shown in Fig. 8.2, occursat the hyperbolic midpoint, v− = u, between 0 and λ. This is a direct con-sequence of the definition of hyperbolic distances: Whereas the hyperbolicdistance from 0 to λ is

h(0, λ) := {−1, 1|λ, 0} = ln

(1 + λ

1 − λ

)= 2 ln

(1 + u1 − u

),

the distance from 0 to v− is half as great

h(0, u) = {−1, 1|u, 0} = ln

(1 + u1 − u

).

The arc PQ is itself the perpendicular bisector of Oλ. The Möbius transfor-mation (8.2.8) is the composition of two hyperbolic reflections in perpen-dicular lines through v−. Moreover, it is the unique Möbius automorphismthat exchanges O and λ by a rotation through π about the hyperbolic mid-point v− of the hyperbolic line segment.

Expression (8.2.5) is the isomorphism that takes the Poincaré model tothe Klein model. Both are hyperbolic models of the unit disc, but whereasthe Poincaré model is conformal the Klein model is not. The price to be

Page 420: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 393

paid is that geodesics in the Poincaré model are arcs of circles that cut theunit disc orthogonally, while the geodesics in the Klein model are straightlines. The isomorphism λ maps the arc with ends P and Q onto the openchord with the same endpoints. Since v− is the point where line � cuts thecircumference of the orthogonal circle C2, λ(v2) = λ(u) is the point at whichthe line � intersects the chord PQ. But this is precisely the definition (8.2.5)of λ.

When the orthogonal arcs intersect, there occurs a hyperbolic rota-tion. At the limit when they are asymptotic, there is a limit rotation whilewhen they become ultraparallel there is a hyperbolic translation. However,hyperbolic translations in hyperbolic velocity space would contradict thefact that the limiting velocity is that of light. Consider the right triangle OλQformed by the intersection of lines � and �c in Fig. 8.2. The angle at the origin� is the limiting angle, or the angle of parallelism that we introduced inSec. 2.5. For angles less than ϕ, the angle of parallelism will not be reached.Consequently, �c is the limiting line for hyperbolic rotations. According tothe right triangle, cos � = λ, while according to the Bolyai–Lobaschevskyformula for the radian measure of the angle of parallelism,

�(λ) = 2 tan−1 e−λ. (8.2.10)

The angle �(λ) is a sole function of the hyperbolic distance λ. The closer itis to π/2, the less pronounced the hyperbolic distortions become. Thus,

cos �(λ) = tanh λ = 2v−1 + v2−

is just the isomorphism of the Poincaré model onto the Klein model. Since� must necessarily be acute, hyperbolic translations are ruled out, and thelimiting rotation occurs for asymptotes.

8.3 Geometry of Doppler and Aberration Phenomena

Consider the triangle in Fig. 8.3 with sides α and δ, and base u. The altitudeh cuts the base into two parts ε and u− ε. The angles formed from the sidesand the base are ϑ and ϕ. The sines of these angles are

sin ϑ = h/α = tanh h sech ε

tanh α= sinh h

sinh α,

Page 421: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

394 A New Perspective on Relativity

Fig. 8.3. Extension of hyperbolic trigonometry to general triangles.

and

sin ϕ = h/δ = tanh h sech(u − ε)tanh δ

= sinh hsinh δ

,

since deformation only occurs normal to the direction of motion, i.e. u.Introducing the hyperbolic Pythagorean theorem of the first triangle,

cosh α = cosh ε cosh h,

into the hyperbolic Pythagorean theorem for the second triangle,

cosh δ = cosh h cosh (u − ε) = cosh h( cosh u cosh ε − sinh ε sinh u), (8.3.1)

results in

cosh δ = cosh α cosh u − tanh ε sinh u cosh α. (8.3.2)

Finally, introducing cos ϑ = tanh ε/ tanh α gives the hyperbolic law ofcosines

cosh δ = cosh u cosh α − sinh α sinh u cos ϑ. (8.3.3)

In an exactly analogous way we find

cosh α = cosh δ cosh u − sinh δ sinh u cos ϕ. (8.3.4)

Now, introducing cos ϑ = tanh ε/ tanh α into cos ϕ = tanh (u − ε)/ tanh δ

results in

tanh δ cos ϕ = tanh u − tanh α cos ϑ

1 − tanh u tanh α cos ϑ. (8.3.5)

But, this should be a velocity composition law [cf. Fig. 8.4 where the trianglehas to be fitted on a surface of a pseudosphere rather than the flat Euclidean

Page 422: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 395

Fig. 8.4. Hyperbolic velocity triangle.

plane], and it will become one when we introduce the velocity componentsu1 = α = tanh α, and u2 = δ = tanh δ.

Inserting these definitions into (8.3.5) gives

u2 cos ϕ = u − u1 cos ϑ

1 − uu1 cos ϑ. (8.3.6)

Equation (8.3.6) is the equation of aberration in the direction of the motion.In the limit as α, γ → ∞, u1, u2 → 1, and they become light signals.

The hyperbolic cosine law, (8.3.3), can be written as

sinh δ

sinh α= tanh δ

tanh αcosh u

(1 − tanh α tanh u cos ϑ

) = sin ϑ

sin ϕ, (8.3.7)

which is the hyperbolic law of sines,

sin ϑ

sinh δ= sin ϕ

sinh α, (8.3.8)

showing that sides can be expressed in terms of angles in hyperbolic geom-etry [Buseman & Kelly 53]. The hyperbolic law of sines, (8.3.8), happensalso to be the equation of aberration normal to the motion,

u2 sin ϕ = u1 sin ϑ√

(1 − u2)1 − uu1 cos ϑ

. (8.3.9)

Taking the differential of (8.3.6),

− sin ϕ dϕ = u1

u2

γ−2 sin ϑ

(1 − uu1 cos ϑ)2dϑ, (8.3.10)

Page 423: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

396 A New Perspective on Relativity

and introducing (8.3.9) result in

dϕ = −√

(1 − u2)1 − u tanh ε

dϑ, (8.3.11)

where we used tanh α · cos ϑ = tanh ε. Dividing both sides by the timeincrement gives the Doppler shift as

ν = Kν0, (8.3.12)

where

K =√

(1 − u2)1 − uu1 cos ϑ

(8.3.13)

is the Doppler factor.Amoving object emits a signal at frequency ν0 = dϑ/dtwith velocity u1, and ν = −dϕ/dt is the frequency that the observer at restregisters.

If the signal is emitted at the velocity of light, u1 = 1, implying thatα → ∞, and ϑ → π/2, or equivalently ε → 0, it follows from (8.3.3) thatδ → ∞ such that the difference δ − α remains finite

e(δ−α) = cosh u = 1√(1 − u2)

. (8.3.14)

The Doppler shift (8.3.12) then becomes the exponential Doppler shift,

ν = e−(δ−α)ν0. (8.3.15)

An exponential law for the longitudinal Doppler shift is known[Prohovnik 67], but not for the transverse redshift. The exponential lawwould imply that the frequency and energy of light received from recedinggalaxies depend on an exponential factor which would decrease the lumi-nosity and explain things like Olber’s paradox, where the decrease in theintensity of the source (inverse square of the distance) is exactly balancedby the increase in the number of sources (square of the distance).

The linear approximation,

ν

ν0= −(δ − α) = −1

2u2, (8.3.16)

equates the redshift with the distance, δ − α, and with the second-orderrelative velocity, obtained by expanding (8.3.14) to first-order. If u is the

Page 424: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 397

Earth’s orbital velocity (3 × 104 m/sec), (8.3.16) would predict a frequencyshift of (δ − α) = 5 × 10−9.

Ordinarily, one writes the Doppler factor (8.3.13) with u1 = 1 withoutrealizing that it requires the limit α → ∞, which, in turn, requires that itbe normal to the motion. That is, the Doppler shift (8.3.13) is

K = (cosh u − sinh u cos ϑ)−1.

In the line of sight, we get the longitudinal shift K = eu, while normal to oursight it becomes (8.3.15). Thus, there is a shift even in the normal direction.However, for ϑ �= 0, the shift will not be exponential, and consequently the‘distance’ will not be given by the Lobaschevsky straight line. In this sense,the Lobaschevsky straight line is the ‘shortest distance.’

Aberration equations (8.3.6) and (8.3.9) can be combined in the half-angle formula, tan ϕ/2 = sin ϕ/(1 + cos ϕ) to read

tan ϕ/2 =(

1 − u1 + u

)1/2

cot ϑ/2 = e−u cot ϑ/2. (8.3.17)

Equation (8.3.17) will give the well-known expression for angle of paral-lelism: The ratio of concentric limiting arcs between two radii is the expo-nential distance between the arcs divided by the radius of curvature. Some-thing very strange occurs at ϕ = π/2. For we then obtain from (8.3.17),

tan ϑ/2 = e−u, (8.3.18)

showing that ϑ is the angle of parallelism. The angle becomes a sole func-tion of the hyperbolic distance, u. An observer in the frame in which theobject is at rest will see it rotated by an amount sin ϑ = √

(1 − u2), exactlyequal to the FitzGerald–Lorentz contraction, as we will see in Sec. 10.5.Since the angle of parallelism provides a unique link between circular andhyperbolic functions, a rotation and contraction can only be related at theangle of parallelism, if the geometry is indeed hyperbolic. The equiva-lence of rotations and contractions was first discussed by Terrell [59], buthis analysis cannot be extended to the situation where the angle is notacute; the angle of parallelism must always be acute, tending to a rightangle only in the limit of Euclidean geometry, as the velocity of light,c → ∞.

Page 425: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

398 A New Perspective on Relativity

8.4 Kinematics: The Radar Method of Signaling

With the realization that there is no such thing as a rigid body in relativity,Whitrow [33] went on to develop a radar method, or what he called a‘signal-function method,’ where light signals are transmitted between dif-ferent inertial frames and non-inertial ones. It was afterward referred to asthe ‘K-calculus technique’ by Bondi [60] without a hint of where the origi-nal idea came from, and Milne [48] used it extensively in his research priorto him. In fact, the original idea and the definition of rapidity can be tracedall the way back to Robb [11] in 1911.

8.4.1 Constant relative velocity: Geometric-arithmeticmean inequality

As in kinematic relativity [Milne 48], time measurements are much morefundamental than distance measurements, and the latter are deduciblefrom the former. In other words, distances are measured by the lapse oftime. This has been criticized by Born [43] as being impractical since no onehas ever received light signals from nebulae beyond the horizon. However,it is far superior to the usual method in general relativity that uses a met-ric, or rigid ruler, to measure distance. So what was discarded in specialrelativity made its come back in general relativity.

The most ideal situation would be to introduce into the fabric ofthe theory distances measured in brightness, or the difference betweenapparent and absolute brightness. However, no one has ever succeeded indoing so and we will base all distance measurements on the so-called radarmethod, where a light signal is sent out and reflected at a later time. All thatis needed is that at each reflection a certain retardation factor, K, comes in,determined by the clock in the frame that is sending out the light pulse.

Consider two observers, A and B, where observer A sends out a lightsignal in his time tA

1 which is received by observer B in his time tB2 . In terms

of A’s time, B will receive it in time KtA1 , where K is some constant factor

that is a function only of the relative velocity of the two inertial frames. Thesignal that arrives at B in time tB

2 will be reflected at some later time. Thereflected signal leaves B in time tB

3 which arrives at observer A in time tA4 ,

where tA4 = KtB

3 . From this it is apparent that both observers will call the

Page 426: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 399

reflection time,

tr = √(tA

1 tA4 ) = √

(tB2 tB

3 ), (8.4.1)

the geometric mean of the time intervals, and it is an invariant indepen-dent of the frame. The appearance of the geometric mean, as opposed tothe arithmetic mean, implies implicitly the existence of another time scale,namely, a logarithmic one [cf. Eq. (8.4.30) below]. So the ‘signal-functionmethod’ of Whitrow singles out the geometric mean as the time of reflec-tion.

The reading shown by a synchronous (stationary) clock at the eventshould be midway between the observer’s time, tA

1 , of sending out thesignal, and the time he receives its reflection, tA

4 ,

t = 12

(tA1 + tA

4 ). (8.4.2)

This was Einstein’s choice, but it is by no means the only choice. The mea-sure of the space interval is the difference between the ‘average’ for thelight-signaling process, (8.4.2), and the time the signal was sent out,

r = t − tA1 = 1

2(tA

4 − tA1 ). (8.4.3)

In terms of B’s coordinates, he will measure a time interval

t′ = 12

(tB2 + tB

3 ), (8.4.4)

and a space interval

r′ = 12

(tB3 − tB

2 ), (8.4.5)

separating the event from where he is located. The two systems of inertialcoordinates (t, r) and (t′, r′) are related by

tB2 = t′ − r′ = K(t − r) = KtA

1 , (8.4.6a)

tB3 = t′ + r′ = K−1(t + r) = K−1tA

4 . (8.4.6b)

The time tB2 is the time on B’s clock when the signal is received, and tB

3 isthe moment on B’s clock when it is sent back.

Page 427: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

400 A New Perspective on Relativity

If B reflects the signal instantaneously so that tB2 = tB

3 then from thesecond equalities in (8.4.6a) and (8.4.6b) it follows that

K =(

1 + r/t1 − r/t

)1/2

=(

1 + u1 − u

)1/2

, (8.4.7)

upon taking the positive square root, and setting r/t = u.Now suppose we place ourselves at the origin of B’s frame, r′ = 0.

Then summing (8.4.6a) and (8.4.6b) gives

t = 12

(tA1 + tA

4 ) = 12

(K + K−1)t′ = t′√(1 − u2)

, (8.4.8)

showing that a clock traveling at a uniform velocity goes slower than oneat rest. This expression for time dilatation only holds for frames moving ata constant velocity u [cf. Eq. (8.4.27) below].

In terms of the longitudinal Doppler shift, (8.2.4), the two systems ofcoordinates are related by

t = 12

(tA1 + tA

4 ) = 12

(KtB3 + K−1tB

2 )

= 12{(K + K−1)t′ + (K − K−1)r′}

= t′ cosh u + r′ sinh u, (8.4.9)

and

r = 12

(tA4 − tA

1 ) = 12

(KtB3 − K−1tB

2 )

= 12{(K − K−1)t′ + (K + K−1)r′}

= t′ sinh u + r′ cosh u. (8.4.10)

These are none other than the well-known Lorentz transformations. Takingtheir differentials and forming the difference of their squares show that thehyperbolic distance,

dt2 − dr2 = dt′ 2 − dr′ 2, (8.4.11)

is invariant.Now, we ask what happens when the light signal is reflected when it

arrives at B. In this case, tB2 = tB

3 ≡ tr is the time of reflection, and it occurs

Page 428: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 401

at the same point in space for B so that r′ = 0. The Lorentz transformations,(8.4.9) and (8.4.10), reduce to

t = tr cosh u, (8.4.12a)

r = tr sinh u. (8.4.12b)

Equation (8.4.12a) is a statement of the arithmetic-geometric mean inequal-ity: The arithmetic mean t can never be inferior to the geometric mean tr

since cosh u ≥ 1. Adding and subtracting (8.4.12a) and (8.4.12b) give

t + r = Ktr, (8.4.13a)

t − r = K−1tr. (8.4.13b)

Taking the differentials of (8.4.13a) and (8.4.13b), and then the product ofthe two, without requiring that K be constant, result in

dt2 − dr2 = dtr 2 − tr 2du2. (8.4.14)

A space-time interval has been transformed into a velocity space-timeinterval.

8.4.2 Constant relative acceleration

There is general consensus [Møller 52] that acceleration has no effect onthe rate of a clock, and that the expression for time dilatation (8.4.8) can beused in its infinitesimal form whether or not u is constant. However, accord-ing to Einstein’s equivalence principle uniform acceleration is equivalentto, or indistinguishable from, a uniform gravitational field. It has beenshown from the gravitational redshift that the latter, indeed, has an effecton the rate of a clock. This contradiction has been clearly pointed out byWhitrow [80], who shows that the time dilatation is greater when the veloc-ity is varying with time than when it is constant. We convert his inequalityinto an equality.

Consider two observers receding from one another with an averagevelocity r/t. Their identical clocks were synchronized at tA = tB = 0 whenthey were at the same point. At time tA

1 , A emits a signal which is pickedup and immediately reflected by B at time tB r, and then received back at A

Page 429: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

402 A New Perspective on Relativity

at time tA3 . The space interval is

t − tA1 = tA

3 − t = 12

(tA3 − tA

1 ) = r.

From this it follows that

tA1 = t − r, (8.4.15a)

tA3 = t + r, (8.4.15b)

and consequently,

tA r = √(1 − (r/t)2) t. (8.4.15c)

Since the Doppler shift is now given by

K =(

1 + u1 − u

)1/2

=(

1 + r/t1 − r/t

)(8.4.16)

and not by (8.4.7), we can express (8.4.15a) and (8.4.15b) as tA3 = KtA

1 , or

tB r = K1/2tA1 , (8.4.17a)

tA3 = K1/2tB r. (8.4.17b)

But, from (8.4.15c) it is apparent that tB r = tA r so that the clocks remainsynchronized, and we can drop the superscripts on the time.

Expressing r and t in terms of t1 and t3 we find [Page 36]

g = 2r

tr 2 = 1t1

− 1t3

, (8.4.18)

where g is the uniform acceleration due to gravity. Employing (8.4.17a) and(8.4.17b) we write (8.4.18) as

g = K1/2 − K−1/2

tr . (8.4.19)

Equation (8.4.18) enables us to express the Doppler shift, K, in termsof the ratio of the time the signal was received back to that when it wassent out

t3/t1 = K. (8.4.20)

Page 430: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 403

Taking the logarithms of both sides of (8.4.20), and then differentiating withrespect to t, give

d ln t3dt

− d ln t1dt

= 1 + ut3

− 1 − ut1

= 11 − u2

dudt

,

where we have used the differentials of (8.4.15a) and (8.4.15b). Dividingboth sides by

√(1 − u2) results in

Kt3

− K−1

t1= 1

(1 − u2)3/2dudt

= g, (8.4.21)

which is identical to (8.4.18). If the ratio (8.4.20) had been proportional tothe square of the Doppler shift, we would have found that (8.4.21) vanishes.

If we consider t′ to be the time of reflection on B’s clock, we can write(8.4.18) as

1t1

= 1t′

+ g/2, (8.4.22a)

1t3

= 1t′

− g/2, (8.4.22b)

which can easily be seen by subtraction. Rather, adding (8.4.22a) and(8.4.22b) gives

1t′

= 12

(1t1

+ 1t3

). (8.4.23)

The time of reflection on B’s clock is the harmonic mean for uniformacceleration, in contrast with the geometric mean as the time of reflectionfor uniform motion.

Writing t1 = t − r and t3 = t + r in (8.4.23) clearly shows that thespace-time interval is not invariant

t2 − r2 = t′t = tr 2,

unless we require the reflection times to be the same, meaning that clocksA and B are synchronous [Prohovnik 67].

Page 431: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

404 A New Perspective on Relativity

Multiplying the left- and right-hand sides of (8.4.22a) and (8.4.22b),rearranging, taking the square roots, and using gt = sinh u, give

t′ = sech(u/2)tr = sech2(u/2)t, (8.4.24)

where the second equality follows from (8.4.15c). The equalities in (8.4.24)express quantitatively that the harmonic mean is always smaller than thegeometric mean which is smaller than the arithmetic mean, because theequality of times can never apply. The first equality in Eq. (8.4.24) statesphysically that the time of reflection on B’s clock is always less than on A’sclock.

Taking the ratio of (8.4.22a) and (8.4.22b) we get

K = t3t1

= 1 + gt′/21 − gt′/2

. (8.4.25)

Instead of (8.4.7), we now have

12

gt′ = r/t = tanh u/2. (8.4.26)

Differentiating with respect to the arithmetic time average gives

g dt′ = sech2 u/2dudt

dt

= sech2 u/21 − u2

dudt

dt

and now using (8.4.21), gives

dt = cosh2 (u/2)dt′√

(1 − u2). (8.4.27)

Comparing (8.4.27) with the expression for time dilatation for uniformmotion, (8.4.8), leads to the inescapable conclusion that

clocks will run even slower in a uniformly accelerating frame than inan inertial frame when viewed from a stationary frame. The consensusof opinion that acceleration will have no effect upon the apparent rateof clocks is inaccurate.

Page 432: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 405

The transformation laws (8.4.17a) and (8.4.17b) can be expressed as

t + r = K1/2tr, (8.4.28a)

t − r = K−1/2tr. (8.4.28b)

Taking the differentials of (8.4.28a) and (8.4.28b), and then their product,result in

ds2 : = dt2 − dr2 = dtr 2 − 14

tr 2(d ln K)2

= dtr 2 − 14

tr 2du2. (8.4.29)

The appearance of tr 2 in the velocity space component of the metric (8.4.29)implies uniform expansion, and solicits the introduction of logarithmictime,

τ = 2τ0 ln (tr/τ0), (8.4.30)

where τ0 is an absolute constant, into (8.4.29) so that

ds2 = dt2 − dr2 = eτ/τ0

4{dτ2 − τ2

0 du2}. (8.4.31)

Thus, the formulas of the transformation of coordinates (8.4.28a) and(8.4.28b) can be written as

t + r = τ0eτ/τ0+u/2, (8.4.32a)

t − r = τ0eτ/τ0−u/2. (8.4.32b)

These equations clearly show that logarithmic time, (8.4.30), is hyperbolictime. At constant tr, a surface of revolution is obtained by rotating thehyperbola t2 − r2 = tr 2 around the t axis to give a bowl-shaped form. As itshould, the relative velocity, u, vanishes in the metric.

The equivalence relations (8.4.32a) and (8.4.32b) can be combined toread

t + r = K(t − r), (8.4.33)

Page 433: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

406 A New Perspective on Relativity

which is the square of (8.4.7) for uniform motion. In fact, at constant velocity,

t + r = K(t′ − r′), (8.4.34a)

t − r = K−1 (t′ + r′) , (8.4.34b)

we conclude that whereas (8.4.32a) and (8.4.32b) do not retain its invarianthyperbolic form the latter does

t2 − r2 = t′ 2 − r′ 2. (8.4.35)

Adding and subtracting the equations yield the well-known Lorentz trans-formations (8.4.9) and (8.4.10), and from which it can be concluded that theLorentz transformations leave invariant the hyperbolic ‘distance’ (8.4.35).

In terms of radar signaling, (8.4.33) consists in a single observer:Alightpulse is emitted in time t1, and observed by him at a later time t2 = Kt1.Alternatively, in the case of constant velocity, (8.4.34a) says a light signal isemitted at time t′1, in the prime inertial frame, and observed in the unprimedframe at a later time t2 = Kt′1. Whereas, (8.4.34b) says that if a signal isemitted at time t1, it will be observed at time t′2 in the primed inertialframe.

For uniform motion the geometric mean time remains invariant,√

(t1t2) = √(t′1t′2).

This is the same as requiring the hyperbolic line element (8.4.35) to be invari-ant. While, for uniform acceleration, the time of reflection in the B frameis the harmonic mean of the A frame. In his analysis of uniform acceler-ation, Page [36] attempted to show that the space-time interval betweenneighboring points is not constant. His analysis replaces (8.4.22b) by

1t3

= 1t′′

− g/2. (8.4.36)

This condition would necessarily imply that the harmonic means in thetwo frames are equal. Solving (8.4.22a) and (8.4.36) for the times t′ and t′′,with t′′ > t′ we get

t := 12

(t′ + t′′) = t sech2(u/2), (8.4.37a)

r := 12

(t′′ − t′) = [r − t tanh (u/2)]sech2(u/2). (8.4.37b)

Page 434: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 407

However, (8.4.37b) vanishes by the definition of the Lobaschevsky linesegment [cf. (8.4.26)], and hence r = 0. The time of reflection is given bythe harmonic mean (8.4.23). Therefore, for uniformly accelerating systemsthe point of reflection must occur at the origin of B’s frame, whose time isgiven by the harmonic mean of A’s clock.

8.5 Comparison with General RelativityWe are convinced that purely mathematical reasoning can never yield physical results, thatif anything comes out of mathematics it must have been put in in another form. Our problemis to find out where the physics got into the general theory.Bridgman

Einstein’s theory of relativity essentially consists of two principles [Fock 69,p. 233]: The unification of space and time into a four-dimensional spacewith an indefinite metric, and the relation of the curvature of the spaceto the presence of matter. Einstein also proposed an ‘equivalence’ princi-ple between inertia and gravitational mass, or between acceleration andgravitation. The latter has been criticized by Fock [69, pp. 232, 233]. Witha constant index of refraction, gravitational considerations appear only inthe specification of the absolute constant, which is related to the constant,negative curvature of the hyperbolic space.

A centrifugal potential appears explicitly in the flat metric whereas thegravitational potential does not.

In the general case of non-uniform motion the relevant space is theLobaschevsky–Friedmann velocity space [Fock 69, Sec. 94], which can, andwill in Sec. 8.6, be derived without any appeal to Einstein’s equations, andthe unphysical assumption that matter must be ‘dust-like’ at zero pres-sure. Even dust exerts pressure! We will see in Sec. 8.7 velocity compo-nents are related to the sides of a Lambert quadrilateral whose Weierstrasscoordinates of the point of the acute angle show that the geometric meantime enters as a magnification of these coordinates, and not as a separateentity. Most importantly, by avoiding the ‘rigid scaffolding’ employed byEinstein, which is applicable to inertial frames of reference only [Fock 69],acceleration has been accounted for as changes in velocity space, where theindependence of the ‘coordinates’ and time has disappeared. But before

Page 435: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

408 A New Perspective on Relativity

embarking on our journey into hyperbolic space let us pause to see howgeneral relativity accounts for a state of constant acceleration.

The general theory is based on the significance of an indefinite met-ric tensor, which sets gravity propagating at the speed of light or beyond[cf. below]. Space-time is labeled by a four-dimensional set of coordinatesxi for which three of the coordinates label the position of the event inspace, while the fourth one fixes the time coordinate. In general relativitythese labels have no physical significance. The proper distances are fixedby the metric tensor,

ds2 = gij dxi dxj, (8.5.1)

where the Einstein convention of summing over repeated indices has beenused. The metric tensor, (8.5.1), says that if xi and xi + dxi are labels for twoevents in space-time then ds will be their proper distance between the twoevents. If it turns out that ds2 > 0, then the two labels are separated by atime-like interval whereas if ds2 < 0, the labels are separated by a space-likeinterval.

The metric tensor determines the geodesics of a freely moving particle.They can be derived from Fermat’s principle of least time. If we considerthe proper time interval [τ1, τ2], then the variational principle states

δ

∫ τ2

τ1

ds = δ

∫ τ2

τ1

√ (gij

dxi

dxj

)dτ,

or equivalently,

∫ τ2

τ1

[ddτ

(gik

dxk

)− 1

2∂gkl

∂xi

dxk

dxl

]δxi dτ = 0.

This must be zero for all virtual variations δxi which vanish at the endpoints.That means that the Euler–Lagrange equations,

ddτ

(gij

dxj

)− 1

2∂gkl

∂xi

dxk

dxl

dτ= 0, (8.5.2)

must be satisfied at each instant of proper time. In the case of constantgravitational acceleration along the x-axis, the coefficients of the metric

Page 436: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 409

tensor are:

g11 = g22 = g33 = 1, g44 = −c2(1 + gx/c2)2, (8.5.3)

and all other are zero. In the xτ-direction, the contracted Ricci tensorvanishes, because each of the components, R11 = R44 = 0, and so toodoes the Gaussian curvature [cf. Sec. 9.10.3 for details].

It is one thing to assume that gravity propagates at the speed of light,but another to assume that it travels at c(1+gx/c2) [Møller 52, p. 257], whichis what (8.5.3) claims. Thus, the Euler–Lagrange equations (8.5.2) reduce toa single equation,

d2xdτ2 = 1

2∂g44

dx

(dtdτ

)2

= −g(1 + gx/c2

) (dtdτ

)2

, (8.5.4)

where i = 1 and k = l = 4. The increment in proper time is related tocoordinate time by [Møller 52, p. 247]

dτ = dt

√ (1 + 2

c2 χ − u2

c2

), (8.5.5)

where

χ = gx(1 + gx

2c2

),

is the so-called scalar gravitational potential [Møller 52, p. 255], andu = |dx/dt| is the speed of the particle measured from the rest frame.

If we use coordinate time, the equation of motion (8.5.4), with the helpof (8.5.5), becomes

d2xdt2 − 2g/c2

1 + gx/c2

(dxdt

)2

+ g(1 + gx/c2) = 0.

The solution to this equation,

x = c2

g

[(1 + gx0

c2

)sech(gt/c) − 1

], (8.5.6)

for the initial conditions x = x0, and dx/dt = 0, clashes with (4.3.37), which,in the present case, is

x = c2

g

(cosh

c− 1

).

Page 437: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

410 A New Perspective on Relativity

In fact, differentiating with respect to proper time gives

d2xdτ2 = g cosh (gτ/c) = dχ

dx= g(1 + gx/c2), (8.5.7)

which is simply Newton’s second law!There is no physical reason why the speed,

u = |dx/dt| = c(1 + gx0/c2) tanh (gt/c)sech(gt/c),

should show a maximum in time when the particle is under constant accel-eration, nor is there any reason to call c

(1 + gx/c2) the velocity of light

when it is apparent that this velocity becomes infinite as x → ∞. Thereis also no reason to consider (8.5.5) a valid relation between proper andcoordinate velocities when the particle is under acceleration. Moreover, itis also apparent from (8.5.6) that proper (hyperbolic) and coordinate timeshave been confused. This is evident from Møller’s conclusion:

For t → ∞ the particle approaches the singular wall x = −c2/g of our system ofcoordinates. At this place also the velocity of light tends to zero, and no signals ofany kind will ever reach the boundary plane.

Thus, general relativity has a ‘black hole’ at the boundary of a system inwhich the particle is merely undergoing uniform acceleration!

We are thus forced to conclude that the geodesic equation (8.5.2) doesnot fix the motion of a moving observer relative to another. By ‘comoving’it is meant that the observer is at rest relative to the matter placed at hisposition, but the observer is allowed to move freely with no forces actingon him. This also puts into serious doubt whether the curvature of space,as contemplated from the metric tensor (8.5.1), really describes the actionof gravity!

8.6 Hyperbolic Geometry of Relativity

All the relations of relativity can be derived directly from the hyperbolicdifferential of arc length, or the Beltrami metric, in velocity space. Considera velocity vector with two components, u1 and u2. Moreover, consider adisc of radius c in which a line passing through the center is cut by anotherline forming an angle θ with it. The Euclidean distance from the center ofthe circle to the point of intersection is u/c = √ (

u21 + u2

2)/c. In Euclidean

Page 438: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 411

space, distances are relative. Congruent triangles of different sizes can allhave the same angles. No so in Lobaschevsky geometry where, as we haveseen, the angles determine the sides of the triangle. Hence, lengths are alsoabsolute.

The Euclidean length of the increment in the velocity du = √(du2

1 +du2

2) will be related to the hyperbolic measure du by

du = � du,

where

� =√

(1 − u2 sin2 θ/c2)1 − u2/c2 . (8.6.1)

From the definition of the curl,

u × du = u du sin θ,

we may write the hyperbolic line element as

du =√(

du2 − (u × du)2/c2)

1 − u2/c2 , (8.6.2)

for the hyperbolic line element in terms of u1 and u2.Now, if du = u−w, the hyperbolic measure of their difference will be

√[(u − w)2 − (u × w)2/c2]1 − u · (w + du)/c2 =

√[(u − w)2 − (u × w)2/c2]1 − u · w/c2 + O(du),

(8.6.3)on the strength of (8.6.2), where O(du) is an infinitesimally small quantityof than higher order than

√[(u − w)2 − (u × w)2/c2].

If the two velocity vectors are parallel, (8.6.3) reduces to (6.3.1). If not,

the composition of Lorentz transforms in different directions involverotations, as Einstein knew in 1905.

Page 439: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

412 A New Perspective on Relativity

If we want to determine the hyperbolic length between u2 and u1, wewrite the velocity vector in parametric form

u = u1 + λ(t)(u2 − u1) 0 ≤ λ(t) ≤ 1,

and introduce it into the expression [Fock 69]

u =∫ 1

0

√(u2 − (u × u)2/c2)

1 − u2/c2 dt,

to get

u = c∫ 1

0

√[(u2 − u1)2 − (u1 × u2)2/c2]1 − u2/c2 dλ. (8.6.4)

If we set

λ = (c2 − u2)2a

ln

(b + axb − ax

),

where the constants

a = √(c2(u2 − u1)2 − (u1 × u2)2),

and

b = c2 − u1 · u2,

the integral (8.6.4) becomes

u = c∫ 1

0

ab dxb2 − a2x2 = c

2ln

(b + ab − a

), (8.6.5)

which is the hyperbolic measure of distance.Using the well-known relation for inverse hyperbolic functions, (8.6.5)

can be written asab

= tanh (u/c). (8.6.6)

Squaring both sides of (8.6.6) and multiplying by c2,

(u2 − u1)2 − (u1 × u2)2/c2

(1 − u1 · u2/c2)2= c2 tanh2 (u/c),

shows that the left side of (8.6.6) is the ratio of the relative velocity to c.Whereas the Euclidean measure of the relative velocity is bounded fromabove by c, there is no limit on its hyperbolic measure u.

Page 440: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 413

To see the relation between the hyperbolic measure of distance, (8.6.5),and the Lobaschevskian arc length, (8.6.2), we write the former in the form

u = c2

ln

[1 + 2

√[c2(u22 − u2

1) − (u1 × u2)2]c2 − u1 · u2 − √[c2(u2 − u1)2 − (u1 × u2)2]

].

Considering the second term in the argument of the logarithm as small, weapproximate ln (1 + x) x so that

u = c√ [

c2(u2 − u1)2 − (u1 × u2)2]

c2 − u1 · u2 − √ [c2(u2 − u1)2 − (u1 × u2)2

]

√ [

(u2 − u1)2 − (u1 × u2)2/c2]

1 − u1 · u2/c2 + O(|u2 − u1|), (8.6.7)

where the remainder in (8.6.7) contains higher-order terms in the velocitydifference. In the case of parallel vectors, (8.6.7) reduces to the velocitycomposition law, (6.3.1).

If we specialize to the case of parallel vectors, (8.6.5) reduces to

u = c2

ln

(c2 − u1 · u2 + c(u2 − u1)c2 − u1 · u2 − c(u2 − u1)

)

= c2

ln

(c − u1

c − u2· c + u2

c + u1

)

= c2

ln {u2, u1| − c, c}. (8.6.8)

The argument of the logarithm is our old friend the cross-ratio, andunderscores its fundamental role in defining distance in hyperbolic space.This shows that the Lobaschevsky segment is really the shortest distancebetween two points on the hyperbolic plane.

We recall from Sec. 2.2.4 that projections are transformations thatchange lengths and angles. No property of three points can be invariantbecause any three points can be transformed into any other three points onthe line. Four collinear points are needed (c, u2, u1, −c) for the cross-ratio(8.6.8). Connecting these points by lines emanating from a common point,the invariance of the cross-ratio can be shown as an invariance of the ratioof the angle formed from this vertex, as we have done in Sec. 2.2.4.

As u2 moves toward c, the cross-ratio, as well as its logarithm, increasesindefinitely. Rather, if u2 is found between u1 and −c, the cross-ratio will be

Page 441: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

414 A New Perspective on Relativity

between 0 and 1. This makes the hyperbolic distance (8.6.8) negative. Andas u2 moves along the line toward −c, the hyperbolic length will decreaseindefinitely. Hence, the points ±c are ‘infinitely distant.’ Infinitely distantpoints are upper and lower bounds on the Euclidean measures of distance,in this case velocities, since we are working in velocity space. The same canbe said about configuration space, where the points at infinity will be givenby ±c/ω, with the frequency to be defined, for example, in consideringgravitational collapse, whose free-fall frequency is ω = √

(Gρ), where G isthe Newtonian gravitational constant and ρ is the density of matter.

We can also write the hyperbolic distance (8.6.8) as

u = c2

ln

(1

c − u· c + u

1

), (8.6.9)

by setting u1 = 0 and putting u2 = u. Alternatively, we can equate (8.6.9)with (8.6.8) without any prior conditions and find the velocity subtractionlaw,

u = u2 − u1

1 − u1 · u2/c2 . (8.6.10)

The subtraction, and not the addition, law for velocities is required for theconstruction of the cross-ratio.

It is also quite remarkable that the triangle defect of hyperbolic geome-try corresponds exactly with the angle of aberration. When a moving object,like us who are on Earth, is trying to determine the position of anotherobject, say a star, the directions to the star will differ by aberration. Thatis, consider an incoming ray in the xy-plane. With respect to two platformsS and S′ moving at a relative velocity v with respect to one another, anincoming light signal will make angles α and α′, respectively.

The velocity components in the x and y directions will be related by

u′x = ux − v

1 − uxv/c2 and u′y = uy

γ(1 − uxv/c2).

The velocity components of the incoming light signal will be ux = − cos α

and uy = −c sin α, with analogous expressions in the primed platform.Hence, the formulas for aberration are [cf. (8.3.6)]

cos α′ = cos α + v/c1 + (v/c) cos α

, (8.6.11a)

Page 442: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 415

and [cf. (8.3.9)]

sin α′ = sin α

γ(1 + (v/c) cos α). (8.6.11b)

With the aid of the half-angle formula,

tan12α′ = sin α′

1 + cos α′ ,

the formulas for aberration, (8.6.11a) and (8.6.11b) can be combined to yield[cf. (8.3.17)]

tan12α′ = K tan

12α. (8.6.12)

Consider two types of displacements from u, du and δu, where

du2 = c2 c2(du)2 − (u × du)2

(c2 − u2)2,

and

δu2 = c2 c2(δu)2 − (u × δu)2

(c2 − u2)2.

By the definition of the cosine of the angle between the two displace-ments [Fock 69],

du · δu = du δu cos α = c2 c2du · δu − (u × du) · (u × δu)(c2 − u2)2

,

we have

cos α = c2du · δu − (u × du) · (u × δu)√ [c2(du)2 − (u × du)2

] · √ [c2(δu)2 − (u × δu)2

] , (8.6.13)

as the expression for the cosine of the angle between the relative velocitiesof two bodies.

8.7 Coordinates in the Hyperbolic Plane

Consider an orthogonal system through the origin O in Fig. 8.5. Let Uand P be the points on the axes where the perpendicular projections froma point P not lying on the axes meet the axes. We then have a Lambertquadrilateral �OVPU. If the angle at P were a right angle, we wouldhave Euclidean geometry; if it is acute we have hyperbolic geometry.

Page 443: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

416 A New Perspective on Relativity

Fig. 8.5. A Lambert quadrilateral in velocity space consisting of three right-anglesand one acute angle.

Though named after Lambert, it was known to Ibn al-Haytham almostseven hundred years earlier [Rosenfeld 88]. This is yet another example ofStigler’s law of eponymy.

The distances to the points U and V are

u = tanh−1 u, and v = tanh−1 v. (8.7.1)

The Weierstrass coordinates can now be introduced as

X = uT, Y = vT, and T = cosh u cosh w, (8.7.2)

where u, v w, and z are four sides of a Lambert quadrilateral, shown inFig. 8.5, consisting of three right angles and one acute angle between wand z.

The condition that the two sides will intersect to form an acuteangle is

1 − tanh2 u − tanh2 v = 1 − u2 − v2 > 0. (8.7.3)

The Euclidean measures of the two sides are

w = u√(1 − v2)

= tanh u cosh v,

z = v√(1 − u2)

= tanh v cosh u.

Page 444: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 417

By giving to each point the triple (X, Y, T) of Weierstrass coordinates, thehyperbolic plane is mapped onto the locus

T2 − X2 − Y2 = 1,

which is one of two sheets of a hyperboloid in Cartesian three-dimensions.The infinitesimal metric,

dσ2 = dX2 + dY2 − dT2 = dw2 + cosh2 w du2

= (1 − v2)du2 + 2uv du dv + (1 − u2)dv2

(1 − u2 − v2)2, (8.7.4)

is the spatial component of the Lobaschevsky velocity space metric. Ifwe want a full time-velocity space metric we must magnify the T coor-dinate, tr > 0 times, viz. T = tr/

√(1 − u2 − v2). This results in a time-like,

indefinite metric,

dτ2 = dT2 − dX2 − dY2 = dtr 2 − tr 2dσ2, (8.7.5)

which we will meet again in our discussion of cosmological models inSec. 9.11.

Atime-velocity metric, similar to (8.7.5), was derived by Friedmann in1922 using the Einstein equations to relate the coordinates to the Lagrangianvariables, ui. It was derived under the condition that matter was ‘dust-like’exerting zero pressure, which is certainly a dubious assumption and doesnot correspond to anything physical. It was also assumed that the velocitiesvariables ui are constants relating the spatial coordinates xi to time, but, sub-sequently, they were differentiated to obtain the Lobaschevsky–Friedmannmetric (8.7.5) [Fock 69].

As can be seen from the definition of the Weierstrass coordinates,(8.7.2), each of the coordinates become magnified tr times [Fock 69,Eq. (94.47)]. In a multi-dimensional velocity space, the spatial part of themetric (8.7.4) can be written as

dσ2 = (du)2 − (u × du)2

(1 − u2 − v2)2,

Page 445: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

418 A New Perspective on Relativity

by introducing the coordinates Xi = uitr/√

(1−u2−v2) into the infinitesimalmetric,

dτ2 = dT2 −∑

i

X2i .

In relation to the Robertson–Walker metric, which we will come acrossin Sec. 9.11, the scale factor R(t) multiplying the spatial part of the metricis just t, which implies uniform expansion. Introducing logarithmic, orhyperbolic, time according to

t = t0 ln (tr/t0), (8.7.6)

where t0 is the age of the system on the tr scale, (and not (8.4.30)) into (8.7.5)gives

dτ2 = e2t/t0{dt2 − t20(dw2 + cosh2 w du2)}. (8.7.7)

The proper time interval is the quantity τ0 determined at constant velocityby the equation

τ0 =∫ t

0et/t0dt = t0

(et/t0 − 1

).

This law could have been anticipated because tr is the geometric mean. Onlyfor t � t0 will the proper time coincide with t. The exponential variablescale factor multiplies both time and velocity increments, and testifies tothe fact that they are not independent, but, are related by the Beltramicoordinates and logarithmic time.

For fixed t0 the velocity space line element is

dσ2 = t20e2t/t0 (dw2 + cosh2 w du2). (8.7.8)

The terms in the parentheses have the metric form of a pseudosphere invelocity space, with constant negative curvature, −1, that we introducedin Sec. 2.5 and met on many an occasion. The scale factor is the same expo-nential that appears in the proper time increment. The hallmark of a pseu-dosphere is that lines which do not intersect are, nevertheless, not parallel.Along a light track (8.7.7) vanishes resulting in

dt = t0

√ [(1 − u2)2dw2 + (1 − w2)du2]

(1 − u2)(1 − w2).

Page 446: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 419

This is a generalization of the well-known one-dimensional expression,whose integral identifies (8.2.1) as the length of the corresponding segmentof a Lobaschevsky straight line.

8.8 Limiting Case of a Lambert Quadrilateral: UniformAcceleration

A limiting case arises when inequality (8.7.3) reduces to an equality

u2 + v2 = 1, (8.8.1)

or v = √(1 − u2) =: u∗. The velocities u and u∗ are said to be comple-

mentary [Greenberg 93]. The defining relation for uniform acceleration is(8.4.16), which upon resolving for the velocity gives

u = tanh u = 2(r/t)1 + (r/t)2

. (8.8.2)

r/t represents the Euclidean ‘length,’ while u is the length that Poincaréused; the two being related by

eu = 1 + (r/t)1 − (r/t)

. (8.8.3)

The complementary velocity is found to be

u∗ = 1 − (r/t)2

1 + (r/t)2= sech u = √

(1 − u2),

which verifies (8.8.1).The angle of parallelism, (8.2.10),

�(u∗) = 2 tan−1 e−u∗, (8.8.4)

is defined solely in terms of the ‘distance’ u∗ from the foot of the perpen-dicular to the angle of parallelism. The angle of parallelism is the lowerbound for the angle of parallax.

It was Bernoulli who first showed that

2 tan−1 e−u∗ = 1i

ln

(1 + ie−u∗

1 − ie−u∗

). (8.8.5)

Page 447: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

420 A New Perspective on Relativity

This is because

�(u∗) = 2i

tanh−1(ie−u∗)

,

which is equal to (8.8.4).In particular,

r/t = tanh u/2 = tan �(u∗)/2 = e−u∗(8.8.6)

shows that the closer the complementary velocity u∗ is to zero, the closer �

is to being a right angle. For large u∗, or nonrelativistic velocities, the angleof parallelism is practically zero. For the Earth’s orbital motion, (8.8.6) is10−4, giving an angle of parallelism � = 89059

′39.4

′′. The deviation from

Euclidean space is only 20.6′′. However, things change drastically as the

velocity of light is approached: for a relative velocity 0.95, the angle ofparallelism drops to 18012

′, and vanishes in the limit [Silberstein 14].

The double angle formula,

tan �(u∗) = e−u∗ + e−u∗

1 − e−2u∗

= 1/ sinh u∗ = sinh u, (8.8.7)

shows that � provides the link between circular and hyperbolic functions.In particular, (8.8.7) relates the angle of parallelism to the particle velocity.

Consider a Lambert quadrilateral with three right angles and an idealpoint in Fig. 8.6. Equation (8.8.7) implies that the complementary velocities,

Fig. 8.6. A Lambert quadrilateral comprised of complementary segments wherethe ‘fourth vertex’ is an ideal point.

Page 448: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 421

u and u∗, which are adjacent to the two angles of parallelism, are related by

eu∗ = 1 + cosh usinh u

= coth u/2

=(

1 + e−u

1 − e−u

), (8.8.8)

and an identical expression for u in terms of u∗. Equation (8.8.3) implies theaddition law for the hyperbolic measure of the complementary velocities,which, in turn, implies the product law for the average velocities.

Rather, if u1 and u2 are the components of the hyperbolic measure ofthe velocity u, their composition law follows velocity composition law:

e−u = e−u1 + e−u2

1 + e−u1−u2. (8.8.9)

Finally, the generalization of (8.8.6) to n components,

n∏i=1

tan[�(u∗

i )/2] = e− ∑n

i=1 u∗i = Gn = tanh (u/2), (8.8.10)

selects out the geometric mean G = (eu∗1 eu∗

2 · · · eu∗n )1/n, and the last equality

follows from (8.8.8), which is valid for a single complementary velocity ora sum of n velocities.

The opposite right angle is divided into two angles of parallelism suchthat

�(u) + �(u∗) = π/2. (8.8.11)

8.9 Additivity of the Recession and Distancein Hubble’s Law

The fact that the shift z = δλ/λ0 for lines in the spectrum of a given galaxy isindependent of the wavelength is a necessary, but not a sufficient condition,that the redshift is due to motion. It was Edwin Hubble who interpretedthese redshifts as Doppler shifts, which are indicative of recessional motion.In so doing he obtained a linear relation between the velocity of recession,u, and radial distance, r, with a constant of proportionality that is the samefor all galaxies. We will show that both these quantities are additive.

Page 449: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

422 A New Perspective on Relativity

There will be a redshift if the detected wavelength, λ, is greater thanthe emitted wavelength, λ0, in

1 + z = λ

λ0= K = eu. (8.9.1)

Now, K is the ratio of the received, t2, to the emitted time, t1. We can there-fore define a hyperbolic measure of the time interval as [Milne 48]

t/t0 = lnt2t1

= ln K. (8.9.2)

A comparison of (8.9.1) and (8.9.2) results in

u = t/t0 = H r, (8.9.3)

where u is the hyperbolic measure of the velocity, H = t−10 , the Hubble

parameter, and r = t is the hyperbolic measure of distance in naturalunits. Hubble’s law (8.9.3) could have also been derived by setting the one-dimensional velocity space metric (8.4.29) equal to zero, and introducingthe logarithmic time (8.7.6).

Consequently, (8.9.1) is the exponential law [Prohovnik 67]

1 + z = eHt. (8.9.4)

Only when Ht � 1 can we neglect powers of Ht greater than first so that(8.9.4) reduces to the relation [Hoyle et al. 00]

z = Ht. (8.9.5)

The exponential law (8.9.4) implies that when there are more than oneredshifts, it is their geometric mean which should be taken. For example,the cluster Group II has n = 21 redshifts, in which case (8.9.1) generalizes to

n∏i=1

(1 + zi) =n∏

i=1

λi

λ0i=

n∏i=1

K(ui) = exp

( n∑i=1

ui

). (8.9.6)

Hyperbolic velocities, ui, like the hyperbolic distances ri, are thereforeadditive. The average wavelength is the geometric mean wavelength.This is implied by the exponential law (8.9.6).

Page 450: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

Relativity of Hyperbolic Space 423

References

[Bondi 60] H. Bondi, Cosmology (Cambridge U. P., London, 1960).[Born 09] M. Born, “Die Theorie des starren elektrons in der Kinematik des

Relativitátsprinzips,” Ann. der Phys. 30 (1909) 1–56.[Born 43] M. Born, Experiment and Theory in Physics (Cambridge U. P., London,

1943).[Buseman & Kelly 53] H. Busemann and P. J. Kelly, Projective Geometry and Projective

Metrics (Academic Press, New York, 1953).[Fock 69] V. Fock, The Theory of Space, Time and Gravitation, 2nd ed. (Pergamon Press,

Oxford, 1969).[Greenberg 93] M. J. Greenberg, Euclidean and Non-Euclidean Geometries: Develop-

ment and History, 3rd ed. (W. H. Freeman, New York, 1993).[Hoyle et al. 00] F. Hoyle, G. Burbidge and J. V. Narlikar, A Different Approach to

Cosmology (Cambridge U. P., Cambridge, 2000).[Lobaschevsky 98] N. I. Lobaschevsky, Zwei Geometrische Abhandlungen (Leipzig,

1898).[Milne 48] E. A. Milne, Kinematical Relativity (Oxford U. P., London, 1948).[Møller 52] C. Møller, Theory of Relativity (Oxford U. P., London, 1952).[Needham 97] T. Needham, Visual Complex Analysis (Clarendon Press,

Oxford, 1997).[Page 36] L. Page, “A new relativity,” Phys. Rev. 49 (1936) 254–268.[Poincaré 05] H. Poincaré, “Sur la dynamique d’électron,” Comptes Rendus

Hebdomadaires des seances de l’Academie des Sciences 140 (1905) 1504–1508;extended version in Rend. Circ. Mat. Palermo 21 (1906) 129–176.

[Prohovnik 67] S. J. Prohovnik, The Logic of Special Relativity (Cambridge U. P.,London, 1967).

[Rosenfeld 88] B. A. Rosenfeld, A History of Non-Euclidean Geometry (Springer,New York, 1988), pp. 59–64.

[Robb 11] A. A. Robb, Optical Geometry of Motion (W. Heffer & Sons, Cambridge,1911).

[Silberstein 14] L. Silberstein, The Theory of Relativity (MacMillan, London, 1911).[Sommerfeld 09] A. Sommerfeld, “Über die Zusammensetzung der Geschwin-

digkeiten in der Relativitheorie,” Verh. Deutsch. Phys. Ges. XI (1909)577–582; “On the composition of velocities in the theory of relativity,”Wikisource translation.

[Terrell 59] J. Terrell, “Invisibility of the Lorentz contraction,” Phys. Rev. 116 (1959)1041–1045.

[Whitrow 33] G. J. Whitrow, “A derivation of the Lorentz formulae,” Quart. Jour.Math (Oxford) 4 (1933) 161–172.

[Whitrow 80] G. J. Whitrow, The Natural Philosophy of Time, 2nd ed. (ClarendonPress, Oxford, 1980).

Page 451: A New Perspective on Relativity

Aug. 26, 2011 11:16 SPI-B1197 A New Perspective on Relativity b1197-ch08

This page intentionally left blankThis page intentionally left blank

Page 452: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Chapter 9

Nonequivalence of Gravitationand Acceleration

The treatment of the uniformly rotating rigid body seems to me to be very important becauseof an extension of the relativity principle to uniformly rotating systems by trains of thoughtwhich I attempted to pursue for uniformly accelerated translation in the last second of. . .my paper (of 1907).Einstein in a letter to Sommerfeld dated 29/9/1909.a

9.1 The Uniformly Rotating Disc in Einstein’sDevelopment of General Relativity

According to John Stachel [89], the rigidly rotating disc is the “missinglink” in Einstein’s formulation of general relativity for it made him awareof the need for a non-flat metric in a relativistic treatment of the gravita-tional field. To Einstein, gravitation is a form of acceleration insofar as itcan be nullified by another accelerating frame. Einstein’s conclusion thatEuclidean geometry does not apply to a reference frame in uniform rotationstems from his belief that a

measuring rod applied to the periphery undergoes Lorentz contraction, while theone applied along the radius does not. Hence, Euclidean geometry does not apply[to a system of coordinates in uniform rotation].

In a private letter, dated August 19, 1919, to a then well-knownphilosopher, Joseph Petzoldt, Einstein goes into more detail:

Let U0 be the circumference, r0 the radius of the rotating disc, consideredfrom the standpoint of K0 [the rest frame]; then on account of ordinary

a“That isolated remark, important as it is, does not change my opinion that Einsteinwas concentrating in other directions during this period (12/1907–06/1911).”A. Pais, in Subtle is the Lord.

425

Page 453: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

426 A New Perspective on Relativity

Euclidean geometry

U0 = 2πr0. (9.1.1)

U0 and r0 naturally are to be thought of as measured with non-rotating measuringrods, i.e. at rest relative to K0.

Now let me imagine co-rotating measuring rods of rest length l laid out on therotating disc, both along the radius as well as the circumference. How long are these,considered from K0? Let us imagine, in order to make this clearer to ourselves, a‘snapshot’ taken from K0 (definite time t0). On this snapshot the radial measuringrods have the length l, the tangential ones, however, the length

√(1 − v2/c2). The

‘circumference’ of the circular disc (considered from K) is nothing but the number oftangential measuring rods that are present in the snapshot along the circumference,whose length considered from K0 is U0. Therefore,

U = U0/√

(1 − v2/c2). (9.1.2)

On the other hand, obviously

r = r0 (9.1.3)

(since the snapshot of the radial unit measuring rod is just as long as that of ameasuring rod at rest relative to K0). Therefore, from (9.1.2) (9.1.3),

Ur

= U0r0

√(1 − v2/c2)

, (*)

or, on account of (9.1.1),

Ur

= 2π√(1 − v2/c2)

. (**)

We pause to see the speciousness of Einstein’s reasoning. Insteadof setting the disc in uniform rotation, we set it in motion at a con-stant velocity v. We would then expect the radius to contract under theFitzGerald–Lorentz contraction,

r = r0√

(1 − v2/c2), (9.1.4)

and expect that the circumference should remain unchanged under uni-form motion,

U = U0. (9.1.5)

The ratio of (9.1.5) to (9.1.4) is (∗), and with (9.1.1), gives (∗∗). So, Einsteinwas begging the question!

He also contended that clocks go slower at the periphery of the rotat-ing disc because clocks in motion go slower than ones at rest:

The rotating observer notes very well that, of his two equivalent clocks, that placed on thecircumference runs slower than that placed at the center.

Page 454: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 427

This is supposedly a result of special relativity and has absolutely noth-ing to do with the fact that the disc is accelerating uniformly. From theseobservations he concluded that

In general relativity, space and time cannot be defined in such a way that the dif-ferences of the spatial coordinates be directly measured by the unit measuring rod,or difference in the time coordinate by a standard clock.

Let us recall from Sec. 1.1.1.1 that the gravitational form of time dilatationis based upon Einstein’s so-called equivalence principle, and has nothingto do with general relativity!

All during the period when the uniformly rotating disc held Einstein’sattention, probably between mid-July to mid-October of 1912, Einsteinexpressed his confusion of the relationship between coordinates and mea-surements with rods and clocks. In Einstein’s own words

One sees already from the previously treated highly special case of the gravita-tion of masses at rest that the space-time coordinates lose their simple physicalinterpretation; and it still cannot be foreseen what form of the general space-timetransformation equations may take. I should like to ask all colleagues to have a tryat this important problem!

As Stachel correctly observes

Curiously enough, a paper actually deriving the metric of the rotating disc hadbeen published two years earlier by Theodor Kaluza (1910). The paper was tohave been delivered by Kaluza at the 1910 Naturforscherversammlung in Königs-berg, where he was then working; but he took sick and only the published versionappeared under the title “Zur Relativitätstheorie,” which gave no idea of its con-tents. I have found no evidence that Einstein — or anyone else in the long history ofthe rotating-disc problem for that matter — was aware of the existence of Kaluza’swork.

Kaluza was right to conclude that “On closer examination, the geom-etry of a rotating disc is non-Euclidean, specifically Lobachevskian geome-try,” but he gets the analysis wrong. Instead of deriving Einstein’sresult,

∫ 2π

0

r√(1 − r2)

dϕ = 2πr√(1 − r2)

, (9.1.6)

where r and ϕ are polar coordinates, he writes:

∫r2

√(1 ± r2)

dϕ.

Page 455: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

428 A New Perspective on Relativity

The square in the numerator is undoubtedly a typo, but the ± in thedenominator is perhaps his uncertainty of which non-Euclidean geome-try to use.

Kaluza’s expressions for the arc length,[

1 + r2

1 ± r2

(dϕ

dr

)2]1/2

dr, (9.1.7)

and phase,

ϕ − ϕ0 = cos−1( r0

r

)± r0

√(r2 − r2

0), (9.1.8)

also suffer from the indeterminacy. Moreover they are wrong: Beginningwith his expression for the arc length, (9.1.7), he would have found theequation of a straight line,

ϕ − ϕ0 = cos−1( r0

r

), (9.1.9)

in polar coordinates, where r0 is the distance from the line to the origin.He would have obtained the same result with a Euclidean metric. Placingthe r0 in front of the first term on the right-hand side and deleting it fromthe second term in (9.1.8), and choosing the negative sign, would give theeikonal of geometrical optics, (7.4.11). Kaluza’s result, (9.1.6), is howevermore interesting, and puts to rest Einstein’s red herring about the lack ofphysical significance of coordinates.

Consider a sphere of radius R. The Euclidean distance on the spherer has the elliptical counterpart R tan r/R, where r is the elliptic measure ofthe distance on the sphere. The distance between (r, ϕ) to (r + dr, ϕ) has theelliptic separation

dr = dr1 + r2/R2 .

The transformation from elliptic to hyperbolic geometry consists in replac-ing the radius R by i R so that the hyperbolic separation is

dr = dr1 − r2/R2 , (9.1.10)

and gives the distance of a hyperbolic straight line segment, iR tan (r/iR) =R tanh (r/R).

Page 456: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 429

Likewise, to determine the circumference of a circle in ellipticgeometry we introduce the spherical coordinates

x = R cos ϕ sin r/R, y = R sin ϕ sin r/R, z = R cos r/R.

Then the distance between (r, ϕ) to (r, ϕ + dϕ) is

ds2 = dx2 + dy2 + dz2 = R2 sin2 (r/R

)dϕ2, (9.1.11)

and, consequently, the circumference of the circle will be

R∫ 2π

0sin (r/R)dϕ = 2πR sin (r/R).

Again making the transition from elliptic to hyperbolic geometry, R → i R,and noting that sin (ix) = i sinh x, we get the hyperbolic circumference as

U = 2πR sinh (r/R) = 2πr√(1 − r2/R2)

. (9.1.12)

Moreover, from (9.1.11) we find that the hyperbolic distance between(r, ϕ) and (r, ϕ + dϕ) is

ds = R sinh (r/R)dϕ,

and combining it with (9.1.10), the hyperbolic separation ds of the points(r, ϕ) and (r + dr, ϕ + dϕ) is

ds2 = dr2

(1 − r2/R2)2+ r2 dϕ2

1 − r2/R2 , (9.1.13)

which, from what we know in Sec. 2.5, is what Beltrami had found back in1868! Kaluza’s error was to use the Euclidean measure of the radial terminstead of its hyperbolic measure.

Thus we see that there is no truth to Einstein’s assertion that “the radialmeasuring rods have length l; the tangential ones, however, have length√

(1 − v2/c2).” Nor can the observer distinguish his position on the discby how slow his clock goes with respect to the clock at the center. This“misunderstanding is quite fundamental,” to use Einstein’s own words.The rulers are not shorter, nor the clocks slower in the hyperbolic metric.Recall Poincaré’s beautiful discovery of a distance function that made allhis tessellations the same size in Fig. 1.1.

Page 457: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

430 A New Perspective on Relativity

Let us reiterate what we said about the difference in size of the bug’slegs in Sec. 2.1.1.

There is a fine line when talking about the metric, length, distance andtime. We want to look at the picture in our Euclidean metric, and say“lengths are shorter and time goes more slowly as we proceed furtherout on our disc.” However, to the person who lives there he sees nodifference, nor can he measure one.

Einstein constructed his general theory as a generalization of the flatmetric of Minkowski to a non-flat metric. According to Einstein, the uni-formly rotating disc was of “decisive importance” because it showed thata gravitational field (equivalent, to him, to a centrifugal field) causes “non-Euclidean arrangements of measuring rods, and thus compels a generaliza-tion of Euclidean space.” The four-dimensional formulation of Minkowski,and its non-flat generalization, required Einstein to go beyond Gauss’s two-dimensional theory of surfaces. In his words:

I first had the decisive idea of the analogy of mathematical problems connectedwith the theory and Gauss’s theory of surfaces in 1912 after my return to Zurichwithout knowing at that time Riemann’s and Ricci’s, or Levi-Civita’s, work.

But once the equation of the phase trajectory is obtained, which isgiven in terms of the coefficients of the first fundamental form, (7.5.1)imposing the condition of the conservation of angular momentum, orits non-Euclidean generalization, gives the equation of motion, (7.5.10).Integration of the latter gives the relation between time and space.

In the Euclidean case, the equation of motion,

r = ±√

(r2 − r20)

r, (9.1.14)

can be integrated at once to give

r2 − t2 = r20. (9.1.15)

In terms of the Minkowski line element, this would be a space-like interval,where no ‘signal’ can be transmitted from the point r to the point r0 in timet between the events that take place at those points. Since the velocity isgreater than that of light, there can be no causal relation between the twoevents. However — and this is a big however — (9.1.15) is not a statement

Page 458: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 431

of the invariance of two inertial frames of reference in which in the K0 frameof reference the two events happen simultaneously. Yet, it gives a relationthat is contrary to Einstein’s assumption that general relativity should bea generalization of the flat Minkowski metric to a non-flat metric.

Any non-Euclidean generalization of the equation of motion will con-tain small corrections to the equation of motion (9.1.14), and, thus, cannotchange the qualitative nature of the solution, (9.1.15). Hence, there was noneed for Einstein to look beyond Gauss’s theory of curvature to look for afour-dimensional generalization to include time.

At least, in hindsight, Einstein saw the need for introducing a non-Euclidean metric based on the uniformly rotating disc to be based uponthree restrictions:

(i) The special theory should hold in a ‘global inertial frame,’ where nogravitational field exists.

(ii) Small measuring rods do not change their length in any gravitationalfield, the acceleration of a clock has no influence on its rate.

(iii) Any coordinate system may be used, since

The general laws of nature are to be expressed by equations which hold good for all systemof coordinates, that is, are covariant with respect to any substitutions whatever (generallycovariant).

The first assumption is that special relativity should be a limiting case whenthe gravitational field vanishes. Since by the principle of equivalence, uni-form acceleration is indistinguishable from a uniform gravitational field,and special relativity should be recovered in the limit where all accelera-tions vanish. However, with regard to the second assumption that rulersand clock are not affected by acceleration is contrary to Einstein’s ownfinding in Sec. 3.8.2.3, and to our conclusions to the contrary in Sec. 8.4.2.

The paradox — which is not a paradox at all — lies in the principle ofequivalence which states that

(ωr)2 = 2GMr

, (9.1.16)

where ω = √(Gρ), which is the inverse of the free-fall time. The left-hand

side of (9.1.16) is the square of (constant) angular velocity of rotation, whilethe right-hand side is the gravitational potential. The passage is from one of

Page 459: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

432 A New Perspective on Relativity

constant density, ρ, to one of constant mass, M. Although mathematicallyequivalent, they are not physically one and the same.

Moreover, a constant gravitational field cannot be annihilated in anaccelerating frame of reference, such as a free-falling elevator, as we shallsee in the next section. Centrifugal forces are not the same as gravitationalforces, as we have realized in Sec. 7.5. How rulers become distorted, andclocks vary in rates depend on the frame of reference: to the inhabitantsof the rotating disc there are no noticeable changes depending on wherethey are on the disc. It is to us Euclideans that the changes are perceptiblebecause we are using Euclidean rulers whereas the inhabitants are employ-ing hyperbolic ones. So the assumption that the rulers be small enough hasno meaning just as the breaking up of a rigid circular disc when set intomotion “on account of the Lorentz contraction of the tangential fibers andthe non-contraction of the radial ones.”

The theme of this chapter is the nonequivalence of gravitation andacceleration, so that when the latter vanishes, all the results of special rel-ativity should be recovered is a non-sequitur. Special relativity attributesto gravity the same form as electromagnetism, and there is no justificationin this. We have Maxwell’s equations to show that electromagnetic wavestravel at the velocity of light, but what equations are there to show thatgravitational waves also propagate at the same velocity? All we have to dois to remind ourselves of the unsatisfactory formulation of a Maxwelliantheory of gravitation that we discussed in Sec. 3.8.1.

The irony of it all is that Einstein sought mathematical help from hisfriend Grossmann, who was an expert not only in tensor calculus, but alsoin non-Euclidean geometries. Grossmann is usually criticized for havingled Einstein astray on the Ricci tensor, but, what he should have suggestedwas looking at hyperbolic geometry as a setting for a theory of gravity.And this was also brought to Einstein’s attention by Varicak, who not onlyquestioned the reality of the Lorentz contraction, but who also introducedLobachevsky geometry into special relativity. Therefore, the true limit ofa relativistic theory of gravitation should be the flat metric of Euclideanspace, and not the flat metric of Minkowski.

The origin of all relativistic corrections are due to negative curvatureof space. Consider the conservation of energy:

r2 + r2ϕ2 + 2�(r) = 2W , (9.1.17)

Page 460: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 433

where W is the total energy, and � is some scalar potential, per unit mass.The conservation of angular momentum is that

r2ϕ = �, (9.1.18)

be constant, again per unit mass. Inserting (9.1.18) into (9.1.17) results in:

r = ±√ [

2W − �2

r2 − 2�(r)

]. (9.1.19)

Suppose further that � has the Beltrami form in (7.5.11),

�(r) = −GMr

(1 + �2

r2

). (9.1.20)

Now, start from the (normalized) equation for the trajectory,

dr= ± �

√E√

G√

(Gη2 − �2), (9.1.21)

where � = �/c is also the ‘distance of nearest approach,’ or collision param-eter, in scattering theory.

For the Beltrami metric, (9.1.13), the coefficients of the fundamentalform are [cf. (7.4.4)]

E = 1(1 − ω2r2/c2)2

, G = r2

1 − ω2r2/c2 .

Introducing the ‘principle of equivalence,’ (9.1.16), the equation of the pre-geodesic (9.1.21) becomes

dr= ϕ

r= ± �

r√[η2r2 − �2(1 − αr)] , (9.1.22)

where α := 2GM/c2 is the Schwarzschild radius. Imposing conservation ofangular momentum, (9.1.18), in (9.1.22) gives the radial equation,

r = ±√ [

1 − �2

r2

(1 − α

r

)], (9.1.23)

for a constant index of refraction. For small α, (9.1.23) has the solution,

r2

1 − α/r− t2 = �2. (9.1.24)

Page 461: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

434 A New Perspective on Relativity

This can be verified as follows. To first-order in α, (9.1.24) is equiva-lent to

r2(1 + α

r

)− �2 = t2.

Taking the square root and differentiating give

r = ±√

(r2 + αr − �2)r + α/2

= ±[

1 + α

r− �2

r2

]1/2

·(1 − α

2r

)

= ±[(

1 + α

r− �2

r2

)·(1 − α

r

)]1/2

= ±√ (

1 − �2

r2 + α�2

r3

),

which is (9.1.23). Expression (9.1.24) has a striking resemblance to theexterior solution of the Schwarzschild metric that we will investigate inSec. 9.10.3.

9.2 The Sagnac Effect

Whereas special relativity teaches us that all inertial frames are equiva-lent, all accelerative frames are not. That is, we can determine the absolutemotion in a uniformly accelerated frame. This was known since the earlydays of relativity, and is referred to as the Sagnac effect [13], after the Frenchscientist who discovered the effect. Actually, he claimed that it proved “thereality of the luminiferous aether by the experiment with a rotating inter-ferometer,” which is the title of one of the two publications dealing withhis interferometer.

The Sagnac effect was used as an experimental test of whether lightcan propagate with the velocity c on a moving platform. A non-null effectwas found, and this was used as an argument to discredit special relativity.It is treated by emission theory, similar to that of the Michelson–Morleyexperiment in Sec. 3.2, where light in the direction of the aether wind has

Page 462: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 435

velocity c +u, while, light in the opposite direction travels at velocity c −u,where u is the relative velocity of the aether wind. On a disc of radiusr, rotating at angular velocity ω, the velocity of light in the direction ofrotation is c + rω, whereas light in the opposite direction travels at velocityc − rω.

The Sagnac effect occurs in ring interferometry where a beam of lightis split into two beams that are made to follow opposite trajectories. Thelight beams travel in circles, or rings, that enclose a given area. Upon return-ing to the initial point on the ring, the light beams are allowed to interactin such a way that they produce an interference pattern. The position ofthe interference fringes depends on the angular velocity ω of the rotatingdisc.

When the platform is in motion one of the two beams will cover lessdistance than the other. This produces a shift in the interference pattern. Theoriginal schematic representation of the Sagnac interferometer is shownin Fig. 9.1. The Sagnac interferometer has been likened to a gyroscope:a compass points in the same direction after spinning up. Thus, just asa gyroscope, it measures its own angular velocity and can be used in aninertial guidance system. However, whereas a gyroscope conserves angularmomentum, the interferometer does not. This fact will be used explicitlyin the following.

The shift in fringes can be considered as a simple statement thatlight travels different distances in the direction of motion and in thedirection opposite to the rotating disc. The times it takes light to travelaround the disc in the direction of motion and in the opposite directionare:

ct± = 2πr ± �L±,

where �L± = ωrt± is the change in the distance that light covers when ittravels in the direction and in the opposite direction to the rotating disc.Hence, the time difference is:

�t = t+ − t− = 2πrc − ωr

− 2πrc + rω

= 4πr2ω

c2 − ω2r2 . (9.2.1)

This result is typical of emission theory, where one of the velocities isgreater than the speed of light. The prediction made by Sagnac was thatthe difference in time traveled by the beams of light to the screen for an

Page 463: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

436 A New Perspective on Relativity

Fig. 9.1. The Sagnac Interferometer as originally depicted in his 1913 article. Thehorizontally rotating plate has a light source at O, which is a lamp with a horizontalmetal filament. The objective of the microscope C0 projects the image of the filamentthrough the Nicol prism N which then falls on a reflecting mirror m. Two beamstraveling in opposite directions are reflected on four mirrors M which complete aclosed circuit, a1a2a3a4.

interference pattern to form depends on the area of the ring, πr2, and theangular velocity of rotation, ω. The shift in the fringes, �ϕ = 2πδ = ωλ�t,where ωλ is the frequency of the light used.

All relativistic frequency shifts depend on the ratio of the perti-nent energy involved to the rest energy. For example, the gravitationalredshift is proportional to the ratio of gravitational energy to the restenergy. Here, the ratio will be proportional to the kinetic energy ofrotation,

�ϕ = 4π�ω/c2, (9.2.2)

to the rest energy, where, as usual, � is the angular momentum (relativeunit mass).

Page 464: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 437

All treatments of the Sagnac effect use the incorrect general relativisticexpression for the line element,

ds2 = dr2 + r2 dϕ2

1 − ω2r2/c2 , (9.2.3)

which we have pointed out, on numerous occasions, is inconsistent witha hyperbolic metric of constant negative curvature. If we use the stereo-graphic model of Sec. 7.4, the metric will be given by

ds2 = dr2 + r2 dϕ2

(1 − ω2r2/c2)2.

This orthogonal metric gives rise to the equation of the geodesic,

dr= ϕ

r= ± �

r2(1 − ω2r2/c2)√[1 − (�2/c2r2)(1 − ω2r2/c2)2] . (9.2.4)

In order that r be a periodic function of time, i.e. rωλ = c sin (ωλt + ϑ),where ϑ is an arbitrary phase, the angular momentum must be given by[cf. (7.4.8)]

� = ωr2

1 − ω2r2/c2 . (9.2.5)

Inserting (9.2.5) into (9.2.2) gives Sagnac’s expression (9.2.1),

�ϕ = 4πωλ ωr2

c2 − ω2r2 = ωλ�t,

for the difference in time traveled by the light beams to the screen in orderto have an interference pattern.

Under the principle of equivalence, (9.1.16), the same expression forthe angular momentum is obtained in general relativity for the advance ofthe perihelion, (7.4.8), where it is claimed that (9.2.5) [Møller 52]

cannot in general be interpreted as angular momentum, since the notion of a ‘radiusvector’ occurring in the definition of the angular momentum has an unambiguousmeaning only in Euclidean space.

The non-conservation of the angular momentum is now due to the presenceof gravity, and vanishes in the absence of mass. This is a summarizingdismissal if ever there was one!

Page 465: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

438 A New Perspective on Relativity

Introducing the expression for the angular momentum, (9.2.5), intothe equation of the trajectory, (9.2.4), gives the rate equation,

r = c

√ (1 − ω2r2

c2

), (9.2.6)

where we have chosen the positive sign for simplicity. The rate equation(9.2.6) is the momentum per unit mass, and so it is related to the action S by

∂S∂r

= r.

Integrating we find the action,

S = c∫ √ (

1 − ω2r2

c2

)dr

= c2

{r

√ (1 − ω2r2

c2

)+ c

ωsin−1

(ωrc

)}. (9.2.7)

Expression (9.2.7) is the phase of the Hermite polynomials in the short-wavelength (WKB) limit. So as long as r < c/ω we get a periodic solutionof a harmonic oscillator which can be quantized,

∮r dr = c2

ω

∫ 2π

0cos2 ϑ dϑ = π

c2

ω, (9.2.8)

where we have used the change of variable, r = (c/ω) sin ϑ. Thus, the quan-tum conditions require the right-hand side of (9.2.8) to be semi-integralmultiples of Planck’s constant. No quantum condition exist for the angularmomentum (9.2.5) because it is not conserved.

What about radii for which r > c/ω? According to special relativity,“for all points with r < c/ω the rotating system of reference may be repre-sented by a uniformly rotating material,” while if the inequality is reversed,“no absolutely rigid body can exist, since they would provide a means oftransmitting signals with velocities greater than c” [Møller 52].

The action, (9.2.7) becomes imaginary in form though not in realityfor we now have

S = c∫ √

(ω2r2/c2 − 1)dr,

Page 466: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 439

and introducing the substitution r = (c/ω) cosh ϑ there results

S = c2

ω

∫sinh2 ϑ dϑ = c2

4ω( sinh 2ϑ − 2ϑ) (9.2.9)

= c2

{r√

(ω2r2/c2 − 1) − cosh−1(ωr

c

)}.

In the same way that we transformed from an oblate to a prolate spheroidin Sec. 5.4.4, we have transformed from a periodic to an exponential solu-tion by reversing the inequality r < c/ω. The hyperbolic action, (9.2.9), willbe finite because the terms in the parentheses of the second line are pro-portional to the volume of the pseudosphere of radius ϑ, which we knowto be finite.

Likewise, if we had used the substitution r = (c/ω) cos ϑ in (9.2.7), wewould have obtained the action,

S = c2

4ω(2ϑ − sin 2ϑ), (9.2.10)

which is proportional to the volume of a sphere in elliptic geometry. Con-sequently, the transition from a uniformly rotating disc with r < c/ω to onewhere r > c/ω is one from elliptic to hyperbolic geometry.

9.3 Generalizations of the Sagnac Effect

The relativistic Sagnac effect considers magnitude of shift in the phase oftwo light beams as the ratio of the potential energy of the centrifugal forceto the rest energy. We can also consider the shift in phase caused by themagnetic energy associated with the angular momentum about the polaraxis. In the same way that the angular momentum � is not conserved in theSagnac effect, so we expect m = r2 sin2 ϑϕ will not be conserved.

Instead of a disc, we take a hemisphere with a disc on it that is deter-mined by the angle ϑ with respect to the normal from the center O of thehemisphere, as shown in Fig. 9.2. The disc will have a perimeter 2π sin ϑ.We now set the hemisphere rotating at a constant angular velocity ω, andagain determine the time difference for light to propagate in the forwardand reverse directions.

Page 467: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

440 A New Perspective on Relativity

Fig. 9.2. Disc cut out of hemisphere at an angle ϑ.

Light traveling with the hemisphere will cover more than onecircumference around the disc and hit the light source from behind in time,

ct1 = 2πr sin ϑ + ωrt1 sin ϑ,

while light traveling in the opposite direction of rotation will travel lessthan the perimeter before colliding with the light source from the frontside. The time it takes is

ct2 = 2πr sin ϑ − ωrt2 sin ϑ.

Solving for the times t1 and t2, and forming their difference, give

�t = t1 − t2 = 4πωr2 sin2 ϑ

c2 − r2ω2 sin2 ϑ.

The accompanying phase shift is

�ϕ = ωλ�t = 4πr2ωωλ sin2 ϑ

c2 − r2ω2 sin2 ϑ,

which we shall now show is

= 4πmωλ/c2. (9.3.1)

Consider the line element that is generalized by the stereographicinner product,

ds2 = dϑ2 + sin2 ϑ dϕ2

(1 − κ2 sin2 ϑ)2,

Page 468: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 441

where κ is still arbitrary, but we know to be related to the absolute constant.Because ϕ is cyclic we immediately have a first integral,

sin2 ϑ ϕ′

(1 − κ2 sin2 ϑ)√

(1 + sin2 ϑ ϕ′ 2)= m

�= sin ϑ0,

where the prime stands for differentiation with respect to ϑ, and ϑ0 isthe minimum value of the colatitude. The magnetic quantum number,m, represents the projection of the angular momentum onto the verticalaxis.

Rearranging we get

dϑ= ϕ

ϑ= ± m

sin2 ϑ

(1 − κ2 sin2 ϑ)√[�2 − (m2/ sin2 ϑ)(1 − κ2 sin2 ϑ)2] .

For κ = 0 this reduces to the well-known equations of classical mechanics,

m = r2 sin2 ϑϕ,

r2ϑ =√ (

�2 − m2

sin2 ϑ

). (9.3.2)

Since m, as well as �, is conserved (9.3.2) can be immediately integrated togive:

∫r2dϑ√

(�2 − m2/ sin2 ϑ)= r2

�cos−1

(cos ϑ

cos ϑ0

)= t − t0. (9.3.3)

This represents the angular distance of a moving point on a great circle fromthe line of nodes, measured on the orbital plane, that occurs in time (t − t0),where t0 has appeared as an arbitrary constant of integration. Inversiongives

cos ϑ = cos ϑ0 · cos�(t − t0)

r2 , (9.3.4)

showing that the moving particle will complete a great circle in periodr2/�. We now compare this with their generalizations in which m is notconserved.

Page 469: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

442 A New Perspective on Relativity

Their generalization for κ �= 0 must be

m = r2 sin2 ϑϕ

1 − κ2 sin2 ϑ, (9.3.5)

r2ϑ = ±√ [

�2 − m2

sin2 ϑ(1 − κ2 sin2 ϑ)2

]. (9.3.6)

If m in (9.3.5) is to be the same as in (9.3.1), it requires setting κ = rω/c,and ϕ = ω, so that the angular momentum in the polar direction will notbe conserved. Introducing (9.3.5) into (9.3.6) results in

r2ϑ = ±√(�2 − �2

0 sin2 ϑ), (9.3.7)

which has a different form than (9.3.2). In fact, it will lead to another periodicfunction whose period depends on ratio of the conserved, �0 = r2ω, to thenon-conserved, �, angular momentum.

Calling that ratio, λ, and integrating (9.3.7) now lead to

∫ sin ϑ

0

dϑ√(1 − λ2 sin2 ϑ)

=∫ z

0

dz√[(1 − z2)(1 − λ2z)] = �

r2 (t − t0). (9.3.8)

Like the integral for arcsine, inverting (9.3.8) gives

z = sin ϑ = �0

�sn

(�(t − t0)

r2

), (9.3.9)

where sn is an elliptic function, coming from the elliptic integral of the firstkind, (9.3.8). For λ = �0/� < 1, the period is real and finite.

The period T satisfies �T/r2 = 2K, where K is the complete ellipticintegral of the first kind. It decreases with increasing values of the angu-lar momentum. The particle on the great circle makes complete revolu-tions; the motion is a libration. Observe that ϑ never changes sign. Itssquare is analogous to the kinetic energy of the particle so that the par-ticle can never come to rest as in the case of a plane pendulum whose totalenergy is greater than the potential energy. And like the plane pendulum,the period is not independent of the amplitude so that the motion is notisochronous.

Page 470: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 443

9.4 The Principle of Equivalence

The principle of equivalence asserts that, in some sense, a field of accel-eration is equivalent to a gravitational field [Fock 66]. That is to say bytransforming to an accelerated frame of reference the gravitational fieldcan be made to disappear. As a consequence, a gravitational field can bemimicked by a field of acceleration, and both can be made to disappear bya transformation to a local inertial frame where material particles behaveas if they were ‘free’ of gravitational or centrifugal forces.

The principle of equivalence has become part of folklore. Gamow [62]tells us that any departure from uniform motion, like a moving car hittinga railing, “will be painfully noticeable.” To explain what is happening,Einstein is said to have devised a gedanken experiment in which he is foundin a rocket ship, as shown in Fig. 9.3, far away from any masses whichwould influence the outcome of any experiment he may perform.

Fig. 9.3. Gamow’s [62] depiction of Einstein’s gedanken experiment showing theequivalence between acceleration and gravity.

Page 471: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

444 A New Perspective on Relativity

Before the rockets are fired, all unattached objects would float freely.However, once the rockets have been turned on, everything that isunattached will suddenly be slammed against the side where the rock-ets are operating. A gravitational field has been produced. An acceleratedrocket ship has remarkably created a gravitational field, and if we can tunethe rockets to the same acceleration experienced on Earth, the inhabitantsof the spacecraft would believe that they are still on Earth.

Now Einstein performs an experiment in which he releases two ballsof different materials. While the balls are in his hand, they are acceleratingalong with him and the spacecraft. But, once released there will be noforce upon them and they will move at a constant velocity when they werereleased. In other words, they will be in a state of uniform motion. But,the spaceship is still accelerating so that at some point the acceleratingfloor will come crashing into them simultaneously. However, to Einstein itwill appear that the balls are falling under the influence of gravity and hitthe floor at the same time. So an accelerating reference frame can mimic agravitational field, or even annul one.

As Einstein concludes:

The implementation of the general theory of relativity [for velocity and for acceler-ation] leads directly to a theory of gravitation; because we can ‘produce’ a gravita-tional field by a mere change in the coordinate systems.

In fact, it has been claimed [Stachel 89] that the seeming equivalencebetween a field of gravity and a non-inertial frame of motion, like a rotatingdisc, was behind Einstein’s search for a geometrical theory of gravitation.Drawing on the putative analogy between the properties of measuring rodsand clocks on a rotating disc with gravity, Einstein [20] writes

In the general theory of relativity space and time cannot be defined in such a waythat differences in the spatial coordinates can be directly measured by the unitmeasuring rod, or differences in the time coordinate by a standard clock.

Einstein was troubled with the relationship between measuring rods andclocks, on the one hand, and the appropriate coordinates to describe accel-erating frames, on the other hand. Although he realized that the observerlocated at the center of a rotating disc will admit that the outer edge of thedisc requires a non-Euclidean geometry, he did not indicate which non-Euclidean geometry is required [Gray 07]. What we will do here is to show

Page 472: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 445

that a uniformly rotating disc can be described by the Beltrami metric ofhyperbolic geometry.

Einstein [89] considered the analogy with a rotating disc in two paperspublished in 1912. The first is an attempt to deal with a stationary, uniform,gravitational field by allowing the speed of light to vary. So he was ledto consider an emission theory where a source emits light traveling at avelocity c, but to a stationary observer who views the source travelingat a relative velocity u, the speed of light will appear to be c + u. As wehave seen in Sec. 3.2, emission theory is used to analyze the Michelson–Morley interferometer experiment, but is at odds with the Fresnel draggingcoefficient in Fizeau’s experiment, which requires the relativistic addition ofvelocities. Oddly enough this was after he made his ‘agreement’ to disagreewith Ritz, which we discussed in Sec. 4.2.2.

Einstein then went on to consider spatially inhomogeneous, but againtime-independent, gravitational fields. Here, he drew on the analogy witha rotating disc for which he had to downgrade his equivalence principle toinfinitesimal regions, or what mathematicians would call tangent spaces.To Einstein gravity acts as a non-uniform force, affecting all bodies equally,but varying from point to point. In his own words:

Let us consider a space-time domain in which no gravitational field exists relativeto a reference body K whose state of motion has been suitably chosen. . . Let ussuppose the same domain referred to a second body of reference K′, which is rotatinguniformly with respect to K. In order to fix our ideas, we shall imagine K′ to be inthe form of a plane circular disc, which rotates uniformly in its own plane aboutits center. An observer who is sitting eccentrically on the disc K′ is sensible to aforce which acts outwards in a radial direction, and which he would interpret as aneffect of inertia (centrifugal force) by an observer who was at rest with respect tothe original reference body K. But, the observer on the disc may regard his disc as areference body which is ‘at rest’; on the basis of the general principle of relativity heis justified in doing this. The force acting on himself, and in fact on all other bodieswhich are at rest relative to the disc, he regards as the effect of a gravitationalfield. Nevertheless, the space distribution of this gravitational field is of a kind thatwould not be possible in Newton’s theory of gravitation. (The field disappears atthe center of the disc and increases proportionally to the distance from the centeras we proceed outwards.) But since the observer believes in the general theory ofrelativity, this does not disturb him, he is quite in the right when he believes that ageneral law of gravitation can be formulated by a law which not only explains themotion of the stars correctly, but also the field of force experienced by himself.

So what Einstein is saying is that if we can solve the uniformly rotating disc,we have solved the problem of an inhomogeneous gravitational field. The

Page 473: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

446 A New Perspective on Relativity

confusion that existed in Einstein’s mind is exemplified by the followingpassage:

. . . at this stage the definition of space coordinates also represents insurmountabledifficulties. If the observer applies his standard measuring rod tangentially to theedge of the disc, then, as judged from the Galilean system, the length of this rod willbe less than 1, since, moving bodies suffer a shortening in the direction of motion.On the other hand, the measuring rod will not experience a shortening in length, asjudged from K, if it is applied to the disc in the direction of the radius. If, then, theobserver first measures the circumference of the disc with his measuring rod andthen the diameter of the disc, on dividing one by the other, he will not obtain asquotient the familiar number π = 3.14 . . ., but a larger number, whereas, of course,for a disc at rest with respect to K, this operation would yield exactly. This provesthat the propositions of Euclidean geometry cannot hold exactly on a rotating disc,nor in general in a gravitational field, at least if we attribute the length l to the rodin all positions and in every orientation. Hence the idea of a straight line also losesmeaning. We are therefore not in a position to define exactly the coordinates x, y, zrelative to the disc by means of the method used in discussing the special theory,and as long as the coordinates and times of events have not been defined, we cannotassign an exact meaning to the natural laws in which these occur.

Granted geodesics are no longer straight lines, but this does not meanthat the uniformly rotating disc does not possess a well-defined metric.The distinction between the uniformly rotating disc and gravity will comenot from the definition of the metric, but, from the non-constancy of thecurvature of the metric, as we will appreciate in Sec. 9.6.

Rather, Poincaré offers a concrete model in his unevenly heated discthat we discussed in Sec. 2.1.1. Length, as it would appear to us outsideof the disc, will become distorted, and how much it will be distorted willdepend on the radius R of the disc, which is the absolute constant. Distance,Poincaré now defines as dr = dr/(1 − r2/R2), which for a finite tract isr = R tanh−1 (r/R).

Einstein [20] had to be familiar with Poincaré’s ideas about spacebecause he uses the unevenly heated slab as an illustration in his presenta-tion of his general theory as he did about time measurements in the specialtheory by bouncing off light signals between observers in different inertialframes. Einstein now considers a grid of little squares on a marble slab asconstituting a “Euclidean continuum with respect to a little rod, which hasbeen used as a ‘distance’ (line-interval).” He then considers the thermaldeformation of the rod:

We shall suppose that the rods ‘expand’ by an amount proportional to the increasein temperature. We heat the central part of the marble slab, but not the periphery, in

Page 474: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 447

which two of our little rods can still be brought into coincidence at every positionon the table. But, our construction of squares must necessarily come into disorderduring the heating, because the little rods on the central region of the table expand,whereas those on the outer part do not. With reference to our little rods — definedas unit lengths — the marble slab is no longer a Euclidean continuum, and we arealso no longer in the position of defining Cartesian coordinates directly with theiraid, since the above construction can no longer be carried out. . . The method ofCartesian coordinates must then be discarded, and replaced by another which doesnot assume the validity of Euclidean geometry for rigid bodies. The reader willnotice that the situation depicted here corresponds to the one brought out by thegeneral postulate of relativity.

If Euclidean geometry is to be discarded, then what must take its place?Einstein goes on to tell us:

Gauss indicated the principles according to which we can treat the geometrical rela-tionships in the surface, and thus pointed out the way to the method of Riemannof treating multi-dimensional, non-Euclidean continua. Thus, it is that the mathe-maticians long ago solved the formal problems to which we are led by the generalpostulate of relativity.

But, we know Gauss never published anything on non-Euclidean geom-etry, apart from the occasional correspondence. So it was not Gauss whopointed out the way to Einstein through Riemannian geometry. In otherwords, Einstein finds the necessity of creating a different edifice than theone which has already been constructed. He chose not to avail himself of theexisting non-Euclidean geometries of constant curvature, but chose a paththat would appear to be a generalization of his special theory of relativityof space-time, and referred to it as ‘the general principle of relativity.’

Einstein’s assumption that there are no privileged systems of coordi-nates, which seemingly appears as a generalization of the covariant formof his special relativity, is a red herring. The principle of relativity, whichmakes equivalent the observer and what he is observing, relies on iner-tial reference frames. In the presence of gravitation these frames simplydo not exist. The inability to distinguish between cause and effect, or thereciprocity of phenomena in electrodynamics, like the observation

during the relative motion of a magnet with respect to a conducting circuit, anelectric current is induced in the latter. It is all the same whether the magnet ismoved or the conductor; only the relative motion counts.

The observation was made by the sixteen year-old Einstein, but has noplace in gravitation. In the words of Fock [66] “in the ‘General Theory ofRelativity’ there is less relativity and not more than in the ‘special theory’.”

Page 475: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

448 A New Perspective on Relativity

No such dilemma occurs if we can show that the metric of a uniformlyrotating disc corresponds exactly to the Beltrami metric. Whether or not itapplies to gravity is another matter.

As a result of the deformations caused by an accelerated frame ofreference, the principle of equivalence had to undergo qualifications andrestrictions. First, and foremost, the equivalence was a local one, at a singlepoint in space [Fock 66]. Second, the deformations on the body and onthe measuring sticks used in measurement must be small enough so thatthe notion of a rigid body retains meaning [Møller 52]. Thus, an equivalencebetween an accelerated frame of reference, caused by a rigid uniformlyrotating system and a gravitational field, should hold if the motion of theformer is slow enough and the field created by the latter is weak enough.

Deformations, whether large or small, cause deviations fromEuclidean geometry. If the measuring rods are contracted on a rotatingdisc in the direction of rotation, we cannot expect the Pythagorean the-orem to hold in its Euclidean form for any inscribed right triangle. Thus,what might be locally Euclidean may very well deviate to other geometrieswhen stresses are present. Not long after the advent of special relativityRobb [11] noticed that the Euclidean triangle of velocities must be replacedby a Lobachevsky triangle for large velocities.

Distortions which require a geometry different from Euclidean geom-etry also produce optical effects [Robb 11]. When light encounters a changein density of the medium, the rays bend in such a way that they minimizetheir propagation time between ray end-points. According to Huygens’s[62] principle, objects are not where they appear to be, but, are slightlydisplaced due to the curvature of the rays. In an analogous way that a non-constant index of refraction relates the Euclidean distance to the opticalpath length, a metric density relates the Euclidean distance to the hyper-bolic distance.

The hyperbolic geometry that describes a uniformly rotating disc alsodescribes the bending of light by a massive body when the transition ismade from a constant to a non-constant surface of negative curvature. Thetransition occurs by relating the absolute constant of the hyperbolic geom-etry to a free-fall time and then transferring from a system of constantdensity to one of constant mass. A uniformly rotating disc has constantdensity, while, in the deflection of light, the mass is constant. In free-falling

Page 476: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 449

frames, the laws of physics are locally the same as in inertial frames, so thatthis transformation does not introduce anything that would have an effecton non-inertial, or gravitational, forces. We will thus come to appreciatethat not all forms of accelerative motion are equivalent.

9.5 Fermat’s Principle of Least Time and HyperbolicGeometry

As we know from Secs. 2.2.3 and 7.2.2, Fermat’s principle of least timestates that light propagates between any two points in such a way as tominimize its travel time. Fermat knew that light travels more slowly indenser materials, but he did not know whether or not light travels at afinite speed or infinitely fast.

The index of refraction, η, takes into account the inhomogeneitiesthrough which light propagates. Over scales in which the Earth appearsas a flat surface y = 0, η is a function only of the height y. The optical pathlength, I, or the product of the propagation time and the velocity of lightconnecting two points, (x1, y1) and (x2, y2), in a plane extending above andnormal to the surface, is

I =∫ x2

x1

η(y)√

(1 + y′2) dx, (9.5.1)

where the prime stands for differentiation with respect to x. Moreover, ifwe assume that the index of refraction decreases with height by makingit inversely proportional to its distance y from the x-axis, Fermat’s princi-ple of least time, (9.5.1), becomes the Poincaré upper half-plane model ofhyperbolic geometry. Poincarites from the heated plane model of Sec. 2.1.1,find their rulers shrink as they do when approaching the x-axis so that theboundary appears infinitely far away.

The geodesics look much different than straight lines connecting twopoints in Euclidean geometry. The geodesics can be found from the condi-tion that Fermat’s principle (9.5.1) be an extremum. With η(y) = 1/y, theEuler equation for the extremality of (9.5.1) is

ddx

∂�

∂y′ = ∂�

∂y, (9.5.2)

Page 477: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

450 A New Perspective on Relativity

where � = √(1 + y′2)/y is the integrand of (9.5.1). The solution to the

resulting differential equation,

y′′ + (1 + y′2)/y = 0,

is the family of circles,

(x − a)2 + y2 = b2,

where a and b are two constants of integration. These circles are centered onthe x-axis, and since we are considering only the upper half-plane, y > 0,the half-circumferences will be the geodesics of our space. As x1 → x2 thegeodesics straighten out into lines parallel to the y-axis.

Employing polar coordinates, the arc length, γ , between (r1, θ1) and(r2, θ2) cannot be less than [cf. (2.3.8)]

h(γ) = κ

γ

√(dx2 + dy2)

y

= κ

∫ θ2

θ1

√(r′2 + r2)r sin θ

≥ κ

∫ θ2

θ1

sin θ= κ ln

(csc θ2 − cot θ2

csc θ1 − cot θ1

),

where κ, the radius of curvature, is the absolute constant of the hyperbolicgeometry. Different hyperbolic geometries with different values of κ arenot congruent [Busemann and Kelly 53]. In the limit as θ2 → π/2, the angleθ1 becomes the angle of parallelism,

γ = κ ln cot[�(γ)/2]. (9.5.3)

This is still another way of deriving Bolyai–Lobachevsky formula thatexpresses the angle of parallelism, �, as a sole function of the hyperbolicarc length γ , which is shown in Fig. 9.4 to be the shortest distance con-necting the bounding parallels �1 and �2. The angle of parallelism entersin the analysis of the Terrell–Weinstein effect which relates the FitzGerald–Lorentz contraction, to a rotation [cf. Sec. 9.9].

We have already remarked in Sec. 2.1.1 that Poincaré originally con-ceived of an unevenly heated slab where the x-axis is infinitely cold. As thePoincarites approach the x-axis, the drop in temperature causes them andtheir rulers to contract in exact proportion as they do. We also know from

Page 478: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 451

Fig. 9.4. The angle of parallelism between two bounding parallels connected bythe geodesic curve γ .

Sec. 2.5 that Poincaré also considered a disc model of hyperbolic geometry.If the Poincarites living in the half-plane and disc could communicate withone another there would be nothing that would allow them to distinguishbetween these two worlds.

We can think of the Poincaré sphere, in three-dimensions, as possess-ing an index of refraction that varies from the center of the sphere to itssurface as being proportional to (R2 − r2), where r is the Euclidean distancefrom the sphere’s center, and R, is the radius of the sphere. From the forego-ing quote, we know that Einstein was familiar with this model, just whenwe do not know.

If R happens to be a star’s radius its temperature at its surface wouldbe infinitely cold, just like the x-axis in the upper half-plane model. Byrescaling to a unit radius, the Poincaré unit disc model has a hyperboliclength of a curve γ given by

γ = κ

γ

√(dx2 + dy2)

1 − x2 − y2 , (9.5.4)

where the ‘stereographic’ inner product of the hyperbolic plane, 1−x2 −y2,plays the same role as the inverse of the index of refraction in (9.5.1). Thehyperbolic length (9.5.4) is the metric that gives the interior of the unit discits hyperbolic structure. It can be derived by mapping the entire half-planeinto the unit disc by means of an inversion [Needham 97], but that will notconcern us here since it involves entering the complex plane [cf. Sec. 2.4].

The absolute constant of hyperbolic geometry determines whetherwe are in configuration or velocity space, and we will have the occasion to

Page 479: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

452 A New Perspective on Relativity

switch back and forth. The radii of the Poincaré discs will set the limitationsimposed by relativity: Whereas in velocity space the disc will have radiusc, the radius will be c/ω in the configuration space of a uniformly rotatingdisc, where ω is the constant angular speed of rotation. The curvature ofthe space is negative and constant [cf. (9.6.17) below]. The transition tonon-constant curvature consists in replacing the angular velocity by thefree-fall frequency, thereby transferring a system at constant density to oneof constant mass. From this we conclude that

it is either configuration space or velocity space which determines themetrical properties, and not space-time. Time enters in the magnificationof the Beltrami coordinates in velocity space.

To the Poincarites living in the κ-disc their world would appear infi-nite because their rulers shrink along with them as they approach therim, κ. This can be seen by introducing the hyperbolic polar coordinates,x = κ tanh (r/κ) cos ϑ and y = κ tanh (r/κ) sin ϑ so that the hyperbolic metric(9.5.4) becomes

dγ2 = dr2 + κ2 sinh2 (r/κ)dϑ2. (9.5.5)

In this polar geodesic parametrization, E = 1 and G > 0, where√

G is themeasure at which the radial geodesics are spreading out from the ori-gin [O’Neill 66]. Because sinh x > x for all x > 0, the rate at which thegeodesics spread out, κ sinh (r/κ), will be greater in the hyperbolic planethan the Euclidean plane, where the rate of spreading is r, which is whatthe hyperbolic rate tends to in the limit as κ → ∞. Consequently, dis-tances become larger, or equivalently, measuring sticks shrink as the rimis approached. This shrinkage causes the geodesics to bend in such a waythat they cut the rim orthogonally.

The Euclidean parallel postulate, that if a point is not on a given linethen there is a unique line through this point that does not meet that line,is invalidated in the hyperbolic plane. In fact, there are an infinite numberof geodesics that pass through any given point that do not meet anothergeodesic.

The geodesics still appear as straight lines to the Poincarites, whereas,to us Euclideans, they appear to be bent, and things vary in size depending

Page 480: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 453

on where we look. We also see things like the bending of light and theshifting of frequencies in a gravitational field.

9.6 The Rotating Disc

Consider a rotating κ-disc, where κ is the relativistic limit that is placed onthe radius vector. At the center of the disc we have an inertial system whichis described by Euclidean geometry. A clock located anywhere else on thedisc will have a velocity rω relative to the inertial system, and consensushas it that its clock will be retarded by the amount,

τ = t

√(1 − r2ω2

c2

). (9.6.1)

This fixes the absolute constant, or the disc radius, at κ = c/ω.Now, it is argued [Møller 52] that any rod in motion should undergo

a FitzGerald–Lorentz contraction. This means that any two points on thedisc that are at a distance r from the center, say, (r, ϑ) and (r, ϑ + dϑ), andare connected by a measuring rod, is shortened with respect to the lengthof the rod in the inertial frame dr0 by an amount,

r dϑ = dr0√

(1 − r2/κ2).

From our earlier discussion, we expect the geodesics to be either bow-shaped or straight lines if they pass through the origin. We will now givea geometric explanation of why successive Doppler shifts occur with rota-tions. In order to do so, we must determine the ratio of the hyperbolic toEuclidean lengths.

Consider two variable points, u and v, on the interval (x1, x2). Thehyperbolic distance between u and v is given by the cross-ratio, defined inSec. 2.2.4,

h(u, v) = κ

2ln

[e(x1, u)e(x1, v)

· e(x2, v)e(x2, u)

]

= κ

2ln

[1 + e(u, v)

e(v, x1)

]+ κ

2ln

[1 + e(u, v)

e(u, x2)

],

where e(u, v) is the Euclidean distance between u and v, and e(x1, x2) =e(x1, v)+ e(v, x2). Since we will let u and v tend to a common limit, p, we can

Page 481: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

454 A New Perspective on Relativity

expand the logarithms in series and retain only the lowest order to obtain,in the limit, the metric density [Busemann and Kelly 53],

limu,v→p

h(u, v)e(u, v)

= κ

2

[1

e(p, x1)+ 1

e(p, x2)

]=: �(p), (9.6.2)

which is the inverse of the harmonic mean of the two distances.In respect to Fermat’s principle of least time, (9.5.1), the metric den-

sity (9.6.2) can be associated with a non-constant index of refraction for itconverts the Euclidean distance, de, into a hyperbolic distance, dh. Just asthe index of refraction varies with height, by causing light to arch its pathupwards in order to minimize its propagation time between given end-points, acceleration, in general, creates distortion causing objects to vary insize and not be where they seem to be [Huygens 62].

In order to obtain an explicit expression for the metric density, (9.6.2),we use two elementary facts about circles:

(i) All chords passing through an interior fixed point are divided into twoparts whose lengths have a constant product,

e(p, x1)e(p, x2) = (1 + r/κ)(1 − r/κ) = 1 − r2/κ2,

and(ii) the length of a chord is twice the square root of the squares of the

difference between the radius and the perpendicular distance from thecenter to the chord,

e(p, x1) + e(p, x2) = e(x1, x2) = 2

√ (1 − r2

κ2 sin2 φ

).

Combining these two geometrical facts gives

�(u1, u2) = κ

√ [1 − (r2/κ2) sin2 φ

]

1 − r2/κ2 , (9.6.3)

where φ is the angle formed by the intersection of lines �1 and �2 in Fig. 9.5.The lines intersect at a point p in the κ-disc. The polar coordinates are

u1 = (r/κ) cos ϑ, u2 = (r/κ) sin ϑ, (9.6.4)

in either velocity or configuration space, where the radius of curvature κ

has the values c and c/ω, respectively.

Page 482: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 455

Fig. 9.5. Geometric characterization of the metric density.

Whereas lengths are relative in Euclidean geometry, and anglesare absolute, the relation between lengths and angles in hyperbolicgeometry makes lengths, as well as angles, absolute.

Denoting χ as the angle of inclination of the tangent line �1 we have

tan χ = du2

du1= r′ sin ϑ + r cos ϑ

r′ cos ϑ − r sin ϑ. (9.6.5)

Moreover, since χ = ϑ + π − φ, we get

tan χ = tan ϑ − tan φ

1 + tan ϑ tan φ. (9.6.6)

Equating the two expressions (9.6.5) and (9.6.6) we find

tan φ = −r/r′ = −rϑ′, (9.6.7a)

− sin φ = u1 du2 − u2 du1

r√

(du21 + du2

2)= rϑ′

√(1 + r2ϑ′ 2)

, (9.6.7b)

cos φ = u1 du1 + u2 du2

r√

(du21 + du2

2)= 1√

(1 + r2ϑ′ 2). (9.6.7c)

Page 483: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

456 A New Perspective on Relativity

Consequently,b

dh2(u1, u2) = κ2 du21 + du2

2 − (u1 du2 − u2 du1)2

(1 − u21 − u2

2)2

= κ2 dr2 + r2 dϑ2(1 − r2/κ2)(1 − r2/κ2)2

= E dr2 + G dϑ2, (9.6.8)

is the square of the hyperbolic line element expressed in terms of u1 and u2,and polar coordinates, r and ϑ. The metric coefficients in the fundamentalform are

E = κ2

(1 − r2/κ2)2, (9.6.9a)

G = κ2r2

1 − r2/κ2 . (9.6.9b)

The coefficients of the fundamental form determine the equation ofthe trajectory by requiring that the integrand in Fermat’s principle,

� = √(E + Gϑ′ 2),

be an extremum. Since the fundamental coefficients will, in general, notcontain the variable ϑ, it will be a cyclic coordinate meaning that thereexists a first integral of the motion,

∂�

∂ϑ= Gϑ′

√(E + Gϑ′ 2)

= � = const.,

where � = �/c, is, again the collision parameter, or distance of closestapproach [cf. Eq. (9.1.21)]. Solving for ϑ′ gives the equation of the trajectory

ϑ′ = ± �√

E√G

√(G2 − �2)

. (9.6.10)

bGeneral relativity proposes a metric [Møller 52, Eq. (7) on p. 224]

dh =√[

dr2 + r2 dϑ2

(1 − r2ω2/c2)

],

where the first term, at ϑ = const., does not integrate to give the hyperbolic measureof distance, but, rather, gives its Euclidean measure, r.

Page 484: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 457

This is also known as the equation for the geodesic in the Clairautparametrization [O’Neill 66].

The first two terms in the numerator of (9.6.8) is twice the Euclideankinetic energy,

2T = r2 + r2ϑ2, (9.6.11)

in the Euclidean limit κ → ∞, when the metric is divided through by dt2.The second term in (9.6.11) can be written as �2/r2, where

�e = r2ϑ (9.6.12)

is the angular momentum per unit mass. As we have seen, the Beltramimetric (9.6.8) conserves the angular momentum, (9.6.12), while the stereo-graphic inner product model does not. This is a consequence of the factor1 − r2/κ2 in the numerator of (9.6.8).

For uniform radial motion ϑ = const., (9.6.8) reduces to

dh = � dr = dr(1 − r2/κ2)

, (9.6.13)

whose integral is

h = κ tanh−1 (r/κ). (9.6.14)

Alternatively, for uniform circular motion, r = const., (9.6.8) reduces to

dh = r dϑ√(1 − r2/κ2)

. (9.6.15)

Integrating over a period,

h = 2πr√(1 − r2/κ2)

> 2πr, (9.6.16)

shows that the length of a hyperbolic circle of radius sinh (h/κ) is greaterthan that of an Euclidean circle having a radius r [= κ tanh (h/κ)]. This isnone other than Einstein’s old result!

Page 485: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

458 A New Perspective on Relativity

We can further justify inequality (9.6.16) by considering the circum-ference of a hyperbolic circle with center O and radius (9.6.14) in Fig. 9.5,which is determined by observing that every point p on this circle hasφ = π/2. Thus, � = κ/

√(1 − r2/κ2) = κ cosh (h/κ), and this value

multiplied by 2πr/κ = 2π tanh (h/κ), gives the hyperbolic circumference2πκ sinh (h/κ).c

The metric coefficients in (9.6.8), (9.6.9a) and (9.6.9b), determine theconstant, Gaussian curvature of

K = − 12√

(EG)ddr

(Gr√(EG)

)= −1/κ2, (9.6.17)

where the subscript denotes the derivative. The metric coefficients alsodetermine the geodesics from (9.6.10),

ϑ′ = ± �

r2√ [1 − (�/r)2(1 − r2/κ2)

] . (9.6.18)

The geodesics are straight lines,

r cos (ϑ − ϑ0) = γ , (9.6.19)

where ϑ0 is the angle that the normal to the line r = γ makes with the polaraxis, as shown in Fig. 9.5. We already know that the geodesics must passthrough the origin, and these have an inclination ϑ = ϑ0 with respect tothe polar axis. In physical terms, there is nothing to counter the centrifugalforce that would allow for the formation of a closed orbit.

We now inquire into the physical meaning of φ whose tangent isrelated to the equation of the geodesics according to (9.6.7a). The equa-tions of aberration, (8.3.6) and (8.3.9) are here given by

u cos φ′ = u1 cos φ − u2

1 − u1 · u2 cos φ/c2 , (9.6.20a)

u sin φ′ = u1 sin φ√

(1 − u22/c2)

1 − u1 · u2 cos φ/c2 . (9.6.20b)

cThis is what the relativists confuse with the expansion factor of the universe as weshall see in Sec. 9.11.

Page 486: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 459

They can be used to derive the most general composition law of velocities.Squaring (9.6.20a) and (9.6.20b), and adding, result in

u2 = (u1 − u2)2 − (u1 × u2)2/c2

(1 − u1 · u2)2

= u21 + u2

2 − 2u1 · u2 cos φ − (u1 · u2 sin φ)2

(1 − u1 · u2 cos φ/c2)2. (9.6.21)

Dividing (9.6.20b) by (9.6.20a) gives

tan φ′ = u1 sin φ√

(1 − u22/c2)

u1 cos φ − u2. (9.6.22)

Whereas the composition law of velocities, (9.6.21), invalidates thelaw of cosines,

u2 = u21 + u2

2 − 2u1 · u2 cos φ, (9.6.23)

(9.6.22) invalidates the law of sines. We can appreciate this by consideringthe phenomenon of stellar aberration, where u1 = c and u2 equals theEarth’s velocity. Equation (9.6.22) becomes

tan φ′ = sin φ√

(1 − β2)cos φ − β

, (9.6.24)

where β = u2/c.

Fig. 9.6. Geometric set-up for stellar aberration.

Page 487: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

460 A New Perspective on Relativity

In Fig. 9.6, L represents the telescope’s lens, and O the eye of theobserver at the moment a light ray arrives at L from a star S. OE indicatesthe direction in which the Earth is orbiting about the Sun. In the time τ

that it takes for the light ray to pass through the telescope, the Earth willhave moved a distance u2τ to position O′. The distance between the lensand the new position of the Earth is τ. If the Earth were stationary then thetelescope would be pointed along O′L, but, because of the Earth’s motion,it is pointed in direction OL. Drawing O′L′ and LL′ completes the paral-lelogram. Denote ∠LO′E by φ′ and ∠L′O′E by φ. The difference φ′ − φ isattributed to stellar aberration, which displaces the star’s actual positiontoward the direction OE in the plane SO′E.

The law of sines for the triangle LO′L′ is

sin ∠LO′L′

LL′ = sin ∠LL′O′

LO′ . (9.6.25)

Introducing the facts that LL′ = OO′ = u2τ, and LO′ = τ, we get

sin (φ′ − φ) = β sin φ. (9.6.26)

Since the Earth’s relative velocity, β := u2/c = 10−4, is small, we mayreplace the sine by its argument to get the formula,

�φ := φ′ − φ = β sin φ, (9.6.27)

which is commonly used to calculate aberration [Smart 60], where β iscalled the constant of aberration. This Euclidean approximation is equiv-alent to approximating the square root in (9.6.22) by unity, and neglectingterms of higher power than first in β so that (9.6.22) will reduce to

tan φ′ � tan φ(1 + β sec φ).

Then performing the expansion of the trigonometric functions to first-orderin the difference �φ results in (9.6.27).

The violations of the laws of cosines and sines, (9.6.23) and (9.6.25),mean that the addition law of velocities cannot be represented as a trian-gle in the flat Euclidean plane, but, rather, as a distorted triangle on thesurface of a pseudosphere, a surface of revolution with constant negativecurvature resembling a bugle surface in Fig. 2.19. The bugle has a rim, sothat it occupies only a finite region of the hyperbolic plane, and it obeys thehyperbolic axiom that for any given line � and point p not on �, there are

Page 488: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 461

at least two lines through p that do not intersect �. The angle defect of thetriangle is a direct consequence of this axiom, and the area of the triangleis proportional to its defect, as we have seen in Sec. 2.1.1.

Furthermore, the parallelogram rule for the addition of velocities inNewtonian kinematics is no longer valid, and we are left only with the tri-angle rule. Thus, (9.6.22) can be considered as the relativistic generalizationof the law of aberration that would be applicable to relativistic velocities.

The plane of orbit that the Sun traces out in a year is called the eclipticplane. The great circle in which this plane intersects the celestial sphere, atwhose center the Earth is found, is called the ecliptic. If the fixed star is at thepole of the ecliptic, φ = π/2 all along the Earth’s orbit. The aberrational orbitwill be a circle about the pole of the ecliptic with radius β. This correspondsto the circle C with center O and Euclidean radius r in Fig. 9.5. For everypoint p on the locus of points at a given distance from the center, φ = π/2,and � = κ/

√(1 − r2/κ2) = κ cosh (r/κ). Multiplying this by the Euclidean

circumference, 2πr = 2π tanh (r/κ) gives the circumference of the hyperboliccircle, 2πκ sinh (r/κ), a result we found earlier.

For stars that lie in the ecliptic, φ varies between ±π/2 and 0 [Som-merfeld 64]. A hyperbolic motion which takes O to p transforms a circle Ccentered at O into an ellipse E centered at p, shown in Fig. 9.5. At the fixedpoint p, the function � reaches its maximum value κ/(1−r2/κ2) at φ = 0, andits minimum value κ/

√(1−r2/κ2) at φ = π/2. Therefore the semi-major and

semi-minor axes of the ellipse are√

(1 − r2/κ2)/κ and (1 − r2/κ2)/κ, whichare the inverse of the values of � at φ = π/2 and φ = 0, respectively. TheEuclidean area of the ellipse is

π(1 − r2/κ2)3/2

κ2 = π

κ2 sech3(

).

If we use the hyperbolic definition of angular momentum for the stere-ographic inner product model that we found previously in (9.2.5),

�h = r2ϑ

1 − r2/κ2 , (9.6.28)

then we must modify the equation for the radius of the trajectory,

r = ±c

(1 − r2

κ2

) √ [1 − �2

r2

(1 − r2

κ2

)], (9.6.29)

Page 489: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

462 A New Perspective on Relativity

if and only if � = �e/c, where �e = r2ϑ, the Euclidean expression for theangular momentum.

For the Beltrami metric, (9.6.28) is an option since (9.6.29) must alsobe modified. However, for the stereographic hyperbolic metric, (9.6.28) isno longer an option, it is a must. Møller [52] claims that the notion of a‘radius vector’ in the definition of angular momentum can only be definedunambiguously in the Euclidean plane, and assumes that (1 − r2/κ2)−1 isa small correction to the angular momentum, so that there is only a slightviolation of the conservation law of angular momentum. Slight, or not, itis still a violation of a conservation law! Rather, (9.6.28) is to be consideredas the hyperbolic conservation law for angular momentum.

General relativity agrees with (9.6.28), but not with (9.6.29). The radialequation must then be

r = ±c

√ 1 − �2

r2

(1 − r2

κ2

)2, (9.6.30)

if it is to correspond to the stereographic hyperbolic metric. Consequently,it leads to an equation of the trajectory of the form

ϑ′ = ± �(1 − r2/κ2)/r2√[1 − (�/r)2(1 − r2/κ2)] , (9.6.31)

whose solutions are bowed geodesics whose centers lie outside the disc aswe have shown in Sec. 7.5.

The bending of the geodesic is attributed to the rotation of thedisc [Grøn 04]. But if the stereographic hyperbolic metric were applica-ble it would invalidate Einstein’s old result (9.6.16), and it would also bein conflict with his general theory. There is no way out for (9.6.28) to hold,and yet come out with inequality (9.6.16).

To calculate the hyperbolic distance between points r1 and r2,

h(r1, r2) =∫ r2

r1

√ [dr2 + r2dϑ2(1 − r2/κ2)

]

(1 − r2/κ2), (9.6.32)

we introduce the ‘effective’ centrifugal potential,

�c(r) = �2

2r2

(1 − r2

κ2

), (9.6.33)

Page 490: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 463

where relativistic effects are accounted for in the second term. Squaring(9.6.29) gives the hyperbolic energy conservation law,

r2 + 2�c(r) = c2, (9.6.34)

if � = �e in (9.6.33) so that the factor in front of the square root in (9.6.29)does not belong there.

In terms of the effective potential, we can write the hyperbolic distance(9.6.32) as the logarithm of the cross-ratio,

h(r1, r2) = κ

2ln

[κ + r2

√(1 − 2�c(r2)/c2)

κ − r2√

(1 − 2�c(r2)/c2)· κ − r1

√(1 − 2�c(r1)/c2)

κ + r1√

(1 − 2�c(r1)/c2)

].

(9.6.35)

This clearly shows that it is the cross-ratio that determines the hyperbolicdistance between any two points. At low angular momentum, the cross-ratio in (9.6.35) simplifies to

h(r1, r2) = κ

2ln

[(κ + r2

κ − r2

)·(

κ − r1

κ + r1

)]= κ

2ln{r1, r2|κ, −κ} (9.6.36)

between the ordered points (κ, r2, r1, −κ). The hyperbolic distance (9.6.36)vanishes when r1 = r2 and tends to infinity when either r2 ↑ κ or r1 ↓ −κ.To the Poincarites, it would seem like the rim is infinitely far away.

If the uniform acceleration is caused by gravity, this will fix the radiusof curvature as

κ = √(3/4πGρ), (9.6.37)

for a mass of constant density ρ, and G is the Newtonian gravitational con-stant. The factor, κ appears as a free-fall time. Free-falling objects know norestrictions placed on their velocities, like the restriction to regions of therotating disc where r < c/ω. The frequency ω = √

(4πGρ/3) is the minimumfrequency which an object must rotate in order to avoid gravitational col-lapse. But if we apply the free-fall frequency to the same condition as thatof the disc, we get

r <

√ (3

8πGρ

)c (9.6.38)

which is the density formulation of the Schwarzschild inequality α/r < 1.

Page 491: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

464 A New Perspective on Relativity

With inequality (9.6.38) a constant free-fall time is thus compatiblewith a uniformly rotating system. This choice of the absolute constant setsthe Gaussian curvature (9.6.17) directly proportional to the constant massdensity, viz.,

K = −1/κ2 = −43πGρ.

In the hyperbolic space of constant negative curvature, a gravitationalpotential cannot be appended onto the energy conservation law (9.6.34) asa separate entity. However, if mass, rather than density, is constant, thecurvature will no longer be constant so that gravitational acceleration willnot be equivalent to a uniformly rotating disc. Although the curvature willno longer be constant, it will show that gravitational effects enter, not onlythrough a potential in energy conservation, but, also in the specificationof the absolute constant κ that determines the point at infinity, or the idealpoint, where two parallel lines intersect. We will come back to this case inSec. 9.10.

9.7 The FitzGerald–Lorentz Contractionvia the Triangle Defect

The fact that a hyperbolic triangle must be fitted onto a pseudosphere inhyperbolic space, rather than lying flatly in the Euclidean plane, causes anangle defect where the sum of the angles of the triangle is less than two rightangles. The curvature of space implies a ‘fitting error.’ If the curvature is likea cylinder of a hat, it will produce an angle defect as shown in the pictureon the left in Fig. 9.7. Moreover, since the angles of a hyperbolic triangledetermines the sides, we can expect the defect to shorten the length of atleast one of its sides.

Likewise, a positive curvature also produces a fitting error by tryingto place a flat object on a sphere in the picture on the right in Fig. 9.7. Thistime there will be an angle excess so that the sum of angles of a trianglewill add to more than π. Consequently we can expect a lengthening ofthe sides of the triangle, which physically corresponds to a space dilata-tion. Since the triangles lie in velocity space we can expect that the angledefect will be related to the FitzGerald–Lorentz contraction, and the angleexcess to the opposite effect of space dilatation. Whereas the contraction

Page 492: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 465

Fig. 9.7. Fokker’s [65] visualization of fitting errors when objects are placed oncurved surfaces. The left and right sides correspond to negative and positive cur-vature, respectively.

is well-known — but not in the hyperbolic context to be describedhere — the dilatation is unknown, and we will differ its discussion untilSec. 11.3.

In the early days of relativity, Ehrenfest [09] arrived at the paradoxicalconclusion that the circumference of a rotating disc should be shorter than2πr due to the FitzGerald–Lorentz contraction. For, according to Ehrenfest,the periphery of a cylinder when set into motion will “show a contrac-tion compared to its state of rest: 2πr′ < 2πr, because each element of theperiphery is moving in its own direction with instantaneous velocity r′ω.”As we have seen, Einstein [Stachel 89] came to the opposite conclusion thatthe circumference had to be greater than 2πr claiming that it was necessaryto keep the longitudinal FitzGerald–Lorentz contraction as distinct fromthe shortening of the tangential components of the measuring rods on thedisc by a factor of

√(1 − r2ω2/c2).

And because you need more tangential measuring rods than when thedisc is at rest, its circumference should be increased by the factor 1/

√(1 −

r2ω2/c2). He concluded that a “rigid disc must break up if it is set intomotion, on account of the Lorentz contraction of the tangential fibers andthe non-contraction of the radial ones” [Stachel 89]. Einstein’s argument,that the measuring rods contract so that more are needed to measure thecircumference of the disc, fails to answer the question of why the periphery

Page 493: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

466 A New Perspective on Relativity

Fig. 9.8. Hyperbolic right triangle inscribed in a unit disc.

of the disc also does not contract when set into motion. And the contractionshould be greater the faster the disc rotates!

Consider a unit discd in velocity space with a right triangle inscribedin it, as shown in Fig. 9.8. In view of the Euclidean expression for the kineticenergy, (9.6.11), we may consider that the Euclidean measures of the sidesare β = r and α = rϑ, and the hypotenuse γ < 1, which ensures that therotating system may be “represented by a uniformly rotating ‘material’ discsince nothing, in Euclidean space, can surpass the velocity of light.” [Møller52] The angle A at the center of the disc will not be distorted so that it willobey Euclidean geometry. Thus, its Euclidean measure A will coincide withits hyperbolic measure A,

cos A = cos A = β/γ = tanh β/tanh γ . (9.7.1)

However, because the angle B is non-central, its Euclidean measure,B, will not coincide with its hyperbolic measure, B. The logarithm of thecross-ratio is the hyperbolic distance,

α = 12

ln

[e(c, u)e(b, u)

· e(b, v)e(c, v)

]

= 12

ln

[ √(1 − β2)√

(1 − β2) − α·√

(1 − β2) + α√(1 − β2)

].

dThis implies we are using natural units where the velocity of light c = 1.

Page 494: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 467

Exponentiating both sides and solving for α give

α = tanh α · sech β. (9.7.2)

This shows that the Lobachevsky straight line segment in hyperbolicspace, tanh α, has been shortened by the amount, sech β, which is theFitzGerald–Lorentz contraction factor. The contraction,

√(1−β2) = sech β,

implies

β = tanh β. (9.7.3)

These is the set of Beltrami coordinates. In terms of these coordinates theBeltrami metric form is

dh2 = dα2 + cosh2 α dβ2.

Moreover, the Euclidean Pythagorean theorem, γ2 = α2 + β2, or theEuclidean kinetic energy, γ = 2T in (9.6.11), asserts that tanh2 γ = tanh2 α ·sech2 β + tanh2 β, gives way to the hyperbolic Pythagorean theorem

cosh γ = cosh α · cosh β. (9.7.4)

Now, the cosine of the hyperbolic measure of the angle B,

cos B = tanh α/tanh γ ,

or the ratio of the adjacent to the hypotenuse, will be related to the cosineof its Euclidean measure by

cos B = α

γ= tanh α sech β

tanh γ

= cos B sech β = cos B√

(1 − β2). (9.7.5)

Consequently,

cos B > cos B,

and since the cosine decreases monotonically on the open interval (0, π) itfollows that B > B. Since B = π/2−A, the sum, A+B < π, which is the well-known angle defect.And since the angles of a hyperbolic triangle determinetheir sides, the side α will appear smaller than its hyperbolic measure, α,by an amount given precisely by the FitzGerald–Lorentz contraction factor,√

(1 − β2).

Page 495: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

468 A New Perspective on Relativity

The origin of the FitzGerald–Lorentz contraction is to be found in thehyperbolic angle defect since the sides of a hyperbolic triangle are deter-mined by their angles. The exact same contraction factor is found for thenormal, or second-order, Doppler shift.

Due to uniform acceleration, the Euclidean measure of α will not betanh α, but it will be decreased by the factor

√(1 − β2). It will appear to us

Euclideans that the disc is rotating at a slower rate than it would to thePoincarites who measure a length α = tanh−1 α. By writing (9.6.16) as

2πr = h sech β,

we can interpret our slower rate to a smaller perimeter, 2πr to cover thanto the Poincarites who have to cover the larger perimeter, h.

9.8 Hyperbolic Nature of the Electromagnetic Fieldand the Poincaré Stress

Hyperbolic geometry also applies to Maxwell’s equations, and, in this sec-tion, we will show how it can be used to calculate the Poincaré stress thatwe analyzed in Sec. 6.2. It will make the assumption of charge conservationon the surface of the electron completely superfluous.

Consider a charge moving in the x-direction at a constant, relativespeed β. The law of transformation of the electromagnetic fields, E and H,are

E′x = Ex, H ′

x = Hx, (9.8.1a)

E′y = �(Ey − βHz), H ′

y = �(Hy + βEz), (9.8.1b)

E′z = �(Ez + βHy), H ′

z = �(Hz − βEy), (9.8.1c)

where we denote � = 1/√

(1 − β2) so as not to be confused with γ , thehypotenuse of the triangle. We consider the primed inertial system to be atrest in the xy-plane.

From the last section, we know that the sides of a triangle may beexpressed in terms of the angles of the triangle. Consequently, the first two

Page 496: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 469

transformation laws, (9.8.1a) and (9.8.1b) can be stated as

cos A = cos A = β/γ = tanh β/tanh γ , (9.8.2a)

cos B = α/γ = tanh α

tanh γsech β = cos B sech β, (9.8.2b)

where the latter is the hyperbolic Pythagorean theorem (9.7.4).The components of the force are obtained by multiplying their pro-

jections in the x, y, and z planes by the factors 1, �−1, and �−1, respec-tively [Lorentz 16]. In the xy-plane the force components will be given by

Fx = e4πa2 Ex = e

4πa2 cos A, (9.8.3a)

Fy = e4πa2 (Ey − βHz) = e

4πa2 E′y/� = e

4πa2 cos B sech β, (9.8.3b)

where a is the radius of the sphere of charge e. The magnitude of the force is

F = √(F2

x + F2y) = e

4πa2√

( cos2 A + cos2 B). (9.8.4)

Without realizing that

cos2 A + cos2 B = 1 − sech2 β + (1 − sech2 α)sech2 β

tanh2 γ= 1,

which follows directly from the hyperbolic Pythagorean theorem, (9.7.4),Page and Adams [40] invent a charge conservation condition, per unit areaon the surface of the electron, ρ dσ = ρ′dσ′, where ρ′ = e/4πa2. The surfaceelements, dσ and dσ′, are supposedly related by

dσ = dσ′√( cos2 A + cos2 B),

so that upon solving for the unknown charge density, ρ, the square root in(9.8.4) is eliminated.

Page 497: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

470 A New Perspective on Relativity

Since the electromagnetic field vanishes inside the electron, the stressacting on the surface,

S = 12

F2 = e2

32π2a4 , (9.8.5)

is the well-known Poincaré stress that was needed to reduce the 43 factor

in the expression for the energy of an electron to unity, leading to the con-clusion that the mass of an electron is not totally electromagnetic in origin.This corrects the derivation given in Sec. 6.2.

There is no need to invoke a hypothetical charge conservation on thesurface of the electron when it is realized that the surface element is inthe hyperbolic plane, and not the Euclidean plane.

9.9 The Terrell–Weinstein Effect and the Angleof Parallelism

If we want to determine the size of a rod traveling at a relative velocity β,we have to take into account that the photons we observe emanating fromthe ends of the rod will arrive at different times. Terrell [59] showed thatone can interpret what is usually viewed as a FitzGerald–Lorentz contrac-tion as a distortion due to the rotation of the rod. Weinstein [60] claimed,about the same time, that the length of a rod can appear infinite whichhe claimed cannot be due to a mere rotation. Here, we will show it to bedue a phenomenon analogous to stellar parallax, and involves the angle ofparallelism in hyperbolic geometry.

In the limiting case we have the Euclidean distance β = cos A [cf.Fig. 9.8], since the maximum length of hypotenuse of the inscribed righttriangle in a unit disc is 1. According to the definition of the angle of paral-lelism, (9.5.3), the hyperbolic measure of the velocity, β, whose Euclideanvalue satisfies β < 1, is

β = 12

ln

(1 + β

1 − β

)= 1

2ln

(1 + cos A1 − cos A

)

= 12

ln

(1 + cos A

sin A

)2

= ln cot(A/2), (9.9.1)

Page 498: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 471

where we have used a half-angle trigonometric formula in writing downthe third equality. Exponentiating both sides of (9.9.1) results in the Bolyai–Lobachevsky formula [cf. Eq. (9.5.3)]

cot[�(β)/2] = eβ, (9.9.2)

where A = �(β) is the angle of parallelism.Consider a rod moving with relative velocity β along the r axis. Light

from the trailing and leading edges must travel over different distances,and, hence arrive at different times. Suppose the distance covered by pho-tons emanating from the trailing edge is d1, while that from the leadingedge d2; they are also their respective times in natural units. The observerthat sees light at time t will have emanated from the trailing and leadingedges at t − d1 and t − d2, respectively, because of the finite propagation oflight.

The Lorentz transformations for the space coordinates will then be

r′1 = �[r1 − β(t − d1)], (9.9.3a)

r′2 = �[r2 − β(t − d2)], (9.9.3b)

where, again, � = 1/√

(1 − β2).The difference between (9.9.3a) and (9.9.3b) provides a relation

between the lengths of the rod in the system traveling at the velocity β.At rest, the difference is � = r2 − r1, while in motion, �′ = r′

2 − r′1, where

�′ = �[� + β(d2 − d1)

].

Now, the difference in distances traveled by the photons from theleading and trailing edges is just the length of the rod in the system atrest, so that the length of the rod in motion will be Doppler-shifted by anamount,

�′ = �

(1 + β

1 − β

)1/2

, (9.9.4)

if the rod is approaching the stationary observer. For a rod receding from theobserver, the signs in the numerator and denominator must be exchangedbecause β → −β. We could have arrived at (9.9.4) directly by observing thatin addition to the usual Doppler effect there is a time dilatation betweenobservers located on the moving and stationary frames.

Page 499: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

472 A New Perspective on Relativity

Setting the Euclidean measure of the relative speed, β, equal to thecosine of the angle of parallelism in the Doppler expression (9.9.4) gives

�′/� = cot[�(β)/2

] = eβ, (9.9.5)

if the rod is approaching, while

�′/� = tan[�(β)/2

] = e−β, (9.9.6)

if it is receding.In general, the angle of parallelism, �, must be greater than A, and the

larger the hyperbolic measure of the relative velocity β the smaller will bethe angle A. Thus, we would expect to see a large expansion of the object as itapproaches us, and a corresponding large contraction as it recedes from us.These are the conclusions that a single observer would make, and not thoseof two observers, as in the usual explanation of the FitzGerald–Lorentzcontraction. Weinstein came to same conclusions by plotting the exponentof the hyperbolic arctangent, rather than the tangent, because he did not goto the limit where β = cos A, which then defines the angle of parallelism.

However, unlike the astronomical phenomenon of parallax, where theradius of curvature is so large and the parallax angle so small as to thwartall attempts to-date at measuring a positive defect, the distortions predictedby (9.9.5) and (9.9.6) are actually more dramatic, precisely because of thefinite speed of light. Since

π − (π/2 + A + B) < π/2 − A = φ,

the defect is smaller than the complementary angle to A, known as theparallax angle, φ, in astronomy.

Moreover, since A(= π/2 − φ) ≤ �, or φ > π/2 − �, there exists alower bound for the parallax of stars, if space is, indeed, hyperbolic. Sinceφ is the upper bound of the defect, the latter may stand a greater chance ofbeing measured. Although no astronomical lower bound for the parallaxangle found to-date, the non-Euclidean nature of light rays may be easier toaccess because the finite velocity of light is not a constraint on the hyperbolicmeasure of the velocity.

Page 500: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 473

9.10 Hyperbolic Geometries with Non-ConstantCurvature

9.10.1 The heated disc revisited

We return to the heated disc that was discussed in Sec. 2.1.1. There, it pro-vided us with an example of a two-dimensional geometry which is actedupon by thermal stresses that tend to warp it and, in so doing, modify itsEuclidean geometry. Our intention was to determine what are the conse-quences in assuming different physical laws for the transport of heat onthe geometry.

Rather than considering the temperature as a correction factor in thephysical law, we consider it to determine the law itself through the met-ric [Robertson 50],

dr2 = dx2 + dy2

T2(x, y), (9.10.1)

with the stereographic inner product T > 0, but, otherwise, unknown.Under the assumption that the transport of heat is directed radially outwardfrom the center of the disc, (9.10.1) becomes

dr2 = dr2 + r2dϕ2

T2(r), (9.10.2)

under a change to polar coordinates.If there is a heat source, of intensity σ, located at the center of the disc,

the law of stationary heat conduction will be given by Poisson’s law,

k1r

ddr

(rdTdr

)= −σ, (9.10.3)

where k is the thermal conductivity. The solution to (9.10.3) is

T = T0 − σr2

4k, (9.10.4)

where the constant of integration, T0, is the temperature at the center ofthe disc. The first constant of integration would have led to an infinitetemperature at the center of the disc, and, so, has been set equal to zero.

Page 501: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

474 A New Perspective on Relativity

The line element for going from r to r + dr and ϕ to ϕ + dϕ, (9.10.2), isnow explicitly given by

dr2 = dr2 + r2dϕ2

(T0 − σr2/4k)2. (9.10.5)

The Gaussian curvature,

K = − (T0 − σr2/4k)2

rddr

(T0 + σr2/4kT0 − σr2/4k

)= −σT0

k,

is constant, and negative if there is a heat source at the center, σ > 0, orpositive if it is a sink, σ < 0.

Outside the disc of radius r1, which is at temperature T1, T will behaveas a logarithmic potential,

T = T1 ln

(rr1

+ 1)

, (9.10.6)

because it satisfies Laplace’s equation,

1r

ddr

(rdTdr

)= 0,

in two-dimensions. The Gaussian curvature,

K = −T20

r2 ,

is still negative, but is no longer constant. This appears to contradict thefact that thermal stresses distort what would otherwise be flat, Euclideangeometry.

If, instead of considering heat sources or sinks, we were to consider therate of heating, we would have the diffusion equation for heat conduction.In two-dimensions it reads

a2

r∂

∂r

(r∂T∂r

)− ∂T

∂t= 0, (9.10.7)

where a2 = k/ρc with c as the specific heat. Eliminating time by looking fora solution whose temporal dependency is exponentially decaying, e−a2µ2t,

Page 502: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 475

where µ is a completely arbitrary constant, (9.10.7) reduces to

T′′ + 1r

T′ + µ2T = 0, (9.10.8)

where the prime denotes differentiation with respect to r. It will be immedi-ately appreciated that (9.10.8) is the equation for a Bessel function of orderzero, J0.

Changing our perspective, we now assume that the temperature isfinite at the center, T0, of the disc and vanishes at the rim, which is infinitelycold. In this case the solution to (9.10.8) is

T = T0 J0(µr).

To take into account that the temperature vanishes on the rim, r1, we setthe zero of the Bessel function µr1 = λ, i.e. J0(µr1) = 0, and eliminate µ inthe argument. The solution can now be written as

T = T0 J0

(λrr1

). (9.10.9)

The line element (9.10.2) is now given explicitly as

ds2 = dr2 + r2 dϕ2

T20 J2

0 (λr/r1). (9.10.10)

Since J0 has the infinite power series,

J0(x) = 1 − x2

4+ x4

64− · · ·,

it is clear that for discs of large radius, the curvature will be negative andconstant. In general, the curvature is

K = −T20

(λ2

r21

J20 + J′ 2

0

). (9.10.11)

The first term in (9.10.11) represents the heat source, while the second termis proportional to the square of the heat flux, since by Fourier’s law of heatconduction, the heat flux is proportional to the negative of the temperaturegradient. For extremely large discs, the first term vanishes, and the negativecurvature becomes proportional to the square of the heat flux. In otherwords, the transport of heat curves space as do heat sources.

Page 503: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

476 A New Perspective on Relativity

Moreover, the curvature (9.10.11) will remain finite at the rim of thedisc, which is the coldest possible, because the zero of J0 and the zero ofJ1 = −J′

0 have no common root. Even at zero temperature, there is finitecurvature!

The curvature (9.10.11) cannot distinguish between the diffusionequation (9.10.7) or a wave equation. This is contained in the multiplicativefactor to (9.10.9) that would make it the complete solution. However, if ithas any sense to introduce a time component to the line element (9.10.10),the wave equation would have to be compatible with a hyperbolic-invariant form of the metric. Nevertheless, both the parabolic and hyper-bolic equations of motion give the spatial component of the metric.This fact leaves much to be desired in assuming a hyperbolic-invariantform, implying the existence of thermal waves, as opposed to thermaldiffusion.

In an analogous way that we went from a space of negative to positivecurvature by exchanging a source for a sink, if we make the substitutionµ → iµ, the Bessel function, J0(ix) = I0(x), becomes a modified Besselfunction. Since

I0(x) = 1 + x2

4+ x4

64+ · · ·,

the line element,

ds2 = dr2 + r2 dϕ2

T20 I2

0 (λr/r1),

will have positive constant curvature for large disc radii. This is, indeed,surprising inasmuch as one would think that circular and ordinary Besselfunctions would apply to bounded, positive curvature, while hyperbolicand modified Bessel functions would be compatible with unbounded, neg-ative curvature.

9.10.2 A matter of curvature

Geometries which are both homogeneous and isotropic have constant(Gaussian) curvature. Gaussian curvature is a measure of a surface’s intrin-sic geometry, or the invariance of a surface to bending without stretching.As we know, there are three distinct simply connected isotropic geometries

Page 504: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 477

in any dimension: Euclidean with zero curvature, elliptic with positivecurvature, and hyperbolic with negative curvature. Homogeneity impliesthat there is at least one isometry that takes one point to another so thatthe points appear to be indistinguishable. Isotropy implies that space isisotropic so that all directions appear the same.

The appearance of non-Euclidean geometries with constant curvatureare rather rare because homogeneity and isotropy are very strong condi-tions which are seldom met with in cosmology. The fact that the inertialmass of a rotating system can be handled within the confines of constantnegative curvature, while gravitational mass cannot, leads us to believethat the fields of acceleration of uniform rotation and gravitation are notequivalent.

To transform the exterior solution of the Schwarzschild [16] metric intothe interior one, possessing constant (negative) curvature, it is necessaryto assume that the mass is not a function of the radius r. Then for objectsin which the density, ρ, is essentially uniform, M = (4π/3)ρr3, introduc-ing this into the Schwarzschild metric renders it equivalent to the hyper-bolic metric with constant Gaussian curvature, (9.6.17), where the absoluteconstant is given by (9.6.37). Gaussian curvature appears here as a rela-tivistic effect, vanishing in the nonrelativistic limit as the speed of lightincreases without limit. Flatness cannot only be achieved in the limit ofa vanishing density, but also in the case where relativistic effects becomenegligible.

The two cases of constant density and constant mass are distinguish-able by the different slopes of the curve of the velocity of rotation of galax-ies as a function of their distance from the galactic center. For distancesless than rc = 2 × 104 light years the curve rises with a constant positiveslope. That means, if the centrifugal and gravitational forces just balanceone another, the rotational velocity is proportional to the density, whichremains essentially uniform.

For distances greater than this value, the curve slopes downward,where the rotational velocity is now proportional to the inverse squareroot of the distance from the galactic center. This implies that the galacticmass is confined to a region whose volume has a radius less than rc, foronce outside this volume it appears that the mass is independent of theradius.

Page 505: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

478 A New Perspective on Relativity

The transformation from constant density to one of constant massnecessitates replacing the metric coefficients (9.6.9a) and (9.6.9b) by

E = κ2

(1 − α/r)2, (9.10.12a)

G = κ2r2

(1 − α/r), (9.10.12b)

respectively. The Gaussian curvature (9.6.17),

K = − α

κ2r3

(1 − 3

r

), (9.10.13)

will be negative provided r > 34α. Although this distance is less than the

Schwarzschild radius, the inequality means that the singularity cannot beapproached without a change in the sign of curvature. The coexistence ofelliptic and hyperbolic spaces depending on the distance from the singular-ity does seem rather surprising. However, distances less than α invalidatethe positive definiteness of the stereographic inner product, and thus insurethe negativeness of the Gaussian curvature, (9.10.13).

Under this transformation, the metric (9.6.8) transforms into

dh2 = dr2 + (r dϑ)2(1 − α/r)(1 − α/r)2

,

where 1 − α/r is the stereographic inner product for a non-constant,negative, curvature for r > α.

9.10.3 Schwarzschild’s metric: How a nobody becamea one-body

Surely there has never been a more ludicrous attempt to prove a conclusion in physicalscience than this arbitrary fixation of a constant, with equal justification, might have beengiven any value we please.e

O’Rahilly [38]

The transition from a system of constant mass to one of constant density isexemplified by the exterior and interior solutions to the Schwarzschild

eO’Rahilly’s comment about the rest energy applies equally to the constant of inte-gration in Schwarzschild’s metric.

Page 506: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 479

metric. Schwarzschild [16] studied a static spherically symmetric fieldproduced by a spherically symmetric body at rest. The static condition doesnot mean that dt = 0, but, rather, that the coefficients of the fundamentalform do not depend upon time.

In spherical coordinates, the line element is

ds2 = E dr2 + Fr2dσ2 − G dt2, (9.10.14)

where dσ2 is given by

dσ2 =(dϑ2 + sin2 ϑ dϕ2

). (9.10.15)

Since spherical symmetry is invoked, the coefficients of the fundamentalforms, E, G, and F can, at most, be functions of the radial coordinate.

Einstein’s condition for empty space is that the Ricci tensor shouldvanish,

Rµν = 0. (9.10.16)

According to Dirac [75], this

constitutes a law of gravitation. ‘Empty’ here means that there is no matter presentand no physical fields except the gravitational field. The gravitational field doesnot disturb the emptiness. Other fields do.

We repeat our statement made in the Introduction (Chapter 1): That gravityacts where matter and radiation are not does not seem credible.

Since we are looking for a spherically symmetric solution, we need notconsider the angular dependency in the metric. If we set F = 0, we donot find that the Ricci tensor components vanish, but only the contractedscalar curvature.

The unknowns are determined by Einstein’s equations that involvethe contracted Ricci tensor,

Rµν = �λµλ,ν − �λ

µν,λ + �ρµλ�

λνρ − �ρ

µν�λρλ, (9.10.17)

Where the Einstein convention of summing over repeated suffixes is used,and the comma in the subscript indicates differentiation with respect to thecoordinate that follows. The only nonvanishing Christoffel symbols of the

Page 507: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

480 A New Perspective on Relativity

second kind are:

�111 = Er

2E, �1

22 = Gr

2E, �2

12 = Gr

2G.

These expressions are to be substituted into (9.10.17). Since the only sur-viving components of the Ricci tensor are

R11 = �212,1 − �1

11�212 +

(�2

12

)2,

R22 = −�122,1 − �1

22�111 + �2

21�122,

where 1 = r and 2 = t, we get

R11 =(

Gr

2G

)

r− ErGr

4EG+ G2

r

4G2 = αr − 34α2

r4(1 − α/r)2, (9.10.18)

R22 = −(

Gr

2E

)

r+ G2

r

4EG− GrEr

4E2 = − αr − 34α2

r2(1 − α/r). (9.10.19)

It is apparent that neither (9.10.18) nor (9.10.19) vanishes. But upon dividing(9.10.18) by E and (9.10.19) by G, the their sum, or total scalar curvature, R,does vanish, i.e.

R = R11

E+ R22

G

= 1E

(Gr

2G

)

r− 1

G

(Gr

2E

)

r− ErGr

2E2G+ G2

r

2G2E

= αr − 34α2

κ2r4 − αr − 34α2

κ2r4 = 0. (9.10.20)

So for a spherically symmetric solution, F = 0, (9.10.18) and (9.10.19)do not vanish. It is only when we set F = 1 in (9.10.14) do they vanishseparately. The big question is why should the angular dependence make adifference when gravity acts radially? At least we can say Newtonian grav-ity acts radially, and if there are angle dependencies, like those in Ampère’slaw, these angular dependencies should be universal and follow some law.That the Ricci tensor Rνµ �= 0 for the spherically symmetric solution, whileRνµ = 0 when the angle dependencies are included makes the criterion forempty space, (9.10.16), extremely dubious.

Page 508: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 481

We want now to compare criterion (9.10.16) with what we know fromdifferential geometry. However, the metric (9.10.14) is an indefinite form.It can be made definite by substituting τ for it, as the independent variable.Then since the coefficients of the fundamental form depend only on theradial coordinate we can set F = 0 and obtain

ds2 = E(r)dr2 + G(r)dτ2.

In the Schwarzschild solution F is set equal to unity, and all the radialdependencies fall on the coefficients E and G of the fundamental form.This has the effect of changing the sign of �1

22 so that R22 changes sign, butdoes not vanish. The total curvature is

R = Grr

GE− GrEr

2GE2 − G2r

2EG2

= 2√(EG)

((√

G)r√E

)

r= −2K,

where K is the Gaussian curvature, (9.6.17). This is a particular form of thegeneral relation,

K = − 1n(n − 1)

R,

for n = 2 dimensions.However, we cannot attach any significance to the imaginary time

variable, τ, in determining the curvature of space-time, for the time has nosignificance in terms of curvature. Clocks may run slower in a gravitationalfield, but that is the effect of the gravitation field on time keepers, and notthe effect that time has on the field. And the reason why clocks do runslower in a gravitational field certainly has nothing to do with a shift infrequency due to the Doppler effect because velocities do not enter at all.

Even more can be said about the outer solution, when we try to matchthe two conditions at the radius, r1, of the sphere, i.e.

1 − α

r1= 1 − r2

1R2 . (9.10.21)

If such a relation would be valid for any generic r, we might try and set

1 − r2/R2 = √(1 − α/r) � 1 − α/2r, (9.10.22)

Page 509: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

482 A New Perspective on Relativity

for relatively weak fields. But, if we are to replace this in the spatial metricfor the outer solution,

dl2 = dr2

1 − α/r+ r2dϕ2, (9.10.23)

the angular term must also change. For if we want to replace (1 − α/r)2 by(1 − 2r/R) � (1 − r2/R2)2, for large R, we must recall that the hyperbolicline element,

dl2 = dx2 + dy2 − dz2,

becomes the Beltrami metric,

dl2 = dr2

(1 − r2/R2)2+ r2dϕ2

1 − r2/R2 ,

or, equivalently,

= dr2 + R2 sinh2 (r/R)dϕ2, (9.10.24)

since r = R tanh−1 (r/R), under the pseudospherical coordinates, R, r, and ϕ,for which

z = R cosh (r/R),

x = R sinh (r/R) cos ϕ, (9.10.25)

y = R sinh (r/R) sin ϕ,

where 0 ≤ r < ∞, and 0 ≤ ϕ < 2π.Thus, for weak fields, α, that imply large absolute constant, R, accord-

ing to (9.10.22), the Beltrami metric, (9.10.24), can be used for the spacepart of the outer Schwarzschild metric, (9.10.23). We will now show thatthe transition that occurs at the surface r1 is one from a hyperbolic metric,(9.10.24), for r > r1 to an elliptic metric, [cf. (9.10.29) below] for r < r1.

9.10.4 Schwarzschild’s metric: The inside story

Landau and Lifshitz [75] would contest the existence of the inner solution.

For a field in the interior of a spherical cavity in a centrally symmetric distribu-tion, we must have [E = G = 1], since otherwise the metric would be singular atr = 0. Thus the metric inside such a cavity is automatically Galilean, i.e. there is nogravitational field in the interior of the cavity (just as in Newtonian theory).

Page 510: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 483

It is incomprehensible why the boundaries of the disc would pose such aproblem as to warrant reducing the geometry on its interior to a Euclideanone.

The coefficients of the fundamental form, (9.10.14), are [Møller 52]

E(r) = 11 − 2M/r − λr2/3

, (9.10.26)

where M and λ are constants, F = 1, and G = 0. For the exterior solution,λ is set equal to zero, while for the interior solution, M = 0 [Møller 52].However, if we set λ = 8πρ, this can be seen as a transition from one ofconstant mass, M, to one of constant density, ρ. But why should they notbe mutually exclusive in (9.10.26)? Or is it an artifice to transfer from theouter to inner solutions?

Then the line element to consider is

dl2 = dr2

1 − 2M/r − λr2/3+ r2dϕ2, (9.10.27)

and to simplify matters still further we have set ϑ = π/2, placing us in theplane. The exterior solution, where λ = 0, falls outside the domain of non-Euclidean geometries of constant curvature, but the interior solution, whereM = 0, certainly does come under their jurisdiction. For then (9.10.27)becomes the line element of elliptic space,

dl2 = dr2

1 − r2/R2 + r2dϕ2, (9.10.28)

or, equivalently,

dl2 = dr2 + R2 sin2 (r/R)dϕ2, (9.10.29)

of positive, constant curvature, 1/R2, where

r = R sin−1 (r/R) , (9.10.30)

and R = √(3/λ) is the absolute constant.

Now comes the crux of the matter: If we hold r constant in (9.10.27)with M = 0, we obtain the periphery of a circle with length 2πr. Thus comesthe conclusion that “the geometry of the surface r = r1 = const. is the sameas on a sphere of radius r1 in Euclidean space” [Møller 52]. This is inaccuratesince once the radial part of the metric (9.10.28) is given, the angular part is

Page 511: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

484 A New Perspective on Relativity

that in (9.10.29). And at constant r, the angular part can be integrated from0 to 2π to give

R∫ 2π

0sin (r/R)dϕ = 2πR sin (r/R) < 2πr.

We would indeed measure a larger circumference than what the Poincariteswould measure. We would say that our rulers have undergone a spacedilatation.

In contrast to what Møller purported, that we would see no differencein the circumference, our standard rulers do not give r1 as the distance fromthe origin, r = 0, but, rather,

r1 =∫ r1

0

dr√(1 − r2/R2)

= R sin−1 r1

R,

which is noticeably larger than r1 because our ‘standard’ rulers areEuclidean rulers!

Thus, the internal solution to Schwarzschild’s problem is an exampleof elliptic geometry with constant curvature. It appears as the antithesis ofthe uniformly rotating disc. Rulers measuring the circumference of the discappear stretched. The Schwarzschild problem is not a single problem for itentails transiting from a metric where λ = 0 and α �= 0 to one of α = 0 andλ �= 0.

9.11 Cosmological Models

9.11.1 The general projective metric in the plane

Surprisingly, cosmological models with constant densities wouldcorrespond to the Schwarzschild inner solution, but with more optionsavailable. Everyone, or almost everyone, begins with the Friedmann–Lemaitre–Robertson–Walker metric,

ds2 = −dt2 + R2(t)

[dr2

1 − kr2 + r2 dσ2

], (9.11.1)

where the parameter k determines the spatial curvature, t is ‘cosmic’ time(whatever that is), and R(t) is the scale factor. For k = +1 the spatial sectionscorrespond to a sphere, or one of higher dimensions; for k = 1 the spatial

Page 512: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 485

sections correspond to a universe with negatively curved sections, andfinally for k = 0, the spatial sections are flat.

As we know, we can write (9.11.1) equally as well as

ds2 = −dt2 + R2(t)[dχ2 + k−1 sin2 χ dσ2

], (9.11.2)

where we introduce the ‘angle,’ χ, in place of the coordinate r accordingto χ = sin−1 (

√(k)r). Here, the parameter k has turned up elsewhere, and

that elsewhere is subsequently set equal to zero on the basis of isotropy,so that the only way to go from a closed to an open model is through thetransformation χ → iχ. Furthermore, by the transformation [Rindler 77],

r = ρ

1 + 14kρ2

, (9.11.3)

(9.11.1) can be written as

ds2 = −dt2 + R2(t)

[dρ2 + ρ2 dσ2

(1 + 14kρ2)2

], (9.11.4)

whose spatial part is still determined by the sign of k.We recognize the terms in the square parentheses of (9.11.4) as the

stereographic inner product metric, for k > 0 it is elliptic while for k < 0 itis hyperbolic. That latter has occupied our attention in Sec. 7.4. If ρ is thehyperbolic distance, the r in the transform (9.11.3) cannot be.

While it is true that you can multiply the metric by a factor R2, it willdecrease the curvature by an amount k/R2, it says nothing about the sizeof the disc itself. In both the elliptic and hyperbolic cases the radius,

r0 = 1/√

k, (9.11.5)

is a constant! So any scale factor, R(t), will have no effect upon the planewhere the Poincarites live. It is usually argued that the volume in ellipticspace is independent of k, an absolute constant. The volume of a cone lengthr1 and solid angle � is [Rindler 77]

V(t) = �R3(t)∫ r1

0

r2 dr√(1 − kr2)

= �R3(t)∫ r1

0r2

(1 + 1

2kr2 + · · ·

)dr,

which to lowest-order is independent of k.

Page 513: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

486 A New Perspective on Relativity

Acloser, and more precise, derivation of the expression for the volumein elliptic space shows that this is not true, but only approximately in theEuclidean limit. For the elliptic line element (9.11.2), the volume element is

dV(t) = R3(t)k−1 sin2 (√

(k)r) sin2 ϑ dr dϑ dϕ.

For a sphere of radius r < π/2√

k, the volume is then [cf. (9.2.10)]

k−1∫ 2π

0dϕ

∫ π

0sin ϑ dϑ

∫ r

0sin2 (

√(k)t)dt

= 2π

(R(t)√

k

)3

[√(k)r − sin (√

(k)r) cos (√

(k)r)]. (9.11.6)

As (9.11.6) clearly shows, only in the limit as k → 0 does the term in thebrackets tend to 1

2k3/2r3. This is the Euclidean limit of an infinite radius ofcurvature in which k disappears from the expression for the volume.

So if r is not the radial coordinate in the non-Euclidean plane then justwhat is it? For the case k = 1, it is given in Fig. 9.9. It supposedly representsa ‘geodesic plane’ through the origin O obtained by setting ϑ = π/2 inthe metrics (9.11.1) or (9.11.4). The disc of radius 1 is just the elliptic planewhere rulers get longer as they move further from O. Points close to the rimhave very small stereographic arc lengths since they correspond to circlesabout the north pole on the sphere.

Fig. 9.9. Interpretation of the variables of the two metrics which are the radii ofthe elliptic plane.

Page 514: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 487

Admittedly, in the hyperbolic case, “no such simple interpretation ofρ and r exist. . . Light propagates along geodesics, e.g. along great circles onthe sphere and straight lines in the plane” [Rindler 77]. But, what does lightpropagate along in hyperbolic geometry? This answer we already know:along geodesics on the pseudosphere. Let us recall, from Sec. 2.5, whatBeltrami did back in 1868. He mapped a negatively curved surface onto aunit disc. He took the disc as the plane, lines within the disc as a measureof the distance between the distance of preimage points on the negativelycurved surface. The distance between any two points is thus meaningfulfor all points in the unit disc. As one of the points tends to the rim of thedisc, the distance tends to infinity so that the plane, and the lines in it areindeed infinite. There is nothing beyond infinity and it makes no sense toconsider expansion factors greater than unity.

Beyond the rim no stereographic projection takes place so R(t) > 1would lie outside the elliptic plane, whose geometry is unknown, butcertainly not that of the elliptic plane.

In Sec. 111 of Landau and Lifshitz [75] is observed that since the radiusof curvature in the ‘closed’ universe metric is (9.10.29), the way to cross overto negative curvature is “by replacing [R] by [iR].” For then

dl2 = dr2

1 + kr2 + r2 dσ2, (9.11.7)

or what should amount to the same thing,

d�2 = R2{dχ2 + sinh2 χ σ2}, (9.11.8)

where k = −1/R2 > 0, χ = sinh−1 (√

(k)r), and the ‘angle,’ χ, can go from0 to ∞.

It cannot be over-emphasized that (9.11.7) is not a hyperbolic metric!Landau and Lifshitz, as well as all previous authors, should have real-ized that χ = tanh−1 (

√(k)r), and not χ = sinh−1 (

√(k)r), is the hyperbolic

radius.

Landau and Lifshitz also fail get the surface area of the sphere correctly.The Euclidean radius, (1/

√k) tanh χ has to be multiplied by the ratio of the

Page 515: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

488 A New Perspective on Relativity

hyperbolic to the Euclidean lengths of arc, 1/√

(1 − kr2) = cosh χ and thisvalue times 2πr gives the length of the circumference of a hyperbolic circleof radius χ as (2π/

√k) sinh χ. The Euclidean element of area is

√(EG − F)dr dϕ, (9.11.9)

for ϑ = π/2. Thus, Landau and Lifshitz would obtain∫ 2π

0

∫ r

0

r√(1 + kr2)

dr dϕ = 2π

k[√(1 + kr2) − 1]

= 2π

k[cosh χ − 1] = 4π

ksinh2 (χ/2), (9.11.10)

which is correct, but for the wrong reason. The surface area, (4π/k) sinh2 χ,they claim is because the radius is (1/

√k) sinh χ.

The correct expression is obtained by inserting the coefficients of thefundamental form,

E = 1(1 − kr2)2

, G = r2

1 − kr2 ,

into expression (9.11.9). Integration then gives∫ 2π

0

∫ r

0

r dr dϕ

(1 − kr2)3/2 = 2π

k

[1√

(1 − kr2)− 1

]

= 2π

k( cosh χ − 1) = 4π

ksinh2 (χ/2), (9.11.11)

which is the same expression as Landau and Lifshitz would have found,(9.11.10), but, again, for the wrong reasons.

The metric (9.11.7) can be derived on a geometrical analogy by con-sidering the geometry of an isotropic three-dimensional surface embeddedin a fictitious four-dimensional space, where the fourth coordinate, so weare told, has nothing to do with time. Then, by extending our imaginationas well as the Pythagorean theorem, we require

R2 = x2 + y2 + z2 + w2

to be constant, where R represents the radius of the hypersphere. The sur-face of this hypersphere will then be identified with our universe. Set

r2 = x2 + y2 + z2

Page 516: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 489

to be the square of the radius of our three-dimensional universe, so thatR2 = r2 + w2 = const. Differentiating we get r dr = −w dw, squaring andeliminating w2 in favor of R2, give

dw2 = r2

R2 − r2 dr2.

Adding this term to the metric dr2 + r2dσ2 gives

dl2 = R2

R2 − r2 dr2 + r2 dσ2,

and the transformation R → iR reproduces the spatial metric (9.11.7), withk = 1/R2.

How does the spatial part of the Robertson–Walker metric (9.11.1)stand up against the most general projective metric for the plane? All wehave to do is to consider Beltrami’s derivation of his metric in Sec. 2.4, andfrom whose paper we took the title of this section. Consider the equationfor the fundamental conic section � = 0. Then consider two points x andy. We will then have three expressions �xx, �xy, and �yy. The cross-ratio ofthe two points to the two points where the line connecting them meets theconic is given by the quotient of the roots, λ+/λ−, to the quadratic equation[cf. Sec. 2.5],

�xxλ2 − 2λ�xy + �yy = 0.

The cross-ratio is

λ+λ−

= �xy + √(�2

xy − �xx�yy)

�xy − √(�2

xy − �xx�yy),

and its logarithm

k ln

(�xy + √

(�2xy − �xx�yy)

�xy − √(�2

xy − �xx�yy)

),

is the distance between the two points. The distance depends on the unit ofmeasurement which is given by the absolute constant, k. Since the logarithmis twice tanh−1, we can express the distance between x and y as

2k cosh−1 �xy√(�xx�yy)

= ±2ki cos−1 �xy√(�xx�yy)

.

Page 517: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

490 A New Perspective on Relativity

If the two points are infinitesimally close together, y = x + dx, we canapproximate the distance,

2ik sin−1√

(�xx�dx dx − �2x dx)

�xx,

by the argument itself, and come out with the square of the arc length as

d�2 = 4k2 �2x dx − �xx�dx dx

�2xx

.

Taking the fundamental conic as a circle of radius 2k,

x2 + y2 = 2k2,

we get

�xx = x2 + y2 − 4k2, �dx dx = dx2 + dy2, �x dx = x dx + y dy.

Thus, the square of the line element can be brought into the form

d�2 = 4k2 4k2(dx2 + dy2) − (y dx − x dy)2

(4k2 − x2 − y2)2. (9.11.12)

Furthermore, if we introduce the polar coordinates, x = r cos σ, and y =r sin σ, then (9.11.12) becomes

d�2 = dr2

(1 − 14 r2/k2)2

+ r2 dσ2

1 − 14 r2/k2

. (9.11.13)

As (9.11.13) does not correspond to the spatial component of the Robertson–Walker metric, (9.11.1), the latter has led to the confusion of identifying theEuclidean measure of distance, r = 2k tanh (r/2k) with 2π sinh (r/2k), thecircumference of a hyperbolic circle of radius r.

9.11.2 The expanding Minkowski universe

The Minkowski metric,

ds2 = dτ2 − dρ2 − ρ2 dσ2, (9.11.14)

Page 518: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 491

may be transformed by

ρ = t sinh χ, (9.11.15a)

τ = t cosh χ, (9.11.15b)

into the metric,

ds2 = dt2 − t2[dχ2 + sinh2 χ dσ2]. (9.11.16)

The metric (9.11.16) appears to have an expansion factor R(t) = t that theflat metric, (9.11.14) does not [cf. Eq. (8.7.5) and following discussion]. Thespace part is the metric of the hyperbolic plane, and

v = ρ/τ = tanh χ,

would be associated with a recessional velocity of a galaxy ‘at coordinatedistance,’ χ. But, from (9.11.15a) it would appear that χ is the ‘distance’sinh−1 ρ/t. For if this were the case, the rim would not be infinitely faraway! In contrast, distance is defined in elliptic geometry by showing thatit satisfies the triangle inequality, as we shall do in Sec. 9.11.3.

Miraculously, we have converted a flat space metric, (9.11.14), into ahyperbolic metric, (9.11.16), which has larger circumferences, areas and vol-umes than its Euclidean counterparts. We would also observe that (9.11.15b)is the time dilatation of the special theory, i.e.

τ = t/√

(1 − v2).

All this we have obtained from a seemingly innocuous transformation,(9.11.15a). But is it really innocuous?

Recall a similar transformation of the pseudospherical coordinates,(9.10.25). There the hyperbolic invariancy condition was

z2 − x2 − y2 = R2 = const.

Now, the similar condition on the transformations (9.11.15a) and (9.11.15b)gives

τ2 − ρ2 = t2 = const.

We might have imagined this when we were unable to express the metric,(9.11.16) in terms of its radial coordinates instead of its ‘angle,’ χ.

Page 519: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

492 A New Perspective on Relativity

Fig. 9.10. The three possible scenarios of closed, flat and open universes. Thefreckles are the galaxies which are more or less evenly distributed.

9.11.3 Event horizons

Consider the first model, k > 0, of a spherical, closed universe in Fig. 9.10.If the sphere is being blown up like a rubber balloon, there will be photonsthat will never have the chance to reach us. Our own galaxy has been circledand the solid line is the geodesic that a photon would take to reach us. Thedashed line separates those photons that reach us from those that do notin time t = t0. In the three-dimensional model this would be representedas a light front called the ‘event horizon.’

The rays, or null geodesics, are determined from (9.11.1) by settingds = 0, and since we can avail ourselves of spherical symmetry, we can putdσ = 0, leaving

χ(ρ0) =∫ ρ0

0

dρ√(1 − kρ2)

=∫ t0

0

dtR(t)

, (9.11.17)

as the definition of the ‘coordinate’ horizon, corresponding to the distance,in comoving coordinates, that a photon has traveled to arrive at an observerat t0 when it started at the beginning of the universe. The ‘proper’ distanceto the event horizon is defined as

ρ(ρ0) = R(t0)χ(ρ0) = R(t0)∫ ρ0

0

dρ√(1 − kρ2)

=∫ t0

0

c dtR(t)

. (9.11.18)

Now for each of the scenarios depicted in Fig. 9.10, the coordinatehorizon will be given by

χ(ρ0) =∫ ρ0

0

dρ√(1 − kρ2)

=

sin−1 ρ0 (k = 1),

ρ0 (k = 0),

sinh−1 ρo (k = −1).

(9.11.19)

Page 520: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 493

But, there are two solutions to ds2 = 0, so we can have, equally as well, thesolutions

χ(ρ0) = −∫ ρ0

0

dρ√(1 − kρ2)

=

cos−1 ρ0 (k = 1),

−ρ0 (k = 0),

cosh−1 ρ0 (k = −1).

(9.11.20)

As we have already mentioned, Landau and Lifshitz [75] always con-sider the spatial metric written in the form:

d�2 = dρ20

1 − ρ20

+ ρ20dσ2. (9.11.21)

For the closed and flat universes, ρ0 can be considered as the Euclideandistance from the origin, which can be chosen anywhere we please. This isbecause cos−1 ρ is a distance. Imagine a creature X at the north pole, andanother creature Y creeping away from it. As Y heads towards the equatorhis image as seen by X becomes smaller and smaller. Once south of theequator, Y’s image, as viewed by X, begins to grow again until he reachesthe south pole. On completing the his world’s trip, Y returns to the northpole only to find his head pointing in the opposite direction. The ellipticplane has only one side!

It is easy to show that cos−1 ρ is a distance because it satisfies thetriangle inequality. [Busemann and Kelly 53, p. 213] The triangle inequalitystates that the length of two sides of a triangle can never be inferior to thethird. Consider the normalized coordinates (x1, y1) and (x2, y2) such thatx2

1 + x22 = 1 and y2

1 + y22 = 1, with x2, y2 > 0. Then the triangle inequality

requires

cos−1 x2 + cos−2 y2 ≥ cos−1(x1y1 + x2y2). (9.11.22)

Since the cosine is a monotonically decreasing function on the interval(0, π), if we take the cosine of both sides of (9.11.22) we have to reverse theinequality. We then obtain

x2y2 − √(1 − x2

2) · √(1 − y2

2) = x2y2 − |x1y1| ≤ |x1y1 + x2y2|.No such relation hold for sinh−1 ρ0 in the third case in (9.11.19) so it

cannot be considered as the distance from the origin. However, as we haveseen in Sec. 2.2.4, thanks to the cross-ratio inequality (2.2.17) that tanh−1 ρ

satisfies the triangle inequality (2.2.18).

Page 521: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

494 A New Perspective on Relativity

Even before Beltrami’s time,

d�2 = dχ2 + k2 sinh2 (χ/k)( sin2 ϑ dϕ2 + dϑ2) (9.11.23)

had been known as the line element of a pseudospherical surface. Forϑ = π/2 we have, in Beltrami’s [68] own words,

. . . the variable ϕ is taken as the longitude variable of the variable meridian, andconsequently the radius of the parallel corresponding to the meridian is sinh χ.The variation of the radius is therefore cosh χ dχ, which is >dχ, and this is absurd,because the variation in question is the projection of dχ onto the plane containingthe parallel.

Again in a letter, Gauss tells of a fundamental discovery. In a letterdated 12 July 1831 to Schumacher, Gauss states that the semi-perimeter ofa non-Euclidean circle of radius χ has the value

12πk

(eχ/k − e−χ/k

), (9.11.24)

where k is a constant. It is this constant that Gauss says may perhaps bedetectable by measurements over very large distances, and Beltrami madeit the radius of his pseudosphere.

A fortiori (9.11.23) can be transformed into the metric given in thePreface that was first written down by Riemann in his Habilitation. Inorder to do so, we write the angular component as d�2 = ∑

j dλ2j ,

where the quantities λr determine the direction of the radius vector[Beltrami 68],

xj = rλj = 2k tanh (χ/2k)λj, (9.11.25)

such that∑

j λ2j = 1. Taking the differential of (9.11.25) and rearranging

Beltrami gets

cosh2 (χ/2k) dxj = λj dχ + k sinh (χ/k) dλj, (9.11.26)

where we used the double angle formula for sinh χ/k. Now, by definition,

cosh2 (χ/2k) = 1

1 − 14k2

∑j x2

j

,

Page 522: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 495

so that squaring and summing (9.11.26) give

√ (∑j dx2

j

)

(1 − 1

4k2

∑j x2

j

)2 = dχ2 + k2 sinh2 (χ/k)d�2,

which is precisely Riemann’s formula, (R), in the Preface. It has beenderived from the fact that the inverse hyperbolic tangent, and not theinverse hyperbolic sine, is the hyperbolic length.

Now turn to the condition imposed by setting ds2 = 0 to get one ofthe two roots in (9.11.17). Two quantities depending on separate variablesthat are equal can be so if they are equal to a constant. The expansion factorR(t) must be determined from other considerations. Those considerationsentail the Einstein or Friedmann equations.

Instead of (9.11.19) we might be tempted to try

χ(ρ0) =∫ ρ0

0

1 + kρ2 =

tan−1 ρ0 (k = 1),

ρ0 (k = 0),

tanh−1 ρ0 (k = −1).

(9.11.27)

Now, χ will have the meaning of distance only in the open universe, sincetanh−1 ρ0 is a hyperbolic distance thanks to its logarithmic representation.That is, the hyperbolic distance between two elements ρ and ρ′ is

12

k lnρ

ρ′ ;

the distance from an element to itself,

12

k lnρ

ρ= 0;

and, finally, the additivity of distances,

12

k lnρ

ρ′′ = 12

k lnρ

ρ′ + 12

k lnρ′

ρ′′ .

The ratio, ρ/ρ′ can be expressed as the cross-ratio {ρ, ρ′|0, ∞}, and the dis-tance is the logarithm of the cross-ratio.

Page 523: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

496 A New Perspective on Relativity

Hence, there is not a single metric that will encompass all three scenarios,which are distinguished by the three k values, and whose expressionsare distances in their respective geometries.

9.11.4 Newtonian dynamics discovers the ‘big bang’

There is a general consensus that the Robertson–Walker metric has twoundetermined ‘constants’: k and R(t). We now see how the field equations‘impose’ conditions on these two ‘elements’ [Rindler 77].

Newton’s second law, applied to the universe, reads

mR = −mMR2 , (9.11.28)

where the expansion factor, R(t), is confused with the radial coordinateseparating the masses m and M. If the total mass of the universe is constant,then

M = 4π

3ρ(t)R3(t) = 4π

3ρ(t0)R3(t0) = C/2 = const. (9.11.29)

Thus, the equation of motion is

2R + CR

= 0.

Multiplying through by R, we get the first integral of motion,

R2 − CR

+ k = 0, (9.11.30)

where k is an arbitrary constant of integration. Equation (9.11.30) is knownas the Friedmann equation in Newtonian cosmology, which, with the excep-tion of the cosmological constant, holds also in general relativity.

Just as in the Schwarzschild outer solution, (9.10.23), we are going topack a lot of physics into an arbitrary constant of integration. The constantk is known as the ‘energy index,’ and it represents the total energy densityof the universe. The only meaning that k can acquire from (9.11.30) is that ofa negative energy density. But, if we want to associate it with the absoluteconstant in the Robertson–Walker metric, (9.11.1), we have to be preparedto assume that there are the possibilities for a positive energy density, as

Page 524: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 497

well as a zero energy density. Never has such exaggerations taken place inphysics.

Even more can, and is, said when (9.11.30) is integrated in time. Takingthe positive square root in (9.11.30) and integrating give

∫ R

0

dρ√(C/ρ − k)

= t. (9.11.31)

It is immediately apparent that the integral of the Friedmann equation,(9.11.31), is not the same as (9.11.18), for, otherwise, it would specify theexpansion factor, R = 1. In fact, the solutions, (9.11.19), have nothing incommon with the solutions

tC

=

1k

{C√k

tan−1

√ (C − kR

kR

)− √ [

R(C − kR)]}

(k > 0),

23

(RC

)3/2

(k = 0),

1|k|

{√ [

R(C + |k|R)] − C√|k| sinh−1

√ ( |k|RC

)}(k < 0).

(9.11.32)The conventional interpretation of all this is: Plotting R(t) versus t,

as in Fig. 9.11, all three curves coalesce at time t = 0. This supposedlyrepresents the explosive birth of the cosmos, warmly referred to as the ‘bigbang,’ a name coined by Fred Hoyle. At this point in time the universestarted as a primordial fireball with infinite density and no size. As timepassed, (9.11.32) predicts that the universe has three possibilities open to it:It can expand forever, with a positive energy index k < 0, or a zero energyindex, k = 0, or it can fall back on itself with a big ‘crunch’ if it has a negativeenergy density, with k > 0. In the latter case, the energy acquired in theprimordial fireball was not sufficient to sustain continual expansion.

The Friedmann equation, (9.11.30) can be written as the metric,

ds2 = dt2 − dR2

C/R − k. (9.11.33)

In order for the metric to be hyperbolic it is necessary that R < C/k,which substantially limits the evolution of the universe. Moreover, (9.11.33)assumes spherical symmetry so it will not hurt to add a term −R2 dϕ2 to it

Page 525: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

498 A New Perspective on Relativity

Fig. 9.11. The fates of the universe.

in order to determine the space-like curvature. In other words, we considera space-like slice of the present galactic time, and determine the Gaussiancurvature as

K = C2R3 = 4π

3ρ(t0)R3

0R3(t)

= 4π

3ρ(t), (9.11.34)

on the strength of the conservation of total mass in the universe, (9.11.29).Equation (9.11.34) shows that the Gaussian curvature is positive if the den-sity of matter is positive. However, with negative energy density we wouldalso have to make leeway for negative densities of matter, even though thetotal mass of the universe is positive and constant. It should also be bornein mind that the Gaussian curvature, (9.11.34) is independent of the energyindex, k. This alone makes the whole scenario less than dubious for basingit on Newton’s gravitational law and his second law, Newton himself couldhave arrived at the big bang scenario of the universe! But, (9.11.28) only hasthe exterior appearance of Newton’s law where R is the distance betweenthe two masses m and M, and not the expansion factor, R(t). So we cannotblame Newton for these scenarios!

References

[Beltrami 68] E. Beltrami, “Teoria fundamentale degli spazi di curvatura costante,”Annali di Matematica Pura ed Applicata, series II (1868) 232–255; translated

Page 526: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

Nonequivalence of Gravitation and Acceleration 499

in J. Stillwell (ed.), Sources of Hyperbolic Geometry (Amer. Math. Soc.,Providence RI, 1996), pp. 41–62.

[Busemann and Kelly 53] H. Busemann and P. J. Kelly, Projective Geometry andProjective Metrics (Academic Press, New York, 1953).

[Ehrenfest 09] P. Ehrenfest, “Gleichförmige Rotation starrer Körper und Relativ-itätstheorie,” Phys. Z. 10 (1909) 918; “Uniform rotation of rigid bodies andthe theory of relativity,” translated by Wikisource.

[Einstein 20] A. Einstein, Relativity: The Special and General Theory (Methuen,London, 1920).

[Einstein 55] A. Einstein, The Meaning of Relativity (Princeton U. P., Princeton,1955).

[Einstein 89] A. Einstein, “The speed of light and the statics of the gravitationalfield,” in The Collected Papers of Albert Einstein: The Swiss Years, Vol. 4(Princeton U. P., Princeton, 1989), pp. 95–106; “On the theory of the staticgravitational field,” ibid pp. 107–120.

[Fock 66] V. Fock, The Theory of Space, Time and Gravitation, 2nd ed. (Pergamon Press,Oxford, 1966).

[Fokker 65] A. D. Fokker, Time and Space, Weight and Inertia (Pergamon Press,Oxford, 1965), p. 139.

[Gamow 62] G. Gamow, Gravity (Anchor Books, New York, 1962).[Gray 07] J. Gray, Worlds Out of Nothing (Springer, New York, 2007), p. 318.[Grøn 04] Ø. Grøn, “Space geometry in rotating reference frames: A historical

appraisal,” in [Rizzi & Ruggiero 04].[Huygens 62] C. Huygens, Treatise on Light (Dover, New York, 1962).[Landau & Lifshitz 75] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields

(Pergamon Press, Oxford, 1975), p. 362.[Langevin 35] P. Langevin, “Remarques au sujet de la Note de Prunier,” Comptes

Rendus 200 (1935) 48–51.[Lorentz 16] H. A. Lorentz, The Theory of Electrons, 2nd ed. (B. G. Teubner, Leipzig,

1916).[Møller 52] C. Møller, The Theory of Relativity (Oxford U. P., Oxford, 1952).[Needham 97] T. Needham, Visual Complex Analysis (Clarendon, Oxford, 1997).[O’Neill 66] B. O’Neill, Elementary Differential Geometry (Academic Press,

New York, 1966).[Page & Adams 40] L. Page and N. I. Adams, Jr, Electrodynamics (Van Nostrand,

New York, 1940).[Rindler 77] W. Rindler, Essential Relativity (Springer-Verlag, New York, 1970),

p. 208.[Rizzi & Ruggiero 04] G. Rizzi and M. L. Ruggiero, Relativity in Rotating Frames

(Kluwer, Dordrecht, 2004).[Robb 11] A. A. Robb, Optical Geometry of Motion (W. Heffer and Sons, Cambridge,

1911).[Robb 36] A. A. Robb, The Geometry of Time and Space (Cambridge U. P., Cambridge,

1936).[Robertson 50] H. P. Robertson, “The geometries of the thermal and gravitational

fields,” Am. Math. Monthly 57 (1950) 232–245.

Page 527: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch09

500 A New Perspective on Relativity

[Sagnac 13] G. Sagnac, “Sur la preuve del la réalité de l’éther lumineux parl’expérience de l’interférographe tournant,” Comptes Rendus 157 (1913)1410–1413.

[Schwarzschild 16] K. Schwarzschild, “Über das Gravitationfeld einer Kugel ausinkompressibler Flüssigkeit,” Sitzber. Preuss. Akad. Wiss. (1916) 424–434(presented at the meeting of 24 February 1916).

[Smart 60] W. M. Smart, Textbook on Spherical Astronomy, 4th ed. (Cambridge U. P.,Cambridge, 1960).

[Sommerfeld 09] A. Sommerfeld, “Über die Zusammensetzung der Geschindig-keiten in der Relativtheorie,” Physikalisches Zeitschrift 10 (1909) 826–829.

[Sommerfeld 64] A. Sommerfeld, Optics (Academic Press, New York, 1964).[Stachel 89] J. Stachel, “The rigidly rotating disc as the ‘missing link’ in the history

of general relativity,” in eds. D. Howard and J. Stachel, Einstein and theHistory of General Relativity (Birkhäuser, Basel, 1989).

[Terrell 59] J. Terrell, “Invisibility of the Lorentz contraction,” Phys. Rev. 116 (1959)1043.

[Weinstein 60] R. Weinstein, “Observation of length by a single observer,” Am.J. Phys. 28 (1960) 607.

Page 528: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Chapter 10

Aberration and Radiation Pressurein the Klein and Poincaré Models

The hyperbolic distance in the Klein model “differs from the formula in the Poincaré discmodel by a mere factor of two!.”[Needham 97]

10.1 Angular Defect and its Relation to Aberrationand Thomas Precession

The angular defect concerns both aberration and parallax, although the twophenomena are quite distinct from each other. In fact, Bradley discoveredaberration in 1728 while looking for parallax. Although both phenomenacause the locus of a star to trace out an ellipse, the direction and magnitudeof the angular deviation in aberration is quite different from that caused byparallax. The crucial difference is that the magnitude of deviation causedby aberration is independent of the distance to the star, and is much greaterthan for parallax. In Sec. 9.9 we found the angle of parallax is greater thanthe defect, and, moreover, the angle of parallax is greater than the comple-mentary angle of parallelism, which is a sole function of distance. In theKlein model, we will appreciate that the angle of parallelism is a limitingangle, while the angular defect is always present.

It has also been shown that the angular defect of a hyperbolic trian-gle is related to the upper bound on the Euclidean measure of relativisticvelocities using the conformal Poincaré disc model [Criado & Alamo 01].On the other hand, if the Klein model is used, which is not conformal, onewould find Lorentz contraction in the direction normal to the motion, aswe have seen in Sec. 9.7.

The angular defect in the hyperbolic triangle, which is proportionalto the area, has also been implicated in the determination of the rotation of

501

Page 529: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

502 A New Perspective on Relativity

axes in successive Lorentz transformations in different planes [Sard 70]. Itcame as a curious surprise that successive Lorentz transforms, or ‘boosts’as they are now referred to, is not only another boost, but one that involvesa rotation. In physics, the angle of rotation is known as Wigner’s angle, andis the kinematic factor underlying Thomas precession.

However, what we refer to as the ‘Thomas’ precession falls underStigler’s law of eponymy because it was actually discovered by EmilBorel [13], a doctoral student of Poincaré. During his exploration of what hereferred to as ‘kinematic’ space, Borel discovered that a system whose accel-erations are rectilinear for observers in that frame will appear to be rotatedwith respect to inertial observers. Borel observed that a vector transportedparallel to itself over a closed path on the surface of a sphere will be viewedas a change in orientation by an observer at the center of the sphere. Theamount of change in orientation is proportional to the enclosed area for theinertial observer whose velocity is equal to the initial and final velocity ofthe accelerating system. Borel predicted that for a circular orbit of radiusR and angular velocity ω, the precession of the orbit would be the orderof β2 := (ωR/c)2, and whose rate would be ωβ2. While he attributed thiseffect to be a direct consequence of the nature of Lorentz transformations,he failed to apply it to any known physical phenomenon, and, undoubt-edly, this is why he lost out to Llewellyn Thomas whose Christmas holidaycalculation was done in 1925. Borel’s priority in the Thomas precession hasrecently been pointed out by Stachel [95].

If u and v are two velocities we know from Sec. 9.6 that the mostgeneral composition law is

w =√[(u − v)2 − (u × v)2/c2]

1 − u · v/c2 . (10.1.1)

The non-planar aspects of the composition law can be clearly seen in thesecond term of the numerator of (10.1.1). Expression (10.1.1) can also bederived by differentiating the Lorentz transformations at constant, relativevelocity [Fock 66, pp. 46–47]. Then, introducing v = u + du into (10.1.1),and dividing through by dt, the law of acceleration is obtained as

w =√[u2 − (u × u)2/c2]

1 − u2/c2 . (10.1.2)

Page 530: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 503

This decomposes the acceleration into longitudinal (u ‖ u) and trans-verse (u ⊥ u) components, analogous to the longitudinal and transversemasses. Taking the inner product of u with (5.4.39) yields

u · u = F · umel

(1 − u2/c2)32 . (10.1.3)

When they are parallel to each other, (10.1.3) gives the longitudinal mass,and when they are perpendicular F ·u = 0, and (5.4.39) gives the transversemass.

It is the second term in the numerator of (10.1.2) that is related to theThomas precession: the rotation of the electron’s velocity vector,

dϑ = (u × u)dt/u2, (10.1.4)

caused by the acceleration, u, in time, dt. Then, as the velocity turns bydϑ along the orbit, the spin projection turns in the opposite direction by anamount equal to the angular defect of the hyperbolic triangle whose verticesare the velocities in three different inertial frames in pure translation withrespect to one another.

The defect caused by aberration can be readily calculated. Considerthe triangle formed by three vertices u1, u2, and u3 in velocity space. Bysetting u3 = nc, where n is the unit normal in the direction of the lightsource, we are considering an ideal, or ‘improper,’ triangle [Kulczycki 61],which shares many properties of ordinary triangles, but has the propertythat the sum of its angles is less than two right angles — its so-called defect.Consequently, there will be two parallel lines forming an ideal vertex u3

whose angle is zero so that cos ϑ3 = 1.The cosines of the angles are given by the inner products [Busemann

& Kelly 53]

cos ϑi = (uk − ui) · (uj − ui) − (uk × ui) · (uj × ui)/c2

�ik �ij, (10.1.5)

where �ik = √[(uk − ui)2 − (uk × ui)2/c2], and a similar expression for �ij.All three angles can be calculated by permuting cyclically the indices, andit is easy to see that ϑ3 = 0. By choosing a frame where the velocities areequal and opposite in direction, u1 = −u2, we are, in fact, considering a

Page 531: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

504 A New Perspective on Relativity

‘two-way’ Doppler shift. The relative velocity isa

γ = 2β

1 + β2 , (10.1.6)

where β = u/c, and u = |u1| = |u2|. The projection of the velocity onto thenormal of the wavefront is

n · u1 = −n · u2 = u cos ϑ ≥ 0. (10.1.7)

The cosine law (10.1.5) for angles ϑ1 and ϑ2 can be written as

cos ϑi = (u2 − cn · ui)u(c − n · ui)

i = 1, 2. (10.1.8)

On account of (10.1.7), the cosine of the first angle,

cos ϑ1 = β − cos ϑ

1 − β cos ϑ, (10.1.9)

represents the usual formula for aberration, except for the negative signwhich implies reflection and guarantees that the angle of parallelism isacute. The second equation of aberration for the first angle is:

sin ϑ1

λ1= sin ϑ

λ. (10.1.10)

The expression for ratio of the wavelengths,

λ1

λ=

√(1 − β2)

1 − β cos ϑ, (10.1.11)

is Doppler’s principle. For the second angle we have

cos ϑ2 = β + cos ϑ

1 + β cos ϑ, (10.1.12)

again on account of (10.1.7), and, hence,

sin ϑ2 =√

(1 − β2)1 + β cos ϑ

sin ϑ. (10.1.13)

Finally, by (10.1.7), we find the relation,

cos ϑ2 = γ − cos ϑ1

1 − γ cos ϑ1, (10.1.14)

between the two cosines, where γ is the relative speed given by (10.1.6).

aIt should be clear from the context when γ denotes the relative speed of the twosystems, and when it denotes the Lorentz factor.

Page 532: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 505

The aberration formula (10.1.10) has the identical form of the law ofreflection for a moving mirror. For a stationary mirror, λ1 = λ and ϑ1 = ϑ,where the angles are subtended by the incoming and outgoing rays, andthe surface of the mirror. However, it must be borne in mind that the anglesare at the vertices in velocity space so that an angle of ϑ = π/2 is parallelto the wavefront, or perpendicular to the motion.

The first, (10.1.9) and (10.1.10), and second, (10.1.12) and (10.1.13), pairof aberration equations can be combined to read

tan (ϑ1/2) =(

1 − β

1 + β

)1/2

cot (ϑ/2), (10.1.15a)

tan (ϑ2/2) =(

1 − β

1 + β

)1/2

tan (ϑ/2), (10.1.15b)

respectively. Expression (10.1.15b) is the usual formula given for aberration,and (10.1.15a) is what we found in (8.3.17). By letting the third vertex be thespeed of light, we have formed an ideal triangle. In hyperbolic geometry, atransversal which cuts the two parallel lines forms angles in the directionof parallelism such that the sum of the angles is less than two right angles.

Two important cases arise:

(i) when the vertices, u1 and u2, of the ideal triangle are on the samelimiting curve, or horocycle, H, whose center is at infinity, � [shown inFig. 10.1], and

(ii) when the transversal is perpendicular to one of the parallel lines [shownin Fig. 10.2].

Fig. 10.1. A segment H of a horocycle with center � at infinity with angles ofparallelism .

Page 533: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

506 A New Perspective on Relativity

Fig. 10.2. Angle of parallelism with transversal perpendicular to one of theparallel lines.

In the first case, ϑ = π/2, ϑ1 = ϑ2 = , where , the angle of paral-lelism, is given by

tan ((u)/2) =(

1 − β

1 + β

)1/2

= e−u/c. (10.1.16)

is function only of ‘distance’ u. From (10.1.9) and (10.1.12) we findcos ϑ1 = cos ϑ2 = β, which is the Euclidean measure of distance in velocityspace.

In the second case, one of the angles is π/2, and the other is neces-sarily acute, being the angle of parallelism. In other words, lines with acommon normal cannot be parallel so that must be acute. It is read-ily seen from (10.1.15b) that ϑ2 cannot become a right angle because thatwould imply cos−1 ( − β) = ϑ > π/2, and so violate (10.1.7). Again β is theEuclidean measure of length, which is equal to the hyperbolic tangent of itshyperbolic measure [cf. Eq. (10.3.2) below]. Negative values are ruled outin hyperbolic geometry: “the hyperbolic tangent is a function that assumesall values between 0 and 1” [Kulczycki 61, p. 163]. In other words, the angleof parallelism must be an acute angle, for, otherwise, the lines would bedivergent. The formation of an ideal triangle is related to the fact that c isthe limiting speed. We will return to this point Sec. 10.3.1.

Rather, if ϑ1 = π/2, and (10.1.15a) is introduced into (10.1.15b), we get

tan ((2u)/2) = 1 − β

1 + β= e−2u/c, (10.1.17)

where the angle of parallelism,, is a function of the twice ‘distance’ u. From(10.1.14) we find the new hyperbolic measure of distance as cos ϑ2 = γ ,which again is related to the hyperbolic tangent through (10.5.5) below.In contrast to (10.1.16), the hyperbolic measure has become twice as greatin (10.1.17). This, as we shall see, is the same as performing a ‘two-way’Doppler shift.

Page 534: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 507

The defect, η = π − ϑ1 − ϑ2 > 0, is expressed in terms of the relativespeed β and the angle ϑ subtended by the direction of the light source andthe line of sight of the observer, i.e.

tan (η/2) = β√(1 − β2)

sin ϑ. (10.1.18)

In the Thomas precession, the velocity turns along the orbit by anamount ϑ, while the spin projection in the orbital plane turns in the oppositedirection by the amount η = −dϕ, the hyperbolic defect, where dϕ is thechange in the angle that the spin projection makes with the velocity vectorin time dt. To calculate this change we consider a triangle with sides β1,β2, and β3 and corresponding angles ϑ1, ϑ2, and ϑ3. According to Gauss’sequation [Greenberg 93]

sin (η/2) = cos12

(ϑ1 + ϑ2 + ϑ3)

= cos12

(ϑ1 + ϑ2) cos (ϑ3/2) − sin12

(ϑ1 + ϑ2) sin (ϑ3/2)

= cosh 12 (β1 + β2) − cosh 1

2 (β1 − β2)

2 cosh 12 β3

sin ϑ3

= sinh 12 β1 · sinh 1

2 β2

cosh 12 β3

sin ϑ3

=√

( cosh β1 − 1) · √( cosh β2 − 1)√

2 · √( cosh β3 + 1)

sin ϑ3. (10.1.19)

Now letting β1 → β2 and β3 → 0, with ϑ3 = dϑ, there results [Sard 70]

sin (η/2) = 12

(γ − 1) sin (dϑ), (10.1.20)

where γ = 1/√

(1 − β2) is the Lorentz factor. For an infinitesimal timeinterval,

−dϕ = η = (γ − 1)dϑ = γ − 1β2

|u × u|c2 dt,

where we have introduced (10.1.4). The angular velocity of the Thomasprocession is thus given as

ωT = ϕ = − γ2

1 + γ

|u × u|c2 . (10.1.21)

However, if the relative speed is that of the two systems β = γ , givenin (10.1.6), (10.1.20) is replaced by

sin (η/2) = 12

(� − 1) sin (dϑ) =(γ2 − 1

)sin (dϑ),

Page 535: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

508 A New Perspective on Relativity

with � = (1 + β2)/(1 − β2). The angular velocity of the Thomas precessionwould then be

ωT = −2γ2 |u × u|c2 . (10.1.22)

At low speeds, γ ≈ 1, and (10.1.22) would be four times as large as (10.1.21).Comparison can be made with general relativity by associating the

acceleration with that of Newtonian gravity [Schiff 60],

u = −GMr3 r,

where if M is the mass of the earth, r would be the radial vector connectingthe center of the earth to an orbiting satellite. The satellite would precessin the plane of the orbit at a rate,

ωT = nα

r|u × r|

r2 ,

where α = 2GM/c2 is Schwarzschild’s radius. If the relative speed is β,then n = 1/4, while for the relative speed of γ , we have n = 1. Apartfrom predicting a precession frequency in the opposite direction, generalrelativity claims that n = 3/4 [Schiff 60].

We can therefore conclude that anytime a component of the accel-eration exists normal to the velocity, “for whatever reason, then there is aThomas precession, independent of other effects” [Jackson 75] — includingrelativistic ones. This kinematic effect is amplified by the compounding ofDoppler shifts, and this will be a recurrent theme throughout this chapter.

The fundamental connection between hyperbolic geometry and opti-cal phenomena in general, and relativity in particular, is that compound-ing longitudinal Doppler shifts gives the cross-ratio, whose logarithm isthe hyperbolic distance. As we know from Sec. 2.2.4, the cross-ratio is aprojective invariant of four points. This is the smallest number of pointsthat is invariant, since three points on a line may be projected to any otherthree. For consider two relative velocities, β1 and β2. Compounding theirlongitudinal Doppler shifts gives

(1 + β1

1 − β1

)1/2 (1 − β2

1 + β2

)1/2

=(

1 + (β1 − β2)/(1 − β1β2)1 − (β1 − β2)/(1 − β1β2)

)1/2

= {β1, β2| − 1, 1}1/2, (10.1.23)

Page 536: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 509

whose logarithm is precisely the hyperbolic distance with an absolute con-stant of unity. If we had considered a velocity addition law rather thansubtraction law, one of the velocities in the cross-ratio would be negative.

We know from Sec. 2.4 that the projective or Klein disc is not confor-mal, except at the origin of the hyperbolic plane, while the Poincaré discis. The disc models also differ in how hyperbolic distance is measured:the hyperbolic distance is twice as great in the Poincaré disc than it is inthe Klein disc. The factor two is not just a mere numerical factor, sinceit is indicative of reflections and the way velocities are compounded anddistances measured. Moreover, it will change the dependencies of energy,momentum, and consequently, mass, on the relative speed.

Another possibility of vindicating hyperbolic geometry consists in thedistinction between aberration and the pressure of radiation against a mov-ing mirror. Early in the development of the ‘special’ theory, the Lorentztransform and its inverse were used to determine the pressure of radia-tion on a moving mirror [Abraham 04,Einstein 98]. It is still common touse aberration to determine the radiation pressure, even though Einsteincalculated the difference in the energy density after being reflected fromthe mirror and the initial energy density in order to determine the radia-tion pressure. A ‘two-way’ Doppler shift is involved, and not a one-wayDoppler shift [Terrell 61]. This we will show to be the same distinctionbetween the Klein and the Poincaré models of hyperbolic geometry. More-over, it will turn out that the second-order Doppler effect predicted by thetwo-way Doppler shift is an experimental test for the angle of parallelism[cf. Sec. 10.6 below].

10.2 From the Klein to the Poincaré Model

The relativistic velocity addition law for two systems moving at equal andopposite speeds, (10.1.6), is the isomorphism fromtheKleinmodelofhyper-bolic geometry onto the Poincaré disc and upper half-plane models. In thePoincaré disc model, points of the hyperbolic plane are represented bypoints interior to a Euclidean circle, �. Lines not passing through the cen-ter of the circle are represented by open arcs of circles which cut a fixedcircle, �, orthogonally at P and Q in Fig. 2.27. Points lying on the real axisin the half-plane model, called ideal points or points at infinity, become

Page 537: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

510 A New Perspective on Relativity

points on the unit circle in the Poincaré disc model, whose locus representsa circle at infinity, or a ‘horizon’.

Not only did Beltrami discover the Poincaré disc model, some four-teen years before Poincaré rediscovered it, he also constructed the Klein, orprojective model, by projecting a hemisphere vertically downwards ontothe complex plane. We have discussed the Poincaré disc model in Sec. 2.5,where we placed a sphere whose south pole is centered at the origin ofthe disc in Fig. 2.26, and having the same radius as the disc. We can alsoplace the Beltrami disc on the equator of the sphere as shown in Fig. 10.3.A chord on the disc, PQ, is projected vertically downwards into the south-ern hemisphere. This chord becomes a semicircular arc dangling verticallydownward from the equator.Astereographic projection from the north poleN transforms the semicircular arc into an arc of a circle that cuts the discnormally or a straight line through the center of the equator. Stereographicprojection is conformal so that the hanging semicircular arc will producea circular arc that cuts the equator at right angles, and it projects circlesonto circles or straight lines. Thus, what was a non-Euclidean geodesicstraight line in the Beltrami–Klein model has become a circular arc in thePoincaré model.

Although the projection of a small circle on the hemisphere becomesan ellipse on the disc, so that the Klein model is not conformal, the redeem-ing feature of the model is that the vertical sections of the hemisphere

Fig. 10.3. Poincaré’s projections of the Beltrami model vertically into the southernhemisphere and stereographically back onto the equator.

Page 538: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 511

Fig. 10.4. Klein model where vertical sections of the hemisphere are projected intostraight lines. Geodesics retain their straightness at the cost of not being conformal.

are projected into Euclidean straight lines as shown in Fig. 10.4. In otherwords, the hyperbolic lines of the Klein model are Euclidean chords of theunit-circle.

We discussed the cross-ratio in Sec. 2.2.4. Here, we motivate its log-arithm as a measure of hyperbolic distance. The original idea was Cay-ley’s, in which he started with projective geometry and then introduced thenotion of Euclidean distance. But, it was Klein who realized the potencyand generality of the idea. A, B, and C are ordinary points inside �, andP and Q are the ends of the chord through A, B, and C. Recalling fromSec. 2.2.4 that the cross-ratio of the four points P, A, B, Q is

{A, B|P, Q} = e(AP)e(AQ)

· e(BQ)e(BP)

.

Likewise, the cross-ratio of the four points P, B, C, Q is

{B, C|P, Q} = e(BP)e(BQ)

· e(CQ)e(CP)

.

Their product,

{A, B|P, Q} · {B, C|P, Q} = e(AP)e(AQ)

· e(CQ)e(CP)

= {A, C|P, Q},

has eliminated the intermediate point B [cf. (2.2.16)].This motivates Klein’s definition of the length of the segment AC as

h(AC) = 12| ln{A, C|P, Q}|, (10.2.1)

Page 539: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

512 A New Perspective on Relativity

since the distances add,

h(AB) + h(BC) = h(AC).

But we also know from (2.2.17) that if we shorten the interval we lengthenthe distance between two intermediary points so that (10.2.1) satisfiesthe triangle inequality. If P = 1, Q = −1, A = 0, and B = b, then Klein’sdistance is

h(AB) = 12

ln

(1

1 − b· 1 + b

1

)= tanh−1 b.

So the Euclidean distance from the origin to a point b is the hyperbolicdistance b = tanh−1 b. As b varies from 0 to 1, b varies from 0 to ∞.

Poincaré, on the other hand, determines the distance of the arc fromA to B as twice Klein’s distance, viz.

h′(AB) = | ln{A, B|P, Q}|. (10.2.2)

Again, let the ends of the chord at P and Q be 1 and −1. If A and B havecoordinates x and y then the cross-ratio is

{A, B|P, Q} = 1 − x1 + x

· 1 + y1 − y

.

If A′ = γ(x) and B′ = γ(y), it follows that d(A′B′) = d′(AB), since

1 − γ(x)1 + γ(x)

=(

1 − x1 + x

)2

.

Hence, γ given by (10.1.6), is an isomorphism that makes the lengths of theKlein and Poincaré models coincide.

10.3 Aberration versus Radiation Pressureon a Moving Mirror

10.3.1 Aberration and the angle of parallelism

Having derived the formulas for aberration in Sec. 8.6 and Sec. 10.1, wenow consider, in greater detail, the limiting forms (10.1.16) and (10.1.17)which are the Bolyai–Lobachevsky formulas for the angle of parallelism.Although there has been no mention of hyperbolic geometry, this situationhas been widely discussed in the literature [Terrell 59,Weisskopf 60], andwithout any mention of an angle of parallelism.

Page 540: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 513

For ϑ1 = π/2 in (10.1.9), the observer in a frame in which the object is atrest will see the object rotated by an amount sin ϑ = √

(1−β2), just equal tothe FitzGerald–Lorentz contraction. The angle of parallelism, ϑ = cos−1 β,provides the link between circular and hyperbolic functions. Only at theangle of parallelism can a rotation be equated with a FitzGerald–Lorentzcontraction.

Terrell [59] also considers the opposite case where ϑ2 = π/2 in (10.1.14)and ϑ = cos−1 (−β). He concludes that to the stationary observer, the objectappears “to be rotating about its line of motion in such a way as to appearbroadside at ϑ = cos−1 (−β), and to present a view of its rear end from thattime on.” However, the stationary observer will not see any motion of thissort performed by the moving object because the angle of parallelism, link-ing circular and (positive) hyperbolic functions, must be acute; otherwise,the hyperbolic measure of distance would turn out to be negative!

Terrell’s [59] analysis cannot therefore be extended to angles of paral-lelism greater than π/2, for such angles do not exist. In other words,

the observer must make his observation of the object in the same inertialframe of the object, and the condition ϑ1 = π/2 makes ϑ an angle ofparallelism via the equation of aberration, (10.1.9).

We recall from Sec. 1.1 that it was the Serbian mathematician,Varicak [11], who dared to question the reality of the Lorentz contraction,and provoked Einstein’s [11] summary responseb:

The question of whether the Lorentz contraction is real or not is misleading. It isnot ‘real’ insofar as it does not exist for an observer moving with the object.

We will analyze the angle of parallelism further in terms of the projec-tive disc model, showing that it leads to Lorentz contraction in a directionnormal to the motion [cf. Eq. (10.5.2) below]. This will add further supportto a contraction normal to the motion that we found using the angle defectin Sec. 9.7. In the next section, we will relate it with the vanishing of theradiation pressure on a moving mirror.

bIt is ironic that Varicak’s works [10,11,12] on hyperbolic geometry went almostcompletely unnoticed, yet his small note on whether the Lorentz contraction wasreal or not caused a great deal of commotion and confusion [Miller 81]. Einsteindelegated Ehrenfest to answer Varicak, but, then, realizing that it might rock theboat of relativity, decided to answer himself.

Page 541: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

514 A New Perspective on Relativity

The angle of parallelism in (10.1.16) is a sole function of the ‘distance’β. The latter is the hyperbolic measure of distance in velocity space,

β = 12

ln

(1 + β

1· 1

1 − β

)= tanh−1 β, (10.3.1)

whose Euclidean measure is β.More precisely, (10.3.1) is the Klein length of the velocity segment. On

the basis of (10.3.1), we get the basic relation for the measure of a straightline segment in Lobachevsky space,

β = tanh β = cos ϑ(β), (10.3.2)

and

cosh β = 1√(1 − β2)

, sinh β = β√(1 − β2)

. (10.3.3)

Whereas the first equality in (10.3.2) and those in (10.3.3) hold for all one-way Doppler shifts, the second equality in (10.3.2) is valid only at the angleof parallelism, where ϑ(β) is a function only of β.

10.3.2 Reflection from a moving mirror

If ϑ is the angle that a ray makes with the surface of a mirror, and ϑ′′ theangle of the reflected ray with respect to the surface of the mirror then thelaw of reflection states that ϑ = ϑ′′. This changes when the mirror is inmotion. As we know from Sec. 3.5.1, radiation pressure has a long historysince Maxwell first predicted it. It also constituted one of the early testinggrounds of relativity.

If the mirror is receding from the radiating source, the ratio of thewavelengths of impinging and reflected radiation is

λ

λ′′ = cos ϑ + β

cos ϑ′′ − β, (10.3.4)

because the wavelength is lengthened in the forward direction and short-ened in the backward direction. The angle of reflection is referred to theframe in which the source is at rest,

cos ϑ′′ = cos ϑ + γ

1 + γ cos ϑ, (10.3.5)

where γ , given by (10.1.6), is the isomorphism from the Klein to the Poincarémodels. It involves a two-step process for carrying a point β in the Poincarédisc to the corresponding point γ in the Klein model.

Page 542: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 515

Introducing (10.3.5) into the ratio (10.3.4), where ϑ1 = ϑ′′ and λ1 = λ′′,leads to

λ

λ′′ =(

1 + β2

1 − β2

)(1 + γ cos ϑ), (10.3.6)

showing clearly that the wavelength of the reflected radiation, λ′′, has beenshortened with respect to the wavelength of the incoming radiation, λ.In fact, expression (10.3.6) is Doppler’s principle, (10.1.11), obtained byreplacing the relative velocity β by −γ .

Introducing (10.3.6) into the aberration equation (10.1.10), which justhappens to have the same form as the law of reflection from a movingmirror, results in

tan (ϑ′′/2) = sin ϑ′′

1 + cos ϑ′′ =(

1 − β

1 + β

)tan (ϑ/2). (10.3.7)

The ratio of the tangents is the square of that for aberration, (10.1.15b)!

10.4 Electromagnetic Radiation Pressure

Let us briefly summarize our results in Secs. 3.5.1, 6.6 and 9.8. Maxwellshowed that the pressure exerted on a square centimeter by a beam oflight is numerically equal to the energy in a cubic centimeter of the beam.Consider a plane wave of monochromatic light traveling in the x-direction.Maxwell’s equations for the relevant components of the electric, E, andmagnetic, H, fields are

E′x = Ex,

E′y = γ

(Ey − βHz

),

H ′z = γ

(Hz + βEy

),

where for a plane wave propagating in the x-direction, Ey = Hz. The radia-tion pressure, P′, in the frame moving at velocity, u, is related to the pressurein the stationary frame, P, according to (3.5.1)c

P′ = 12π

E′ 2y = P

(1 − β

1 + β

), (10.4.1)

cThis expression also appeared in Abraham’s [23] work, but we have not beenable to establish a priority claim with respect to the analysis of Poynting which wediscussed in Sec. 3.5.1.

Page 543: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

516 A New Perspective on Relativity

where P = (1/2π)E2y is Maxwell’s prescription of associating the pressure

acting on a square centimeter of surface with the energy density in a cubiccentimeter of the beam.

Let us remind ourselves that the relativistic Doppler shift in thefrequency ν′, from its stationary value, ν,

K := ν′

ν= γ (1 − β cos ϑ) , (10.4.2)

combines the ordinary Doppler shift with the relativistic time dilatationfactor. Of course, (10.4.2) can be derived from the Lorentz transformation;it can also be derived, however, in more general terms from the relativevelocity, w, of the corresponding segment s of the Lobachevsky straight line(10.1.1), where the relative velocity is related to the corresponding segments of the Lobachevsky straight line by w = c tanh s. Expression (10.1.1) spansthe entire gamut: from a single velocity, β = c tanh (u/c) [the first equalityin Eq. (10.3.2)], to equal and opposite velocities, γ = c tanh γ [Eq. (10.4.12)below].

If the energy increases with speed w asd

E′ = E0√(1 − w2/c2)

,

dAs we discussed in Sec. 5.4.4, Abraham’s [04] model was an early contender totaken into account the electron’s energy dependency upon speed, in which he tookSearle’s [97] expression for the total energy of a spherical body of radius r with auniform distribution of charge, e, in motion with a uniform speed w,

E = e2

2r

{cw

ln

(1 + w/c1 − w/c

)− 1

},

for the energy of an electron. It is commonly believed that Abraham’s model distin-guishes itself insofar as the electron remains rigid both in the state of rest as in thestate of relative motion. If this were true, its energy would not be a function of therelative velocity. Abraham, as we have seen, obtains this dependency on invokinga dilatation of the semimajor axis that depends on the relative velocity through theLorentz factor.

His expression for the energy shows that the energy is proportional to the dif-ference in the hyperbolic and Euclidean measures of the speed, (w − w), wherePoincaré’s hyperbolic measure is given by the logarithm of the cross-ratio,

w = c ln

(1 + w/c1 − w/c

).

It demonstrates that the body’s energy, and hence its mass, increases as a resultof the motion, and shows that such a dependency is tied to the deviation fromEuclidean geometry.

Page 544: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 517

then

E′ = E(1 − u · v/c2)√

(1 − u2/c2),

where E/E0 = 1/√

(1 − v2/c2). For v = c cos ϑ we get

E′ = E1 − β cos ϑ√

(1 − β2). (10.4.3)

The energy, (10.4.3), and amplitude [cf. Eq. (10.4.9) below], transform in thesame way as the frequency, (10.4.2). This was stressed by Einstein [98] asbeing of particular relevance since, according to him, Wien’s distribution isrelated to it. Because the volume transforms as the inverse of the frequency,the energy density, ε, will transform as the square of the frequency,

ε′ = K2ε. (10.4.4)

But this is none other than what Poynting claimed in Sec. 3.5!Observing the motion in the line of sight, (10.4.4) reduces to Poynting’s

expression (10.4.1) for the energy densities [cf. first equation in Sec 3.5.1]. Inthe general case, the radiation falls obliquely on the mirror, making an angleϑ with the normal, as in Fig. 10.5. The energy that falls on a unit area normalto the rays (CB in Fig. 10.5) has an area of magnitude 1/ cos ϑ′ on the surfaceAB. In addition, the component of the momentum is reduced by a factorof cos ϑ′ than if it were directed normal to the surface. Consequently, themomentum per unit area is decreased by a factor of cos2 ϑ′, and this factormust be multiplied to the energy density when calculating the pressure.

We thus obtain

P′ = 2ε′ cos2 ϑ′ = 2ε(cos ϑ − β)2

(1 − β2

) , (10.4.5)

for the radiation pressure, where the pre-factor 2 comes from the fact that,upon reflection, the mirror receives Poynting’s ‘double dose’ of momen-tum, and the pressure is, consequently, doubled.

Fig. 10.5. Radiation falling obliquely on a mirror of length AB.

Page 545: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

518 A New Perspective on Relativity

Whereas the derivation of the radiation pressure on a moving mirrorbased on aberration is conceptually incoherent [Terrell 61], Einstein’s [98]original derivation is. From his two-way Doppler shift, and his requirementto calculate the reflected energy in the same frame as the incident energy,he could have deduced many of the results presented here, together withthe realization of the intimate relationship between special relativity andhyperbolic geometry that, as we have seen in Chapter 9, applies to relativityin general. We shall now show that whereas (10.1.15b) is related to one-way aberration, its square, (10.3.7), relates to the change in wavelength onreflection from a moving mirror.

Although Einstein [98] gets the same result as (10.4.5), he uses energyconservation and by transforming to the mirror’s moving frame, reflectingand transforming back to the stationary frame. The first step would haveyielded half the pressure, as shown below, but is more enlightening thanthe method used above, since it brings out the fact that it is a second-orderrelativistic effect.

Einstein obtains the frequency shift,e

ν′′ = ν

(1 + β2 − 2β cos ϑ

1 − β2

), (10.4.6)

upon reflection. In addition, he gives the law of the transformation of thecosine of the angle,

cos ϑ′′ = (1 + β2) cos ϑ − 2β

1 + β2 − 2β cos ϑ, (10.4.7)

which is not the aberration formula (10.1.5), but, rather, (10.3.5) withβ → −β. Had Einstein used the above procedure to calculate the radiationpressure, he would have obtained

P′ = 2ε

(1 + β2 − 2β cos ϑ

1 − β2

)2 ((1 + β2) cos ϑ − 2β

1 + β2 − 2β cos ϑ

)2

= 2ε

(1 + β2

1 − β2

)2

( cos ϑ − γ)2, (10.4.8)

eEinstein later corrects the denominator to read as in expression (10.4.6).

Page 546: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 519

which is certainly not (10.4.5). This is the radiation pressure that a mirrorfeels when it moves at constant relative speed γ .

Pauli [58] uses the fact that the amplitudes, A′ and A, transform as thefrequencies, i.e.

A′ = A1 − β cos ϑ√

(1 − β2), (10.4.9)

to claim that the radiation pressure,

P = 2A2 ( cos ϑ − β)2

1 − β2 = 2A′2 cos2 ϑ′ = P′, (10.4.10)

is invariant. This is not, however, what one would conclude from (10.4.1).Recall from Sec. 6.4 that the invariance of the pressure was first establishedby Planck [08] by studying how thermodynamic densities transform underthe Lorentz transformation. Since 1 ≥ cos ϑ ≥ β, we average (10.4.10) overthe solid angle with the given limits and get

P′tot(β) = 1

∫ cos−1 β

0P′ · 2π sin ϑ dϑ

= ε

1 − β2

∫ 1−β

0x2 dx = ε

3(1 − β)2

1 + β. (10.4.11)

This result differs from Terrell [61] in the limits of integration, and it isalso at variance with Rindler and Sciama [61], and Schlegel [60]. The totalradiation pressure, (10.4.11), tends to its classical value of ε/3 in the limit asβ → 0, and vanishes in the limit as β → 1. The former is the blackbody radi-ation limit, and the latter is completely comprehensible since light wavescannot exert a pressure on an object which is traveling at the same speed.Whereas Terrell finds the same classical limit for the radiation pressure,he concludes that “it becomes infinite for β = 1,” which is (10.4.11) underβ → −β, i.e. the mirror is approaching the radiation source. Curiously, thearithmetic average of forward and backward pressures,

12

[Ptot(β) + Ptot(−β)

] = 13ε

1 + 3β2

1 − β2 ,

is precisely what von Laue [19] finds for the xx-component of the stresstensor in an inertial frame moving in the x-direction.

Page 547: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

520 A New Perspective on Relativity

The process of reflection changes the Euclidean measure of the relativespeed, β, into

γ = tanh γ , (10.4.12)

which is now the corresponding segment of the Lobachevsky straight linein velocity space. When (10.4.12) approaches cos ϑ, the angle of parallelismis reached and the radiation pressure will vanish [cf. Eq. (10.3.2)]. The rela-tions between first- and second-order relativistic effects are

cosh 2(u/c) = cosh2 (u/c) + sinh2 (u/c)

= cosh γ = 1 + β2

1 − β2 = �, (10.4.13)

sinh 2(u/c) = 2 sinh (u/c) cosh (u/c)

= sinh γ = 2β

1 − β2 , (10.4.14)

which were originally derived by Varicak [10] way back in 1910!With β = tanh (γ/2) as the relative speed, and cos ϑ = tanh δ, the con-

servation of energy demands

P′β = ε( cos ϑ − β) − ε′( cos ϑ′ + β), (10.4.15)

which is given explicitly by

2εsinh2 (δ − γ/2)

cosh2 δtanh (γ/2)

= ε

{tanh δ − tanh (γ/2) − cosh2 (δ − γ)

cosh2 δ

[tanh (δ − γ/2) + tanh (γ/2)

]}.

Equation (10.4.15) expresses the fact that the difference in energy is equal tothe work, P′β. In the limit as δ → ∞, we get the line of sight relation,

P′ = 2εe−γ = 2ε

(1 − β

1 + β

),

which is (10.4.1).When radiation impinges on a forward moving mirror, the wave-

length of incident radiation is shortened by the amount proportional to1−β, while the reflected radiation is elongated by an amount proportional

Page 548: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 521

to 1+β.f In fact, (10.4.15) is the negative of what Pauli [58] considers as Ein-stein’s expression for the radiation pressure. P′β is the work that is requiredto move the mirror backwards.

Poynting [10] very vividly describes pressure absorption as the ceas-ing of wave motion at a black surface, where the waves deliver up all theirmomentum. Since the waves

press against [the black surface] as much as they pressed against [the source] inbeing emitted. . . the pressure against [the black surface] is therefore equal to theenergy density per cubic centimeter in the beam.

If the source is moving forward at constant relative speed β the work, P′β,is determined by the one-way Doppler shift; that is, the difference betweenthe incident energy density per unit area per unit time, ε( cos ϑ − β), andthe energy absorbed by the black surface, ε′ cos ϑ′, viz.

P′β = ε( cos ϑ − β) − ε′ cos ϑ′

= ε

{cos ϑ − β − (1 − β cos ϑ)2

1 − β2

(cos ϑ − β

1 − β cos ϑ

)}

= ε

{tanh δ − tanh (γ/2) − cosh2 (δ − γ/2)

cosh2 δ· tanh (δ − γ/2)

}.

This gives a radiation pressure,

P′ = εsinh2 (δ − γ/2)

cosh2 δ= ε

( cos ϑ − β)2

1 − β2 ,

that is exactly half of (10.4.5). It has the same form as (10.4.8), since the lattercan be expressed as

P′ = 2εsinh2 (δ − γ)

cosh2 δ.

The fact that the wavelength at which the radiation is absorbed is greaterthan that at which it is emitted by the source, i.e.

λ′/λ = (1 − β cos ϑ)−1,

fNo appeal is being made to emission theory since the frequency varies inverselyto the wavelength in order to maintain the speed of light constant. Electromagneticvibrations are self-contained, and are not those of the medium.

Page 549: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

522 A New Perspective on Relativity

means that less energy is absorbed than was emitted. This is true also forthe two-way shift. The factor of 2 has led to the confusion of whether toconsider the radiation pressure reflected by a moving mirror as being aone- or two-way Doppler shift, or equivalently, as belonging to the Kleinor Poincaré model of the hyperbolic plane.

10.5 Angle of Parallelism and the Vanishingof the Radiation Pressure

Consider again a unit disc with center O and some hyperbolic distance β

whose value is (10.3.1). This defines the ‘distance’ β in terms of the loga-rithm of the cross-ratio. Now consider the right triangle that has an angle Aat the origin, as shown in Fig. 9.8. We recall from Sec. 9.7 that since the angleA is located at the origin, the hyperbolic measure of A will be the same asits Euclidean measure. Also recall that hyperbolic tangents correspond tostraight lines in Lobachevsky space, the cosine of the angle will be the ratioof the adjacent to the hypotenuse, cos A = cos A = tanh β/ tanh γ , where A isthe Euclidean measure of the angle, and we have set the absolute constantequal to one.

Now, the Euclidean length of the opposite side, α, can be calculatedfrom the cross-ratio, and what we found was

α = tanh α sech β, (10.5.1)

or (9.7.2). Expression (10.5.1) represents the ratio of the Euclidean to hyper-bolic arc lengths, which is progressively smaller than 1 the larger theEuclidean distance in velocity space. This is the origin of the Lorentz con-traction in the direction normal to the motion that we found in Sec. 9.7. Hadone endpoint of the line-segment been located at the origin O, there wouldhave been no distortion of the angle and therefore the line segment wouldhave been tanh α. As we have seen in Sec. 9.7, its non-central location iswhat is responsible for the angle defect that we observe.

In the projective model, the hyperbolic measures of all other angleswill be different than their Euclidean counterparts. For the angle B,

cos B = α

γ= tanh α

tanh γsech β = cos B

√(1 − β2).

Page 550: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 523

Since the last term is less than unity, cos B > cos B. And since cosine is adecreasing function over the open interval (0, π), it follows that B < B, sothat its hyperbolic sum will be less than its Euclidean sum, π. Thus, theangular defect is the origin of the FitzGerald–Lorentz contraction in thedirection normal to the motion, just like the second-order Doppler shift.

The first-order, longitudinal, Doppler shift plays a fundamental role inhyperbolic geometry. It determines the velocity composition law andthe cross-ratio, and hence the hyperbolic distance. The second-order,lateral, Doppler shift is the ratio of the Euclidean to hyperbolic linesegments and determines the angle defect.

Now, the largest value of α occurs when it reaches the chord PQ. Itshyperbolic measure becomes infinite, and the angle B tends to zero forβ = 0. However, the Euclidean measure of α is

α = sin A(β�) = √(1 − β2

max) = sech β�, (10.5.2)

where the maximum relative velocity is δ in Fig. 9.8. Only at the angle ofparallelism can rotation be linked to hyperbolic contraction, and this isprecisely what happens when A = cos−1 βmax.

Since the Euclidean measure of the hypotenuse, γ = 1, (10.3.1) gives

β� = 12

ln

(1 + cos A1 − cos A

)

= 12

ln

(1 + cos A

sin A

)2

= ln cot (A/2),

where A is the angle of parallelism, which is a function of the length β� atβmax = cos A in Fig. 9.8.

We recall that expression (10.5.2) is what Terrell [59] finds for therotation of an object that an observer will see in the same frame as the mov-ing object when the stationary observer’s view is in the direction normalto the motion. And since this is a limiting form of aberration it does notdepend upon the distance between the observer and the object that is beingobserved.

Associating the angle A withϑ, the radiation pressure (10.4.5) vanishesfor the one-way shift at the critical angle ϑ = cos−1 βmax, whereas for a

Page 551: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

524 A New Perspective on Relativity

Fig. 10.6. The Poincaré half-plane model of measuring distances.

two-way shift (10.4.8) vanishes at its critical angle, ϑ = cos−1 γmax. At thesecritical angles, the waves have ceased to press against the mirror, and,consequently, the radiation pressure vanishes. This is what the Klein discmodel predicts.

Now, let us see what the Poincaré disc has to say about two-wayDoppler shifts. Since the model is conformal there is no need to distinguishbetween Euclidean and hyperbolic measures of the angles. Consider thehyperbolic arc length, γ , from A to B in the Poincaré half-plane, in Fig. 10.6,of a semi-circle of radius 1. Its length is determined by the logarithm of thecross-ratio, {A′, B′|P, Q}, where the primes denote the projections of A andB onto the x-axis.

Using the Klein definition of hyperbolic distance, (10.2.1), we have

γ = 12

ln{A′, B′|P, Q

} = 12

ln

(1 + cos ϑ

1 − cos ϑ· 1 − cos ϑ′′

1 + cos ϑ′′

), (10.5.3)

where ϑ = ∠BOQ and ϑ′′ = ∠AOQ. Hence, the Euclidean length of γ is

tanh γ = cos ϑ − cos ϑ′′

1 − cos ϑ cos ϑ′′ , (10.5.4)

and when γ becomes the hyperbolic length RB, ϑ′′ = π/2, (10.5.4) reducesto [cf. Eq. (10.3.2)]

γmax = tanh γ� = cos ϑ(γmax). (10.5.5)

Page 552: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 525

And since

γ = tanh−1 γ = 12

ln

(1 + γ

1 − γ

)

= ln

(1 + β

1 − β

)= 2β, (10.5.6)

we, in effect, are dealing with Poincaré’s definition of hyperbolic distance[cf. Eq. (10.3.1)].

Thus, ϑ becomes the angle of parallelism, which is a function solely ofthe arc length, γ . This is so because BR is perpendicular to the line h whosebounding parallel through B is h′. Hence, the angle between BR and h′ isalso equal to ϑ.

10.6 Transverse Doppler Shifts as ExperimentalEvidence for the Angle of Parallelism

The one-way Doppler shift, (10.4.2), predicts a small ‘blueshift’ whenϑ = π/2,

ν′ = ν/√

(1 − β2). (10.6.1)

As we know from Sec. 3.4, Ives and Stilwell [38] were the first to test timedilatation by measuring the difference in the Doppler shift of spectral linesemitted in the forward and backward directions by a uniformly movingbeam of hydrogen atoms.

It might be more advantageous to consider the two-way Doppler shift,where (10.3.6) gives the frequency shift

ν′′ = ν1 + γ cos ϑ√

(1 − γ2). (10.6.2)

The two-way aberration formula,

sin ϑ′′ =√

(1 − γ2)1 + γ cos ϑ

sin ϑ,

together with (10.6.2) lead immediately to the law of sines, (10.1.10), fora moving mirror, which, as we have pointed out, also happens to be theformula for aberration.

Page 553: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

526 A New Perspective on Relativity

The two-way Doppler shift, (10.6.2), like its one-way counterpart(10.4.2), predicts a ‘redshift’ as either the transmitter, or receiver, recedefrom the other. However, for ϑ = π/2, a blueshift would remain. The shiftedfrequency would be

ν′′ =(

1 + β2

1 − β2

)ν = ν/ sin ϑ′′. (10.6.3)

In this limit, (10.3.7) reduces to the angle of parallelism:

tan (ϑ′′/2) =(

1 − β

1 + β

)= e−γ , (10.6.4)

which follows from (10.5.6). The angle ϑ′′ is, indeed, acute, and γ = tanh γ =cos ϑ′′. Therefore, a second-order shift predicted by (10.6.3) would be adirect confirmation that relativity operates in hyperbolic velocity space.At the present time, the experimental evidence is not conclusive. Lightpulses reflected from a rotating mirror have not shown relativistic fre-quency shifts [Davies & Jennison 75], nor have those from dual disksrotating at equal speeds in opposite directions operating in the microwaveregion [Thim 03]. However, a positive result has been reported by measur-ing the Mössbauer effect with source and absorber mounted on a rotatingdisk [Champeney et al. 64].

The null results can possibly be explained by a confusion between one-way and two-way Doppler shifts. In [Thim 03], the one-way, (10.6.1), andtwo-way, (10.6.3), shifts were placed on equal footing because both predicta frequency shift proportional to β2. Hence, it is not clear to experimenterswhat they should be looking for is a two-way, second-order Doppler shift,and not a first-order one.

References

[Abraham 04] M. Abraham, Boltzmann-Festschrift (1904), p. 85; “Zur Theorie derStrahlung und des Strahlungsdruckes,” Ann. der Phys. 14 (1904) 236–287.

[Abraham 23] M. Abraham, Theorie der Elektrizität, 5th ed. (Teubner, Leipzig, 1923),p. 316.

[Borel 13] E. Borel, “La théorie de la relativité et la cinématique,”Comptes Rendusdes séances de l’Académie des Sciences 156 (1913) 215–217.

[Busemann & Kelly 53] H. Busemann and P. J. Kelly, Projective Geometry and Projec-tive Metrics (Academic Press, New York, 1953), p. 186.

Page 554: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

Aberration and Radiation Pressure 527

[Champeney et al. 64] D. C. Champeney, G. R. Isaak, and A. M. Khan, “A timedilatation experiment based on the Mössabauer effect,” Proc. Phys. Soc. 85(1964) 583–593.

[Criado & Alamo 01] C. Criado and N. Alamo, “A link between the bounds onrelativistic velocities and the areas of hyperbolic triangles,” Am. J. Phys.69 (2001) 306–310. The formula for the metric on page 307 in this article isinaccurate. It is given correctly in footnote 10, where r is replaced by v.

[Davies & Jennison 75] P. A. Davies and R. C. Jennison, “Experiments involvingmirror transponders in rotating frames,” J. Phys. A 8 (1975) 1390.

[Einstein 11] A. Einstein, “Zum Ehrenfestschen Paradoxon,” Phys. Z. 12 (1911)509–510.

[Einstein 98] A. Einstein, “On the electrodynamics of moving bodies,” in Einstein’sMiraculous Year, ed. J. Stachel (Princeton U. P., Princeton NJ, 1998),pp. 123–160.

[Fock 66] V. Fock, The Theory of Space, Time, and Gravitation, 2nd ed. (PergamonPress, Oxford, 1966), pp. 375–383.

[Friedmann 22] A. Friedmann, “Über die Krümmung des Raumes,” Z. Phys. 10(1922) 377–386.

[Greenberg 93] M. J. Greenberg, Euclidean and Non-Euclidean Geometries, 3rd edn.(W. H. Freeman, New York, 1993), p. 434.

[Ives & Stilwell 38] H. E. Ives and G. R. Stilwell, “An experimental study of rapidlymoving objects,” J. Opt. Soc. Amer. 28 (1938) 215–226.

[Jackson 75] J. D. Jackson, Classical Electrodynamics, 3rd ed. (Wiley, New York 1975),p. 546.

[Kulczycki 61] S. Kulczycki, Non-Euclidean Geometry (Pergamon Press, Oxford,1961), p. 77.

[Larmor 00] J. Larmor, Aether and Matter (Cambridge U. P., London, 1900),pp. 177–179.

[Laue 19] M. von Laue, Die Relativitätstheorie, Vol. 1 (Vieweg, Braunschiweig, 1919),p. 205, first formula in the third line of equation (XXVIII).

[Miller 81] A. Miller, Albert Einstein’s Special Theory of Relativity (Addison-Wesley,Reading MA, 1981), pp. 249–253.

[Needham 97] T. Needham, Visual Complex Analysis (Clarendon Press, Oxford,2005), p. 307.

[Pauli 58] W. Pauli, Theory of Relativity, (Dover, New York, 1958), p. 97.[Planck 08] M. Planck, “Zur Dynamik Bewegter Systeme,” Ann. d. Phys. 26 (1908)

1–34.[Poynting 10] J. H. Poynting, The Pressure of Light (Soc. Promoting Christian Knowl-

edge, London, 1910), p. 85.[Rindler & Sciama 61] W. Rindler and D. W. Sciama, “Radiation pressure on a

rapidly moving surface, ” Am. J. Phys. 29 (1961) 643.[Rindler 82] W. Rindler, Introduction to Special Relativity (Clarendon Press, Oxford,

1982), p. 48.[Sard 70] R. D. Sard, Relativistic Mechanics (Benjamin, New York, 1970), p. 289.[Searle 97] G. F. C. Searle, “On the motion of an electrified ellipsoid,” Phil. Mag. 44

(1897) 329–341.

Page 555: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch10

528 A New Perspective on Relativity

[Schiff 60] L. I. Schiff, “Motion of a gyroscope according to Einstein’s theory ofgravitation,” Proc. Natl. Acad. Sci. 46 (1960) 871–882.

[Schlegel 60] R. Schlegel, “Radiation pressure on a rapidly moving surface,” Am.J. Phys. 28 (1960) 687–694.

[Stachel 95] J. J. Stachel, “History of relativity,” in Twentieth Century Physics, eds.L. M. Brown et al. (AIP, New York, 1995), Vol. 1, 249–356.

[Terrell 59] J. Terrell, “Invisibility of the Lorentz contraction,” Phys. Rev. 116 (1959)1041–1045.

[Terrell 61] J. Terrell, “Radiation pressure on a relativistically moving mirror,” Am.J. Phys. 29 (1961) 644.

[Thim 03] W. H. Thim, “Absence of relativistic transverse Doppler effect atmicrowave frequencies,” IEEE Trans. Instrum. Meas. 52 (2003) 1660–1664.

[Varicak 10] V. Varicak, “Die Reflexion des Lichtes an Bewegten Spiegeln,” PhysikZeitschr. XI (1910) 586–587.

[Varicak 11] V. Varicak, “Zum Ehrenfestschen Paradoxon,” Phys. Z. 12 (1911) 169.[Varicak 12] V. Varicak, “Über die nichteuklidische interpretation der relativtheo-

rie,” Jber. Dtsch. Mat.Ver. 21 (1912) 103–127.[Weisskopf 60] V. F. Weisskopf, “The visual appearance of rapidly moving objects,”

Phys. Today Sept. 1960, 24–27.

Page 556: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

Chapter 11

The Inertia of Polarization

Special relativity killed the classical dream of using the energy–momentum–velocityrelations of a particle as a means of probing the dynamic origins of mass.[Pais 82]

11.1 Polarization and Relativity

Polarization is a property of the orientation of oscillators that producestransverse wave motion. In the case of light waves that travel withoutobstruction, the polarization is always normal to the direction of propa-gation. The medium ‘wiggles’ back and forth in a direction perpendicularto the direction of motion. For plane waves of electromagnetic origin, thetransversality condition demands that the electric and magnetic fields beperpendicular to the direction of propagation, and perpendicular to eachother. Traditionally, the electric field vector has been used to describe polar-ization, since the magnetic field vector is both proportional, and perpendic-ular, to it. When the wave is polarized, the electric field remains constantboth in amplitude and phase; it can be oriented in a single direction, inwhich case we speak about linear polarization, or it can rotate as the waveprogresses, which may either be circular or elliptical polarization.

11.1.1 A history of polarization and someof its physical consequences

The concept that light waves are transverse is due to Thomas Young andit was used by him to explain the phenomena of polarized light. Up untilthat time, light was thought to be constituted of longitudinal waves. Thisseemed to be compatible with Huygens’s principle which can be applied toa wavefront as it expands from a point source through the aether. The aether

529

Page 557: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

530 A New Perspective on Relativity

was deemed necessary as the medium which supported the propagation ofwaves. At some point a parent wavefront will disappear instantaneously,leaving in its wake a myriad of daughter wavelets which again expand asspherical waves in the aether. Now, the gist of Huygens’s principle is thatthe disturbances of the daughter wavelets will be only observed on theircommon forward envelope. Thus, Huygens envisaged both a fission andfusion of wave forms: A fission from a parent into daughter wavefronts,and a fusion of these wavelet motions along a common envelope forminga single wavefront at a later time.

However, it was not until the discovery of polarization that a choicehad to be made between transverse and longitudinal propagation of thewaves. In fact, polarization was used initially to support the corpusculartheory of light. Bartholinus discovered the effect of double refraction in1669 which occurred when light passed through crystals of calcite, thenknown as Icelandic spar. Somewhat later, Huygens discovered the phe-nomenon of polarization by passing light in series through two calcitecrystals. Although, as we have seen, Huygens relied on wave theory, thephenomenon of polarization was used in favor of a corpuscular theoryof light. Earlier Newton had suggested that double refraction and partialreflection could be explained by assuming that the particles of light wereasymmetric. Malus took this idea one step further by assuming that the par-ticles of light were initially disoriented, and only when they pass througha double diffracting crystal become ordered like those of magnetic bodies.Carrying the analogy further he assumed that light particles had poles, andthat the oriented light be called polarized light.

Among the initial proponents of a vibration theory of light were Eulerand Young. They argued that if light were composed of particles, the par-ticles would have to be exceedingly small so that when two light beamscross each other they do not interfere with one another. Even more impor-tant, there is no ‘dissipation’ of light when it travels over great distances. Iflight were composed of particles, the particles making up the rays of lightwould interfere with one another making the image fuzzy, in contrast tothe sharp images that are observed. According to their vibration theory,there would have to be a medium in which the vibrations propagate likethat of sound waves. When sound waves propagate in a solid medium, thepolarization of the waves is in the direction of the shear stress in the plane

Page 558: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 531

normal to the direction of propagation. In gases and liquids, sound wavesare longitudinal with their oscillations in the direction of the motion.

Every vibratory source would require a different medium or aether.As Maxwell lamented,

Aethers were invented for the planets to swim in, to constitute electric atmosphereand magnetic effluvin, and so on to convey sensations from one part of our bodyto another, and so on, till all space had been filled three or four times over withaethers.

For light transmission the medium was referred to as a ‘luminiferousaether,’ as opposed to an ‘electric aether’ that was required for the propa-gation of electrical disturbances.

To Fresnel we owe the idea that light waves are completely transversewhich was a revolutionary idea since transverse elastic waves in solids werecompletely unknown at that time. However, this luminiferous aether wasno ordinary aether since the theory of elastic waves in solids leads to theconclusion that longitudinal waves are always present in the reflection andrefraction of elastic waves. However, the boundary conditions introducedby Fresnel were not the boundary conditions of an elastic medium, but theydid account for phenomena associated with the propagation of light.

Diffraction phenomena could be explained by vibration theory inwhich light was seen to be a longitudinal vibration like those of soundwaves, or wave theory which advanced the transverse vibrations of theaether. The death knell of corpuscular theory came with the Fizeau andFoucault measurements of the velocity of light which clearly showed thatthe velocity of light was smaller in liquids than in air, contrary to whatNewton predicted. Thereafter the wave theory triumphed, but there wasanother merger to be made. Maxwell advanced the idea that light was reallyan electromagnetic wave on the basis of a common velocity of propagation.But, Maxwell did not live to see his idea become reality when Hertz, in 1888,showed that electromagnetic waves stemming from oscillating electric cur-rents can exhibit reflection, diffraction, refraction, interference, and, last butnot least, polarization.

Polarization is described by two perpendicular components normalto the direction of wave propagation. Plane waves of any polarization canbe obtained by combining any two orthogonally polarized waves. So whatdoes this have to do with relativity?

Page 559: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

532 A New Perspective on Relativity

Lorentz introduced two masses according to the relation between theacceleration a and the force F [cf. (5.4.39)],a

a = √(1 − u2)

F − (F · u)um

. (11.1.1)

If the force is in the direction of the motion, we get the so-called ‘longi-tudinal’ mass, while if the force is normal to the motion the ‘transverse’mass results. Since it was the latter which coincided with the mass deter-mined from Kaufmann’s deflection experiments on negatively chargedparticles that we discussed in Sec. 5.4.1, the former was subsequentlyforgotten.

Furthermore, in the old definition of the electromagnetic mass thatwe discussed in Sec. 3.7.3.2, the mass was defined as the ratio of the elec-tromagnetic momentum, p, to the speed, u, viz.

p = mu,

in the case where the electron is asymmetrical. This definition made themass a vector quantity when the momentum was not in the direction of thevelocity. Likewise, the longitudinal mass also became a vector

m′ = ∂p∂u

= m + u∂m∂u

.

In an asymmetrical electron both masses have transverse components.However, in the models set forth byAbraham and Lorentz in Sec. 5.4.4, bothmasses are in the direction of the motion, and their moduli reduce to thenormal transverse and longitudinal components.

J. J. Thomson criticized the Lorentz force, which is what a chargeparticle experiences as it traverses a magnetic field, for violating Newton’sthird law of motion.According to him, this could be rectified by consideringthe existence of momentum in the electric field. In his words, “the loss ofmomentum in the pulse should be equal to the gain of momentum bythe body.” The amount of momentum in the field Thomson found to beproportional to the number of ‘Faraday tubes’ passing through a unit area

aIn this chapter we use natural units in which � = c = 1. In natural units length andtime have the same dimensions while mass has the dimensions of inverse length,e.g. Compton length. Also in this chapter we will use p as the momentum since Gis reserved for a generalized displacement.

Page 560: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 533

drawn at right angles to their direction. It was also shown to be proportionalto the magnetic induction, a quantity which Larmor associated with thevelocity of the aether. Then, the direction of momentum would be at rightangles to both the magnetic induction and the Faraday tubes. Since themomentum is proportional to the Poynting vector, the Faraday tubes wouldbe a materialization of the electric field.b

As the Faraday tubes move through the aether, the motion of thesecylinders normal to their lengths would necessarily lead to an increase intheir mass for they would drag the aether with them. We have corroboratedin Sec. 5.3.1 that the broadside motion of a rod increases its mass over that ofits frontal motion. A moving charge creates a magnetic field whose energywe have shown to be proportional to the kinetic energy in Sec. 5.4.3. So if thegeometry of the mass is considered to be a sphere when at rest, there willbe an additional increase in its momentum when set in motion. Accordingto Thomson, the additional momentum does not reside in the sphere, but,rather, in the aether surrounding it.

A third type of polarization is well-known in hadron colliders. If pl isthe momentum along the beam direction, the experimental particle physi-cist’s definition of rapidity is

y = 12

ln

(W + pl

W − pl

), (11.1.2)

bAgain there is a potential priority dispute between Poynting and Heaviside forthe discovery of the ‘Poynting’ vector. While it is true that Poynting’s paper wasreceived by the Royal Society on December 17, 1883, and read on January 10 ofthe following year, it carries a footnote that was subsequently added by Poyntingon the 19th of June leading one to believe that it did not appear in print until afterthat date [Nahin 88]. While in the June 21st edition of The Electrician, Heavisidewrote

The direction of maximum transference is therefore perpendicular to the planecontaining the magnetic force and the current directions, and its amount persecond proportional to their strengths and the sine of the angle between theirdirections.

And it was not until the following year, January 10, 1885 to be precise, that Heavisideactually published a proof of this theorem. It was a two step proof, thanks to hisvector calculus, and not a many page one containing infinite triple integrals likethe one given by Poynting. Moreover, it was Heaviside who got the direction ofPoynting’s vector right, as we shall see in Sec. 11.5.1.

Page 561: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

534 A New Perspective on Relativity

where W is the total energy

W2 = p2l + m2

t . (11.1.3)

The transverse momentum, pt, is related to the transverse mass mt by

mt = √(m2 + p2

t ). (11.1.4)

Expression (11.1.2) differs from the usual definition of rapidity insofar as itreplaces the modulus, |p|, by pl. In hadron collider physics, this modifica-tion is justified by the fact that particle production is a constant function ofthe rapidity in (11.1.2). We will soon appreciate that the difference betweenmass and momentum polarizations lies in which quantity is being heldconstant: For mass polarization it is the total energy, while for momentumpolarization it is the mass that is the invariant.

In 1852 Sir George Gabriel Stokes [52] showed that a partially polar-ized light beam could be characterized by four parameters now bearinghis name. In substance, Stokes demonstrated that when any two beams oflight are superimposed incoherently, the Stokes parameters are additive.Moreover, any arbitrary light beam may be considered as a superposi-tion of an unpolarized beam and an elliptically polarized one. Because oftheir operational forms, the Stokes parameters have been related to quan-tities that appear in the quantum-mechanical treatment of light. For exam-ple, the equivalence between the Stokes parameters and the componentsof the density matrix have also been noticed by Perrin [42], and Falkoff andMacdonald [51].

We plan to reinterpret the Stokes parameters to give the relativisticinvariant forms

W2 − p2 = m2, p2 + m2 = W2,

in the cases where the mass m, or the total energy W , is invariant. Thepolarization arises by designating the orthogonal components of the mass,momentum, or mass and momentum. In the case of the hadron collider,the latter is realized which involves the orthogonal components of thetotal mass and the transverse momentum, i.e. mt = pt + im with modu-lus

√(mtm∗

t ) = √(p2

t + m2).

Page 562: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 535

The beam momentum and the transverse mass can be represented bythe spherical coordinates

pl = W cos 2ϑ = W cos 2χ cos 2ψ,

pt = W sin 2ϑ cos 2ϕ = W cos 2χ sin 2ψ,

m = W sin 2ϑ sin 2ϕ = W sin 2χ,

(I)

in the case of complete polarization. This shows that the total energy, andnot the transverse mass [Jackson 05], is the conserved quantity. The secondequality in first line of (I) is none other than the Pythagorean theorem forelliptic geometry.

In experimental particle physics, the rapidity (11.1.2) is replaced by theso-called ‘pseudorapidity.’ If ϑ is the angle between the particle momentump and the direction of the momentum of the beam then cos ϑ = pl/|p|. In thelimit m � |p|, the rapidity (11.1.2) can be replaced by the pseudorapidity

η = 12

ln

( |p| + pl

|p| − pl

)= 1

2ln

(1 + cos ϑ

1 − cos ϑ

)= − ln tan (ϑ/2). (11.1.5)

This identifies the angle ϑ in the limit m � |p| with the Boylai–Lobachevskyangle of parallelism. In the Euclidean limit the pseudorapidity vanishes,while as the angle of parallelism decreases, the pseudorapidity increases.Since the transverse momentum is related to ‘missing,’ or ‘invisible,’ massin collider particle production, the infinite limit of the pseudoadditivitywould be related to the limit where all the masses are accounted for, inwhich case the particle momentum is directed along the beam momentum.

The pseudorapidity (11.1.5) provides a unique link between hyper-bolic and circular functions. We saw in Sec. 2.5 that hyperbolic geome-try depends on an absolute constant, k, such that the area of any triangle�ABC is

area(�ABC) = π

180k2 × defect(�ABC).

Since the defect, or the (positive) difference between 180◦ and the sumof the angles of the triangle, measured in degrees, is minutely small on aterrestrial scale, while the area is finite, the constant k2 must be immenselylarge.As we have seen in Sec. 2.5, the parallaxes of fixed stars serve as lowerbounds to k2.

Page 563: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

536 A New Perspective on Relativity

In (11.1.5), we have implicitly assumed k = 1 since we are consideringnatural units. Recall in Sec. 2.4 we showed this to be equivalent to the choiceof our unit of length so that the ratio of corresponding arcs on concentrichorocycles is equal to e when the distance between horocycles is 1. Thischoice is analogous to the choice of the unit of angular measure so thata right angle will have a radian measure of π/2. It makes the area of atriangle equal to its defect, provided the defect is now measured in radiansinstead of degrees. Using double angle formulas, we may express the angleof parallelism, ϑ(η), measured in radians, in terms of the pseudorapidity as

tan ϑ(η) = 1/sinh η, cos ϑ(η) = tanh η, sin ϑ(η) = 1/cosh η.

The distinction between the invariancy of the total energy, W , or thetotal mass, m, is geometrically related to the distinction between ellipticand hyperbolic geometries, and optically connected to the differencebetween birefringence and dichroism.

Dichroism is related to the unequal absorption of two orthogonallypolarized light components, while birefringence is the unequal retardationof orthogonal components.

If we are considering processes which conserve the total energy, therecan occur mass polarization. Denoting by 2ϑ and 2ϕ as the polar andazimuth angles, respectively, and choosing the momentum to form an angle2ϑ with the z-axis, we can write the momentum p and mass componentsml and mt in terms of these angles through spherical coordinates

p = Wε cos 2ϑ,

ml = Wε sin 2ϑ cos 2ϕ,

mt = Wε sin 2ϑ sin 2ϕ.

()

Since the degree of polarization, ε ≤ 1, is constant, the equality of thedifferences,

W2 − p2 − m2

W2 = W ′2 − p′2 − m′2

W ′2 = 1 − ε2, (11.1.6)

will always hold no matter what frame we are working in, where themass m = √

(m2l + m2

t ). This was first commented on by Paul Soleillet

Page 564: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 537

in 1929 who also devised 4 × 4 matrices that act on these four-vectors, andcan be applied to the description of the polarization of Compton scatter-ing [Fano 49].c

Complete polarization, ε = 1, corresponds to ‘on mass shell,’ wherethe ‘mass shell,’ or ‘mass hyperboloid,’ refers to solutions of

W2 − p2 = m2,

describing the combinations of momentum, p, and energy, W , that areallowed for a relativistic particle of mass m. ‘Virtual’ particles may be ‘offshell,’ or partially polarized. In the sequel we will always treat the ‘onshell’ case, or that of complete polarization, ε = 1. It is apparent from theseequations that W and p are invariant for a rotation of the axes through theazimuthal angle. But, ml and mt change with the axes, and are related toone another through a rotation about this angle.

In birefringent media there is a phase delay. Polarizers exploit thebirefringent properties of crystals like quartz and calcite. An ideal birefrin-gent crystal transforms the polarization state of an electromagnetic wavewithout loss of energy. The crystal has an optical axis for which light has adifferent index of refraction for light polarized parallel and perpendicularto this axis. A beam of unpolarized light is split by refraction at the surfaceof these crystals into two rays: light rays polarized parallel to the optic axisare known as the ‘ordinary’ rays, while light rays polarized normal to theoptic axis are called ‘extraordinary’ waves. Only the former obey Snell’slaw, (3.5.6).

A Nicol prism, which was an early prototype of a birefringent polar-izer, can be used to measure the degree of plane polarization with respectto two arbitrary orthogonal axes, and the degree of plane polarization withrespect to a set of axes oriented at 45◦ to the right of the previous ones. Themeasurement of the degree of circular polarization requires a quarter-waveplate.Aquarter-wave plate is a phase retarder that can be used to transformcircularly polarized light into linearly polarized light or vice versa.

cAgain Stigler’s law of eponymy is borne out in that spectroscopists refer to thecalculus where light is represented by a vector which is operated on by an opticalelement as the Jones and Mueller calculus [Kliger et al. 90], and not to its rightfuldiscoverer, Soleillet [29], who more than a decade earlier than Jones [41], and almosttwo decades earlier than Mueller [48], discovered it.

Page 565: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

538 A New Perspective on Relativity

Fig. 11.1. Spherical right triangle for scheme (II).

The elliptic nature of the phase changes is made apparent by consid-ering the spherical right triangle shown in Fig. 11.1, with coordinates.

p = W cos 2ϑ = W cos 2ψ cos 2χ,

ml = W cos 2ϕ sin 2ϑ = W sin 2ψ cos 2χ,

mt = W sin 2ϑ sin 2ϕ = W sin 2χ.

(II)

According to this spherical right triangle, scheme () corresponds to thePoincaré sphere, shown in Fig. 11.7 below.

Electromagnetic waves may be characterized by their electric vectorswhich can be decomposed into orthogonal components that encounter dif-ferent propagation effects in the media through which they pass. We havealready discussed phase lags between the two components giving rise tobirefringence, which can be characterized by a rotation of 2ϕ in the planeperpendicular to the direction of momentum propagation. The rotationmatrices are unitary. However, it may occur that the amplitudes of one ofthe orthogonal components of the electric vector gets reduced in dichroicmedia. Radiation filters serve to block all the radiation in one of the modes,and are known as polarizers. In terms of the parameters describing thepolarized state, the total intensity is reduced. Translated into relativisticterms, the total energy will no longer be a conserved quantity, and trans-formations from one inertial frame to another involve a Lorentz boost of2β = tanh 2ϑ in the direction of propagation. Such transformations aredescribed by Hermitian matrices.

We are now dealing with momentum polarization, where the momen-tum is decomposed into orthogonal components pl and pt, such that the

Page 566: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 539

Fig. 11.2. Hyperbolic right triangle related to the scheme (III).

momentum is p = pl + ipt, with modulus√

(pp) = √(p2

l + p2t ). In terms of

the polar and azimuthal angles, 2ϑ and 2ϕ, the energy and momentum aregiven by

W = m cosh 2ϑ = m cosh 2χ cosh 2ψ,

pl = m sinh 2ϑ cos 2ϕ = m cosh 2χ sinh 2ψ,

pt = m sinh 2ϑ sin 2ϕ = m sinh 2χ.

(III)

The second equalities in (III) are deduced by considering the hyperbolicright triangle in Fig. 11.2. In particular, the second inequality in the firstline of (III) will be recognized as the Pythagorean theorem for a hyperbolicright triangle.

Thus, whereas birefringence involves phase changes of the orthogonalcomponents of the electric vector and belongs to elliptic space, dichro-ism involves the reduction in total intensity and lives in hyperbolicspace.

In comparison to (II), scheme (III) is obtained by allowing the polarangleϑ to become imaginary. This is analogous to the transition from ellipticto hyperbolic geometry which is affected by allowing the radius of a sphereto become imaginary and thus transforming a sphere into a ‘pseudosphere’that we discussed in Sec. 2.5. In Sec. 11.2 we will show how the Stokesparameters can be written in terms of the density matrix, which, in turn,can be expressed in terms of the mass, momentum, and energy terms, or interms of the components of angular momentum since all can be expressedin terms of a conserved four-vector. In terms of mass, momentum and

Page 567: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

540 A New Perspective on Relativity

energy, the density matrix,

ρ = 12

(W + m pl − ipt

pl + ipt W − m

), (11.1.7)

has the total energy as its trace, and has a vanishing determinant.In analogy with the three components of linear momentum, we write

the generators of rotation as a scheme (II) type

px = W sin 2ϑ cos 2ϕ = m cos 2ϕ,

py = W sin 2ϑ sin 2ϕ = m sin 2ϕ,

pz = W cos 2ϑ.

(II′)

The relativistic conservation of energy is

W2 = p2x + p2

y + p2z = m2 + p2

z =: p2. (11.1.8)

This reduces to a scheme (I) type when the transverse momentum and massbecome zero as it would be for a particle of zero mass.

11.1.2 Spin

Not long after the proposal of ‘spin’ as an additional degree of freedom ofthe electron, experimenters were under the belief that there should be ananalogy between the behavior of linearly polarized light and the asymmet-ric orientation of spins in an electron beam [Farago 71].

A spin-1 particle, with a well-defined momentum, p, can have a spinalong the direction of the motion, opposed to the direction of motion, ornormal to that direction, h = ±1, 0. The new property, h, known as the par-ticle’s helicity, is not confined to spin-1 particles. However, if the particlehappens to be a photon its transverse wave property excludes the value 0.This value has been associated with the longitudinal mode, and the pres-ence of mass, in the electroweak interaction [Gottfried & Weisskopf 86].The orientation of spin along, or opposite to, the direction of propagationn can have only the values h = ±1. Moreover, since the orbital angularmomentum, L, vanishes in the direction of propagation n, the helicity isdefined as the projection of the total angular momentum J in the direction

Page 568: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 541

of motion, i.e.

J · n = (L + S) · n = S · n = h,

where S is the spin of the photon.To add to the confusion, instead of helicity, referring to the orientation

of spin with respect to the axis of quantization, states of definite helicity arerelated to left- and right-handed circular polarization, as opposed to linearpolarization. Additional confusion is further incurred by the close formalanalogy between spin- 1

2 particles and spin-1 photons. Since there are onlytwo helicity states h = ±1, these states can be represented as spinors, justlike electrons!

We will see that the properties of light can be fully determined bythe density matrix (11.1.7), where the Stokes optical parameters will replacethe mechanical parameters [cf. (11.2.2) below]. The diagonal elementsgive the probability of finding a photon in the beam in one of the twohelicity states. By allowing the beam to pass through various polarizationfilters, information can be obtained about its polarization. Since the inten-sity must be real, the off-diagonal elements must be complex conjugatesof one another, i.e. the density matrix must be Hermitian. This reduces thetotal number of independent parameters to four, if the total intensity of thebeam is included. It is quite remarkable that Stokes came to the exact sameconclusions way back in 1852, with absolutely no knowledge of the quan-tum nature of light, or even Maxwell’s theory predicting the transversenature of wave propagation!

Parenthetically, we may add that associating the longitudinal modeof the state of helicity |0〉 with mass is not without its problems. For spin-1particles there need not be a direction in which the spins are pointing,either up or down. Although there can be a preferred direction, it is notpossible to specify a projection along this axis so as to obtain helicity. Thespin vector of |0〉 can be thought of as precessing in the direction perpen-dicular to the motion. Thus, quantities that characterize particles of spin-1must not depend upon the preferred direction, and a vector of polarizationis not applicable. These quantities must be at least quadratic in the spincomponents, or second-order tensors.

To see that the polarization is insufficient to characterize such states,we could equally as well produce the state of zero polarization by all the

Page 569: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

542 A New Perspective on Relativity

particles in the state |0〉, or by an equal mixture of the states |1〉 and | − 1〉.It would then be necessary to construct a monopole, vector, and second-rank tensor in order to obtain a complete characterization of the state ofpolarization. Thus, the association of a longitudinal mode with the |0〉 statewould mean a complete overhaul of the properties of polarization.

The very fact that electrons share both undulatory and corpuscularproperties of light, and do have mass, would tend to rule out that a com-pletely new mechanism be added to treat polarization once massless par-ticles acquire mass in the electroweak theory. It would be far simpler toassume that the acquisition of mass is a breakdown in the pure helicity ofthe state due to dispersion.

Stokes’s analysis of the polarization of electromagnetic radiationhas been gaining increasing interest in other branches of physics due,undoubtedly, to its similarity to a rotation in a four-dimensional Minkowskispace [Soleillet 29]. This is a consequence of the realization that the Stokesparameters are the components of a conserved four-vector. This columnvector can be scattered into a new column vector, with the same conser-vation properties, by matrices which change the state of the polarizationof light.

The transformation matrix appears as a generalized matrix of rotationin which two components are rotated through an imaginary angle, andthe other two components are rotated through a real angle. The rotationthrough an imaginary angle is a ‘rotation’ of the total energy and linearangular momentum by a Lorentz transform, while the rotation of theother two components through a real angle is analogous to the intro-duction of a phase difference between the components of vibration ofthe electric vectors along mutually perpendicular axes, and is the originof mass polarization.

The former case provides a physical example of hyperbolic geometryin which there is a contraction of rulers as the boundary of the space isapproached as seen, of course, from a Euclidean perspective. An additionalrepresentation of the Stokes parameters, proposed by Poincaré in 1892, andwhich we will discuss in Sec. 11.2, is physically equivalent to a light beambeing rotated through an angle around its direction of propagation. It is

Page 570: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 543

related to Rayleigh scattering, where the rotation of the outgoing linearpolarization vector is rotated away from the incoming linear polarizationvector, and constitutes an elliptical geometric distortion effect.

Just as there are two independent states of light polarization, the den-sity matrix can be represented as the sum of the identity matrix and the innerproduct of the Stokes parameters and the Pauli spin matrices [Fano 49]. Anidentical treatment can be given to the weak interaction where the protonand neutron are a ‘charge doublet’ of the nucleon. This doublet can only bedistinguished by the weak interaction in which a free neutron decays intoa proton, an electron, and an antineutrino. We will return to this shortly inSec. 11.1.7.

Now, the two-dimensional unitary modular group, SU(2), can be rep-resented by the three 2 × 2 Pauli spin matrices so that the ordinary spinmultiplets of particles like electrons can be derived from this group. Itwas Heisenberg’s foresight that led him to apply the same group of trans-forms to the neutron-proton charge doublet, or what has become knownas ‘isospin.’

11.1.3 Angular momentum

The Stokes parameters also bear an intimate tie with the angular momen-tum operators. In exactly the same way that each state can be chosen to be asimultaneous eigenfunction of the square of the total angular momentum,J2, and its projection on the direction of momentum, Jz, which we will takeas the z-axis, we can a priori conclude that there will be J(J +1) eigenvalues,where J is either an integer or half-integer, and each multiplet will con-sist of 2j + 1 states with eigenvalue jz of the operator Jz, varying in stepsfrom −j to +j.

In this analogy with angular momentum, the total energy, W , corre-sponds to the total angular momentum, J. Its projection onto the z-axis,Jz, corresponds to the operator of linear momentum, pz. It will turn outthat pz, or Jz, is proportional to the difference between the populations ofthe two states of isospin, or helicity, or any other two mutually exclusivestates. When the populations of the two states become equal the particle’svelocity goes to zero.

The remaining two angular momentum operators, J± = Jx ± iJy, areknown as ‘ladder’ operators, since they cause jumps up and down in a

Page 571: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

544 A New Perspective on Relativity

multiplet, creating or destroying a particle as they go. To these operatorswe will associate the ‘mass’ operators in longitudinal, ml, and transverse,mt, directions of momentum. They can be considered as the last vestiges ofthe ‘transverse’ and ‘longitudinal’ masses that discussed in Sec. 5.4.4, wereintroduced early in relativity theory, and then quickly forgotten when itwas found that the transverse mass was the mass measured in the e/mexperiments described in Sec. 5.4.1.

In group theory jargon, we are saying that the Stokes parameters arethe operators that generate the SU(2) algebra. Since two helicity, spin, orisospin, states are involved, the Stokes parameters can be represented bythe creation of a spin up (down), a†

+ (a†−), or the annihilation of one, a+ (a−).

In terms of these second quantized operators, the total energy W becomesthe total number,

W = a†+a++ a†

−a−,

of particles operator, and the operators of the mass components andmomentum are

ml = a†+a−,

mt = a†−a+,

pz = 12(a†

+a+− a†−a−). (11.1.9)

The conservation of angular momentum,

W2 = 12

(mlmt + mtml) + p2z , (11.1.10)

is also the square of the total quasi-spin operator of isospin, and it givesrise to the eigenvalue equation

〈W2〉 = W (W + 1).

The momentum operator (11.1.9) is the difference in the number of the twospin states. In Sec. 11.1.7 we will show that it is proportional to the relativeof velocity of an electron, which is found to be equal to its longitudinalpolarization in the electroweak interaction.

Spin states can be classified into multiplets, each characterized by aneigenvalue of the operator W2. The significance of the statement that eachstate can be chosen to be a simultaneous eigenfunction of both pz and W2

Page 572: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 545

is that the difference W2 − p2z is invariant under a Lorentz transformation.

What one would think of as the space of ‘space-time’ [Dirac 47] is reallyspanned by the Stokes parameters, and their mechanical counterparts ofmass, momentum, and energy.

The Stokes parameters play a fundamental role in the character-ization of polarized relativistic systems in separating the energyand momentum, which evolve according to Lorentz transformations,and of the polarized mass components, which undergo rotationaltransformations.

No matter how enticing the analogy between the Stokes parametersand angular momentum operators may be, we have to realize that it is lessthan perfect because the former vary continuously, while the latter are dis-crete. Particles with zero rest mass can have only two states of polarization,±W , while particles of finite mass have 2W + 1 states of polarization. Theeigenvalues of the operator pz will have 2W + 1 values of the multipletfrom W to −W for massive particles that are aligned parallel, anti-parallel,or normal to the direction of momentum.

11.1.4 Elastic strain

The distinction between vibratory motion in the direction of wave propa-gation in contrast to vibratory motions in directions normal to wave prop-agation can be understood by considering the nature of strain upon anelastic body.

If a displacement G satisfies the condition

∇ × G = 0, (11.1.11)

throughout a strained body, then no element in that body experiencesrotation. Such a strain is said to be irrotational, or longitudinal. Alternatively,if the displacement G satisfies

∇ · G = 0, (11.1.12)

then no element in the strained body undergoes a change in volume. Sucha strain is said to be solenoidal, circuital, or transversal.

Page 573: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

546 A New Perspective on Relativity

Moreover, any vector field may be decomposed into purely longi-tudinal and transverse parts so that the most general type of strain is asuperposition of the two. It is also possible to treat the two types of strainsseparately. It will then be found that the two types of disturbances will bepropagated at different velocities so that if a single source emits both typesof disturbances one will travel faster than the other.

Any wave equation with a single speed of propagation must, there-fore, contain a single type of disturbance — either longitudinal ortransversal.

If condition (11.1.11) is met everywhere in the body, the displacementcan be represented as the gradient of a scalar potential, φ, viz.

G = ∇φ. (11.1.13)

The displacement will therefore occur in the direction normal to the surfacesφ = const., and if n is the unit normal we may write (11.1.13) as

G = n∂φ

∂n.

We will restrict our attention to infinitesimal strains, or those for whichthe square and products of the derivatives ∇x = ∂/∂x, ∇y = ∂/∂y, and∇z = ∂/∂z, of the displacement G will be negligible in comparison withthe linear terms. Then, the principal elongations will be

λx = ∇xGx, λy = ∇yGy, λz = ∇zGz,

so that the cubic dilatation is simply their sum,

c = λx + λy + λz = ∇ · G = ∇ · ∇φ = ∇2φ. (11.1.14)

In other words, the cubic dilatation is equal to the divergence of the dis-placement, or to the Laplacian of the field, φ.

Neighboring values of φ are potential surfaces which split the bodyinto a series of infinitely thin surfaces. If the displacement remains constanton any one of the surfaces, say x = const., and changes only when passingfrom one plane to the next, then the cubic dilatation reduces to ∇xGx. Now,the fact that the curl vanishes, (11.1.11), means that ∇xGy = ∇xGz = 0,so that Gy and Gz are constant. That is, the transverse components of the

Page 574: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 547

displacement are arbitrary constants, which, without any loss of generality,we may take as zero. Hence, only the longitudinal displacement Gx remainsfinite, indicating, for instance, that all the molecules of a lattice vibrate inthe direction of wave propagation.

The lines corresponding to the principal axes are replaced by rota-tional ones for transverse displacements, where the lines are in the directionof the axis of rotation. The cubic dilatation, (11.1.14), vanishes, indicatingthat the volume of any portion of the body remains the same so that (11.1.12)applies. Thus, the generalized displacement can be represented as the curlof a vector A,

G = ∇ × A,

in contrast to longitudinal strain, (11.1.13). The vector potential, A, playsan analogous role to the scalar potential φ of longitudinal strain. Instead ofthe cubic dilatation in terms of the Laplacian, we now have the curl as theindicator of the intensity of rotational motion,

J = ∇ × ∇ × A = ∇ × G. (11.1.15)

Since

∇ × ∇ = ∇(∇·) − ∇2,

if we introduce the auxiliary condition that the vector field is sourceless,∇ · A = 0, we can write the rotation, or vortex, J as

J = −∇2A,

which is entirely analogous to (11.1.14) for cubic dilatation.Equations (11.1.14) and (11.1.15) are the well-known Poisson equa-

tions, where if dV is an element of volume in which the rotation J does notvanish, (11.1.15) has the solution,

A =∫

JrdV, (11.1.16)

for the vector potential, while, in the exactly analogous way, (11.1.14) hasthe solution,

φ = −∫

crdV, (11.1.17)

Page 575: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

548 A New Perspective on Relativity

for the scalar potential, if the cubic dilatation, c, does not vanish in thevolume element dV. In electrodynamics, the vector J represents the currentdensity, and the scalar c stands for the charge density.

Again assume that the displacement G depends only on thex-coordinate. Since (11.1.12) holds,

∇xGx = ∂Gx

∂x= 0.

This means that Gx is constant, which we can conveniently take to be zero.The displacement is therefore normal to the x-axis, lying in the yz-plane, andconsisting of two non-zero components. The strain is said to be transversal,where, for instance, the particles ‘wiggle’ in the directions normal to thedirection of propagation of the wave disturbance. As an easy reminder, wemay say that a longitudinal disturbance needs two components to vanishon account of (11.1.11), whereas a transversal disturbance needs only onecomponent to vanish on account of (11.1.12).

The foregoing discussion elicits an interpretation of longitudinal andtransverse wave motion in terms of the underlying medium. It is the reasonwhy the concept of an aether was so well received and widely acceptedbefore relativity. Heaviside’s opinion sums up the tendency of the periodto regard the aether with open arms:

Aether is a wonderful thing. It may exist only in the imagination of the wise, beinginvented or endowed with properties to suit their hypotheses; but we cannot dowithout it. . . But admitting the aether to propagate gravity instantaneously, it musthave wonderful properties, unlike anything we know.

So the aether was the deus ex machina upon which physical theorieswere built. No matter how unsuccessful were the experiments to “set theaether in motion,” it served both as a guide and crutch upon which tobuild physical theories so that its demise cannot be entirely looked uponas a positive move. Whether it exists between particles, or within them, itprovided the trunk whose branches bore fruit. In particular, it led Maxwellto add on a new current to Ampere’s law, called by him the displacementcurrent, and in so doing ‘closed the circuit,’ and allowed for electromagneticwave propagation.

The aether and its conservation did play a role. It was said that Kelvincould not understand a phenomenon until he made a mental picture of theaether to which it corresponded. And the seminal idea of Kelvin, back in

Page 576: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 549

1853, that energy can be stored in the field implied that there was a mediumin which it could be stored. It also led to Maxwell’s abandoning a theory ofgravity, as we saw in Sec. 3.8.1, and declaring that such a theory was beyondnineteenth-century physics, for it would imply that, due to the attractivenature of masses, the aether must store ‘negative’ energy!

So the aether was the medium in which energy could be stored, like astretched rubber band that could give up energy upon request. But, sincethe abolishment of the aether, “we don’t have this invisible, convenientstorage vault to make the field energy easier to ‘visualize.’ The field energyis, in this sense, a greater mystery for us today than it was for the Victori-ans” [Nahin 88].

According to Maxwell [65], the total field energy density,

W = 18π

(ε0E2 + µ0H2), (11.1.18)

is localized in space, but, it can be far from any material whose dielectricconstant and permeability are ε0 and µ0, respectively. The only remnant ofthe ‘material’ body lies in their product, 1/

√(ε0µ0) = c, the speed of light

in vacuo.In many ways, the present-day vacuum in quantum field theory plays

the role of the deceased aether. The essential assumption in the Higgs mech-anism is that the ground state, or vacuum, is asymmetric, notwithstand-ing the fact that the Lagrangian is symmetric. The Higgs mechanism withits nonvanishing vacuum expectation value plays the role of the vectorpotential in an apparent analogy to spontaneous magnetization in fer-romagnetism when the temperature is lowered below its critical value[cf. Sec. 11.5.2 below]. So in many ways the vacuum has replaced the aether.There may be many roads to a discovery that destroy the uniqueness of asingle theory like that of general relativity [cf. Chapter 7]. Modern day ten-dencies are to replace aethers by field Lagrangians and let their symmetry,or better symmetry-breaking, be their deus ex machina.

It would not be inappropriate to recall the words of Heaviside [12]concerning the Lagrangians and the principle of least action:

Whether good mathematicians, when they die, go to Cambridge, I do not know.But it is well known that a large number of men go there when they are youngfor the purpose of being converted into senior wranglers and Smith’s prizemen.Now at Cambridge, . . . there is a golden or brazen idol called the Principle of LeastAction. Its exact locality is kept secret, but numerous copies have been made and

Page 577: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

550 A New Perspective on Relativity

distributed amongst the mathematical tutors and lecturers at Cambridge, who makeyoung men fall down and worship the idol.

How times have changed — and how times may yet change again!

11.1.5 Plane waves

Since electron spin appears as the counterpart of the polarization of lightwe might be inclined to use the Stokes parameters to characterize the polar-ization of elementary particles [Jauch & Rohrlich 55].

Consider a plane wave propagating in the positive z-direction withwave number κ and angular velocity ω, which is completely polarized. Inoptics it is necessary to consider four ‘amplitudes,’ Ex, Ey, Hx and Hy, eachof which satisfies the wave equation. Rather than using the two compo-nents of the magnetic field, H, we may use the vector potential A whichis related to it by A = (∇ × H)/κ2. Since H = ∇ × A, this implies that Hsatisfies the reduced wave, or Helmholtz, equation,

∇2H = −∇ × ∇ × H = −κ2H,

since ∇ · H = 0.The non-vanishing components of the vector potential,

Ax = a sin (κz − ωt)/κ, Ay = b sin (κz − ωt + δ)/κ,

have amplitudes a and b, and phase δ. Since E = −A, because φ ≡ 0, thenon-vanishing components of the electric field are

Ex = a cos (κz − ωt), (11.1.19a)

Ey = b cos (κz − ωt + δ). (11.1.19b)

If the electric vector, E = Ex + Ey, rotates counter-clockwise when theobserver is facing into the oncoming wave, such a wave is said to be left-circularly polarized. In the jargon of elementary particle physics it meansthat the particle has positive helicity, or that the spin of the particle is inthe direction of the momentum. In contrast, if the rotation of the electricvector is clockwise when looking into the wave, the wave is said to be right-circularly polarized, or, equivalently, that the particle has negative helicity,

Page 578: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 551

in which case the spin is in the opposite direction to the momentum of theparticle.

The stresses, formed from the products of (11.1.19a) and (11.1.19b),and averaged over a period of the motion, are the way Maxwell accountedfor the ponderomotive forces of the electric field. The normal stress,

Jz = E2x − E2

y = 12

(a2 − b2), (11.1.20)

is related to the (radiation) pressure, while the tangential stress,

Jx = 2ExEy = ab cos δ, (11.1.21)

is the stress due to shearing. We emphasize that it is precisely through theMaxwell stresses, such as (11.1.20) and (11.1.21), that we can account forthe actions of inertia in a theory which is otherwise completely devoid ofit. Then you ask, stress on what? And here we return to the aether, not as aluminiferous, gaseous, aether, but an elastic, or ‘jelly-like,’ solid as Stokesliked to think of it.

The third component,

Jy = ab sin δ, (11.1.22)

is related to the projection of the angular momentum on the z-axis. It is thespin component of the angular momentum [cf. (11.5.47) below],

S = 14π

E × A = k4πω

ab sin δ, (11.1.23)

where k is the unit vector pointing in the z-direction. Finally, the fourthcomponent, J, is related to the total energy,

W = 18π

(E2 + H2

)= 1

(a2 + b2

). (11.1.24)

The direction of the flow of energy is determined by Poynting’s vector,which can arguably be also associated with the name of Heaviside,

P = 14π

E × H, (11.1.25)

Page 579: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

552 A New Perspective on Relativity

as we explained in footnote 2 of this chapter. The magnetic intensityH = ∇ × A has components

Hx = −a cos (κz − ωt + δ), Hy = b cos (κz − ωt).

How are the two vectors (11.1.23) and (11.1.25) related?The latter represents the linear momentum of the electromagnetic field

per unit volume. The moment of the linear momentum density is the totalangular momentum density,

J = r × P = 14π

(r × E × H) .

Expressing the magnetic force in terms of the vector potential, and usingthe vector identity

E × ∇ × A = ∇A · E − E · ∇A,

we get

r × P = r × ∇A · E + E · ∇A × r.

The first term is analogous to the orbital angular momentum density[Rohrlich 65],

L = r × ∇A · E, (11.1.26)

while the second term can be rewritten as

E · ∇A × r = ∇ · (EA × r) + r × A∇ · E + E · ∇r × A.

When integrated over the volume, the first term vanishes under theassumption that the fields vanish sufficiently fast at infinity (Maxwell’sinfinite integrals to get finite quantities!), the second term vanishes in theabsence of charges, and the third term is (11.1.23) since ∇r = 1 is the unitdyadic.

If (11.1.23) and (11.1.26) are to apply to a photon, they need to be rein-terpreted. The spin of a photon is usually assumed to be 1. But, what doesthis mean in terms of the decomposition of the total angular momentum interms of its orbital and spin components? If we interpret (11.1.26) as spinitself, then (11.1.23) can be taken as the projection of the spin on a preferreddirection, or the two components of the helicity of a photon.

Page 580: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 553

Dividing (11.1.23) by (11.1.24) gives

SW

= k2ab

a2 + b2sin δ

ω.

This was first derived by Abraham in the special case of circular polariza-tion, and later generalized to a spherical wave by Sommerfeld [34]. Thepresence of the vector product in (11.1.23) implies that the spin is different fromzero if the wave is other than linearly polarized.

The kind of polarization is determined by the phase, δ. A linearlypolarized wave has δ = 0, and consequently, the intrinsic spin of the particlevanishes. Rather, for δ = π/2 and a = b, the polarization ellipse degeneratesinto a circle resulting in a state of right-circular polarization, where theabove ratio reaches a maximum of 1/ω. For the same condition on theamplitudes, but with a phase δ = −π/2, a state of left-circular polarizationresults.

Finally, introducing Planck’s relation, W = ω, in natural units, givesthe spin states S = ±k. These angular momenta correspond to helicitiesh = ±1, since there is no photon state with h = 0, because electromagneticwaves have only transverse fields. In other words, spin orthogonal to thedirection of propagation for photons, as well as for all massless particles,does not exist.

11.1.6 Spherical waves

Next in line after plane waves, in regard to their simplicity, are spheri-cal waves. They were originally thought to produce condensation waves.Kelvin suggested that the rapid charging of two conducting spheres con-nected to an alternating dynamo would produce waves of compression,just as the rapid back-and-forth actions of a piston in a cylindrical cavitywould do. Only here, it would be the rapid alternating charging that wouldbe the seat of compressional waves.

Compressional waves in electromagnetism was loathsome to Heavi-side, and he rejected them outright.

There are no ‘longitudinal’ waves in Maxwell’s theory analogous to sound waves.Maxwell took good care that there should not be any.

Page 581: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

554 A New Perspective on Relativity

The ability of a changing electric field to induce a magnetic field, and achanging magnetic field to induce an electric field create radiation, andprohibit the formation of condensational waves. The radiation componentsof the electric and magnetic fields are not dependent upon charge andcurrent, respectively. Rather, they are cut loose of these sources so thatelectric and magnetic variations influence one another and enable radiationto travel unlimited distances for unlimited amounts of time. This attests tothe absence of mass of the photon.

Maxwell’s circuital equations inevitably lead to a wave equation. Thiswas an oversimplification for Gauss, but sufficient for Maxwell’s needs[cf. Sec. 1.2.1]. In the spherically symmetric case to be treated in Sec. 11.5.1,the wave equation has the solution of the product of a spherical Besselfunction of order � = 1, and a spherical harmonic in which m = −1, 0, 1.But, from its derivation from the circuital equations, the state m = 0 ismissing, for if it did exist it would correspond to the state of zero helicity.Yet, if we allow for a new current, which is indicative of compressionalwaves, the state m = 0 will make its appearance. In Sec. 11.5.2 we willappreciate that a current proportional to the vector potential A introducesmass by introducing dispersion, whereas a term proportional to −∇ · A isanalogous to a hydrostatic pressure, which, by itself, is not related to eitherincompressible or compressible fluid flow [Landau & Lifshitz 59].

In the standard theory of the electroweak interaction, the appearanceof the longitudinal mode with h = 0 occurs as a result of the breakingof gauge invariance. In so doing the gauge fields acquire mass, but can-not propagate unless their frequencies exceed the mass created. If ourinterpretation of longitudinal modes in Maxwell’s equation is correct, theappearance of mass has absolutely nothing to do with the appearance of alongitudinal mode with h = 0. In Sec. 11.5.2 we will analyze the putativeanalogy between the superconducting state in the Meissner effect and thevacuum state of the Higgs field in the symmetry breaking mechanism ofelectroweak theory. We will conclude that the analogy is evanescent.

11.1.7 β-decay and parity violation

Another example of the relation between the relative velocity and the nor-mal stress (11.1.20) is afforded by parity violation in the weak interaction.

Page 582: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 555

Fig. 11.3. Weak β-decay of the neutron. In Fermi’s theory this occurs at a singlepoint where the emission of an electron-antineutrino pair is analogous to electro-magnetic photon emission.

Weak interactions first made their appearance in nuclear β-decay.Fermi’s theory models β-decay as analogous to an electromagnetic transi-tion of an excited atom. However, instead of ejecting a photon, an electron-antineutrino pair is ejected. The most elementary example of β-decay isneutron decay, shown in Fig. 11.3, where a neutron, n, decays into a proton,p, an electron, e, and an antineutrino νe.

n −→ p + e + νe.

The inverse reaction,

p −→ n + e + νe,

where e is the positron and νe the neutrino, cannot be observed outsideof the nucleus because the proton is lighter than the neutron. Inside thenucleus, the proton can ‘borrow’ the necessary energy from the rest ofthe nucleus.

To investigate parity violation one studies the β-decay of an unstablenucleus with a large spin that can be polarized so that it points in a specifieddirection.Agood candidate is Co60, which is polarized so that its spin pointsin the direction of an applied magnetic field, B, as shown in Fig. 11.4.

When the nucleus decays it emits an electron with momentum p.The experiment consists in determining the directional distribution of thismomentum. The emission probability per unit solid angle, dP/d�, is a 2×2

Page 583: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

556 A New Perspective on Relativity

Fig. 11.4. The decay of polarized cobalt.

matrix, whose most general form,

dPd�

= AI + Bs · p,

contains arbitrary, but positive, constants, A and B, where I is the unitmatrix, and the spin is defined in terms of the Pauli matrices as s = σ/2.More electrons will be emitted into one of the hemispheres, either aboveand below the xy-plane.

This is a violation of parity inversion. For if the coordinate axesare inverted, the momentum p being a polar vector will change sign, butthe spin s does not because it is an axial vector like angular momentum.Hence, under parity inversion the probability per unit solid angle will beAI − Bs · p, and is not an invariant.

Another experimental possibility is to measure the polarization ofelectrons that are emitted from unpolarized nuclei. In the case of Co60,

Page 584: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 557

Fig. 11.5. The decay plane of cobalt 60.

β-decay would yield

Co60 −→ Ni60 + e + νe.

For there to be conservation of momentum, the recoil of the Ni60 nucleusmust be such that

pNi + pe + pνe = 0.

The momenta define a plane called the decay plane which is shown inFig. 11.5.

Suppose that the initial state of Co60 is unpolarized so that it has nopreferential direction. Neither do the linear momenta so that leaves onlythe spin of the electron. Being an axial vector, reflection through the originwill have no effect on it, but a rotation about the n-axis will, so that if parityis conserved, the electron spin must be pointing in the direction n, normalto the decay plane. This means that there can be no polarization of theelectron along the direction of its momentum.

However, parity conservation was found broken, and the electron hasa longitudinal polarization equal to −u, the negative of the relative speed,u. This means that the state of helicity h = − 1

2 is more populated than thestate of helicity h = + 1

2 . If a2 denotes the number of electrons with helicityh = − 1

2 , and b2 those with helicity h = + 12 , what is experimentally open to

Page 585: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

558 A New Perspective on Relativity

measurement is the relative velocity,

u = a2 − b2

a2 + b2 = JzW

, (11.1.27)

where Jz and W are given by (11.1.20) and (11.1.24), respectively. Similarexperiments involving the conversion of a proton to a neutron shows thatpositrons are also longitudinally polarized, but with opposite polariza-tion, +u.

Now, if we solve (11.1.27) for the square of the ratio, b/a, we easilyfind

b2

a2 =(

1 − u1 + u

), (11.1.28)

which is the square of the longitudinal Doppler shift, a result to be expected.Furthermore, if we decompose the wave function into orthogonal compo-nents of the spin up |u〉 and spin down |d〉,

ψ = a|d〉 + b|u〉,we can determine the orientation of spin in relation to the z-axis, say, bysolving the eigenvalue equation,

s · p

(ab

)=(

pz px − ipy

px + ipy −pz

)(ab

)= W

(ab

).

The ratio,

ba

= W − pz

px + ipy= px − ipy

W + pz, (11.1.29)

can be evaluated by introducing the spherical coordinates

px = W sin ϑ cos ϕ,py = W sin ϑ sin ϕ,pz = W cos ϑ.

(‡)

And when this is done, we get the stereographic projection formula,

ba

= 1 − cos ϑ

sin ϑe−iϕ = sin ϑ

1 + cos ϑe−iϕ

= tan (ϑ/2) e−iϕ =(

1 − cos ϑ

1 + cos ϑ

)1/2

e−iϕ, (11.1.30)

Page 586: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 559

Fig. 11.6. The spherical coordinates used to describe the orientation of spin.

for the orientation of the spin with respect to the z-axis, shown inFig. 11.6.

The first line of (11.1.30) is not only the half-angle formulas for thetangent, but are transcriptions of (11.1.29). They show that the total energy,

W2 = p2z + p2

x + p2y, (11.1.31)

is that of a relativistic, massless, particle. But, wait, appearances can bedeceiving. The second line of (11.1.30) is the formula for stereographic pro-jection. We know from Sec. 2.2.3 that stereographic projection is a confor-mal map of the surface. In regard to Fig. 7.5 a would be the diameter of thesphere, 2R, and b would be the point on the plane where the projection ismade.

Comparing the last expression in (11.1.30) with the square root of(11.1.28) identifies

u = cos ϑ. (11.1.32)

So the last equation in (‡) is Wu = pz. However, we know that Wu = p,so we have to identify the total momentum, p, with the momentum in thez-direction, pz. This is obvious because we arranged our axes so that themomentum will be pointing in the z-direction. Then what are the remainingtwo terms in (11.1.31)?

The electron cannot be ultrarelativistic because u < 1, and becauseit is an electron it must have mass. We are therefore led to conclude thatthe last two terms in (11.1.31), if they are non-zero, must be related to themass of the electron. If we set p2

x +p2y = m2, (11.1.31) becomes the relativistic

expression for the energy of a massive particle whose momentum is pz = p.This is the origin of mass polarization. Consequently, our transformation to

Page 587: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

560 A New Perspective on Relativity

spherical coordinates becomes

px = W√

(1 − u2) cos ϕ = m cos ϕ,

py = W√

(1 − u2) sin ϕ = m sin ϕ,

pz = Wu = p.

Furthermore, the last equality in (11.1.30) with the identification(11.1.32) and the definition of hyperbolic distance enable it to be written as

ba

=(

1 − u1 + u

)1/2

e−iϕ = e−(u+iϕ), (11.1.33)

where u is the hyperbolic measure of the velocity in a velocity space withabsolute constant unity.

It also makes

ϑ = cos−1 u

the angle of parallelism, and has converted the formula for stereo-graphic projection, (11.1.30) into the Bolyai–Lobachevsky formula forthe angle of parallelism,

tan [ϑ(u)/2] = e−u,

where the angle ϑ is a sole function of the hyperbolic velocity, u.

Moreover, the ratio is real, ϕ = 0 and we are dealing with plane polar-ization; if it is imaginary, u = 0 and W2 = mm, and the polarization iscircular with ±i for right- and left-circular polarization; and finally if it iscomplex we are dealing with elliptic polarization.

11.2 Stokes Parameters and Their PhysicalInterpretations

Unwittingly we have derived expressions for the famous Stokes parame-ters in Sec. 11.1.5. For the derivation of these parameters it is sufficient toconsider only the components of the electric field since the effect of light onmolecules is to cause a redistribution of static charges. Before it was knownthat light was an electromagnetic phenomenon, Stokes considered that, for

Page 588: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 561

linearly polarized light, the electric vector is oriented along the polarizationaxis of the light. For light propagating along the z-axis, (11.1.19a) and(11.1.19b) describe right-linearly polarized light along the x- and y-axes,respectively. The relative magnitude of these two components determinethe orientation of the polarization axis.

There are various forms of light polarization, and all can be repre-sented as linear combinations of the orthogonal components, (11.1.19a) and(11.1.19b). The extreme cases are linearly polarized light, where one of thecomponents vanishes, and circularly polarized light, where they becomeequal. In general, light will be elliptically polarized, and the square root of(11.1.20) will be proportional to the eccentricity of the ellipse. According toStokes’s definition, (11.1.20) represents the difference in intensities betweenhorizontal and vertical linearly polarized components. Stokes interpreted(11.1.21) as the difference in intensities between linearly polarized compo-nents oriented at angles ±45◦. What we have referred to as spin, (11.1.22),to Stokes was the difference in intensities between right- and left-circularlypolarized components. Finally, the total energy, (11.1.24), is proportional tothe total intensity.

The following table summarizes the Stokes parameters:

J ≡ total intensityQ ≡ Jh − Jv difference in horizontal and vertical polarized

light intensitiesU ≡ J+45 − J−45 difference in linearly polarized components

oriented at ±45◦ intensitiesV ≡ Jr − Jl difference in right- and left-circularly polarized

light intensities

The three Stokes parameters therefore measure the ‘preference’ ofthe light wave to be horizontal, linearly-polarized at an angle +45◦, andright-circularly polarized [Shurcliff 62]. All components can be obtainedby combining the orthogonal components of the electric vector. The not soobvious one is (11.1.20), for it appears to require the vector potential. Actu-ally, it represents the difference in intensities of right- and left-circularlypolarized light. In this way it makes spin point in the direction of themomentum of a particle and spin point in the opposite direction of a

Page 589: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

562 A New Perspective on Relativity

particle, or what is referred to as ‘helicity,’ synonymous to right-circularand left-circular polarization, respectively.

Consider the linear combinations,

C1 = (Ex − iEy)/√

2, C2 = (Ex + iEy)/√

2.

The intensities of right- and left-circular components are Jr = C1C1 and

Jl = C2C2. Introducing Ex = a and Ey = beiδ results in

Jr = ab sin δ + W ,

Jl = −ab sin δ + W .

One-half of their difference, 12 (Jr − Jl) is (11.1.22), and one-half their sum is

(11.1.24). Actually, the Stokes vector is defined as twice this value.For unpolarized light, the polarization-dependent terms (11.1.20),

(11.1.21), and (11.1.22) will all vanish, while for partially polarized light,

J2 ≥ J2x + J2

y + J2z ,

which can be understood when one considers partially polarized light asconsisting of two beams, one which is completely polarized and the otherunpolarized. The contribution of each of these beams to the magnitudeof the total beam determines the degree of polarization. The equality signapplies to the state of complete polarization, where J can be looked upon asa radius vector of a sphere with coordinates (Jx, Jy, Jz). Points on this spherewill correspond to specific states of polarization.

Linearly and circularly polarized light can be converted into oneanother through the use of retarders, such as a quarter-wave plate.A quarter-wave plate increases the phase of one linear component by 90◦

with respect to the other. Retardation is caused by the refractive index of amaterial. When light passes from vacuum into matter, the speed of light isreduced in proportion to the inverse of the refractive index. This we sawwas the determining factor in accepting the wave theory of light over thecorpuscular theory. Since the frequency remains the same, the phase anglechanges more rapidly with position inside the body than it does in the vac-uum. The increase in the phase of light as it traverses the body appears asa retardation of light. The Poincaré sphere was designed by its discovererto calculate the effects of the retarder on polarized light.

Page 590: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 563

Fig. 11.7. The Poincaré sphere is the parametrization of the Stokes parameters inelliptic geometry.

According to Poincaré [92],d a state of polarization can be representedby a point on a sphere whose radius is given by the intensity εJ, whereJ is the total intensity, and ε is the degree of polarization. The Poincarésphere is shown in Fig. 11.7, where each point on the sphere denotes a spe-cific type of polarization. The polarization is specified by the azimuth ψ,ellipticity, and handedness, either left or right. On the sphere this is givenby the angles 2ψ and 2χ, which are the longitude and latitude, respec-tively. The factor 2 in the longitude indicates that any polarization ellipseis indistinguishable from one rotated by π radians. The azimuth, ψ, is theinclination of the semimajor axis of the polarization ellipse with respectto the x-axis, as seen in Fig. 11.8. The factor of 2 multiplying the latitude,χ, indicates that the same polarization ellipse can be obtained by inter-changing the semimajor and semiminor axes, and rotating it through π/2radians.

The four Stokes parameters are denoted by J, Q, U, and V, and deter-mine a polarized state on the surface of the ellipse according to

J = a2 + b2, (11.2.1a)

Q = Jε cos 2χ cos 2ψ = a2 − b2, (11.2.1b)

dPoincaré became interested in optics as a result of the lectures he gave at theSorbonne during the years 1888, 1889, and again in 1899. It seems like each time hetaught a new course new discoveries were to be made.

Page 591: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

564 A New Perspective on Relativity

Fig. 11.8. The polarization ellipse swept out by the electric field vector which isenclosed by a rectangle of sides 2a and 2b. The transformation to new electric vectorcomponents E′

x and E′y consists in a counter-clockwise rotation about the angle ψ.

U = Jε cos 2χ sin 2ψ = 2ab cos δ, (11.2.1c)

V = Jε sin 2χ = 2ab sin δ. (11.2.1d)

The first set of equalities are spherical coordinates of latitude 2χ andlongitude 2ψ, as shown in Fig. 11.7. Any two diametrically opposite pointson the sphere represent an orthogonal pair of polarization forms. Thereis a direct correlation between any point on the sphere and the form ofpolarization.

The second set of equalities express the Stokes parameters in terms ofthe horizontal and vertical, a and b, components of the electric vector and thephase angle, δ, between them. Expression (11.2.1a) is just the total intensity,expressed as the squares of a and b. If the electric vibration is horizontal,(11.2.1b) becomes 1 while if the vibration is totally vertical, −1. It vanishesfor circular polarization, a = b, and is elliptically polarized with a majoraxis at ±π/4. It thus expresses the preference for a horizontal, as comparedto a vertical, vibration. Expression (11.2.1c) expresses the preference for+π/4 vibration, while (11.2.1d) that of right circular polarization.

Alternatively, they can be given the density matrix representation,

ρ = 12

(J + Q U + iV

U − iV J − Q

), (11.2.2)

Page 592: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 565

whose trace is the total intensity J, and whose determinant J2 − V2 − Q2 −U2 ≥ 0, where the equality sign applies to the case of complete polarization.The property that the Stokes parameters form an invariant four-vector willbe of great usefulness to our development.

The Poincaré sphere is constructed by projecting points that definea light vector in the complex plane onto a real three-dimensional sphere.As we have seen in Sec. 11.1.5, every type of polarization can be describedby orthogonal components of the electric vector. If we represent the ratioof the two components as

Ey

Ex= b

aeiδ, (11.2.3)

we can then map the ratio Ey/Ex onto the complex plane consisting of axesu and v, where

Ey/Ex = (b/a)( cos δ + i sin δ) := u + iv.

Every possible form of polarization is represented in the uv-plane. Anycircle whose center is at the origin has radius b/a. Moreover, the phase δ isthe angle between the u-axis and the radius vector [cf. Fig 11.9 below]. Allpossible values of b/a are obtained by considering an infinite number ofconcentric circles about the origin, and all possible values of δ are realizedby sweeping the radius vector around each of the concentric circles.

Whereas the origin has b = 0, and therefore represents linearly hor-izontal polarized light, values of u and v which are infinite require a = 0,and therefore represent vertically polarized light. All states in the upperhalf-plane, v > 0, represent phase differences between 0 and π, and areright-handed polarizations, while all states in the lower half-plane corre-spond to phase differences between −π and 0, and so represent left-handedpolarizations. In regard to the Stokes parameters V and U, the intersectionof the unit circle (a = b) with the v-axis represents states of right-circularly(north pole) and left-circularly (south pole) polarized light, while intersec-tions of the unit circle with the u-axis represent +π/4 (east) and −π/4 (west)linearly polarized light, as shown in Fig. 11.9. Thus, any polarized state canbe identified with a point in the complex plane.

We know from Sec. 2.2.3 that by a stereographic projection any polar-ized state can be projected onto a Riemann sphere — only in this case it is

Page 593: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

566 A New Perspective on Relativity

Fig. 11.9. Complex plane representation of polarized states.

called the Poincaré sphere! Circles in the uv-plane whose centers lie on theu- and v-axes project into lines of longitude or lines of latitude, respec-tively, on the Poincaré sphere. Consider the former case first. A circlewhose center is (u0, 0) cuts points (0, 1) and (0, −1), which represent right-and left-circularly polarized light projected onto the south and north polesof the sphere. Every circle, or longitudinal line, will be characterizedby two values of the azimuth ψ of the characterizing polar ellipse. Onevalue represents points in the right hemisphere, while the other representspoints in the left hemisphere.

To find the value of u0 — which we guess will be given by theformula for stereographic projection — we must solve the equation fora circle whose radius is

√(u2

0 + 1), i.e.

(u − u0)2 + v2 = u20 + 1.

Introducing the definitions of u = (b/a) cos δ and v = (b/a) sin δ into this for-mula for a circle results in

b2 − a2

2ab cos δ= −Q

U= − cot 2ψ = u0.

Page 594: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 567

Since 2ψ lives in the semi-open interval (−π, π], the two values of theazimuths are ψ = − 1

2 cot−1 u0, and ψ = − 12 ( cot−1 u0 ± π), where the +

(−) sign applies to u0 >0 (u0 < 0).

Therefore, the longitude to which each circle centered on the u-axis inthe complex plane at (−cot 2ψ, 0) is 2ψ, and has a radius

√( cot2 2ψ+1) =

csc 2ψ.

Now consider the second case where circles centered on the v-axisat (0, v0) project into parallels of latitude on the Poincaré sphere where|v0| > 1. The radius of each circle is r = √

(v20 − 1) so that the equation of

the circle is

u2 + v2 − 2v0v = −1.

Again introducing the definitions of u and v in terms of the polarizingellipse leads to a2 + b2 = 2ab sin δv0, or

a2 + b2

2ab sin δ= J

V= csc 2χ = v0.

The radius of the circle in the uv-plane is r = √(csc2 2χ − 1) = cot 2χ.

Therefore, the latitude to which a circle of radius r = cot 2χ, centered at(0, csc 2χ), is projected onto the Poincaré sphere is given by the angle 2χ.

The stereographic projection of points on the uv-plane onto thePoincaré sphere is shown in Fig. 11.10. The complex plane bisectsthe sphere in such a way that its center coincides with that of the sphere.The orientation of the sphere is such that the +y axis of the sphere coincideswith the +u-axis, and the +z-axis with the +v-axis. A point P on the planeis projected to a point P′ on the sphere by extending the line connectingP and V, where V denotes vertically polarized light, and H, horizontallypolarized light. Thus, the latitude of the point P′ is given by the angle 2χ

formed from the vector from O to P′ and the projection of this vector ontothe xy-plane, where positive angles are measured for increasing z.

Page 595: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

568 A New Perspective on Relativity

Fig. 11.10. Stereographic projection of the complex plane onto the Poincaré sphere.

11.3 Poincaré’s Representation and Spherical Geometry

The mixing of (11.1.21) and (11.1.22) does not reflect their original defini-tions as Maxwell stresses and (11.1.20) as the momentum. Rather, if weintroduce the Poincaré representation with the angle variables (2χ, 2ψ),which are related to (2ϑ, 2ϕ) by the right-spherical triangle shown inFig. 11.1, we will get

Jz = W cos 2χ · cos 2ψ = 12

(a†+a+− a†

−a−)

, (11.3.1a)

Jx = W cos 2χ · sin 2ψ = 12

(a†+a−+ a†

−a+)

, (11.3.1b)

Jy = W sin 2χ = −12

(a†+a−− a†

−a+)

, (11.3.1c)

in place of (II′). Whereas expression (11.3.1a) corresponds to thexx-component of the Maxwell stress,

σxx = 18π

(E2y − E2

x),

Page 596: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 569

(11.3.1b) is the amount of x-momentum that flows in the y direction,

σxy = 14π

ExEy.

Due to symmetry, this is equal to the amount of y-momentum that flowsin the x direction. Finally, expression (11.3.1c) is the angular momentumoperator. The same configuration repeats itself by mixing the tangential andnormal Maxwell stresses in the plane normal to the invariant momentumoperator.

The mixing of the normal and tangential stresses with the spin andtotal energy corresponds to the rotation of light through an angle 2ψ aboutits direction of propagation. For example, light can be passed through acrystal plate with simple rotatory power [Perrin 42], where

J′ = J,

J′z = Jz cos 2ψ − Jx sin 2ψ, (11.3.2a)

J′x = Jz sin 2ψ + Jx cos 2ψ, (11.3.2b)

J′y = Jy.

The rotations of the normal and tangential stresses are quite different fromthose predicted by relativity [McCrea 47]. Whereas (II′) elicits a mechanicalinterpretation, (11.3.1a)–(11.3.1c) requires an electromagnetic interpreta-tion. In the former, the axis normal to the mixing of the two mass com-ponents was the momentum, whereas, in the latter, it is the two stresscomponents that lie in a plane normal to the spin.

Whereas the square of the orthogonal vectors corresponds to therelativistic mass relation, the sum of the square of the normal and tan-gential Maxwell stress components is

J2z + J2

x = W2 cos2 2χ = W2( cos2 2ϑ sin2 2ϕ + cos2 2ϕ). (11.3.3)

Averaging (11.3.3) over all directions of polarization, by integrating overall ϕ, where

cos2 2ϕ = sin2 2ϕ = 12

,

Page 597: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

570 A New Perspective on Relativity

gives

J2x + J2

z = W2

(1 + cos2 2ϑ

2

). (11.3.4)

If 2ϑ is interpreted as the angle of scattering with respect to the direction ofpropagation of the primary beam whose intensity is W2, then (11.3.4) givesthe intensity of a scattered beam. Observing that

J2y = 1

2W2 sin2 2ϑ,

we obtain Rayleigh’s expression [Born & Wolf 59],

ε = J2y

J2x + J2

z

= sin2 2ϑ

1 + cos2 2ϑ, (11.3.5)

for the degree of polarization, although Rayleigh derived it in a differentway.

Moreover, by a change of coordinates we have gone from a hyper-bolic to an elliptic space. Consider again the elliptic velocity right trianglein Fig. 11.1. The angle 2ϕ at the origin will have the same elliptic measureas the Euclidean measure. The cosine of the angle is cos 2ϕ = tan 2ψ/ tan 2ϑ.The same, however, is not true for the non-central angle 2ω for it willundergo distortion, and so, too, the side 2χ.

Its cosine will be given by

cos 2ω = tan 2χ

tan 2ϑsec 2ψ = sin 2χ

sin 2ϑ,

where the last relation follows from the elliptic Pythagorean theorem,cos 2ϑ = cos 2ψ ·cos 2χ. For the elliptic angle, ω, there is no distortion sothat cos 2ω = tan 2χ/ tan 2ϑ. It therefore follows that the relation betweenthe two measures of the angle is

cos 2ω = cos 2ω · sec 2ψ.

Since sec 2ψ ≥ 1, cos 2ω > cos 2ω, and because the cosine is a decreasingfunction on the interval (0, π/2), ω <ω. This is the origin of the angle excessin elliptic geometry. And just like hyperbolic geometry, the angles of anelliptic triangle also determine the sides.

Page 598: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 571

It will appear to us that the side 2χ will undergo a dilatation by theamount

sec 2ψ = cos 2χ

cos 2ϑ=

√(u2 + (1 − u2) cos2 2ϕ)

u≥ 1, (11.3.6)

where, again, the first equality is the elliptic Pythagorean theorem, and theinequality, cos 2ϑ < cos 2χ, implies thatϑ > χ. The space dilatation dependsupon the polarization which is determined by the relative phase 2ϕ. For alinearly polarized wave, 2ϕ = 0, π, the stretching is maximum, 1/u, whilefor left- (right-) circular polarization, 2ϕ = −π/2 (2ϕ = +π/2), it vanishes.This occurs when the amplitudes of the orthogonal components of theelectric vector become equal. Intermediary, elliptic, polarization occurs inthe interval 1 ≤ sec 2ψ ≤ 1/u.

We have underscored the analogy between the Stokes parameters andthe operators of SU(2). What can we say about the strong interaction whichsupposedly uses SU(3) whose states are the color charges? Since there aresupposedly three ‘colors’ for each of the six quark species, a 3 × 3 matrixis required. This means that there will be eight generators, replacing thethree Pauli matrices of SU(2). These are known as the Gell-Mann matrices,named after their inventor. Instead of a single (Casimir) invariant of SU(2),there will be three. But, for a compact group these Casimir invariants canalways be written as a sum of squares of generators [cf. (11.3.7) below]. Thiswould imply that the SU(3) group is not elementary, but, rather, the differentSU(2) subgroups of SU(3) can be used. Within each subgroup the operatorswould be those of the ordinary angular momentum algebra [Lipkin 66]. Inother words, any two components of the triplet can define isospin leavingthe third component invariant. The couplings of these subgroups wouldbe related to the additivity of the Stokes parameters when there is a super-position of two independent light beams. Additivity reflects the lack ofinterference, or the lack of correlation of the amplitudes and phases. It isfrom this additivity principle which makes the scattering parameters ofthe emergent beam a linear homogeneous function of the incident beamfrom which the Lorentz and rotational transformations are immediateconsequences.

Page 599: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

572 A New Perspective on Relativity

11.3.1 Isospin and the electroweak interaction

The distinction between elliptic and hyperbolic geometries can betranslated into the language of Lie groups. A ‘compact’ Lie group is associ-ated with elliptic geometry, where the parameters of the group can assumevalues over a closed interval. The group U(1) for the electromagnetic inter-action is compact because it is characterized by a unique angle that cantake on values in the closed interval [0, 2π]. It is said that U(1) applies toelectromagnetic interactions because it represents phase changes; the elec-tromagnetic four-vector potential, Aµ, is determined up to four-divergenceof an arbitrary function.

However, even classically, it is known that when a circularly polarizedlight beam is directed at a target it sets the electrons in the target into circularmotion in response to the rotating electric field. Hence, a relationship issuggested between circularly polarized light and photons in a definite stateof angular momentum. A fortiori photons have definite states of helicitywhich are related to states of left- and right-handed circular polarization.Thus, the photon is not a singlet, but a doublet, just like the three doubletsof leptons, the electron, muon, and tau, all with their own neutrinos. Thedoublet structure that defines an SU(2) symmetry for the weak force arisesfrom the lepton’s behavior with respect to weak decays, like the β-decaydiscussed in Sec. 11.1.7. And just as each doublet belongs to a fundamentalrepresentation of weak SU(2) symmetry so, too, the photon has a doubletstructure. Analogous to the three weak isotopic spin components of thelocal gauge, the three elements of the electromagnetic interaction are theStokes parameters.

The compactness of the group ensures that the group is unitary, orthat it has a unitary representation. Non-compact groups have parametersthat are not restricted to a finite interval. An example is the Lorentz group,where the ‘boosts,’ or transforms from one inertial frame to another, arerepresented by non-unitary matrices. In fact, the ‘boost’ parameters arenothing but rapidities,

u = tanh−1 u,

which are not restricted to finite intervals. As we know, these belong tohyperbolic geometries.

Page 600: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 573

The distinction is also represented in the quantities that are conserved:Compact Lie groups conserve the total energy, or total momentum, whilenon-compact ones will conserve mass. Thus far, non-compact Lie groupshave not found their way into gauge theory since internal quantum num-bers, like isotopic spin, appear to be associated with compact symme-try groups. But, by all of what we have said about the transformationfrom elliptic to hyperbolic geometry, and back, we expect non-compactgroups to find their way into gauge theory, or something more fundamentalthan it.

Generalizing to n-dimensions, the unitary group U(n) is representedby n × n unitary matrices. Those with determinant equal to +1 define thespecial unitary or modular group SU(n). The elements of SU(n) have n2 −1independent parameters. Examples of such groups are the SU(2) group ofisotopic spin and the SU(3) group associated with color.

The unitary transformations of SU(2) are given by

U = e−iσ ·α ,

where σ consists of three generators, which are the Pauli spin matrices, andthe components of α are the three weak isotopic spin components of thelocal gauge,

σ · α =(

α3 α1 − iα2

α1 + iα2 −α3

).

The αi form a linear space, known as the Lie algebra, in which there is bothvector and scalar products. The fact that the operators do not commuteleads to a form of vector multiplication, or Lie product, while the scalarproduct, or the negative of the determinant,

α23 + α2

1 + α22 = const., (11.3.7)

expresses the conservation of something like total angular momentum,energy, or intensity. This invariant commutes with all the generators.

The doublet structure of quarks and lepton that defines a SU(2) sym-metry for the weak nuclear force follows from their behavior with respect toweak decay, such as the β-decay discussed in Sec. 11.1.7. Electrons, muons,and tau particles each have their own neutrinos and form three doublets.This carries over to quarks, which again form three distinct doublets.

Page 601: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

574 A New Perspective on Relativity

According to the ‘standard’ theory, the mediators of the weak SU(2),or spin-1 gauge particles are by definition massless. However, it has beenknown since the early 1930’s that the force between nucleons has anextremely short range, and this is what led Yukawa to propose his shortrange potential. It implies that the masses of the spin-1 vector mesons thatmediate the weak interaction, or the W -bosons, are anything but zero.

So what is done is to ‘mix’ the electromagnetic U(1) symmetry withthe SU(2) symmetry. To the masses of the charged bosons W1 and W2, oneadjoins a third component W3, corresponding to the third (diagonal) Paulimatrix, σ3. The new W3 component would have the same coupling strengthas the

W± = W1 ± iW2, (11.3.8)

bosons, but it would be neutral. Being neutral, W3 would imply a new classof interactions for both the electron and neutrino. These ‘neutral interac-tions’ were unknown at the time they were predicted, and earlier gaugetheories were built to exclude the possibility of such neutral currents.

The problem then was to couple the new field W3 to a physical gaugefield. This was taken to be the four-vector potential Aµ = (φ, A) itself.But this required something more than SU(2). So the combined weakand electromagnetic interactions would be ‘unified’ in the larger gaugegroup SU(2) × U(1). In the absence of W3, the force between two elec-trons would be given exactly by Coulomb’s law, while, in its presence,Coulomb’s law must be modified. What was charge and vector potentialin electromagnetism must now be modified to contain a touch of the newweak interaction.

The simplest way was to consider a linear combination of the two,(

Z0µ

)=(

cos ϑw sin ϑw

− sin ϑw cos ϑw

)(Aµ

W3µ

), (11.3.9)

where ϑw is the so-called Weinberg angle that is defined in terms of the‘coupling’ constants of (hyper-) charge and the weak isotopic charge, so asto produce a ‘new’ four-vector potential, A

µ, in respect to the ‘old’ four-vector potential Aµ, and a new weak field, Z0

µ. This was a newly hypoth-esized neutral weak boson that forms the SU(2) triplet of weak bosonstogether with the original W bosons, (11.3.8). The reason why W3

µ-field

Page 602: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 575

Fig. 11.11. The scattering of a neutrino and antineutrino emits a Z0 boson whichdecays into W bosons.

was ousted was due to the definition of the new four-vector potential,A

µ. This meant that W3µ cannot be considered to be purely weak, but also

contains an electromagnetic contribution. Then, Z0µ would be the ‘physi-

cal’ neutral weak field. However, since (11.3.9) is invertible, the roles of W3

and Z0 can be interchanged. As it stands, (11.3.8) and the Z0 boson mustsatisfy a conservation relation of the form (11.3.7). The Z0-emitted bosonin neutrino-antineutrino scattering decays into the W+ and W− bosonsshown in Fig. 11.11.

In contrast, the components of the four-vector Aµ transform accordingto Lorentz,

A0′ = A0 + uA1√

(1 − u2), A1′ = A1 + uA0

√(1 − u2)

, A2′ = A2, A3′ = A3, (11.3.10)

which leaves the square magnitude,

A20 − A2

1 − A22 − A2

3, (11.3.11)

invariant, where A0 = φ.

The hyperbolic nature of the four-vector Aµ makes it transform throughan imaginary angle in (11.3.10), and leads to a different type of invariant.Any other field which is coupled to it must transform in the same way inorder to be compatible with it. In other words, the invariant (11.3.11) isnot the same as the invariant (11.3.7), for the former is hyperbolic whilethe latter is elliptic. We recall that the invariant in hyperbolic space isthe mass, whereas the total energy is the invariant in elliptic space.

Page 603: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

576 A New Perspective on Relativity

It is hard to believe that Nature is such an improviser of mixing hyperbolicwith elliptic elements, which would be like gluing two incompatible piecestogether.

The story is still not over. The masses of the charged bosons (11.3.8)weigh in at about 80 times that of a proton, and the neutral Z0 boson isslightly heavier at 91 times that of a proton. The problem was to get mass outof a theory which apparently forbids it. It was required that the Lagrangian,leading to correct equations of motion, must be gauge-invariant, and thisprevented mass appearing explicitly in the Lagrangian through a term ofthe form mAµAµ. The rabbit was pulled from the hat by introducing aspin-0 field, together with its accompanying particle, known as the Higgsfield and particle, after Peter Higgs who invented them. Then by introduc-ing a potential of the field which undergoes a second-order phase transition,mass would suddenly appear at the onset of the phase transition.

Therefore, it was claimed that, some new physics is called for suchas spontaneous symmetry-breaking, where the Higgs field allows quarks andelectrons to acquire mass. The postulated, but unproven, Higgs field isanalogous to Cooper pairs in superconductivity, and like Cooper pairs, ismassive.e This is analogous to the Dirac equation where mass is introduced‘by hand,’ in order to get it to satisfy the relativistic conservation of energy.

The mysterious Higgs field has been likened to an aether[Moriyasu 83]. Have we come more than a century after its demise justto return to the aether that was found so useful in electromagnetic theory?The Higgs field of a superconductor was the ensemble of electrons boundinto Cooper pairs. Does the Higgs field represent a new binding force thathas a range much smaller than the weak interaction, or is it just a figmentof the imagination?

It is argued that purely transverse waves cannot describe massbecause Maxwell’s equations are both transverse and massless. Any andall attempts to destroy the transverse property of the electric and magnetic

eThe question whether or not the spontaneous break-down of the SU(2) × U(1)to the U(1) of electromagnetism depends on the open question whether the Higgsfield actually exists. It is claimed that if the Higgs field is mathematical, ratherthan physical, then there must be some new physics lying around that makes thespontaneous symmetry-breaking such a good description of elementary particlesdown to distances of the order of 10−16 cm [Georgi 09]. Though symmetry describesthe mechanism, it cannot supply the physics.

Page 604: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 577

fields have met with disastrous consequences [Heaviside 99]. We willanalyze those consequences of introducing mass into Maxwell’s equationsin Sec. 11.5.

11.4 Polarization of Mass

11.4.1 Mass and momentum

The Stokes characterization of the two independent states of light polar-ization is mathematically identical to the orientation of a spin- 1

2 particle[Fano 54]. We have seen that a completely polarized beam of light has anelectrical vibration which may be represented by its components alongtwo rectangular axes, (11.1.19a) and (11.1.19b). Electromagnetic vibrationschange irregularly and erratically. Yet, for elliptically polarized light theirregular vibrations must be such that the ratio of the amplitudes, togetherwith the phase difference, must be absolute constants. Hence, no averagingis required.

The average energy and momentum of vibrations are

W = a2 + b2,

p = a2 − b2.(A)

In spherical coordinates of a vector of length W , longitude 2ϕ, and co-latitude 2ϑ, the Stokes parameters are given by scheme (II). These threequantities determine elliptic vibrations, apart from their phase.

Dirac [47] made the distinction between the way W and p transform byrotation through a hyperbolic angle, and the rotation of ml and mt througha real angle, but thought that the former applies to the space and timecoordinates, x and t, while the latter to the space coordinates y and z. Thisis unfortunate since it has led to the introduction of space-time invariancewhich has nothing to do with the theory.

From (II) it is at once apparent that W and p are invariant under arotation of axes, while the mass components change with the axes. If m′

land m′

t are the values of ml and mt after a rotation of axes through an angle2ϕ in the clockwise direction,

m′l = ml cos 2ϕ + mt sin 2ϕ,

m′t = −ml sin 2ϕ + mt cos 2ϕ,

(a)

Page 605: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

578 A New Perspective on Relativity

while W ′ =W , p′ = p. The rotation (a) can be thought of as a rotation of twonucleons, or any mixture of the two, in isospin space. From these equationsit follows that

W ′ 2 − p′ 2 − m′ 2l − m′ 2

t = W2 − p2 − m2l − m2

t (11.4.1)

is an invariant under rotations. In other words, (11.4.1) is invariant undera rotation of the axes. This has nothing to do with its invariancy under aLorentz transform!

As Dirac pointed out, we can satisfy (11.4.1) when W �= W ′ and p �= p′,but with invariant mass, by rotating the axes through a hyperbolic angle.The hyperbolic measure of the relative velocity, u, is related to the Euclideanmeasure u, by the usual form of the rapidity,

u = tanh u, (11.4.2)

or equivalently,

u = tanh−1 u = 12

ln

(1 + u1 − u

). (11.4.3)

Since p = Wu, it follows that

W = a2 + b2 = m cosh u,

p = a2 − b2 = m sinh u.(A’)

Rotating (A’) through the hyperbolic angle v results in

W ′ = W cosh v + p sinh v,

p′ = W sinh v + p cosh v,(b)

while the total mass,

m = √(m2

l + m2t ), (11.4.4)

is invariant because each of its components remain invariant, m′l = ml and

m′t = mt. The pair of equations (b) is the Lorentz transform in the plane

p, W , and not in the xt-plane as Dirac [47] would have us believe.

Page 606: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 579

If we insist on the invariance of a four-vector, we can always choose ouraxes so that one points in the direction of the momentum, thus leavingtwo slots vacant in the four-vector that needs to be filled. On the strengthof energy conservation, they can be filled only by the components ofthe mass such that (11.4.4) holds. And once it is recognized that p isthe momentum in the direction of the motion so that the momentumis not given by its three Cartesian components, Dirac’s theory becomesequivalent to Stokes’s formulation with the transformation (b).

The ratio of the semiminor to the semimajor axis of the electric ellipseis [cf. (11.4.13) below]

∣∣∣∣ba

∣∣∣∣ =(

1 − cos 2ϑ

1 + cos 2ϑ

)1/2

= tan ϑ. (11.4.5)

The numerical value of tan ϑ represents the ratio of the sides of the rect-angle, of area ab, which encloses the ellipse that the point of the vibratingelectric vector traces out in Fig. 11.8. Now, from the relation p = Wu andthe relation between Euclidean and hyperbolic measures of the relativevelocities, (11.4.3), we find the same ratio of the axes of the ellipse to be[cf. (11.1.30)],

∣∣∣∣ba

∣∣∣∣ = e−u =(

1 − u1 + u

)1/2

. (11.4.6)

Finally, comparing (11.4.5) with (11.4.6), and noting (11.4.2), we find

u = cos 2ϑ = tanh u, (11.4.7)

which again identifies 2ϑ with the Bolyai–Lobachevsky angle of paral-lelism. This angle provides the link between hyperbolic and circular func-tion as (11.4.7) clearly shows. The ratio of the momentum to the energy,

pW

= a2 − b2

a2 + b2 = cos2 ϑ − sin2 ϑ = cos 2ϑ, (11.4.8)

is precisely (11.4.7).Fermi’s original formulation of β-decay took into account five differ-

ent interactions, called scalar (S), vector (V), tensor (T), axial vector (A),and pseudo-scalar (P). These interactions are distinguished by the way

Page 607: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

580 A New Perspective on Relativity

they transform under Lorentz transformations [Lipkin 62]. Consider Fermitransitions where S and V interactions contribute, as in the case where aleft-handed neutrino is ejected. The V interaction will give right-handedelectrons, while the S interaction gives left-handed electrons in the extremerelativistic limit.

In the helicity plane, the V axis will align itself with the vertical, whilethe S-axis will align itself with the horizontal axis. These two axes aresymmetric about the axis that makes a 45◦ angle which occurs when u = 0,and is an even parity s-state. As the electron slows down, these vectorsrotate toward one another until they coincide in the zero velocity state at45◦ in Fig 11.12. At some velocity u, the V interaction will have an averagehelicity +u, while the S interaction will have a mean helicity −u. The meanhelicity is given by (11.4.8), where the angle ϑ represents the angle betweenthe vertical axis and the vector V. The electron state corresponding to the Sinteraction makes the same angle between the horizontal and the S vector.These vectors play the role analogous to the vibrating electric vector inoptics, whose components must always remain orthogonal to one anotherbecause photons can only travel at the speed of light.

Fig. 11.12. V and S interactions rotate toward one another as the electron velocitydecreases.

Page 608: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 581

The helicity states become orthogonal only in the extreme relativisticlimit. In general, the decay probability will not reduce to the sum of thesquares of the different helicity states, except in the extreme relativisticlimit. If we characterize the decay according to S and V interactions, thedecay probability will not consist of independent contributions, except inthe ultrarelativistic limit and become identical in the nonrelativistic limit. Ingeneral, therefore, the S and V interactions will contain energy-dependentcross terms in the decay probability spectrum, which would vanish in theextreme relativistic limit where the S and V states become orthogonal. Theseenergy-dependent cross terms in the decay probability are known as Fierzinterference terms.

The simplest type of Fierz interference occurs between two channelswith opposite electron helicity, and the same values for all other quantumnumbers. The wave functions corresponding to these two orthogonal statesare entirely analogous to the orthogonal states of longitudinal spin whosespin direction is given by stereographic projection. In the plane representedby the orthogonal axes of positive and negative electron helicity, a vectorat an angle ϑ with respect to the vertical (he = +1) represents a mixtureof states having both positive and negative helicities that have amplitudesproportional to cos ϑ and sin ϑ, respectively. The mean helicity of such astate,

⟨he⟩, is (11.4.8) [Lipkin 62]. So parity non-conservation is written into

the Stokes parameters when we identify the rotated Poincaré representation(II) with the Minkowski representation (A’).

In fact, (11.4.8) is the parity violation law of weak interactions [Omnès70], as we have seen in Sec. 11.1.7. Because particles and antiparticles areoppositely polarized, charge conjugation symmetry has also to be aban-doned. In all cases, the degree of polarization is found to be equal to therelative Euclidean velocity, u. But, the angle ϑ is the angle of parallelism,it will be a function of the hyperbolic velocity, u. It is entirely reasonablethat longitudinal polarization should tend to zero with the velocity since,in the limiting case of zero momentum there can be no longitudinal polar-ization. A limiting case occurs in the polarization of muons, where themuon and anti-muon are 100% polarized because they travel at the speed oflight.

For an electron, the state of helicity − 12 is more heavily populated

than the state of helicity + 12 . Calling a2 the number of electrons found with

Page 609: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

582 A New Perspective on Relativity

helicity− 12 and b2 the number of electrons with helicity+ 1

2 , we immediatelyfind (11.4.8). In other words, if the mean spin of a particle is ± 1

2 in thedirection of polarization, its mean value in any other direction will be itsprojection,

12

cos 2ϑ = 12

(P+ − P−),

where the probabilities for finding the particle with spins ± 12 are P+ =

cos2 ϑ and P− = sin2 ϑ, which conserve probability, P+ + P− = 1.In β-decay there are two types of measurements made on leptons.

The first consists of polarization measurements that determine the meanhelicities of the particles with respect to a specified axis, and the sec-ond determines the angular distribution of the emission of a particlewith respect to a specified axis. Whereas angular momenta are restrictedto discrete values, linear momenta are not, and are known to have a1 ± cos 2ϑ distribution, or more generally as 1 + A cos 2ϑ, where A, theasymmetry parameter, is a mean value that is determined by the projec-tion of angular momenta on a preferential axis, or the two possible statesof helicity of the electron and neutrino [Lipkin 62]. Since A is propor-tional to ±1, the probability for any interaction will be 1

2 (1 ± u), depend-ing on whether the helicities of the electron and neutrino are equal oropposite.

When there is a difference in phase 2ϕ′ between components of vibra-tion along orthogonal axes, we have a counter-clockwise rotation

m′l = W sin 2ϑ cos 2(ϕ + ϕ′) = ml cos 2ϕ′ − mt sin 2ϕ′, (11.4.9a)

m′t = W sin 2ϑ sin 2(ϕ + ϕ′) = ml sin 2ϕ′ + mt cos 2ϕ′, (11.4.9b)

togetherwith the invariancy of p′z = pz, and W ′ = W . This says that thereare

two polarization states of mass, both normal to the direction of momentum.The phase 2ϕ′ ‘mixes’ these components according to (11.4.9a) and (11.4.9b).This is somewhat analogous to the early days of relativity where distinctionwas made between the ‘transverse’ mass, for which the force is normal tothe velocity, and the ‘longitudinal’ mass, where the force is parallel to thevelocity [cf. Sec. 11.1.1].

Page 610: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 583

Just as in the Trouton–Noble experiment, described in Sec. 3.7.2, thecharge on the electron would feel a couple whose axis is perpendicu-lar to the plane formed from the velocity and direction of its motion.The only difference is that the velocity is not due to the Earth’s motionthrough the aether, but to the mass in motion. This could be the originof mass polarization at the elementary particle level. Whereas Lorentzprovided the bridge from the bulk to the atomic level, this would pro-vide a bridge from the atomic level to that of its elementary particleconstituents.

We will now show that ±mt describe right- (left-) circular polarization,and ml the polarization at 45◦ degrees from the two orthogonal componentsof the electric vector. Squaring (11.4.9a) and (11.4.9b), and adding give

m′2 = m2 = W2 − p2 = W2 sin2 2ϑ, (11.4.10)

on account of (11.4.4) and (11.4.8). In view of (11.4.7), (11.4.10) asserts that

W√

(1 − β2) = const., (11.4.11)

under rotations. Equation (11.4.11) is none other than invariancy of the totalmass m, and explains the increase of energy with speed. From (11.4.10) wemay say that mass is identified with transverse momenta, which in jetsprovides an estimate of the masses of resonance states. They are invari-ant under rotation. Resonance states live longer than the time of theircreation, and have masses comparable to twice their transverse momenta[Heisenberg 66].

A plane wave will be polarized along the z-axis, either in the posi-tive or negative direction. A more general treatment of polarization in anydirection is to consider II′ as direction cosines so that polarization in anydirection will be given by

σ · p = σxpx + σypy + σzpz

=(

pz px − ipy

px + ipy −pz

),

Page 611: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

584 A New Perspective on Relativity

where

σx =(

0 11 0

), σy =

(0 −ii 0

), σz =

(1 00 −1

),

are the Pauli spin matrices, all of which have eigenvalues ±1.Thus, if the wave function,

ψ = aψ+ + bψ−, (11.4.12)

is a linear combination of ψ+ and ψ−, which represent states of spin ± 12 in

the positive and negative z-directions, the relative weights are given by

σ · p

(ab

)=(

ab

),

or

ab

=(

1 + cos 2ϑ

1 − cos 2ϑ

)1/2

eiϕ = cot ϑeiϕ. (11.4.13)

Again we find a stereographic projection of an arbitrary spin direction ontothe z-axis.

The equation for the Pauli spinor �, analogous to the Diracequation, is

(σ · p)�(p) = W�(p). (11.4.14)

Spin in Stokes’s momentum space (II′) requires a Pauli spinor, and its nega-tive energy solutions double that number. There will be non-zero solutionsto (11.4.14) if and only if the determinant,

W2

∣∣∣∣∣1 + pz/W (px − ipy)/W

(px + ipy)/W 1 − pz/W

∣∣∣∣∣

= W2 − p2z − p2

x − p2y, (11.4.15)

of the pair of linear homogeneous equations (11.4.14) vanishes. The vanish-ing of (11.4.15) is precisely the condition for complete polarization, ε = 1,and if the four Stokes parameters satisfy this condition they may be con-sidered the polarization parameters of the light beam.

Page 612: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 585

The terms in the matrix corresponding to (11.4.15) are related to thoseof the density matrix [McMaster 54], where pz/W = ±1 is plane polariza-tion along he = ±1,f right or left-handed helicity (u = 1), px/W = ±1 isplane polarization at π/4, or equal left- and right-handed helicity u = 0,and py/W = ±1 is right- or left-handed polarization, respectively. Theformer pair gives the probabilities for propagating in the direction of thez-momentum or opposite to it, 1

2 (1 ± u), or 12 (1 ± cos 2ϑ), while the latter

pair is proportional to the mass times the phase, and gives the probabilityfor a turn in a ‘space-time path.’

11.4.2 Relativistic space-time paths: An exampleof mass polarization

Feynman [65], in his visualization of the “space-time paths for the one-dimensional Dirac particle,” wrote the propagator as

K+− =N∑

zig-zag paths

(iεm)R , (11.4.16)

for an N-segment trajectory in time t with R reversals. In his notes [Schwe-ber 86], Feynman writes that “each turn to + gives +iε, each turn to − gives−iε, where ε := t/N is the infinitesimally small time interval. Gersh [81]claims that the minus sign should be present in (11.4.16) in order “to getthe correct nonrelativistic limit.” In fact, both signs should be present in(11.4.16), which still is only part of the propagator.

Feynman associates the probability amplitude for a reversal withmass. But mass does not enter in the way it enters the Dirac equation in con-figuration space because there it enters in the diagonal terms, and not in theoff-diagonal ones. In a one-dimensional stochastic model [Gaveau et al. 84]of an electron shuttling back-and-forth at the speed of light, the energy con-servation equation (11.1.8), implying a wave equation, is waived in favorof the telegraph equation, where mass enters through the dissipative term.This is also inaccurate, and is only salvaged formally by an analytic continu-ation of time. But, this does not explain why an electron should shuttle back

fThe helical states stand in for orthogonal components, Ex and Ey, of the electricvector, E.

Page 613: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

586 A New Perspective on Relativity

and forth at the speed of light. So where does mass enter in the expressionfor the probability amplitude, and at what speed will the electron travel?

Feynman was correct to associate the probability amplitude for apath reversal with the mass, but this is only part of the story. The totalpropagator,g

K(p) := σ · p =(

pz px − ipy

px + ipy −pz

), (11.4.17)

is the same as the matrix of the local weak SU(2) gauge transformation. If(11.4.17) is to reflect Feynman’s rule for a path reversal, it must be given by

σ · p =(

p me−iϕ

meiϕ −p

), (11.4.18)

which is (slightly!) more general than Feynman’s prescription, (11.4.16),since it allows for phases other than ϕ = π/2.

Then, the propagator for path of length N and energy W that will betraversed in time t such that W = N/t =: 1/ε, in natural units, is eiσ ·pε foreach segment. This propagator propagates the Pauli spinor �(p, t) to

�(p, t + ε) = eiσ ·pε�(p, t), (11.4.19)

in time ε.The matrix exponential function is defined by the infinite series,

eiσ ·pε = I + iσ · pε + · · · + in(σ · p

)nεn/n!,

where I is the unit matrix. For small time intervals, this permits us to write(11.4.19) as

�(p, t + ε) = (I + iσ · pε)�(p, t). (11.4.20)

Then, proceeding to the limit as ε → 0 gives

−i

(�−�+

)=(

p m e−iϕ

m eiϕ −p

)(�−�+

), (11.4.21)

gThis identifies the Stokes parameters (II′) with the three weak isotopic spin com-ponents of the local gauge.

Page 614: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 587

where the dot denotes differentiation with respect to time. For ϕ = π/2,(11.4.21) becomes the Dirac equation in momentum space,

W� = (σzp + βm)�,

where p = pz, and

β =(

0 −1−1 0

).

In (11.4.21) we have introduced the mass according to the first twoequations in scheme (II′), i.e. px ± ipy = me±iϕ. For ϕ = 0, π mass is longi-tudinal corresponding to linear polarization, while for ϕ = ±π/2, the massis transverse corresponding to right-(left-) circular polarization. The traceof (11.4.21) vanishes as every SU(2) representation must be symmetricalabout 0; the spin varies from −j to +j. The elimination of either componentin (11.4.21) by increasing its order gives back the Klein–Gordon equation,

� = −(p2 + m2

)�,

where � represents either component of the spinor.The propagator (11.4.17) is related to the density matrix,

ρ = 12

(W + p ml − imt

ml + imt W − p

). (11.4.22)

Photons show only longitudinal polarization: spins parallel or anti-parallelto the direction of propagation. For photons u = ±1 and it has two helic-ities h = ±1, corresponding to the Jz component of angular momentum,(11.1.20). Since both mt and ml vanish, there is no state of helicity h = 0.In other words, there are no ladder operators, Jx ± iJy, with a multiplet of2J + 1 degenerate states, −J, −J + 1, . . . , 0, . . . , J − 1, J.

For complete polarization the determinant of the density matrix,(11.4.22), vanishes. The diagonal terms are the probability for an electronto propagate with its helicity in the direction of the momentum,

P+ = 12

(W + p

W

)= 1

2(1 + u) = cos2 ϑ,

Page 615: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

588 A New Perspective on Relativity

and the probability that it will propagate with its helicity in the anti-paralleldirection with respect to the direction of its momentum,

P− = 12

(W − p

W

)= 1

2(1 − u) = sin2 ϑ.

The difference between the two probabilities is the relative velocity, u = p/W . Thisis none other than the parity violation in weak decay, (11.4.8). The helicitygoes to zero as its momentum goes to zero — a fact that is well-known. Inoptics, P± = 1 would correspond to plane polarization along he = ±1.

The mass in Feynman’s formula, (11.4.16), corresponds to the off-diagonal terms, me±iϕ in (11.4.17) for a phase, ϕ = π/2. Feynman, withhis chess-board approach to the zig-zag motion of the electron, wasconsidering the transverse mass, mt = m sin ϕ for a phase ϕ = π/2.Consequently, Feynman’s formula (11.4.16) is only part of the propaga-tor (11.4.17), consisting of the off-diagonal terms for a phase, ϕ = π/2.Two dimensions is vital in order to account for electron’s spin.

Comparing (11.4.22) with (11.2.2) the transverse mass corresponds tothe Stokes parameter U, and the difference in linearly polarized compo-nents which are oriented at ±π/4 intensities. Analogously, the longitudinalmass, ml = m cos ϕ corresponds to Q, the difference in horizontal and ver-tical polarized light intensities.

The wave field is transverse and this explains why the density matrix(11.4.22) is two-dimensional in any frame of reference. Mass is normallytaken into account in a field which is longitudinal, like sound waves, wherethe velocity of propagation is inversely proportional to the square root ofthe density. However, by the fact that helicity can either be in the directionof momentum or in the opposite direction, the difference in the numberof particles of opposite helicities, which is proportional to the electron’srelative velocity, possesses inertia.

Feynman’s image of an electron shuttling back-and-forth at the speedof light is to be replaced by the electron’s helicity, or its spin axis, that isdoing the shuttling in the direction of the electron’s momentum, or in theopposite direction, thereby oscillating between a left and a right-handedscrew. And because helicity is proportional to the velocity of an electron,

Page 616: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 589

the shuttling is done at that velocity, and not the velocity of light. Helicitymust therefore have inertial properties.

The conservation equation, (11.1.8), leads to two values of the energy,W = ±p. To interpret negative values of W , the analogy with the generatorsof angular momentum, rather than the Stokes parameters where W > 0, isthe more pertinent one. To the total energy W , there corresponds a multipletof 2W + 1 states of the same eigenvalue W but with the projection onto thedirection of angular momentum, pz taking on values between +W and −W .These states are identified as states of helicities h = 1 and h = −1, respec-tively. The extreme states apply to massless particles which are either rightor left-handed. This would apply to an ultrarelativistic electron, where therest mass of electron can be neglected since it behaves essentially as a zeromass particle.

As the velocity of an electron decreases from its ultrarelativistic valueto smaller values, the average helicity, | ⟨h⟩ | = u, would also decrease.The helicity would, therefore, vary in a continuous manner, and not aschanges in discrete values, −s, −s+1, . . . , +s, where s is the spin of the par-ticle [Schweber 61, p. 113]. This would imply that the rest mass varies as

m = W sin 2ϑ = W√ (

1 − u2)

, (11.4.23)

and, since W = const., the mass becomes increasingly smaller as the relativevelocity u → 1. The usual relativistic result, where m = const, and Wbecomes infinite in the same limit, does not apply. It would also apply toa left-handed neutrino whose spin is anti-parallel to its momentum, andright-handed antineutrino whose spin is parallel to its momentum.

In the hole picture, the antineutrino would have a momentumanti-parallel to the momentum of the negative energy state which hasbeen vacated [Schweber 61]. But, there is no need to consider a Dirac‘sea’ filled with negative energy states, which can never be neutral-ized [Oppenheimer 30]. Rather, the anti-particle of the electron is right-handed in the ultrarelativistic limit, corresponding to the eigenvalue −Wof the total energy, while the left-handed neutrino would have an energyeigenvalue +W .

Mass, therefore, is a measure of the correlation between states of posi-tive and negative helicity. The energy eigenvalues, W± = ±p, are analogousfor a spin system where W+ and W− are the energies for the spin to align in

Page 617: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

590 A New Perspective on Relativity

the direction of momentum and in the direction anti-parallel to momentum,respectively. The amplitude of the correlation,

µ = |µ|ei2ϕ = mei2ϕ

√ (W − pz

) · √ (W + pz

) . (11.4.24)

is always less than, or equal to, unity by Schwarz’s inequality. It is thesquare-root of the ratio of the product of off-diagonal to the product ofdiagonal terms in the density matrix. The inequality |µ| < 1 accounts for‘off mass-shell,’ or virtual processes, which do not conserve energy, (11.1.8).Expression (11.4.24) is a measure of correlation between states of helicity+1 and −1. The amplitude |µ| is a measure of their ‘degree of coherence,’while 2ϕ, is a measure of their ‘effective phase difference.’

11.5 Mass in Maxwell’s Theory and Beyond

In this section we seek to generalize Maxwell’s electromagnetic theory inthree directions:

(i) the introduction of the state of helicity h = 0, and see how Maxwell’sequations exclude it,

(ii) the introduction of mass into these equations, and(iii) can a generalization of these equations support compressional waves?

In this way we carve out a precise domain of validity of Maxwell’s relations.We begin by a simple radiation mechanism and show that even thoughthere is an h = 0 helicity state it cannot propagate.

11.5.1 A model of radiation

As a simple model of radiation [Skilling 42] we consider a short wire oflength � carrying a current I sin ωt. The wire is placed at the origin suchthat �/2 points up from the equator in the z-direction and −�/2 pointsdown in the opposite direction, as shown in Fig. 11.13. The vector potentialis given as the retarded potential integrated along the wire, viz.

Az =∫ �/2

−�/2

I sin ω(t − r)r

dz.

Page 618: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 591

Fig. 11.13. A short vertical antenna.

Since current is flowing in the z-direction, there will be only one componentof the vector potential.

If the distance at which the vector potential is measured is muchgreater than the length of the wire, the denominator in the integrand willremain sensibly constant during the integration. Moreover, if the length ofthe wire is small compared to the wavelength of the radiation, then so toowill be the numerator. Consequently, the integral can easily be performedwith the result

Az = ar

sin ω(t − r),

where a = I�. Transforming to spherical coordinates this component of thevector potential will have two components: one radial,

Ar = ar

sin ω(t − r) cos ϑ, (11.5.1a)

and one tangential,

Aϑ = −ar

sin ω(t − r) sin ϑ, (11.5.1b)

as shown in Fig. 11.13, with a vanishing third component Aϕ = 0.

Page 619: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

592 A New Perspective on Relativity

The curl of A gives the circular magnetic field, which consists of circlesof constant latitude,

Hϕ = curlϕA = −aκr

sin ϑ

[sin κ(r − t)

κr− cos κ(r − t)

], (11.5.2)

where κ is the wave number. We can appreciate that the circular magneticfield, (11.5.2), is a spherical Bessel function of order 1.

The radial and tangential components of the electric field can be deter-mined from

E = −A + ∇ ·∫

(∇ · A)dt. (11.5.3)

These components are given explicitly by

Er = −2κa cos ϑ

r

[1

(κr)2cos κ(r − t) + 1

κrsin κ(r − t)

], (11.5.4a)

Eϑ = aκ sin ϑ

r

[(1 − 1

(κr)2

)cos κ(r − t) − 1

κrsin κ(r − t)

], (11.5.4b)

and Eϕ = 0. The relative magnitudes of the terms in (11.5.4b) can be gleanedfrom their dependence on the inverse powers of r. For instance, if the secondterm in the first expression in (11.5.4b) is small compared to unity, the tan-gential component of the electric field will have the same form as that of themagnetic field, i.e. considered as a function of r, it will be a spherical Besselfunction of order 1. This is precisely what Maxwell’s equations predict.

Maxwell’s equations, in spherical coordinates, are

Er = curlrH = 1r sin ϑ

∂ϑsin ϑ Hϕ,

Eϑ = curlϑH = −1r

∂rr Hϕ,

Hϕ = −curlϕE = −1r

(∂

∂rrEϑ − ∂Er

∂ϑ

).

(SM)

Differentiating the third equation with respect to time and introducing thefirst two equations lead to the wave equation,

Hϕ = 1r

∂2

∂r2 rHϕ + 1r2

∂ϑ

1sin ϑ

∂ϑsin ϑHϕ, (11.5.5)

Page 620: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 593

where the right-hand side is not exactly the Laplacian in sphericalcoordinates. It can be brought into that form by using Legendre’s equationfor m = ±1,

ddϑ

1sin ϑ

ddϑ

sin ϑ� = 1sin ϑ

ddϑ

sin ϑd�

dϑ− 1

sin2 ϑ�

= −�(� + 1)�, (11.5.6)

whose solution is the spherical harmonic, � = Y±1� . In light of Legendre’s

equation, the second term in (11.5.5) can be replaced by −2/r2, since � = 1.For any m in the range −� ≤ m ≤ �, Legendre’s equation is

1sin ϑ

ddϑ

(sin ϑ

d�

)+[�(� + 1) − m2

sin2 ϑ

]� = 0. (11.5.7)

Maxwell’s equations contain the facts that there are only two helicities:parallel and anti-parallel to the momentum. It is for this reason thatPlanck was able to obtain the correct density of states of his harmonicoscillators in his study of blackbody radiation, without any knowledgeof the polarization of the photon.

Assuming the circular magnetic field component, Hϕ, varies period-ically in time, (11.5.5) for � = 1 becomes the ‘spherical Bessel differentialequation,’

(d2

dr2 + 2r

ddr

− 2r2 + ω2

)Hϕ = 0, (11.5.8)

where the dispersion relation is κ = ω. It is important to observe thatbecause of (11.5.6) we do not have to specify the value of m. There are twolinearly independent solutions to (11.5.8). They are the spherical Bessel andNeumann functions,

Hϕ(r, t) = −aω2 (j1(κr) cos ωt + n1(κr) sin ωt)

sin ϑ, (11.5.9)

where

j1(x) = sin xx2 − cos x

x, (11.5.10a)

Page 621: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

594 A New Perspective on Relativity

and

n1(x) = −cos xx2 − sin x

x, (11.5.10b)

are spherical Bessel and Neumann functions of order 1, respectively.In contrast, for the radial component of the electric intensity,

Er satisfies the reduced wave equation,(

1r

∂2

∂r2 r + 1r2 sin ϑ

∂ϑsin ϑ

∂ϑ+ ω2

)Er = 0. (11.5.11)

On the strength of Legendre’s equation, (11.5.7) for � = 1, we must choosem = 0 in order to come out with the same spherical Bessel differential equation,(11.5.8), whose solution,

Er = 2aωr

(n1(κr) cos ωt − j1(κr) sin ωt

)cos ϑ, (11.5.12)

is again given in terms of the spherical Bessel, (11.5.10a), and Neumann,(11.5.10b), functions of order 1. In comparison to (11.5.9) it is a power higherin 1/r.

For the magnetic field component, Hϕ we did not have to specifym, that is done by sin ϑ in (11.5.9) which makes it proportional to eitherthe Legendre polynomial, P1

1 = − sin ϑ, or P−11 = 1

2 sin ϑ. Rather, for theelectric field component, Er, we had to specify the value m = 0 in orderthat it satisfies the spherical Bessel differential equation, (11.5.8), and thisis substantiated by the fact that its solution, (11.5.12), is proportional to theLegendre polynomial P0

1 = cos ϑ. In other words, the circular magnetic fieldgives the two longitudinal helicity states, m = ±1, parallel and anti-parallelto the direction of motion, while the radial component of the electric fieldgives the transverse helicity state, m = 0. We will now show that the radialcomponent of the electric field cannot propagate!

Two regions need be considered: one in which λ � r, and the otherλ � r, where λ = κ−1. The former occurs near the radiating antenna.The highest-order terms in (11.5.4b) dominate, which describe an oscil-lating double, or dipole. This is the source of electromagnetic radiation.However, in the latter region where λ/r is small all higher powers of λ/rmay be neglected so that, in this region far from the oscillating doublet,Er vanishes and Eϑ is given in terms of P±1

1 , just like Hϕ. These are the

Page 622: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 595

Fig. 11.14. The configuration of electric and magnetic fields on the surface of asphere. P is Poynting’s vector showing the direction of radiation. In any smallportion, a spherical wave cannot be distinguished from a plane wave.

longitudinal helicity states of the photon. This is, yet, another examplewhere a transverse helicity state is not related to a massive vector field.

The tangential electric and circular magnetic force components aregiven by the common expression,

}= aω

rcos κ(r − t) sin ϑ, (11.5.13)

and are mutually perpendicular, as shown in Fig. 11.14 with Er = 0 togetherwith Eϕ = Hr = Hϑ = 0. Each vector varies as the inverse of the wavelength,κ = ω, and the solution describes a spherically symmetric wave travelingoutward. Both components are inversely proportional to the radius, and,thus, become weaker and weaker as they travel further from the source.The circular magnetic field are parallels of constant latitude on any sphereof radius r, while the tangential electric field are the meridians.

The time rate of radiation is obtained by integrating Poynting’s vectorover the surface S of the sphere

∫P · dS = 1

∫E × H · dS

= 14π

∫ π

0

[ωar

cos κ(r − t) sin ϑ]2 · 2πr2 sin ϑ dϑ

= 23ω2a2 cos2 κ(r − t). (11.5.14)

Page 623: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

596 A New Perspective on Relativity

In the region where only the radiation components (11.5.13) subsist, thePoynting vector is radially directed, and since it is in the direction of E × Hit is pointed outward, as shown in Fig. 11.14. The average power radi-ated from a small antenna with uniform current distribution, 1

3ω2I2�2 =13ω2e2u2, is identical to Larmor’s formula (4.3.13) when averaged.

However, Poynting, as well as many other authors believed that theelectric field is always parallel to the wire, assuming it is parallel to thevector potential. This belief is based on Ohm’s law J = σE, where σ isthe conductivity. Although this is true inside the wire, it is not true outsidethe wire, where it is nearly perpendicular to the wire [Nahin 88]. We haveHeaviside [92] to thank for this observation:

. . . the transfer [of energy]. . .takes place, in the vicinity of the wire, very nearly par-allel to it, with a slight slope towards the wire. . . Prof. Poynting, on the other hand(Royal Society Transactions, February 12, 1885), holds a different view, representingthe transfer as nearly perpendicular to a wire, i.e. with a slight departure from thevertical. This difference of a quadrant can, I think, only arise from what seems asa misconception on his part as to the nature of the electric field in the vicinity of awire supporting electric current.

The lines of force are nearly perpendicular to the wire. The departure from per-pendicularity is usually small that I have sometimes spoken of them as beingperpendicular to it, as they practically are, before I recognized the great physi-cal importance of the slight departure. It causes the convergence of energy into thewire.

The electric vector lies in the plane of the wire and the radius vector r.The magnetic vector is perpendicular to this plane. Because the electric vec-tor is proportional to sin ϑ, there is no radiation in the direction of oscillation;that is, E cannot be parallel to A. Poynting would have his vector point-ing inward, which compensates energy dissipation through Joule heating,but it would raise havoc with the radiation of radio waves. Although thisenergy compensation is true of a very small part of Poynting’s vector, theremainder of E × H is parallel to the wire outside the wire. Hence, energypropagation along the wire occurs outside the wire.

The electric and magnetic fields in (11.5.13) are both proportional tosin ϑ. This means that no radiation is emitted in the direction of oscillation,while there is maximum radiation in the direction perpendicular to theoscillating dipole. Radiation in any other direction is proportional to thesine of the angle the direction it makes with the vertical z-axis along whichthe electric charge is oscillating. The radial component of the electric force,Er, is proportional to cos ϑ, which corresponds to the spin normal to thedirection of propagation. However, it cannot propagate.

Page 624: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 597

Fig. 11.15. The polar plots of the spherical harmonics. Maxwell’s equationsprohibit the middle radiation pattern.

The polar plots of the spherical harmonics Ym1 , for m = 1, 0, −1 are

shown in Fig. 11.15. If radiation could occur in the z direction, there wouldbe maximum radiation in the direction of oscillation and zero at right anglesto it. This would be the hallmark of compressible longitudinal waves.Maxwell took great care that his equations should describe an incom-pressible fluid, and, thus, they are incapable of describing the inductivezone. The inductive zone can be distinguished from the radiation zone,because the former varies as the inverse cube of the radius while the latteras the inverse of the radius. In the inductive region, the electric field andmagnetic field components are due to electric charges and current, respec-tively. The current is encircled by the stationary magnetic field. It is a regionin which electromagnetic statics applies.

A short wire connecting two metal spheres, acting a condenser, isvisualized as a dipole. The current carried along the wire alternativelycharges and discharges their capacitance. So in the inductive zone, wherethe antenna appears as a dipole, and at a distance that is short compared toa wavelength of radiation, there is a radial component of the electric field.But due to its short range it cannot propagate into the radiation zone. Inthe inductive zone there are large amounts of energy that are continuallytransforming back-and-forth between the electric and magnetic fields. Thefields are strongest at the equator of any imaginary sphere of radius r, andvanish at the poles.

In the intermediary zone, where both electric field components ofinduction and radiation are present, they are out of phase. Only in the radi-ation zone are the electric and magnetic fields in phase with one another –precisely as Maxwell’s equations predict!

Page 625: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

598 A New Perspective on Relativity

We now want to modify Maxwell’s equations to take into accountthe possibility of the existence of a longitudinal mode with m = 0. In theabsence of shear, a generalized force will be given by

ρG = η∇(∇ · G) − ν∇ × ∇ × G,

where G is any spatial displacement, and η and ν are the elastic constantsrelated to compression and rotation, respectively, and ρ is a density. We nowset G equal to what Maxwell called the ‘electrokinetic momentum,’ A.Heaviside argued in favor of setting the electric force, E, equal to the veloc-ity G. Then, with the simplifications, ρ = η = ν = 1, we get the first set ofgeneralized Maxwell’s circuit equations,

A = ∇(∇ · A) − ∇ × H. (11.5.15)

From the second circuital equation,

∇ × E = −H = −∇ × A, (11.5.16)

we see that this relation will be satisfied by E = −A.Thus, from (11.5.1a) and (11.5.1b), we find the components of the

electric vector are

Er = −Ar = −aωr

cos ω(t − r) cos ϑ, (11.5.17a)

and

Eϑ = −Aϑ = aωr

cos ω(t − r) sin ϑ. (11.5.17b)

Expressing (11.5.17a) in terms of spherical Bessel and Neumann functions,we get

Er = aω2 {n0(κr) cos ωt − j0(κr) sin ωt}

P01, (11.5.18)

where

j0(x) = sin xx

, (11.5.19a)

and

n0(x) = −cos xx

, (11.5.19b)

are spherical Bessel and Neumann functions of order 0.

Page 626: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 599

In contrast to (11.5.12), (11.5.18) falls off as inverse distance and whensquared and integrated over a surface will be a constant. Thus, (11.5.18)can propagate! From the middle diagram of Fig. 11.15, we see that radia-tion is being emitted in the direction of the oscillating charges. This wasstrictly forbidden by Maxwell’s equations which permit only the first andthird configurations, i.e. normal to the direction of the oscillating charges.However, the mismatch on the indices is sufficient to indicate that this isan artificial propagation.

Consider the power equation,

12

ddt

[E2 + H2 + (∇ · A)2

]= −∇ · [E × H + E(∇ · A)

], (11.5.20)

which is obtained by multiplying (11.5.15) by E and (11.5.16) by H, andadding them. The new term on the left-hand side of (11.5.20), 1

2 (∇ · A)2,is the energy of compression, while the new term on the right-hand side,E(∇ · A) is the momentum it creates. From the expression of the orbitalangular momentum, (11.1.26), we see that E ·∇A is its corresponding linearmomentum just as E(∇ · A) is the momentum due to compression by theaction of a hydrostatic pressure, −∇ · A.

The radially outward Poynting vector, E × H now has another con-tribution coming from the radial component of the electric vector, Er∇ · A,since Eϑ∇ · A vanishes on integrating over a spherical surface. Noting that

∇ · A = aω2 {j1(κr) cos ωt + n1(κr) sin ωt}

P01,

the additional power will be

14π

∫ π

0Er∇ · A (2πr2 sin ϑ)dϑ = 1

3ω2a2 cos2 κ(r − t). (11.5.21)

The power due to longitudinal waves of compression and expansion,(11.5.21), is exactly half of Poynting’s value, (11.5.14). Longitudinalwave propagation is, therefore, a less efficient means of power radi-ation than transverse wave propagation.

One final point: The reason for splitting the wire into equal halves,one above and one below the equatorial plane in Fig. 11.13, which can be aconducting sheet, is that everything below this plane of symmetry can be

Page 627: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

600 A New Perspective on Relativity

eliminated. The actual antenna and conducting plane can be replaced byan isolated antenna of double length without changing the electromagneticfields [Skilling 42]. The effect of the antenna above the conducting plane isthe same as that below the plane with the current and charges being equaland opposite.

11.5.2 Enter mass: Proca’s equations

In the 1930’s the Romanian physicist Alexandru Proca developed a vectormeson theory of nuclear forces that was subsequently used by Yukawa toobtain a Nobel Prize for himself. What he did was to modify Maxwell’sequations so that they would admit a non-vanishing photon mass throughthe appearance of the Compton wavelength, λc = m−1. Proca’s equationsread:

E = ∇ × H + mA, H = −∇ × E,

∇ · E = −mφ, ∇ · H = 0,(P)

together with the auxiliary conditions,

∇ · A = −φ, ∇ × A = mH, (P’)

mE + A + ∇φ = 0.

The last equation follows from taking the time derivative of the first equa-tion on the left, introducing the first equation on the right, and observingthat any field will satisfy the Klein–Gordon equation,

E = ∇(∇ · E) − ∇ × (∇ × E) − m2E. (11.5.22)

The first equation in the second set is the transversality condition, whichin terms of the four-vector potential, Aµ, can be expressed as ∂µAµ = 0.

That mass requires the presence of the potentials, φ and A, meansthat the energy densities and momentum will also require them. Scalarmultiplication of the first of Proca’s equation by E and the second by Hlead to the power density equation,

∇ · (E × H) + 12

∂t(E2 + H2) = −∇ · (φA) − 1

2∂

∂t(φ2 + A2). (11.5.23)

Page 628: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 601

Surprisingly, mass will not appear in energetic considerations, as (11.5.23)testifies. Expression (11.5.23) shows that the fields and potentials can beconserved independently,

12

∂t(φ2 + A2) + ∇ · (φA) = 0,

or if not, can be combined into the energy density,

18π

(E2 + H2 + φ2 + A2), (11.5.24)

and energy flux,

14π

(E × H + φA). (11.5.25)

On the basis of (11.5.24) and (11.5.25) Bass and Schrödinger [55] wereable to discriminate between transverse and longitudinal waves. Maxwellcalled A the ‘electrokinetic’ momentum, and rightly so. For a transversewave, the contribution from A will be negligibly small so that it can beconsidered as an (E, H)-wave, while for a longitudinal wave, the momen-tum is in the direction of A almost entirely so that it can be considered a(φ, A)-wave.

Longitudinal waves are to be associated with the potentials while trans-verse waves with the fields.

If the fields were to vanish all together, the components of the four-vector potential would be gradients in the direction of motion and wouldbe ineffective to sustain wave motion since there is no longer induction.Keeping a small, but finite, electric field shows that both the scalar andvector potentials satisfy the Klein–Gordon equation, (11.5.22), which nowreduces to

∇ · ∇φ − φ = m2φ. (11.5.26)

Although Proca’s equations are self-consistent insofar as the four-vector potential satisfies the Klein–Gordon equation, (11.5.26), they are notcovariant gauge-invariant. If we introduce the definition of the magnetic

Page 629: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

602 A New Perspective on Relativity

field into Faraday’s equation, we can write it as

∇ × (E + m−1A) = 0.

This can be satisfied identically by

E = −m−1A − m−1∇φ, (11.5.27)

which is the last equation in (P′).Now, if we change the potentials in such a way

A → A′ = A − ∇�,

φ → φ′ = φ + �,

for arbitrary �, there will be no change in (11.5.27). Moreover, if we require� to satisfy the wave equation,

∇2� = �,

then

∇ · A + φ = ∇ · A′ + φ′. (11.5.28)

Things which are equal but have nothing in common can only be equal ifthey are equal to a constant. We are free to choose this constant equal tozero, and (11.5.28) becomes the first equation in (P′), which is the Lorentzgauge. The only blemish on the Proca equations is that the gauge potential� does not satisfy the same wave equation as the four-vector potential.This will be remedied in Sec. 11.5.3.

If we choose a non-covariant gauge, ∇2� = 0, this would necessarilyimply that

∇ · A = ∇ · A′,

or that ∇ · A = 0, which is the Coulomb, or radiation gauge. This would becounter-productive since it would ensure transverse waves. Also, � wouldnot satisfy the same wave equation as the four-vector potential.

Proca’s equations, (P) and (P′), preserve the transversality condition,and, thus, can support only transverse waves. It is very enticing, and notnew to electroweak theory, to associate the unused ‘third degree of freedom’

Page 630: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 603

with a longitudinal mode, and with mass. Bass and Schrödinger [55] havedone precisely this way back in 1955. While admitting that

Plane waves have only two possible states of polarization, not three, as would beexpected for a vector wave (e.g. an elastic wave; remember the historical dilemmaconcerning the ‘elastic properties of the ether’),

they contend that

a third state of polarization, namely, a longitudinal wave, is possible for any twoMaxwellian transversal waves with the same wave normal. The third wave is prop-agated with the same velocity; it is perfectly respectable, and remains so, howeversmall a value we adopt for the rest-mass.

Thus, they associate the third, unused, degree of freedom of light, witha longitudinal mode that would be a massive field. However, there is noreason to believe that the longitudinal and transverse modes will propagateat the same velocity since the mechanism of wave generation is completelydifferent, and so too what is being propagated.

Bass and Schrödinger use Proca’s equations to support their asser-tions. However, the transversality condition has not been affected by theintroduction of mass so that a longitudinal mode of propagation is imme-diately ruled out. The transverse waves easily follow from the second lineof (P): ∇ · H = 0 means that our plane wave traveling in the z direction hasHz = 0. The same will be true of the electric field if φ = 0 so that the Procaequations reduce to

E = ∇ × H + mA, H = −∇ × E,

∇ · E = 0, ∇ · H = 0,(PT)

with the auxiliary conditions,

∇ · A = 0, ∇ × A = mH, (P′T)

mE + A = 0.

From the first equation in (P′T), we know that Az = 0, so that A can either beparallel to E or H. But because of the last equation in (P′T) we set E ‖ A. Weknow from Sec. 11.5.1 that this cannot be the case outside of a conductingwire.

Taking the time derivative of the last equation in (P′T), and eliminatingthe time derivative of the first term by using the first equation in (PT) and

Page 631: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

604 A New Perspective on Relativity

the second equation in (P′T), give

A = −∇ × ∇ × A − m2A. (11.5.29)

This is the equation of motion of an incompressible elastic solid whichshows resistance to both translation and rotational motion.

To see this in greater detail, we form the power equation. Multiplying(11.5.29) through by A, we get

12

ddt

{A2 + m2A2 + (∇ × A)2

} = ∇ · (A × ∇ × A), (11.5.30)

on the strength of the vector identity,

∇ · (X × Y) = Y · curl X − X · curl Y,

for any two vectors X and Y. The terms in (11.5.30) have the followingsignificances: 1

2 A2 is the kinetic energy, 12m2A2 is the potential energy,h and

12 (∇ × A)2 is the energy of rotation.i Consequently, (11.5.29) is the equationof motion of a transverse wave.

Next, Bass and Schrödinger consider longitudinal waves. Since themagnetic field will always be solenoidal, they set it equal to zero, H = 0,and the rotational energy vanishes. Now we should expect some form ofcompressional motion just like in (11.5.15). Let’s see. Proca’s equations, (P)and (P′), then reduce to

E = mA, ∇ × E = 0, ∇ · E = −mφ, (PL)

hThis is the term by which mass is introduced in the Lagrangian. But because thisterm is not invariant under a gauge transformation it will introduce additional termsthat are linear in the four-vector that are not canceled out in the transformation of thewave function. It is for this reason that such a term is banned from the Yang–MillsLagrangian, and recourse is made to gauge symmetry-breaking.iRecall that in (11.5.20) we found the compressional energy, 1

2 (∇ · A)2 due to astatic pressure, −∇ · A. Now we have rotational energy 1

2 (∇ × A)2 due to angularmomentum, ∇ × A.

Page 632: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 605

with the auxiliary conditions,

∇ · A = −φ, ∇ × A = 0, [P’L]

mE + A + ∇φ = 0.

Again, taking the time derivative of the last equation in (P′L), and elimi-nating the time derivatives of the first and third terms now result in

A = ∇(∇ · A) − m2A. (11.5.31)

This is the equation of a compressible elastic solid which offers resistanceto translation. That is, A, and also E, are polar, which are propagated with-out magnetic force. So we conclude, along with Heaviside: “This makeslongitudinal electric waves.”

The scalar potential, φ, satisfies the same equation as (11.5.31), exceptthat the first term on the right-hand side is ∇2φ, so this does not tell usanything about the nature of the elastic solid. Again we form the powerequation by multiplying (11.5.31) by A. We then obtain

12

ddt

{A2 + m2A2 + (∇ · A)2

}= ∇ · A(∇ · A), (11.5.32)

on the strength of the vector identity,

div(cX) = c div X + X · grad c,

for any scalar c and vector X. From (11.5.32) it is clear that 12 A2 is the kinetic

energy, 12m2A2 is the potential energy, 1

2 (∇·A)2 is the energy of compression,and −A(∇·A) is the energy flux density just like in (11.5.20). Hence, (11.5.31)is a longitudinal wave, but we did not need Proca’s equations to get it.

If we do not set φ = 0 to get transverse waves, or H = 0 to get longi-tudinal waves, the entire set of Proca’s equations gives the Klein–Gordonequation, (11.5.22). The power equation will contain both the energies ofrotation and compression — but with equal elastic constants. Equal elasticcoefficients allow the first two terms on the right-hand side of (11.5.22) tobe combined into a single term, the Laplacian. And the resulting wave isstill transverse, exactly as Maxwell predicts. The vanishing of the scalarfield converts the Lorentz gauge into the Coulomb gauge and ensures thatA will be solenoidal. Alternatively, if H is polar, or vanishes, the circuitalequations are broken, and only longitudinal waves persist.

Page 633: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

606 A New Perspective on Relativity

All these conclusions hold independently of whether the mass, m = 0or not! For both types of waves, the presence of mass is to introduce apotential energy term, with m playing the role of a spring constant. Inregard to the wave equations, (11.5.29) and (11.5.31), the effect of this termis to introduce dispersion so that the group and phase velocities will notbe equal.

The presence of mass has nothing to do with the existence of longitudi-nal waves. However, in contrast to Maxwell’s equations, the presenceof mass requires the potentials, and not just the fields. If mass requireda longitudinal mode it could not be polarized; only transverse wavesare polarizable.

The weak point in the Proca equations, (P), is the expression for thedivergence of the electric field. Instead of setting it equal to the chargedensity, as Gauss would have done, it is set equal to the scalar field. Whenthe latter vanishes, it makes both E and A solenoidal. It will nevertheless besolenoidal anyway far from electric charges. The presence of φ is requiredwhen we create longitudinal waves.

From what has been said in Sec. 11.1.4, we have no reason to believethat transverse and longitudinal waves will propagate at the same speed.Since we know that the transverse waves propagate at the speed of light,longitudinal waves will either propagate slower or faster. If G is any gen-eralized displacement, the most general form of the force due to shear,compression, and rotation is

F = ξ

[∇2G + 1

3∇(∇ · G)

]+ η∇(∇ · G) − ν∇ × (∇ × G), (11.5.33)

where ξ is the rigidity, η the compressive resistivity, and ν the elastic con-stant related to rotation. Using the vector identity, curl2 = ∇div − ∇2 andthe force may be written as

F = (ξ + ν)∇2G + (η + 13ξ − ν)∇(∇ · G).

Neglecting shear, and with ν = η the compressibility vanishes, bringing usback to Maxwell’s theory. In general, η can take on all values from 0 to ∞,and since the speed of propagation will be proportional to

√η, longitudi-

nal waves will, in general, propagate faster than transverse waves. This is

Page 634: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 607

the conclusion Heaviside reached, and he was no stranger to tachyons, orparticles that travel faster than the speed of light.

Writing in 1898 (7 years before special relativity, if you mark its birthwith Einstein’s 1905 paper) Heaviside [99, Appendix G] remarked thatSearle, J. J. Thomson, and FitzGerald all considered that no charged bodycan travel faster than the speed of light. This is because the energy of acharged body is infinite at the speed of light, “and since this energy mustbe derived from an external source, and infinite amount of work must bedone, that is, an infinite resistance will be experienced.” One way of prov-ing this a ‘fallacy’ is to consider two oppositely charged bodies, both movingat the speed of light, so that “the infinity disappears, and there you are, withfinite energy when moving at the speed of light.” Heaviside was consid-ering electromagnetic energy, and not the total mechanical energy, whichrelativity theory asserts is true for charged, as well as uncharged, matter.

The lack of distinction between the two energies would have troubledMaxwell deeply. For Maxwell reasoned, as we have seen in Sec. 3.8.1, that ittakes energy to overcome the repulsion when two like charges are broughttogether. This energy goes “into the field” giving it a positive energy den-sity. But, two neutral masses attract one another so that it takes energyto keep them apart, and this would mean that there would be a negativeenergy density in the field. This so worried Maxwell that he gave up allhope of including gravity as a field theory. This, too, troubled Heavisidefor he wrote in July 1893:

To form any notion at all of the flux of gravitational energy, we must first localize theenergy. In this respect it resembles the legendary hare in the cookery book. Whetherthis notion will turn out to be useful is a matter for subsequent discovery. For this,also, there is a well-known gastronomical analogy.

By making all matter obey relativity, any reasoning of this type becomescompletely sterile, together with the notion of how energy is stored in thefield.

11.5.3 Proca’s approach to superconductivity

We can remedy the fact that any arbitrary gauge in Proca’s equations satis-fies the wave equation instead of the Klein–Gordon equation by replacingA by A − ∇�, and φ by φ + � everywhere in Proca’s equations. We then

Page 635: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

608 A New Perspective on Relativity

obtain

E = ∇ × H + m(A − ∇�), H = −∇ × E,

∇ · E = −m(φ + �), ∇ · H = 0,(L)

with the auxiliary conditions,

∇ · A = −φ + m2�, ∇ × A = mH, (L′)mE + A + ∇φ = 0,

Now all potentials and fields will satisfy the Klein–Gordon equation,(11.5.22).

The current,

J = m (∇� − A) , (11.5.34)

and charge density,

ρ = −m(φ + �), (11.5.35)

satisfy the continuity equation

ρ = −∇ · J, (11.5.36)

which is none other than the first equation in (L′). The new potential, �,in this gauge has the significance of an internal, as opposed to the externalpotential, A, in the Meissner effect. Quantum mechanics gets projected ontothe macroscopic stage when electrons interacting with the lattice produceattractive forces between themselves. When the electron energies are suffi-ciently small, this attractive force induced by lattice interactions is sufficientto overcome their Coulomb repulsion. Pairs of electrons with their spinsin opposite directions lock together to form a spin-0 boson with doublenegative charge. These ‘Cooper’ pairs have an enormously large effectivesize, about 10−4 cm, due to their very weak binding. Hence, these Cooperpairs will overlap with other Cooper pairs producing a state of coherencedue to the locking together of the phases of their wave functions. Instead ofdealing with 106 pairs, the current of the superconductor acts as if it werea single, free particle.

The Meissner effect results when an external magnetic field inter-acts with the Cooper pairs. When the magnetic field penetrates into the

Page 636: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 609

superconductor it will create a current resulting in the flow of Cooper pairs.This current, in turn, will generate its own magnetic field so as to opposethe external field. However, since the nullification is not exact, there willbe a small magnetic field that seeps into the superconductor decreasingexponentially with distance.

When the quantum mechanical flux,

J = 12mi

(ψ∇ψ − ψ∇ψ) − e2

mψψA,

is evaluated by a wave function of the form ψ ∼ eiα�, where α =e2, the fine-structure constant, we get a current density of the form(11.5.34),

J = e2

m(∇� − A), (11.5.37)

except that the coefficient is inversely proportional to the mass, whereas it isproportional to it in (11.5.34). Since the Meissner effect is time-independent,the Coulomb gauge, ∇ · A = 0, is applied to the continuity equation,∇ · J = 0.j Applying this to (11.5.37) requires ∇2� = 0, which means ∇� isconstant. If we take this constant to be zero, we come out with London’sequation,

J = −mA, (11.5.38)

which is the hallmark of superconductivity.The phase, �, is an internal phase that depends only on the prop-

erties of the superconductor. On the contrary, the magnetic field is anapplied external field. The flux, (11.5.37), is the difference between theinternal momentum, ∇�, and the external momentum A. The vanish-ing of the internal momentum, ∇� = 0, is precisely the condition forthe onset of superconductivity, described by London’s equation, (11.5.38).Hence,

jIt is amusing that applying the Coulomb gauge in (L′) gives φ = m2�. Introducingthis into the expression for the charge density, (11.5.35), results in a field equationfor a forced harmonic oscillator, φ + m2φ = −mρ. So it would appear that thecharge density makes the gravitational field oscillate at a frequency inverse to theCompton length.

Page 637: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

610 A New Perspective on Relativity

� in the modified Proca equations (L) and (L′) has the physicalsignificance of an internal field that is created when the externalelectromagnetic fields act on a continuous medium, like a compositesystem of electrons interacting with a lattice so that they are boundtogether in Cooper pairs.

This medium has many properties of the aether, even at relatively shortdistances. From now on we will omit the internal phase �.

There is a great deal of folklore connecting the Meissner effect withspontaneous symmetry-breaking in the electroweak interaction [Gottfried& Weisskopf 86]. They concern:

(i) Within a superconductor the frequency of any disturbance mustexceed a certain threshold and thus correspond to a quantum of energyω0 having a finite mass.

(ii) Not only transverse waves, but also longitudinal ones can propagatein a superconductor, and it is the longitudinal waves that carry mass.

(iii) The existence of a threshold,ω0, and longitudinal fields are both relatedto the helicity state h = 0.

We will now address these points.Although the definition of the magnetic field in terms of the vector

potential leads to some perplexity of depending upon a finite mass, as wehave already mentioned, the really suspicious equation is the modifiedGauss law, which is given on the left-hand side of the second line in (P).It is analogous to the potential term in the first set of Maxwell’s equationsthat was introduced by Helmholtz, as we will discuss in Sec. 11.5.5 below.

For if we equate it to Gauss’s law, we get

ρ = ∇ · E = −mφ. (11.5.39)

This is certainly not the solution (11.1.17) to Poisson’s equation, (11.1.14),with c = ρ. Moreover, if we take the divergence of the last equation in (P′),and use the Coulomb gauge we get

(∇2 − m2)φ = 0, (11.5.40)

Page 638: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 611

which cannot admit a plane wave solution. This is because the four-vector,(k, ω) is time-like, ω ≥ |k|. We can transform the space part to zero, but notthe time part.

If we take the divergence of the first equation on the left-hand side in(P), and use Gauss’s law, the first equality in (11.5.39), we come out with thecontinuity equation (11.5.35), where the flux is given by London’s equation,(11.5.38). Taking the curl of the first equation in (P), we get

∇2H − H = m2H, (11.5.41)

on the strength of the last equation in (P), and the second equation onthe first line of (P′). This shows that the magnetic field satisfies the Klein–Gordon equation. But, (11.5.41) does not describe the static Meissner effect:Static magnetic fields cannot penetrate into a superconductor beyond alayer of thickness ∼ m−1 = λc, the Compton wavelength. The persistentcurrent, (11.5.38), is also confined to this layer.

Can we convert (11.5.41) into the stationary equation,

(∇2 − m2)H = 0? (11.5.42)

The dispersion relation corresponding to (11.5.41),

ω2 = κ2 + m2, (11.5.43)

says that the four-vector (κ , ω) is time-like. Hence, its space part κ can betransformed to zero giving us a quanta of mass ω = m. But, its time partcannot be transformed to zero. However, if H remains steady in time, E ispolar from the second equation on the first line of (P). If H remains steadyin time, so too must A, and nothing can propagate. With H solenoidal,∇2H = −curl2H, but with E polar, ∇2E = grad div E, the latter too willsatisfy (11.5.42), so that both fields will decay exponentially, as we nowshow.

Rayleigh, in his second volume of The Theory of Sound tells us how tosolve this equation. Expressing ∇2 in polar coordinates, he finds:

∇2 e−mr

r=(

d2

dr2 + 2r

ddr

)e−mr

r= 1

rd2

dr2 r · e−mr

r= m2 e−mr

r.

Page 639: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

612 A New Perspective on Relativity

If J is an impressed electric current in a conductor, the wave equation (11.5.42)becomes

(∇2 − m2)H = −∇ × J,

whose solution can be written as

H =∫

e−mr∇ × Jr

dV. (11.5.44)

Such solutions were well-known long before Yukawa applied themto limit the range of the nuclear binding forces that are mediated by theexchange of a new quantum, which he drew from the analogy with thephoton. However, the electromagnetic force is infinite, making the photonmassless, while nuclear forces were known to short range, and in order fortheir range to be less than one fermi, the mass of the mediating particle hadto be greater than 200 MeV, which is not far from the 140 MeV of the knownπ-meson.

Heaviside referred to (11.1.17) as ‘pot,’and (11.5.44) as ‘pan,’althoughhe was not referring to pots and pans. In his own words

. . . pot means “potential,” or the “the potential of,” and has no more to do withkettle than the trigonometrical sin has to do with the unmentionable one.

According to Heaviside, m−1 would not be related to the range of the poten-tial; rather, its inverse would represent the space derivative, d/d(ut), whichtransforms J(t) to J(t − r/u), thereby making (11.5.44) a retarded potential.

Spontaneous symmetry-breaking in the electroweak interaction nowexploits Lorentz invariance to show that H must satisfy the Klein–Gordonequation. Admittedly, the constraint of Lorentz invariance is too simplisticfor a superconductor since the ions in the conductor supposedly selectout a preferred, time-independent, frame. But, it is argued [Gottfried &Weisskopf 86], that since electroweak theory must be Lorentz-invariant,so the electric and magnetic fields must propagate, albeit only above athreshold frequency, because, now, the photons have ‘acquired mass.’

Since the dispersion equation of the Klein–Gordon equation is(11.5.43), the quanta of the field have mass m. Gottfried and Weisskopfargue that

As these fields are vectorial, these quanta are conventional spin-1 bosons with helic-ities h = ±1 and 0. The “lost” degree of freedom [the phase] has reappeared as the

Page 640: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 613

longitudinal mode with h = 0 . . .The phase is not an independent degree of freedomof the electrons; in the superconducting state it is the longitudinal degree of freedomof the electromagnetic field.

This, however, is only wishful thinking for if E and H both satisfy theKlein–Gordon equation, the waves are transverse.

Quanta with finite mass can propagate both transversally as well aslongitudinally. But, for the latter to take place H must remain steadyin time, or vanish altogether. Only in this case will E be polar, andthere will be longitudinal electric waves. Arguments related to Lorentzinvariance are irrelevant since Lorentz invariance applies to all fieldsor to none.

For a phase to represent a longitudinal mode requires an act of faith!It is as Heaviside says:

What happens in an unbounded non-conducting uniform medium is that the cir-cuital E and H make Maxwellian waves which go out to infinity, whilst the polarpart of E makes longitudinal waves, which also go out to infinity. Nothing is leftbehind.

Nothing is left behind if H = 0, but if H is steady and (11.5.42) applies, thenthere is a fixed, permanent magnetic field. But, this would imply longitudi-nal electric waves, which no one has ever seen.

There is the perennial argument as to which pair of fields (A, φ)or (E, H) is more fundamental. Maxwell referred to A as the ‘electroki-netic momentum,’ which “may even be called the fundamental quantityin the theory of electromagnetism.” Hertz and Heaviside disagreed. AndHeaviside even went so far as to express his desire to “murder” Maxwell’s“monster.” We have seen that Proca’s equations lead at once to London’sequation, (11.5.38), and the Yukawa equation (11.5.42), provided the mag-netic field remains steady. Otherwise, (11.5.41) will not reduce to it.

If we consider the pair (A, φ) as fundamental, in order to get transversewaves φ must be steady, while in order to get longitudinal waves H mustvanish. This is contained in the first two equations of (P′). Rather, if weconsider the pair (E, H) to be fundamental, we need φ to vanish, while inorder to get longitudinal waves H must remain steady. Thus, the questionof whether the waves are transverse or longitudinal lie in the nature of φ

Page 641: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

614 A New Perspective on Relativity

and H. The crucial point, therefore, is not which pair is more fundamental,but, rather the nature of the magnetic field.

If H propagates along with E we have induction and the waves aretransverse, if H is steady, or vanishes, E is left to propagate alone andthe waves are necessarily longitudinal.

Maxwell’s equations, on the other hand, are less clear-cut, for theyallow the coexistence of ∇ · A and ∇ × A. But — and this is a big ‘but’ —they must contribute equally so that their difference is the Laplacian. Ifwe can get rid of the potentials, then 1

8π(E2 + H2) and 1

4π(E × H) are the

electromagnetic energy density and momentum, respectively. If we get ridof the fields, then 1

2 (φ2+A2) and φA are the energy density and momentum,respectively. However, if the Proca equations (P) and (P′) hold, the fieldenergy density,

18π

{E2 + m−2(∇ · E)2 + A2 + m−2(∇ × A)2

}, (11.5.45)

and momentum density,

E × ∇ × A = (∇A) · E − E · (∇A), (11.5.46)

contain only the fields E and A. The energy density of E has a compressionalcontribution, while that of A has a rotational contribution.

The linear momentum, (11.5.46), may not look as amounting to much,but its moment, or the (total) angular momentum is most suggestive. Takingthe moment of the terms gives

r × (∇A) · E + E · (∇A) × r

= r × (∇A) · E + ∇ · (EA × r) + E · (∇r) × A + r × A(∇ · E).

Now, whatever the form of the divergence of E, whether it be given by Gauss’s law,or vanish for a solenoidal field, or be given by Proca’s equation in terms of the scalarpotential, it cancels out in the above formula. Since ∇r = 1, the unit dyadic,and we can neglect divergence since its integral vanishes by assumption,

Page 642: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 615

the total angular momentum density is found to be

J = r × (∇A) · E + E × A

≡ L + S, (11.5.47)

where L and S are the ‘orbital’ and ‘spin’ angular momenta densities,respectively. We have come across the latter in (11.1.23). The total angu-lar momentum, (11.5.47) is expressed solely in terms of E and A.

We may thus consider the pair E and A as fundamental, which areparallel in free space, but out of phase, and the other pair, φ and H, as theconditioning potential and field. Moreover, given (11.5.41) we can easilytransform the space part to zero, so that (11.5.43) reduces to ω = ±m. Thisis on account of the fact that (11.5.43) guarantees that (κ , ω) is a time-likefour-vector. More drastic action is required to transform the time part toequal zero, i.e. assume H remains steady so as to produce longitudinalwaves.

By adding a vector source to (11.5.41),

(∇2 − κ2)H = −∇ × J, (11.5.48)

where κ2 = − (∂2/∂t2 + m2), we can again write the solution in the form ofYukawa’s potential, (11.5.44); only this time we have

H = pan curl J =∫

e−κr∇ × Jr

dV. (11.5.49)

Independent of what κ represents, (11.5.49) says that the magnetic forceis the pan-potential of the curl of the impressed current. Since pan curl =curl pan we can write (11.5.49) in terms of the vector potential,

A =∫

e−krJr

dV. (11.5.50)

If the propagation is isotropic and dispersionless, m = 0, and the valueof the source J is taken to be the value of the impressed current at time t− r,then (11.5.50) is the retarded potential. This was already appreciated byHeaviside before the turn of the last century.k But, what Heaviside failedto appreciate was that in the presence of dispersion, m �= 0, the range of

kAs we know from Sec. 4.1.1, Eq. (4.1.10), Liénard and slightly later Wiechert alsointroduced retarded potentials about the same time.

Page 643: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

616 A New Perspective on Relativity

the retarded potential is curtailed, as a comparison of (11.1.16) and (11.5.50)readily shows. That was Yukawa’s contribution: Massive mediators of vec-tor meson interactions have a finite range.

Although a threshold is needed, the effects of such symmetry-breaking is irrelevant if the threshold is high enough. And althoughthis is the universally accepted interpretation, it would mean we expecta symmetry-breaking anytime there is a threshold that separates ther-mal from non-thermal radiation. Should such a transition be accompa-nied by symmetry-breaking or by a longitudinal component? Bass andSchrödinger [55] expressed this in the following terms:

If these L-[longitudinal] waves contributed to the heating and pressure effects ofblack-body, we should expect the constant of Stefan’s law, the constant in front ofPlanck’s formula, and the measured radiation pressure to be 3

2 times the values weactually find for them. Our actual findings might thus be construed to indicate thatwe are faced with the limiting case of rest-mass zero.

But this would be a poor and, so we believe, a wrong solution to the dilemma.In a reasonable theory we cannot admit even hypothetically that a certain typeof modification of Maxwell’s equations, however small, would produce the abovegrossly discontinuous changes. Even if we had it ‘from the horse’s mouth’ that inNature the limiting case is realized, we should still feel the urge to adumbrate atheory which agrees with experience on approaching to the limit, not by a suddenjump at the limit.

There are echoes here of how Schrödinger took the continuous limitof a finite-difference equation to arrive at his continuous wave equa-tion [Lavenda 00], which fourteen years before his discovery, he argued thatNature never goes to the limit [cf. footnote n of this chapter]. So Schrödingerwould not look too kindly on a mechanism in which there is a “sudden jumpat the limit.” There is more than one way to ‘skin a cat,’ and the analogy withthe Meissner effect in order to generate mass is not the most aestheticallypleasing one.

In the early 1970’s it was believed that the dispersion relation(11.5.43) could place bounds on the mass of the photon [Jackson 75]. If wewrite it as

ω2 = ω20 + m2, (11.5.51)

where ω0 is the frequency of a lumped LC circuit, then the smaller thefrequency ω0, the larger will be the fractional difference between ω andω0, so that it would provide a limit for the photon mass. The idea was to

Page 644: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 617

measure the resonant frequencies of a series of circuits whose frequenciesω0 are in known ratios. If the observed frequencies ω were not in the sameratio, it would be evidence for a finite m in (11.5.51).

Two circuits were compared: one with an inductance, L, and capaci-tance, C, and another circuit with the same L, but two capacitances, C, inparallel. In the first circuit, ω2

0 = 1/LC, while the square of the observedfrequency of the second circuit would be in the ratio 2 : 1 with respect tothe square of the frequency of the first circuit to within experimental error,having corrected for resistance effects. Thus, an upper limit to the photonmass could be inferred.

The fly in the ointment is the observation that any lumped circuit isincapable of setting limits on the photon mass [Jackson 75]. A two-terminalbox has a current, I, at one terminal and a voltage, V, between the terminalsgiven by I = CV. A lumped inductance two-terminal box has a voltageV = −LI. When two such boxes are connected they have a common I andV so that inserting the latter into the former gives I = −LCI. The only thingthat can be surmised from the combined system is the resonant frequencyω0 = 1/

√(LC), and nothing more.

11.5.4 Phase and mass

Mass can also creep into Maxwell’s equations via phase relations. We returnto the Proca equations in the case of transverse waves. Setting φ = 0 andassuming that all fields can be expressed as plane waves with frequency ω

and wave number κ, the last relation in (P’) is

E = −iω

mA. (11.5.52)

Now, by Ohm’s law, E is proportional to the current J, at least if the spaceis isotropic. A is parallel to J because that is what creates it. So E ‖ A, atleast in empty space [cf. Sec. 11.5.1 where this is not true]. But, (11.5.52) tellsus something more. Because of the i, the two fields will be out of phase byπ/2. When this is substituted into Proca’s modification of the first circuitalequation, we get

ε0E = ∇ × H + im2

ωE. (11.5.53)

Page 645: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

618 A New Perspective on Relativity

Were not for the imaginary factor, the last term in (11.5.53) might beconstrued as a constitutive relation for the field current, J, e.g. a form ofohmic relation. Phase information is obtained from curls and cross prod-ucts. Multiplication by i simply means that a right-hand rotation has beenperformed, a rotation of x → y about z, say. And since the curl and crossproduct are subject to the right-hand screw rule, we can interpret X × Y asthe complex number ±iXY.

It would, therefore, make more sense to write the current as an addi-tional curl of the magnetic field. In this way mass would enter as phaseinformation. However, this would make Maxwell’s equations lopsided. Ifa current is added to the second of Maxwell’s equation, viz.

µ0H = −∇ × E − Jm, (11.5.54)

it would require the auxiliary relation µ0∇ · H = ρm in order that thecontinuity equation be satisfied.

But, what would happen if the permeability, µ0, were to vary in space?The permeability is analogous to inertia in Maxwell’s equations, and ifit were to vary it would be equivalent to a varying index of refraction.Maintaining the solenoidal character of H, the divergence of (11.5.54) wouldlead to the continuity equation,

ρm = −∇ · Jm,

where the magnetic charge density is ρm = ∇µ0 · H, corresponding tomagnetic charges, or magnetons in Heaviside’s terminology.

The restoration of symmetry in Maxwell’s circuit equations comes,however, at the cost of introducing a magnetic charge density, ρm, and itsaccompanying current density, Jm. Heaviside even went so far as to attributethe name ‘duplex equations’ to the symmetrized set, and said that ρm = 0was merely an ‘experimental input.’ The concept of a magnetic pole isnot repugnant in itself; Maxwell and both his predecessors and followersused it freely. It is like the Faraday tube, and its relation to the aether thatled to its demise. Recall from Sec. 5.4.3 that the Faraday tube was essen-tial to J. J. Thomson’s conclusion that motion increases inertia. Whereas,the demise of the free magnetic pole was ushered in by the discovery ofthe electron, and the failure of experiments to discover the correspondingmagnetic counterpart.

Page 646: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 619

Observing that

|H||E| = |κ|

ω< 1,

where the inequality follows from the dispersion relation, (11.5.51), we cantransform (11.5.53) and (11.5.54) into

ε0E = ∇ × H + imH, (11.5.55a)

µ0H = −∇ × E + imE, (11.5.55b)

by setting κ = m. Equations (11.5.55a) and (11.5.55b) combine to give theKlein–Gordon equation (11.5.41), for either E or H, e.g.

E = −∇ × ∇E − m2E,

where we set the product ε0µ0 = 1. The phase terms are what is necessaryto ‘couple’ the two relations, as we will now show.

If we define a complex vector,

F(±) = E ± iH,

Maxwell’s equations can be written in the compact form,

∂F(±)

∂t± i∇ × F(±) = 0. (11.5.56)

Furthermore, if we include the phase relations contained in (11.5.55a) and(11.5.55b), we get the coupled set of equations,

F(±) ± L · ∇F(±) − mF(∓) = 0, (11.5.57)

where the j = 1 angular momentum matrices are

Lx = i

0 0 00 0 −10 1 0

, Ly = i

0 0 10 0 0

−1 0 0

, Lz = i

0 −1 01 0 00 0 0

.

The rationale behind (11.5.57) is analogous to Weyl’s equations, whichcouples the left- and right-handed portions of the Dirac wave functions[cf. Sec. 11.6.1 below]. If the fields were uncoupled, then with the mass,m �= 0, the particle would propagate at a velocity less than c. By performing

Page 647: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

620 A New Perspective on Relativity

a Lorentz boost, we can overtake the particle and therefore invert itshelicity. The helicity would not then be a property of the particle. Thecoupling would then be proportional to the mass, and this is what (11.5.57)says.

11.5.5 Compressional electromagnetic waves:Helmholtz’s theory

It is well-known that plane waves cannot support compressional waves, sothe addition of an extra current in Maxwell’s second circuit equation has notreally been instructive. Long before Proca modified Maxwell’s equations,our old friend Heaviside, after listing all the reasons why Maxwell shouldnot have introduced longitudinal waves into his theory, asked “Why, then,should he spoil his work by introducing longitudinal waves?” Since com-peting theories to Maxwell, like Hermann Helmholtz’s theory, did includelongitudinal waves, Heaviside was prompted to looking into the conse-quences of such waves, if for nothing else than to find arguments withwhich to criticize Helmholtz’s theory.

Helmholtz’s theory was purported to be a generalization of Maxwell’stheory to include longitudinal waves, if the need ever arose. Many of thenineteenth century ‘giants’ thought that longitudinal waves would explainthe recently discovered X-rays. Among those was Ludwig Boltzmann whoclaimed that “Whether the longitudinal oscillations and the other gener-alizations, which Helmholtz had added to Maxwell’s theory, are of greatimportance or not, is a question that the present stage of science is unable todecide.” Regardless of whether or not Nature actually did admit longitudi-nal waves in electromagnetism, Heaviside denied their existence outright:“No one has the right to trifle with Maxwell’s equations this way.” To dis-tinguish between Helmholtz’s and Boltzmann’s interpretation of Maxwell,Heaviside referred to his interpretation as “my Maxwell.” Nature did bowto Maxwell, and it was Heaviside who carried the day — but not so in otherareas outside his field of expertise.l

lHaving settled his squabble with Boltzmann, Heaviside was ready to take onanother to be famous German, Max Planck. But, this time it was thermodynam-ics and not electrodynamics that would be the area of contention, and, in particu-lar, the notion of entropy. Here, Heaviside intervened in the nasty debate between

Page 648: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 621

Aforce moving through the aether would meet with resistance arisingfrom the stresses in the medium: shear, rotation and compression. Disre-garding shear, the force, ρG, where ρ is the density of the aether, would begiven in terms of the generalized spatial displacement, G, by

ρG = η∇(∇ · G) − ν∇ × (∇ × G). (11.5.58)

The elastic constants, η and ν, are those of compression and rotation, justas in (11.5.33) with F = ρG.

In contrast to J. J. Thomson’s argument of the increase in inertiadue to motion in Sec. 5.4.3, where a charge in motion creates a magneticfield which is proportional to the velocity, so that the magnetic energy isproportional to the kinetic energy at low speeds, we follow Heaviside andconsider E to be the velocity of the aether, G.

Had we chosen H to represent the velocity, as Larmor insisted, therewould be no discussion between transversal and longitudinal waves. Onlythe former would exist since there are no magnetic charges (poles). Theirnonexistence did not stop the likes of Heaviside to include them, if fornothing else than for the sake of symmetry. Whether or not they exist is justa matter of experimental input, like the laws of electrodynamics: “If they arevalid at any speed, then there is nothing to prevent speeds of motion greaterthan light.” However, Heaviside criticized this choice of the velocity, for

if H is velocity, the case is far worse, for an impossibility is involved. The electricforce becomes rotation or proportional thereto, and the impossibility is that we needto have E both circuital and polar at the same time roundabout an isolated charge!

With E as velocity,

−ν∇ × G = H, (11.5.59)

Perry and Swinburne, past and present Presidents of the Institution of ElectricalEngineers, the latter asserting that the textbook definition of entropy to be “fun-damentally wrong.” [Nahin 88] Planck was called on to adjudicate, and he whollysided with Swinburne. Planck had a go at Heaviside’s ‘ghostly’ reference to theentropy’s tendency to increase. Heaviside rather fashioned Kelvin’s concept of the“universal dissipation of energy,” tending to a minimum as opposed to Boltzmann’suniversal tendency for the entropy to increase. Needless to say, Planck had the upperhand, and left Heaviside, in a rare occasion, speechless. This goes to show that onceoutside their area of expertise, these geniuses of science were prosaic.

Page 649: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

622 A New Perspective on Relativity

(11.5.58) would give the first circuital equation, while the time derivativeof (11.5.59) would give the second one,

−∇ × E = µ0H, (11.5.60)

if ν = µ−10 . Using the third auxiliary relation in (P′), we can write the first

circuital relation in terms of A alone, viz.

A = η∇(∇ · A) − ν(∇ × ∇ × A), (11.5.61)

provided the scalar potential satisfies the wave equation,

φ = η∇2φ.

The scalar potential, φ, would propagate at a speed√

η, which could begreater than that of light.

The presence of a convergence (=-divergence) in (11.5.61), whichalthough is analogous to a hydrostatic pressure, does not necessarily meanthat there is a longitudinal mode. For an incompressible fluid that satisfiesEuler’s equation, the vanishing of the divergence of the velocity is the con-dition that the fluid velocity is everywhere normal to the wave vector, and,consequently, the wave is transverse. Such a term is in the Lorentz gauge,which is the first equation in (P′).

In fact, combining the Lorentz gauge with

A + E + ∇φ = 0, (11.5.62)

results in

A = ∇(∇ · A) − ∇ × (∇ × A). (11.5.63)

This is precisely (11.5.61) with η = ν = 1 and our (11.5.15), so Maxwell’sequations do not require a vanishing coefficient of compression! Recallingour discussion in Sec. 11.5.3, folklore has it that the transition to supercon-ductivity occurs with the creation of a longitudinal mode with the simulta-neous appearance of a finite mass. If what we have said in Sec. 11.1.4 abouttwo velocities of propagation is correct, then there can be no coexistence oftransverse and longitudinal modes of propagation at a single velocity. Asin the case of a sound wave, the fluid velocity is in the direction of propa-gation, and, consequently, sound waves are longitudinal. Furthermore, theyneed a medium to propagate in. If E is the velocity, then the only way to get

Page 650: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 623

longitudinal propagation is to have H polar or vanish, as Heaviside clearlyrealized.

Multiplying (11.5.61) through by A and performing several in inte-grations by parts we get

12

ddt

{A2 + ν(∇ × A)2 + η(∇ · A)2} = −∇ · {ν∇ × A × A − ηA(∇ · A)}.(11.5.64)

The term 12 A2 is the kinetic energy density, 1

2 (∇×A)2 is the rotational energydensity, while 1

2 (∇ · A)2 is the energy density of compression [cf. (11.5.20)].In fact, as we have already mentioned, −∇ · A can be likened to a staticpressure. And a pressure exists both when the fluid is incompressible aswell as when it is compressible.

Moreover, Maxwell’s theory predicts a pressure due to radiation. Thecondition that the flow be incompressible is

∇ · E = 0,

because we have taken E to be the velocity. Thus, there exists a veloc-ity potential whose gradient is E, and which satisfies Laplace’s equation.In view of condition (11.5.62) this means that ∇2φ = −∇·A = 0, and, hence,the third term on the left-hand side of (11.5.64) drops out. That is, there isno rate of change of the compressional energy so for all intent purposes itdoes not exist. According to Heaviside, when the aether is incompressible,η becomes infinite and ∇·A tends to zero in such a manner that the pressureremains finite. However, A is not E so that the same restriction placed onthe latter does not necessarily apply to the former.

Moreover, if E is a gradient, the second term on the left-hand sideof (11.5.64) is also zero, because H “remains steady in time and place.”E cannot be incompressible flow for, otherwise, the second circuital law,(11.5.60), would vanish. So Maxwell’s aether is not incompressible, but onein which the elastic constants of rotation and compression just happen tobe equal.

What if we go to the other extreme where the motion is entirely com-pressible? Equation (11.5.58) becomes

A = η∇2A, (11.5.65)

Page 651: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

624 A New Perspective on Relativity

if ∇ × A = 0 [cf. Sec. 11.1.4], so there is no magnetic field. With φ = 0,(11.5.65) can be split up into the pair of equations:

E + η∇(∇ · A) = 0,

∂t(∇ · A) + ∇ · E = 0.

(Z)

The second equation follows from (11.5.62) upon taking is divergence.The strains are most easily characterized by what are known as

‘surfaces of discontinuity,’ where a displacement, or its derivatives, mayexperience a discontinuity, or jump. Using Christoffel’s notation we willindicate a discontinuity in G by [G]. If the body is not torn apart by thestrain, we may well suppose that the normal component of the displace-ment be continuous, or that [G · n] = 0, where n is the unit normal tothe surface, S. In contrast, the displacement tangential to the surface mayundergo a discontinuity, [T] �= 0, where

T = G − (G · n)n.

Let E and ∇ · A be continuous across the surface S, whereas the accel-eration of the displacement, G, and the gradient of ∇ ·A are discontinuous.Writing the first equation in (Z) on both sides of the surface S and subtract-ing one from the other result in

[dEdt

]= −η[∇(∇ · A)].

Since [E] = 0, its divergence

[∇ · E] = m · n,

where the vector m characterizes the discontinuity.Also since the pressure is continuous across the discontinuity

[∇ · A] = 0, but its gradient is not,

[∇(∇ · A)] = αn,

where α is a scalar which we will shortly determine. The kinematical con-dition of compatibility, upon taking the time derivative of both sides yields

[ddt

(∇ · A)]

= −αu,

Page 652: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 625

where u is the speed of propagation of the wave. Now, by the discontinuityin the second equation in (Z),

[ddt

(∇ · A)]

= −[∇ · E],

there results

α = m · nu

,

and, consequently,

[∇(∇ · A)] = m · nu

n.

Introducing this last condition into the first equation in (Z), togetherwith the additional kinematic condition of compatibility,

[dEdt

]= −um,

we obtain the condition

u2m = η(m · n)n.

There are only two ways that this condition can be satisfied:

(i) The vector m is normal to S so that the speed is

u = √η, (11.5.66)

or(ii) m is tangent to the surface so that m · n = 0 requires u = 0.

This proves a theorem of Hugoniot:

There are only two kinds of discontinuities in a compressible, non-viscous fluid:longitudinal discontinuities which propagate with speed (11.5.66), and transversaldiscontinuities which do not propagate at all.

Thus, by eliminating the magnetic field, and demanding that A be irro-tational we have made longitudinal electrical waves. The wave equation(11.5.65) is expressed in terms of the vector potential, A. Now, if A were

Page 653: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

626 A New Perspective on Relativity

solenoidal we could write (11.5.65) as

η∇ × ∇ × E = −E.

This equation can further be split into two equations of first order

−∇ × E = H,

η∇ × H = E.

The curl suffers a discontinuity, [∇ × v] = n × m. Thus, in the firstcase where m is normal to the surface, [∇ × v] = 0, while in the secondcase where m is tangent to the surface, [∇ × v] �= 0. Hence, for transversewaves, the vortex velocity 1

2 (∇×v) jumps, with m tangent to S, while ∇·v iscontinuous. Whereas for longitudinal discontinuities the opposite occurs.In the former case, the longitudinal discontinuities are not propagated atall. We can, therefore, have either transverse or longitudinal discontinuities,but not both at the same time.

In Helmholtz’s theory, the first circuital law appears in the form,

∇ × H = ε1E − ε0∇φ, (11.5.67)

where, in general, ε1 and ε0 are two components of a dielectric constant ε,i.e. ε1 + ε0 = ε. The dielectric constant, ε1, is a property of matter, whileε0 belongs to the aether. As Heaviside was quick to point out, the poten-tial term allows the electric current to have divergence. It also allows forthe establishment of longitudinal waves without the contradiction that thecurrent be stationary.

That is, if we introduce (11.5.62) and use the Lorentz gauge,

εφ = −∇ · A, (11.5.68)

we could get (11.5.63) without the second term if A were irrotational.The second term in (11.5.67) avoids this contradiction, and gives a waveequation,

∇ × ∇ × A = −ε1A − ε∇φ, (11.5.69)

Page 654: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 627

where we used ε1 + ε0 = ε. Imposing the Lorentz gauge, (11.5.68), on(11.5.69) gives

∇2A = ε1A. (11.5.70)

Such a wave travels at the superluminal speeds since 1/√

ε1 > 1/√

ε.So depending on whether A is irrortational or solenoidal, (11.5.67) gives riseto longitudinal or transverse waves, with the former traveling faster thanthe latter. Since the former are at variance with Maxwell’s theory we areforced to set ε0 = 0, and so with it Helmholtz’s generalization of Maxwell.

11.5.6 Directed electromagnetic waves

Mass can also creep in very simply in the propagation of electromagneticwaves along a cable through the assumption that the fields are periodicboth in time and along the axis of propagation. We choose the z-axis asthe axis of symmetry along the cable. The radial coordinate, r, measuresthe distance to any point from the center of the cable. We have discussedsuch an example in Sec. 11.1.6, but, consider it here from a differentperspective.

There are two components of the electric field: E is along the axisof symmetry, and F points in the radial direction. This fixes the magneticfield to be circular about the symmetry axis. Maxwell’s equations are thusgiven by

E = curlzH = 1r

∂rrH,

F = curlrH = −∂H∂z

,

−H = curlϕE = ∂F∂z

− ∂E∂r

.

(C)

These equations can give wave equations by increasing the order and elim-inating one of the variables. For instance, the longitudinal component ofthe electric field will satisfy

1r

∂rr∂E∂r

+ ∂2E∂z2 = E. (11.5.71)

Page 655: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

628 A New Perspective on Relativity

Now assume, as is usually done, that the electric and magnetic fieldsare simply periodic with respect to z and t so that each will have the formof a plane wave

ei(ωt−mz).

The wave equation (11.5.71) will then reduce to Bessel’s equation of orderzero,

1r

ddr

rdEdr

+(λ−2 − m2

)E = 0, (11.5.72)

where E depends only on r, λ = ω−1, and we have chosen the wave-length along the z-axis to be the Compton wavelength, m−1. There are twooptions:

(i) If m−1 > λ, the solution will oscillate as a Bessel function of order zero.If the waves were not controlled by the wire we should have λ = m−1

in the dielectric.(ii) If λ > m−1, the disturbance decays exponentially in the radial direction.

The solution to (11.5.72) is then a modified Bessel function of orderzero.

Mass has entered through the assumption that the longitudinal com-ponent of the electric field is periodic along the axis of symmetry with awavelength given by the Compton wavelength. This does not destroy thetransverse nature of Maxwell’s equations. The Compton wavelength is thesmallest wavelength that the cable is capable of supporting, and its rela-tion to that of the electromagnetic radiation will determine the nature ofthe propagation of radial disturbances.

Even more can be said. Instead of introducing the Compton wave-length, we use the wave number κ . Then, (11.5.72) becomes

1r

ddr

rdEdr

+ (ω2 − κ2)E = 0. (11.5.73)

We know that the relativistic dispersion equation (11.5.43) applies. Thiswould identify ω in the second expression of (11.5.73) with the inverseCompton wavelength, and hence with the mass, while κ is related to themomentum through the de Broglie relation. But, by the very definition of

Page 656: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 629

the operator of angular momentum, the first term should be proportionalto the square of the momentum.m

That momentum and mass can be construed in two different waysmakes it reasonable to consider mass to have tensorial componentswhile momentum can be treated as a scalar, instead of the other wayround.

We write the coefficient of the second term in (11.5.73) as

K2 = (ω2 − κ2) = ω2(

1 − 1v2

), (11.5.74)

by the definition of the phase velocity v = ω/κ. For a given mode of vibra-tion there is the lowest possible frequency, corresponding to infinite wave-length, κ = 0, viz.

ωcrit = K,

which defines K in (11.5.74). Now, rearranging (11.5.74) to read:

ω2 = (κ2 + K2),

differentiating with respect to κ, and using the definition of the group veloc-ity, u = dω/dκ, we find 1 = uv. Thus, if u < 1, we come out with theinevitable conclusion that

phase velocity > velocity of light,

which is a well-known property of waves on wires [Brillouin 60].Now, u < v implies m−1 > λ. In a letter to Niels Bohr in March

1930, Werner Heisenberg outlined his picture of a ‘lattice world’ [Gitterwelt].It was Heisenberg’s idea to introduce a fundamental length, m−1, where hesupposed the mass was that of the proton, for lack of another elementaryparticle known at that time. Space, according to Heisenberg, is chopped up

mDirac [47, p. 153] notes a problem on transforming from Cartesian to sphericalcoordinates in that the commutator “makes pr like the momentum conjugate to ther coordinate, but it is not exactly equal to this momentum because it is not real. . ..Thus pr − i�r−1 is real and is the true momentum conjugate to r.” Yet, he goes onto use pr = −i�∂/∂r as if it were the true momentum. Actually, the momentumoperator will be different in cylindrical and spherical coordinates.

Page 657: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

630 A New Perspective on Relativity

into lengths m−1 so that (11.5.72) actually stands for the finite differenceequation,

(ω2 − m2)un − un+� − 2un + un−�

�2 = 0.

Since ω > m, or λ < m−1, the spacing � < m−1 is required. But aparticle cannot be localized to within dimensions smaller than the Comp-ton length so Heisenberg set � equation to the Compton wavelength ofthe proton. Over distances smaller than � causality breaks down and wehave no knowledge of what goes on over such distances. Electrons withwavelengths greater than � can propagate along the linear lattice. However,what is good in one dimension does not mean it is also good in three dimen-sions. In fact, Heisenberg found that in three dimensions, the lattice pictureviolates energy and momentum conservation, and dropped the whole ideaof a lattice world picture.

We have subsequently resuscitated Heisenberg’s Gitterwelt [Lavenda00], and have shown that there is more than meets the eye, even in a singledimension. The recursion relation for a modified Bessel function gives riseto a diffusion process. The transformation from a real diffusion processto quantum mechanics consists in replacing the real probability of tran-sition by the probability amplitude for a reversal of the path, just as inFeynman’s formulation in Sec. 11.4.2. It is the phase factor eiπ/2 that guidesthe motion of the particle, and converts a modified Bessel function into anordinary Bessel function. The transformation from a probability to a prob-ability amplitude converts the recursion relation into a finite space-timeequation which transforms into Schrödinger’s equation in the limit thatthe spacing between the lattice points tends to zero.n However, there aregood reasons for keeping the spacing finite, since it avoids all problems of

n It is truly ironic that the discoverer of the nonrelativistic wave equation wouldsome fourteen years prior be uttering these words:

It is so to speak part of the creed of the atomist that all partial differential equationsof mathematical physics. . . are incorrect in a strictly mathematical sense. For themathematical symbol of the differential quotient describes the transition in thelimit to arbitrary small spatial variations, while we are convinced that in formingsuch ‘physical’ differential quotients we must stop at ‘physically infinitely small’regions. . .

Page 658: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 631

non-locality, and it provides relativistic corrections to Schrödinger’s equa-tion, like the Sommerfeld relativistic correction to the Balmer term due tofine-structure.

11.6 Relativistic Stokes Parameters

In this section we show that the Stokes parameters are analogous to thefour-vector of energy–momentum in the case of complete polarization.The density matrix representation of the Stokes parameters will give usinsight into the transformation from the Weyl to the Dirac equations, andshow what options are available in the interpretation of the componentsof the density matrix. One of these options will allow us to account forthe microwave Lamb shift through the circularly polarized component ofthe mass, which the Dirac equation is unable to account for. Moreover, allhydrogen-like splittings, such as the hyperfine and fine-structure splittings,occur from negative to positive k = ±(j + 1

2 ) values, and are characterizedby left-hand elliptical polarization.

11.6.1 Weyl and Dirac versus Stokes

It is well-known that Weyl’s equations can be derived from splitting therelativistic conservation of energy,

W2 − p2 = m2,

into

(W − σ · p)(W + σ · p)ψ = (me−iϕ)(meiϕ)ψ,

since (σ · p)(σ · p) = p2, where ψ is a wave function with two components.The second-order differential equation may be factored, and since the tworesulting first-order equations must give a single second-order equation,

From what has been said earlier, Schrödinger was infatuated whether there is a‘jump’ at the approach of continuous physical laws. Luckily for physics he had achange of heart, and went to the continuous limit in 1926.

Page 659: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

632 A New Perspective on Relativity

they must be coupled, viz.

(W − V + σ · p me−iϕ

meiϕ W − V − σ · p

)(ψa

ψb

)= 0. (11.6.1)

These are the Weyl equations. The potential V makes W > W0 =√(p2 + m2), which is analogous to the case of partially polarized light.

The matrix in (11.6.1) is the density matrix representation of the Stokesparameters, where the helicity operator is

σ · p = σ · rr2 (σ · r) = σ · r

r2

(r · p + iσ · r × p

). (11.6.2)

The last term in (11.6.2) is the spin orbit interaction, σ · L.We are primarily interested in bounded solutions. To this end we con-

sider the spinor,

ψ =(

f (r)Ykjm

ig(r)Y−kjm

), (11.6.3)

where Ykjm is a generalized spherical harmonic.o For each j there are two

Y’s. These have orbital angular momenta � equal to j − 12 and j + 1

2 . Theparity of the spherical harmonics is determined by whether � is even or odd.Specifying the parity and j uniquely determines �. Instead of determiningthese states by parity it will prove more convenient to introduce a newquantum number k defined as k = ±(j+ 1

2 ). If k is positive then k = � = j+ 12 ,

while if it is negative then k = −(� + 1), where � = j − 12 . The spin-orbit

interaction acts on the generalized spherical harmonics to give

σ · LYkjm = −(k + 1)Yk

jm,

and

σ · rYkjm = −Y−k

jm ,

where r is the unit vector.

oThere should be no confusion between the magnetic quantum number and themass.

Page 660: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 633

With the spinor wave function given by (11.6.3), the Weyl equations(11.6.1) become

[W + α

r− (1 − k)

r

]g − dg

dr+ mf = 0, (11.6.4)

[W + α

r+ (1 + k)

r

]f + df

dr+ mg = 0, (11.6.5)

if we are willing to lose all phase information, in the presence of theCoulomb potential, and α = 1/137 is the fine-structure constant in the nat-ural units we are using. The loss of phase information will also plague theDirac equation. We can get rid of the 1/r terms by writing g = G/r andf = F/r; for then there results

[W + α

r+ k

r

]G − dG

dr+ mF = 0,

[W + α

r+ k

r

]F + dF

dr+ mG = 0.

For a bounded solution to exist, both F, G → 0 as r → ∞. Whenr → ∞, the coupled equations reduce to

dGdr

= WG + mF

dFdr

= −WF − mG

⇒ d2Gdr2 −

(W2 − m2

)G = 0.

There can be no bound state solutions for W < m. Moreover, as r → 0 thecoupled equations reduce to

dGdr

− (α + k)r

G = 0,

dFdr

+ (α + k)r

F = 0,

again showing that there is no solution vanishing at the origin.It is often said that Dirac’s equation is equivalent to Weyl’s equa-

tion, but this is simply not true. By adding and subtracting the two Weyl

Page 661: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

634 A New Perspective on Relativity

equations, (11.6.1), Dirac obtained

(−W p−p W

)(φ+φ−

)= −m

(φ+φ−

), (11.6.6)

where φ+ = ψb + ψa and φ− = ψb − ψa. The density matrix determinesthe evolution of the wave function. It is equivalent to the coupled set ofequations

dFdr

+ kr

F −(

W + m + Zα

r

)G = 0,

dGdr

− kr

G +(

W − m + Zα

r

)F = 0.

In (11.6.6) the mass now appears in the diagonal terms instead ofWeyl’s equation, (11.6.1). As such it provides no coupling between thespinor components, and goes against the grain of Feynman’s idea thatthe probability amplitude for a reversal depends on the mass [cf. (11.4.16)].In Dirac’s theory, like the Klein–Gordon equation, mass is appended ontothe energy, and is not an intrinsic part of the field.

Unlike the Weyl equations, (11.6.6) admits a bound state solution; forin the limit as r → ∞, the pair of coupled equations reduce to

dGdr

+ (W − m)F = 0

dFdr

− (W + m)G = 0

⇒ d2Fdr2 +

(W2 − m2

)F = 0.

Hence, a bounded solution exists for W < m. But, this is surprising sincewe always have W2 ≥ p2 + m2. While this is true for optical phenomenaand repulsive mechanical systems, it is not true for attractive potentialswhich can form orbits. There is no optical analog for such problems. Hencefor bounded states, both F and G tend to zero with e−√

(m2−W2)r as r → ∞.This calls for the coordinate change ξ = √ (

m2 − W2) r, and the pair of

Page 662: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 635

coupled equations reduce to

dFdξ

+ kξ

F −(

+ Zα

ξ

)G = 0,

dGdξ

− kξ

G −(

η − Zα

ξ

)F = 0,

where

η =√(

m − Wm + W

).

It is clear from the smallness of Zα that η � 1.The tried and true method of solving the coupled differential equa-

tions is to look for a power series solution which we can cut off at a certainpoint. The series solutions we looking for are:

F = ξν∞∑

n=0

Anξne−ξ ,

G = ξν∞∑

n=0

Bnξne−ξ ,

where ν will be determined by the indicial equation which guarantees thatthe wave function goes to zero as some power of ξ.

Introducing the power series into the coupled equations results in

(ν + n + k)An − An−1 − 1η

Bn−1 − ZαBn = 0, (11.6.7)

(ν + n − k)Bn − Bn−1 − ηAn−1 + ZαAn = 0. (11.6.8)

The indicial equations tells us how the wave function tends as r → 0. Settingn = 0 and observing that the coefficients are zero for negative indices, weget

(ν + k)A0 − ZαB0 = 0,

ZαA0 + (ν − k)B0 = 0.

Page 663: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

636 A New Perspective on Relativity

A non-trivial solution demands the determinant of the coupled equationsto vanish, i.e.

ν = ±√(k2 − (Zα)2). (11.6.9)

We must choose the positive solution in order to avoid singularities at theorigin.

Multiplying (11.6.7) by η, and subtracting (11.6.8) from it, give

Bn = η(ν + n + k) − Zα

ν + n − k + ZαηAn.

Substituting this into the recursion relations (11.6.8) yields

An+1

An= (ν + n + 1 − k + Zαη)

(2ν + 2n + Zα(η − 1/η)

)

(ν + n − k + Zαη)((ν + n + 1)2 − k2 + (Zα)2

) .

The asymptotic behavior will be eξ unless the series is forced to ter-minate, and this produces an eigenvalue condition. One of the terms in thenumerator must be zero, and this determines the integer n = N. It is notdifficult to see that it is the second term in the numerator,

2(ν + N) − Zα

η

(1 − η2

)= 0,

or, in terms of the original variables,

ν + N − ZαW√(m2 − W2)

= 0. (11.6.10)

Introducing the value for ν found in (11.6.9) into (11.6.10), and after rear-ranging, we get

W = m

[1 − Z2α2

(N + |k|)2 + 2N(√

(k2 − (Zα)2) − |k|)

]1/2

. (11.6.11)

The formulation does not allow a solution for N = 0, k > 0. In fact,by setting n = N + |k| ≥ 1, where −n ≤ k < n, and |k| = j + 1

2 , the energy

Page 664: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 637

levels of the Dirac atom can be cast in the form

W = m

[1 − (Zα)2

n2 + 2(n − (j + 1

2

))[√((j + 1

2

)2 − (Zα)2)− (j + 1

2

)]]1/2

.

The energy levels are only a function of two quantum numbers, n, the non-relativistic principal quantum number, and j, the sum of the orbit angularmomentum and spin. Since the Lamb shift deals with the splitting of dif-ferent �’s with the same j, it is beyond the Dirac solution to tackle. Finally,expanding the square root in powers of α, the Sommerfeld relativistic cor-rection to the Schrödinger equation

W − m = −m(Zα)2

2n2 − m(Zα)4

2n4

(n

j + 12

− 34

)+ · · ·

is recovered that is valid to (Zα)4.An essentially identical expression was first given by Sommerfeld in

1916, way before Schrödinger mechanics and Dirac relativistic mechanicswere invented. The expression Sommerfeld found was:

W + m = m

/[1 + Z2α2

(nr + √(n2

φ − Z2α2))2

]1/2

,

where W denotes the energy of the bound electron after deducting the restenergy m, nr is the radial quantum number, and nφ = j+ 1

2 is Sommerfeld’snotation for the Bohr azimuthal quantum number. It corresponds to � + 1in Schrödinger’s scheme. Two terms of different �, but with the same j,always coincide so that relativity and spin partly compensate each other.The principal quantum number is the sum n = nr + nφ. Thus, Sommerfeldfound

W = −RZ2

n2

{1 + (Zα)2

n2

(nnφ

− 34

)+ · · ·

},

upon expanding in powers of α, where R = mα2/2 is the Rydberg constant.The Balmer term,−R/n2, undergoes a relativistic modification that dependson j and on the fine-structure constant α.

Page 665: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

638 A New Perspective on Relativity

What Dirac did, essentially, was to transform Weyl’s equations,

(W + Q U − iVU + iV W − Q

)(ψa

ψb

)= 0,

into

(W + U Q + iVQ − iV W − U

)(φ+φ−

)= 0, (11.6.12)

where φ± = ψa ± ψb. The Stokes parameters are U = m cos ϕ, V = m sin ϕ,and Q = σ · p. We get Dirac’s equation on setting ϕ = 0, π, which is linearpolarization. The presence of a left-handed elliptical polarized componentof the mass will be shown to be related to the microwave Lamb shift, inSec. 11.6.3 which distinguishes between different � values that the Diracequation does not.

In its most general form, (11.6.12) becomes

W + Zα

r+ m cos ϕ σ · p + im sin ϕ

σ · p − im sin ϕ W + Zαr − m cos ϕ

(

φ+φ−

)= 0,

or, equivalently,

W + Zα

r+ m cos ϕ i

[σ · r

(− ∂

∂r+ σ · L

r

)+ m sin ϕ

]

i[σ · r

(− ∂

∂r+ σ · L

r

)− m sin ϕ

]W + Zα

r− m cos ϕ

(

φ+φ−

)= 0,

where we have introduced

σ · p = −iσ · ∇ = −iσ · r∂

∂r+ iσ · r

σ · Lr

.

The non-zero integer, k, determines whether the spin is parallel (k < 0),or anti-parallel (k > 0), to the momentum in the nonrelativistic limit. It is

Page 666: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 639

related to the spin-orbit coupling σ · L by

k2 = (σ · L + 1)2

= L2 + 2σ · L + 1.

Now, the total angular momentum is

J2 = L2 + σ · L + S2,

where the eigenvalue of the square of the spin is 12 ( 1

2 + 1) = 34 . Thus, the

operator, K, whose eigenvalue is k, is K2 = J2 + 14 . Consequently, k2 =

j(j + 1) + 14 = (j + 1

2 )2, and so

k = ±(

j + 12

).

The coupled set of first-order differential equations is equivalent tothe second-order wave equation:

d2ψ

dr2 −[(

m sin ϕ + kr

)2

−(

W + Zα

r

)2

+ m2 cos2 ϕ

]ψ = 0,

or, by expanding terms,

d2ψ

dr2 −(

a2 − 2(ZαW − km sin ϕ)

r+ ν2

r2

)ψ = 0, (11.6.13)

where a2 = m2 − W2 > 0, and ν2 = k2 − (Zα)2. We will employ the oldquantum condition that the action, when evaluated over a closed orbit, bean integral value, N, of Planck’s constant,

∮p dr =

∮ √(−a2 + 2

(ZαW − km sin ϕ)r

− ν2

r2

)dr = 2πN, (11.6.14)

in natural units.In the method of complex integration [Born 60], r is a line in the

complex plane where the integrand is pictured on a Riemann surface oftwo sheets with branch points at the roots e1 and e2 of the radicand withe1 > e2. The path of integration is taken around the line joining the tworoots. In the sheet of the Riemann surface where the root is positive it goesfrom e2 → e1 with dr > 0, while in the sheet with the negative root, thepath goes from e1 → e2 with dr < 0, as shown in Fig. 11.16.

Page 667: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

640 A New Perspective on Relativity

Fig. 11.16. The diagrams of the original and deformed paths of integration withthe pole at r = ∞ as if it were at a finite distance.

In order to evaluate the integral, we distort the path so that it separatesinto individual contours, each of which encloses one pole of the function.The poles are located at r = 0 and r = ∞. With the direction of rotationgiven in Fig. 11.16, the value of N is just the negative sum of the residues ofthe integrand in these poles, where the residue is 2πi times the coefficientof 1/(r − r0) in the Laurent expansion in the neighborhood of the pole r0.

As a check we set ϕ = π/2, and find

−ν + ZαW√(m2 − W2)

= N,

which is precisely (11.6.10). Now, turning to (11.6.14), a necessary conditionthat the integrand have real roots is that ZαW > |k|m sin ϕ. The eigenvaluecondition is now

−ν + ZαW − km sin ϕ√(m2 − W2)

= N, (11.6.15)

which differs from (11.6.10) by the second term in the numerator. SinceW < m and Zα < |k|, we expect the angle will be very small. Just howsmall it is, the Lamb shift will tell us in Sec. 11.6.3.

11.6.2 Origin of the zero helicity state

The Dirac equations in the presence of a central force are

(W − m − V(r))ψA + dψB

dr+ 1 − k

rψB = 0,

−(W + m − V(r))ψB + dψA

dr+ 1 + k

rψA = 0.

The two-component wave functions, ψA and ψB, will be shown to haveopposite parities. These equations are comparable to the Weyl equations(11.6.4) and (11.6.5). Whereas Weyl’s equations are coupled through the

Page 668: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 641

cross terms involving mass, the Dirac equations are coupled through thediagonal terms of the density matrix involving energy as well as mass.Terms have been exchanged in the density matrix as a result of tak-ing linear combinations of the original spinor component wave functions[cf. (11.6.6)].

Eliminating one of these wave functions results in:

[(W − V)2 − m2]ψA − k(k + 1)r2 ψA + 2

rdψA

dr+ d2ψA

dr2 = 0. (11.6.16)

The constant potential solutions to (11.6.16) are spherical Bessel functions[cf. (11.5.10a) and (11.5.19a)]. The hydrogen-like solutions to this equation,as we have seen, are to be sought in a power series expansion which mustterminate, and, thus, provide an eigenvalue condition.

In momentum space, the Dirac equation for a central force is

(W − V(r))ψB = (σ · p)ψA, (11.6.17)

where

σ · p =

pz px − ipy

px + ipy −pz

= −i

∂z∂

∂x− i

∂y∂

∂x+ i

∂y− ∂

∂z

. (11.6.18)

The mass, which would appear missing in (11.6.17), has been incorporatedinto the diagonal terms in (11.6.18) since

pz = W cos ϑ = W√ (

1 − u2)

= m,

py = W sin ϑ sin ϕ = Wu sin ϕ = p sin ϕ,

px = W sin ϑ cos ϕ = Wu cos ϕ = p cos ϕ.

If we take ψA to be an s 12

state wave function with spin up,

ψA(W , p) = R(r)

(10

)eip·r−iWt,

Page 669: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

642 A New Perspective on Relativity

then for ψB we find

ψB = − iW − V + m

∂z∂

∂x− i

∂y∂

∂x+ i

∂y∂

∂z

(

R0

)ei(p·r−Wt)

= − iW − V(r) + m

1r

dRdr

(z x − iy

x + iy −z

) (10

)ei(p·r−Wt)

= − iW − V(r) + m

1r

dRdr

(cos ϑ sin ϑe−iϕ

sin ϑeiϕ − cos ϑ

) (10

)ei(p·r−Wt)

= − iW − V(r) + m

dRdr

√(4π

3

)[−Y0

1

(10

)+ √

2Y11

(01

)]ei(p·r−Wt),

where the spherical harmonics are

Y01 =

√(3

)cos ϑ,

Y±11 = ∓

√(3

)sin ϑe±iϕ.

The p 12

state is a linear combination of these spherical harmonics so thatψA and ψB have opposite parities.

Observe that mass does not couple the components of the spinor mak-ing it impossible to use a Lorentz transform to an inertial frame movingfaster than the particle so that its helicity would be reversed. Moreover, ifm = 0, the rotation lies in the pxpy-plane so that the presence of mass tiltsthe plane of rotation toward the north pole. If the mass were consideredseparately, as in Dirac’s formulation, it could not be connected to the helic-ity h = 0. Even if m = 0, as for a neutrino, we have no way of eliminatingthe helicity h = 0 if the angular momentum is not exactly parallel to thez-axis so that σ · p would be the helicity operator with helicities h = 0, ± 1

2 .

According to the Dirac equation it is only when pz = 0, and rotationoccurs in the momentum pxpy-plane that we get h = ± 1

2 , whether ornot the mass is finite. Thus, finite mass cannot be associated with h = 0.

Page 670: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 643

With the aid of Legendre’s equation, (11.5.7), (11.6.16) can be written as

1r2

∂r

(r2 ∂ψA

∂r

)+ 1

r2 sin ϑ

∂ϑ

(sin ϑ

∂ψA

∂ϑ

)+ 1

r2 sin2 ϑ

∂2ψA

∂ϕ2 + q2ψA = 0,

where q2 = (W − V)2 − m2. The solutions to the radial equation arespherical Bessel functions. When the quantum conditions are applied, wewill find that the magnetic quantum number satisfies |m| ≤ �. Therefore,we now turn our attention to the solution of the reduced wave equation inthe short-wavelength diffraction limit.

We will again employ Fermat’s principle of least time, just as we did inSec. 7.2.2, only now constrained to motions on a sphere of radius �. Fermat’sprinciple asserts that a ray will follow the path from (ϑ0, ϕ0) to (ϑ, ϕ) suchthat the optical path length,

I = �

∫ √ (dϑ2 + sin2 ϑdϕ2

), (11.6.19)

is an extremum. If the potential V is slowly varying, as we assume it is, wemay consider the index of refraction to be constant. Choosing ϑ to be theindependent variable, (11.6.19) can be written as

I = �

∫ ϑ

ϑ0

√ (1 + sin2 ϑϕ′2) dϑ, (11.6.20)

where the prime stands for differentiation with respect to ϑ. Calling � theintegrand of (11.6.20), we have, on account on the cyclic nature of ϕ, that

d�

dϕ′ = sin2 ϑ ϕ′√ (

1 + sin2 ϕ′2) = const. (11.6.21)

is a first integral of the motion.We set the constant equal to sin ϑ0 so that Snell’s law is satisfied, and

proceed to solve for ϕ′. We then obtain

dϕ = sin ϑ0 dϑ

sin2 ϑ√ (

1 − sin2 ϑ0/ sin2 ϑ) ,

which requires that ϑ > ϑ0 because the sine is an increasing function onthe interval (0, π/2). The integration can be carried out straightforwardly

Page 671: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

644 A New Perspective on Relativity

Fig. 11.17. A right-spherical triangle.

giving the difference in longitude,

ϕ(ϑ) − ϕ0 = arccos

(tan ϑ0

tan ϑ

), (11.6.22)

of two great circle arcs, ϑ0 and ϑ. The integration constant, ϕ0, has beenchosen such that ϕ = ϕ0 when ϑ = ϑ0.

Now, (11.6.22) is related to the expressions for the sine and cosine lawsof the right-spherical triangle shown in Fig. 11.17, where, for instance,

cos (ϕ − ϕ0) = cos � · sin � = cos ϑ

cos ϑ0sin � = cos ϑ

cos ϑ0

sin ϑ0

sin ϑ,

on account of the elliptic form of the Pythagorean theorem,

cos ϑ = cos ϑ0 · cos �, (11.6.23)

and the law of sines,

sin ϑ0

sin �= sin ϑ,

for a right triangle.Introducing the extremum condition (11.6.22) into Fermat’s principle

(11.6.20) gives the extremum optical path as

Iext =∫ ϑ

ϑ0

dϑ√(1 − sin2 ϑ0/ sin2 ϑ)

= arccos

(cos ϑ

cos ϑ0

)= �(ϑ, ϑ0). (11.6.24)

It is quite remarkable that the extremum of the optical path shouldgive the elliptic version of the Pythagorean theorem. � is the shortestoptical path connecting the latitudes ϑ and ϑ0, all three being the sidesof a right-spherical triangle.

Page 672: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 645

Now, the law of cosines is

cos ϑ = cos ϑ0 cos � + sin ϑ0 sin � cos ξ.

If it turns out that the angle ξ = π/2, we are talking about a right-sphericaltriangle, and the law of cosines reduces to the elliptic Pythagorean theorem,(11.6.23). If we set sin ϑ0 and sin � equal to the relative velocities u0 and u,respectively, the condition that the spherical triangle be a right triangle canbe expressed as u ·u0 = 0. Then, the relativistic addition of complementaryvelocities,

√ (1 − w2

)=

√ (1 − u2

0) √ (

1 − u2)

1 − u0 · u, (11.6.25)

yields the Pythagorean theorem for spherical trigonometry, (11.6.23), wherew = sin ϑ.

The phase, or eikonal, will be a certain linear combination of (11.6.22)and (11.6.24). Just as in Sec. 7.2.2 we define the eikonal, S, as the integral ofthe Legendre transform of � with respect to ϕ′ [cf. 7.4.11] viz.

S = �

∫ (� − ∂�

∂ϕ′ ϕ′)

= �

{∫dϑ√ (

1 − sin2 ϑ0/ sin2 ϑ) −

∫sin2 ϑ0 dϑ

sin2 ϑ√ (

1 − sin2 ϑ0/ sin2 ϑ)}

= �

∫ √(1 − sin2 ϑ0

sin2 ϑ

)dϑ (11.6.26)

= �

{arccos

(cos ϑ

cos ϑ0

)− sin ϑ0 arccos

(tan ϑ0

tan ϑ

)}

= �� − m(ϕ − ϕ0),

where

m = � sin ϑ0 (11.6.27)

is the azimuthal quantum number.

Page 673: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

646 A New Perspective on Relativity

Fig. 11.18. A right-spherical triangle traced out by an orbiting electron where co –denotes the complement of the angle.

The spherical harmonics can be expressed in terms of their corre-sponding associated Legendre polynomials,

Y±ml (ϑ, ϕ) ∼ Pm

l ( cos ϑ)e±iϕ.

The associated Legendre polynomials, P±ml , have the asymptotic form

P±ml ( cos ϑ) ∼ e±iS(ϑ,ϑ0),

where S is given by (11.6.26).The complement of (11.6.24) is the angular distance of the orbiting

electron from the line of nodes, measured on the orbital plane, and (11.6.22)is the projection of this angular distance onto the equator. This is shownin Fig. 11.18, which is a right-spherical triangle equivalent to Fig. 11.17.The cosine of the angle of inclination, i, is given by the ratio of the ellipticmeasures of the adjacent to the hypotenuse,

cos i = tan[π/2 − (ϕ − ϕ0)]tan (π/2 − �)

= tan �

tan (ϕ − ϕ0),

Page 674: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 647

while for the sine of the angle, we have

sin i = sin (π/2 − ϑ)sin (π/2 − �)

= cos ϑ

cos �= cos ϑ0.

The latter implies that the elliptic dilatation factor,

� =√

(1 − m2/�2)√(1 − m2/�2 sin2 ϑ)

≥ 1,

becomes larger the closer ϑ approaches ϑ0. The dilatation factor reaches itssmallest value for m = 0.

Even before the advent of quantum theory, the Dutch physicistZeeman (1896) found a splitting of the spectral lines when atoms are placedin a magnetic field. Zeeman observed that when the magnetic field is placedperpendicularly to the light path, the spectral lines split into three, whilewhen the field is parallel to the light path they split into two.

His countryman, Lorentz, showed that a charged oscillator in a mag-netic field could explain these splittings. Lorentz accomplished this bydecomposing the motion of the oscillator into two opposite circular motionsnormal to the field, and one linear motion parallel to the field. When animpressed magnetic field is applied to a rotating electron, it makes it pre-cess about the direction of the field. The frequency of precession is knownas the Larmor frequency, ωL = eH/2m, where H is the applied magneticfield. Lorentz’s theory predicted that the frequency of one of the circularmotions is increased exactly by ωL, while the other frequency of the circu-lar motion is diminished by exactly the same amount. This is referred toas the ‘normal’ Zeeman effect to distinguish it from the ‘anomalous’ effectin which the splitting pattern consists of a greater number than three lines(or doublet).

In terms of the azimuthal quantum number, m, the above analysisimplies that it can change by −1, 0, 1. Transitions that leave m invariantcorrespond to linear polarization in the direction of the field. For transitionsm ± 1 → m, the radiation is circularly polarized about the direction of thefield.

Even in the absence of an applied magnetic field, a revolving electronproduces its own magnetic field. An increase (decrease) in m correspondsto right- (left-) circularly polarized light. Observations that are made longi-tudinal to the magnetic field result in a doublet with frequencies displaced

Page 675: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

648 A New Perspective on Relativity

Fig. 11.19. Zeeman splitting: light path parallel (perpendicular) to field results ina doublet (triplet).

in both directions of the central frequency, and the disappearance of thecentral line. In contrast, transverse observations result in a triplet wherethe central frequency is plane-polarized parallel to the field, while the outerlines are polarized in a perpendicular direction, as shown in Fig. 11.19. Thisis yet another example of the myriad of physical meanings that the indexm on the associated Legendre polynomials can take on.

11.6.3 Lamb shift and left-hand elliptical polarization

In the late 1940’s Lamb and Retherford were able to detect transitionsbetween the very close 2s 1

2and 2p 1

2levels of hydrogen induced by per-

turbations using microwaves at frequencies of the order of 1000 Mc/sec.The shift can be explained as arising from the interaction of the atomwith its own radiation field. To obtain the observed shift, Hans Bethehad to subtract off the infinite electromagnetic energy of the elec-tron, and then introduce a plausible cut-off on the integral over theenergy.

The original back-of-the-envelope calculation of Bethe was elabo-rated on by Welton [48] who considered fluctuations of the zero-pointoscillations of the electromagnetic field. The energy of such oscillations

Page 676: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 649

is proportional to

E2 = H2 = 2π

∫ ∞

0κ3dκ. (11.6.28)

This equation is clearly infinite, in which the integrand is the energy, ω = κ,times the number of Planck oscillators between the frequencies κ and κ+dκ,which is proportional to κ2dκ. At absolute zero the integral over all modesis infinite, and this will require some type of cut-off.

A free electron, acted upon by an electric field, will have an equationof motion,

mω2�q = −eE, (11.6.29)

where it is supposed that the electron’s motion is periodic, and �q is itsdisplacement from its equilibrium position. Squaring (11.6.29), and using(11.6.28), we find the mean-square displacement is (at least heuristically)given by

(�q)2 = 2π

e2

m2

∫ κ1

κ0

κ, (11.6.30)

which diverges ‘only’ logarithmically if the limits are zero and infinity. Theequation of motion (11.6.29) fails to take into account processes regardingthe radiation of the electron. The longitudinal recoil of the electron willhave to be added to the transverse oscillations which have been included.Recoil will become important for wavelengths smaller than the Comptonwavelength, which is the smallest wavelength that can be used to localizethe electron. Therefore, to exclude recoil, we set the upper limit at κ1 = κc,the inverse of the Compton wavelength.

The lower limit was more problematic since the divergence arises fromvery large, but very low, frequency oscillations. Electron binding will helpeliminate such oscillations so the lower limit κ0 was taken by Bethe to beslightly greater than the energy of ionization of the hydrogen atom, 13.6 eV,or 1 Rydberg. Bethe [47] needed 17.8 times this amount to obtain approxi-mate agreement with the experimental result of Lamb and Retherford [47].This alone attests to the ad hoc nature of Bethe’s procedure.

Page 677: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

650 A New Perspective on Relativity

Due to the zero-point fluctuations, the Coulomb potential,

V(q) = − e2

r,

will be perturbed so that we may develop the perturbed potential in a Taylorseries expansion in the displacement about this value. We then obtain

V(q + �q) =(

1 + �q · ∇ + 12

(�q · ∇)2 + · · ·)

V(q).

What is important is its average value, and since the spatial distribution isisotropic, it will be given by

V(q + �q) =(

1 + 16

(�q)2∇2 + · · ·)

V(q).

Although this series fails to converge in the case of the Coulombpotential, this should not prevent us from retaining only the lowest non-vanishing term, for which we have the expression (11.6.30). Multiplyingthe result by the probability density, |ψ100(0)|2, where ψ100 is the groundstate wave function of the hydrogen atom, gives an energy shift of

�W = 43

e2α

κ2c

ln

(κc

κ0

)|ψ100|2. (11.6.31)

Quantum electrodynamics corroborates (11.6.31), while providing a cos-metic touch-up by adding small terms to the logarithm [Sakurai 67]. Thiswould seem like a great achievement on account of the ad hoc nature of thecut-offs.

Quantum field theory offers still another explanation of the Lambshift. The quantization of the radiation field leads to fluctuating fieldstrengths in empty space which can be viewed as a shielding effect onthe electron whereby the surrounding cloud of virtual electron-positionpairs shield its charge so that it appears smaller than what it actually is atlarge distances. The virtual electron-positron pairs, which are produced bythe electric field acting on the vacuum (aether), are attracted to the singleelectron of the hydrogen atom by polarizing the pairs so that the positronsare attracted slightly closer to the electron while the electrons, in the vir-tual pairs, are repulsed. Thus, the space surrounding the electron appearsas a polarized dielectric, as depicted in Fig. 11.20. Rather than invoking the

Page 678: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 651

Fig. 11.20. The conventional explanation of the Lamb shift as the shielding of theelectron’s charge by virtual electron-positron pairs that are produced by the vacuumwhen acted upon by an electric field.

radiation field, we will attempt to explain the Lamb shift as the effect of theelectric field acting to circularly polarize the electron mass.

According to Dirac, the energy depends only on n and the magnitudeof k so that there is no lifting of the degeneracy of states with the same jvalues, but with different � values. However (11.6.15) clearly shows thatwith a non-vanishing azimuthal angle ϕ �= 0, the energy will also dependon the sign of k, and consequently levels with different values of k but withthe same modulus will be split. This will lift the degeneracy and accountfor the Lamb shift.

Solving (11.6.15) for the energy we get

W = m

√[1 − (Zα)2

(N + |k|)2 + 2N(√

(k2 − (Zα)2) − |k|)]

Page 679: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

652 A New Perspective on Relativity

×√[

1 − k2 sin2 ϕ

(N + |k|)2 + 2N(√ (

k2 − (Zα)2)− |k|)

]

+ Zα km sin ϕ

(N + |k|)2 + 2N(√ (

k2 − (Zα)2)− |k|) , (11.6.32)

which reduces to Dirac’s result, (11.6.11), when ϕ = 0. We may verify(11.6.32) by squaring both sides to obtain

W2q − 2ZαW km sin ϕ = m2q − m2[(Zα)2 + k2 sin2 ϕ

], (11.6.33)

where

q = (N + |k|)2 + 2N(|ν| − |k|) ,

and we have canceled the common terms (Zα km sin ϕ)2. Now adding(Zα)2W2 to both sides of (11.6.33) we find

(ZαW − km sin ϕ

)2 =(m2 − W2

) (q − (Zα)2

). (11.6.34)

Observing that q − (Zα)2 = (N + |ν|)2, and taking the positive square rootof (11.6.34) give us back the eigenvalue condition, (11.6.15).

Expanding (11.6.32) in powers of Zα gives

W − m = −m

{(Zα − k sin ϕ)2

2n2 + (Zα)2

2n4 (Zα − k sin ϕ)2(

n|k| − 3

4

)− · · ·

},

(11.6.35)

where n = N + |k|. The nonrelativistic energy is now dependent on twoquantum numbers, n and k, and on the value of the azimuthal angle, ϕ. Thefact that ϕ �= 0 represents a preference for right circular polarization. Wemust now determine this angle.

Observe that for ϕ = 0, (11.6.35) depends on n and |k|, but not onthe sign of k. This implies that energy levels with the same j and same nare degenerate. This degeneracy is peculiar to the Coulomb field. In non-spherically symmetric potentials, as in many electron atoms, the level withlower � lies above that with higher �, due to screening. Even in hydrogen,where there is no screening, there is still a splitting between the atomiclevels with the same total angular momentum, but with different orbitalangular momenta. This tiny splitting is between the 2s 1

2(k = −1, � = 0)

Page 680: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 653

Fig. 11.21. Splitting of energy levels of a hydrogen-like atom (not drawn to scale).All shifts are left-hand elliptical polarizations.

and 2p 12

(k = +1, � = 1) levels, with the 2s state being the higher of the two,as shown in Fig. 11.21. It is known as the Lamb shift, and was predicted byS. Pasternack even before the 1940’s.

Because (11.6.35) is unable to account for the Lamb shift with ϕ = 0,since it only depends on the modulus of k and not on its sign, recoursehad to be made to quantum electrodynamics. However, for ϕ �= 0, (11.6.35)predicts a splitting. The splitting where the 2s 1

2lies above the 2p 1

2level

is left-hand elliptical polarization. Moreover, even in the nonrelativisticlimit, where we retain only the leading term in (11.6.35), we find the shiftin energy between the 2s 1

2and 2p 1

2states to be

�W = W2s 12

− W2p 12

= −12

mZα sin ϕ, (11.6.36)

where Z = 1 for hydrogen.The degeneracy in the Dirac equation has been lifted. The measured

frequency shift is 1057 megacycles per second corresponding to 0.035 cm−1.To get an idea of how small this is just compare it to the ionization energyof the ground state, 2.7 × 104 cm−1. The shift corresponds to an energy of

Page 681: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

654 A New Perspective on Relativity

4.34 × 10−6 eV. With 12αm = 1.863 × 103 eV, in (11.6.36), a negative phase

angle of ϕ = −0.024′′ is given, indicating left-handed elliptical polarization.All the energy levels of the splittings of a hydrogen-like atom are

shown in Fig. 11.21. All splittings occur with a negative difference in thek values. For instance, the fine-structure splitting, which is roughly tentimes that of the Lamb shift, has a negative phase angle, ϕ = −0.16′′. Allallowed splittings show left-handed elliptical polarizations, which can beconsidered a selection rule, and are obtained from (11.6.35) without anyrecourse to quantum electrodynamics.

References

[Bass & Schrödinger 55] L. Bass and E. Schrödinger, “Must the photon mass bezero?” Proc. Roy. Soc. A 232 (1955) 1–6.

[Bethe 47] H. A. Bethe, “The electromagnetic shift of energy levels,” Phys. Rev. 72(1947) 339–341.

[Born & Wolf 59] M. Born and E. Wolf, Principles of Optics (Pergamon, Oxford,1959), p. 653.

[Born 60] M. Born, The Mechanics of the Atom (Frederick Ungar Pub. Co., New York,1960) Appendix II.

[Brillouin 60] L. Brillouin, Wave Propagation and Group Velocity (Academic Press,New York, 1960), p. 143.

[Dirac 47] P. A. M. Dirac, The Principles of Quantum Mechanics, 2nd ed.(Oxford U. P., London, 1947), pp. 257–258.

[Falkoff & MacDonald 51] D. L. Falkoff and J. E. MacDonald, “On the Stokesparameters for polarized radiation,” J. Opt. Soc. Am. 41 (1951) 861–862.

[Fano 49] U. Fano, “Remarks on the classical and quantum-mechanical treatmentof partial polarization,” J. Opt. Soc. Am. 39 (1949) 859–863.

[Fano 54] U. Fano, “AStokes-parameter technique for the treatment of polarizationin quantum mechanics,” Phys. Rev. 93 (1954) 121–123.

[Farago 71] P. S. Farago, “Electron spin polarization,” Rep. Prog. Phys. 34 (1971)1055.

[Feynman & Hibbs 65] R. P. Feynman andA. R. Hibbs, Quantum Mechanics and PathIntegrals (McGraw-Hill, New York, 1965), p. 35.

[Gaveau et al. 84] B. Gaveau et al. “Relativistic extension of the analogy betweenquantum mechanics and Brownian motion,” Phys. Rev. Lett. 53 (1984)419–422.

[Georgi 09] H. Georgi, Lie Algebras in Particle Physics, 2nd ed. (Levant Books,Kolkata, India, 2009).

[Gersh 81] H. A. Gersh, “Feynman’s relativistic chessborad as an Ising model,” Int.J. Theor. Phys. 20 (1981) 491–501.

[Gottfried & Weisskopf 86] K. Gottfried and V. Weisskopf, Concepts of ParticlePhysics, Vol. II (Oxford U. P., New York, 1986).

Page 682: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

The Inertia of Polarization 655

[Heaviside 92] O. Heaviside, Electrical Papers, Vol. II (MacMillan, London, 1892),pp. 94–5.

[Heaviside 99] O. Heaviside, Electromagnetic Theory, Vol. II (The Electrician,London, 1899), Appendix D.

[Heaviside 12] O. Heaviside, Electromagnetic Theory, Vol. III (The Electrician,London, 1912), Sec. 514.

[Heisenberg 66] W. Heisenberg, Introduction to the Unified Field Theory of ElementaryParticles (Wiley, New York, 1966), p. 131.

[Jackson 05] J. D. Jackson, “Kinematics,” http://pdg.lbl.gov/2005/reviews/kinemarpp.pdf.

[Jackson 75] J. D. Jackson, Classical Electrodynamics, 2nd ed. (Wiley, New York,1975), Sec. 12.9.

[Jacobson & Schulman 84] T. Jacobson and L. S. Schulman, “Quantum stochastics:The passage from a relativistic to a non-relativistic path integral,” J. Phys.A: Math. Gen. 17 (1984) pp. 375–383.

[Jauch & Rohrlich 55] See, for example, J. M. Jauch and F. Rohrlich, Theory of Photonsand Electrons (Addison-Wesley, Reading MA, 1955), Sec. 2.8.

[Jones 41] R. Clark Jones, “A new calculus for the treatment of optical systems.I. Description and discussion of the calculus,” J. Opt. Soc. Am. 31 (1941)488–493.

[Kliger et al. 90] D. S. Kliger, J. W. Lewis, and C. E. Randall, Polarized Light in Opticsand Spectroscopy (Academic Press, Boston, 1990).

[Lamb & Retherford 47] W. E. Lamb, Jr. and R. C. Retherford, “ Fine structure of thehydrogen atom by a microwave method,” Phys. Rev. 72 (1947) 241–243.

[Landau & Lifshitz 59] L. D. Landau and E. M. Lifshitz, Fluid Mechanics (PergamonPress, Oxford, 1959).

[Lavenda 00] B. H. Lavenda, “Heisenberg’s Gitterwelt revisited,” Nuovo Cimento B115 (2000) 1385–1395.

[Lipkin 62] H. Lipkin, Beta Decay for Pedestrians (North-Holland,Amsterdam, 1962),p. 96.

[Lipkin 66] H. J. Lipkin, Lie Groups for Pedestrians (North-Holland, 1966), p. 138.[Maxwell 65] J. C. Maxwell, “A dynamical theory of the electromagnetic field,”

Phil. Trans. 155 (1865) 459.[McCrea 47] W. H. McCrea, Relativity Physics, 2nd ed. (Methuen, London, 1947),

p. 59.[McMaster 54] W. H. McMaster, “Polarization and the Stokes parameters,” Am. J.

Phys. 22 (1954) 351–362.[Moriyasu 83] K. Moriyasu, An Elementary Primer for Gauge Theory (World Scientific,

Singapore, 1983), p. 120.[Mueller 48] H. Mueller, “The foundations of optics,” J. Opt. Soc. Am. 37 (1948) 661.[Nahin 88] P. J. Nahin, Oliver Heaviside: Sage in Solitude (IEE Press, New York, 1988).[Omnès 70] R. Omnès, Introduction to Particle Physics (Wiley-Interscience, London,

1970), pp. 81ff.[Oppenheimer 30] J. R. Oppenheimer, “On the theory of electrons and protons,”

Phys. Rev. 35 (1930) 562.

Page 683: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-ch11

656 A New Perspective on Relativity

[Perrin 42] F. Perrin, “Polarization of light scattered by isotropic opalescentmedia,” J. Chem. Phys. 10 (1942) 415–427.

[Poincaré 92] H. Poincaré, Théorie Mathématique de la Lumiere (Georges Carré, Paris,1892), Vol. 2, Ch. 12.

[Rohrlich 65] F. Rohrlich, Classical Charged Particles (Addison-Wesley, Reading MA,1965).

[Sakurai 67] J. J. Sakurai, Advanced Quantum Mechanics (Addison-Wesley, ReadingMA, 1967), p. 293.

[Schweber 61] S. S. Schweber, An Introduction to Relativistic Quantum Field Theory(Harper & Row, New York, 1961), p. 110.

[Schweber 86] S. S. Schweber, “Feynman and the visualization of space-time pro-cesses,” Rev. Mod. Phys. 58 (1986) 449–507.

[Shurcliff 62] W. A. Shurcliff, Polarized Light: Production and Use (Harvard U. P.,Cambridge MA, 1962).

[Skilling 42] H. H. Skilling, Fundamentals of Electric Waves (Wiley, New York, 1942),Ch. XI.

[Soleillet 29] P. Soleillet, “Sur les paramètres caracterérisant la polarisation par-tielle de la lumière dans le phénomenènes de fluorescence,” Ann. de Phys.12 (1929) 23–97.

[Sommerfeld 34] A. Sommerfeld, Atombau und Spektrallininien, 5th ed. (Teubner,Berlin, 1934), p. 688.

[Stokes 52] G. G. Stokes, “On the composition and resolution of streams of polar-ized light from different sources,” Trans. Cambridge Phil. Soc. 9 (1852) 399;Mathematical and Physical Papers, Vol. 3 (Cambridge U. P., Cambridge,1901), p. 233.

[Welton 48] T. A. Welton, “Some observable effects of the quantum-mechanicalfluctuations of the electromagnetic field,” Phys. Rev. 74 (1948) 1157–1167.

Page 684: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

Index

aberration, 288, 414, 458, 501, 504angle, 414constant, 460in direction of motion, 395normal to the motion, 395stellar, 89, 111, 124, 459two-way, 526

Abraham’s model, 263, 304absolute constant, see curvature,

radiusabsolute instrument, 61acceleration, 408

centripetal, 147, 148, 228effect on clock rate, 387, 401, 404longitudinal and transverse, 503role in twin paradox, 298uniform, 406, 419, 431

accelerativeframe, 448motion, nonequivalent, 449

action and reaction law, 150, 185violated by Lorentz’s force, 186

action at a distance, 158, 177, 181, 200,207, 262

action principle, 218activity, see power densityadditivity principle, 571adiabatic process, 308

conservation of enthalpy, 308aether, 548, 618

absolute motion, 297conservation, 548density, 621dielectric constant, 626

Green’s, 129incompressible, 623increase in mass, 293jelly-like, 551luminiferous, 128, 551

in contrast to electric, 531mass, 292Maxwell’s equality of

compression and rotation, 623motion, 187necessity, 264, 292restores action and reaction, 186rotation, 131storage for energy, 290velocity, 161, 621wind, 113, 434

Ampère’s law, 153, 201, 207relation to Biot and Savart, 201

analytic continuation, 585angle defect, 281, 387, 414, 464, 467,

472, 503, 507cause of Lorentz contraction, 523caused by aberration, 503upper limit on, 89

angle excess, 281, 464, 570angle of parallelism, 82, 238, 283, 393,

397, 419, 420, 471, 472, 506, 512, 526,579

Bolyai–Lobaschevsky formulafor, 85, 450

for β-decay, 560from aberration, 397from pseudorapidity, 535

657

Page 685: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

658 A New Perspective on Relativity

link between hyperbolic andcircular functions, 420

lower bound for angle ofparallax, 419

angle, critical, 136angular momentum

Euclidean conservation, 433non-conservation, 166, 363, 437,

442, 462operators, 543orbital, 552, 615spin, 551, 615total, 541, 615, 639

antineutrino, 589asymptotic lines, 82

Balmer series, relativistic correction to,see Sommerfeld’s relativisticcorrection

Beltramicoordinates, 467flat model, 100metric, 87, 95, 218, 368, 410, 429,

433, 448, 467, 482, 494for a uniformly rotating

disc, 445pre-geodesic, 371relation to Liénard’s

radiation loss, 224model, 91, 98, 362, 523

deflection of light, 381Bessel function, 352, 475

modified, 476determining

thermodynamicproperties, 328

generating function for, 335spherical, 593, 598

β-decay, 555, 582Fermi’s theory, 555, 580

big bang, 497Biot–Savart law, 251birefrigent polarizer, 537birefringence, 137, 536, 539black hole, 410blackbody radiation, 319, 616blueshift, 526

Boltzmann’s law, 328Bolyai–Lobaschevsky formula, see

angle of parallelismBoyle’s law, see Mariotte’s law

Casimir invariant, 571Cauchy’s formula, 349caustic, 342, 343, 364, 366centrifugal

force, 343, 432, 439potential, effective, 462

charge conjugation symmetry,violation, 581

circle at infinity, 81, 509circuital equations, see Maxwell’s

equationsClairaut representation, 350, 367collision parameter, 380, 433color charges, 571compressional waves

hallmark of, 597Kelvin’s electromagnetic analogy,

553Compton wavelength, 329, 353Cooper pairs, 576, 608

as spin-0 bosons, 608coordinates

comoving, 410, 492controllable and uncontrollable,

308cyclic, 350homogeneous, 97, 389pseudospherical, 482Weierstrass, 97, 98, 416

Coriolis force, 343Coulomb’s

gauge, 602, 605, 610law, 185, 198, 201, 251, 357, 633,

650breakdown of, 574

covariance, Einstein’s principle, 431,447

cross-ratio, 67, 68, 70, 73, 80, 86, 190,285, 389, 413, 453, 463, 508, 522, 524

Cayley’s definition, 104, 511Poincaré’s definition, 103, 104,

512, 516

Page 686: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

Index 659

relation to Doppler shifts, 190relation to hyperbolic distance,

466cubic dilatation, 546curvature, 476

as fitting error, 464constant, 477Gaussian, 362, 366, 369, 370, 409,

458, 464, 474–476, 478, 498Einstein’s need to go

beyond, 430relation to scalar curvature,

481geodesic, 351mean, 370negative, 97, 464, 477

relation to source, 474non-constant, 464positive, 476, 477

relation to sink, 474versus negative, 465

principal, 370radius, 147, 191, 225, 228scalar, 161, 479, 481

as a criterion for emptiness,480

de Broglie’s relation, 122, 628density matrix, 540, 541, 585, 641

reduced, 587representation of Stokes

parameters, 564density of states, 329depolarization, 248, 282dichroism, 536, 539diffraction

short-wavelength limit, 342diffraction, short-wavelength limit,

643diffusion process, 630dipole radiation, 597Dirac’s equation

hydrogen-like solutions to, 641in presence of central force, 640momentum space, 641

nonequivalence to Weyl’sequations, 633

dispersion, 134, 192, 615displacement current, 131, 183, 199,

548dissipation, 585, 596

Kelvin’s principle, 621Doppler shift, 107, 117, 119, 182, 296,

319, 336, 344, 386, 421, 504, 515, 525compounding, 388, 508connection with hyperbolic

length, 190, 508exponential, 396first-order, 297for water waves, 294general, 396in β-decay, 558lack of in emission theories, 211longitudinal, 296, 335, 388, 400,

471, 558ordinary, 516relativistic, 516second-order, 121, 297

related to space contraction,468

successive occurrence withrotations, 453

transverse, 334two-way, 504, 507, 518, 526

as experimental test for theangle of parallelism, 509

prediction of redshift, 526used to distinguish Klein

and Poincaré models, 509duplex equations, 618dynamic equilibrium, 293

ecliptic plane, 461Ehrenfest’s paradox, 465eikonal, 136, 363, 428, 645Einstein’s

addition law, see velocitycomposition law

equations, 161, 417, 479, 495postulates, 2, 4, 208

Page 687: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

660 A New Perspective on Relativity

elastic constantscompressional, 358, 606, 621equality, 622rigidity, 358, 606rotational, 358, 606, 621

electricdisplacement, 291waves, longitudinal, 605, 613, 625

electrokineticmomentum, 131, 133, 181, 348,

547, 550, 591, 598, 601potential, 179, 261, 262

electromotive force, 206electron

classical radius, 308models, 262spin, 550

electroweak interaction, 540, 554, 573,574, 610

Lagrangian, 576mediators, 616theory, 542

ellipsoidoblate, 247, 273, 302prolate, 242, 248, 270

ellipticcircumference of circle, 429distance, 428geometry, 95, 287, 377, 570

aberration in, 288transformation to

hyperbolic, 428volume of a sphere, 376

metric, for Schwarschild innersolution, 483

plane, 67, 376space dilatation, 465, 647volume, 486

emission theory, 187, 192, 294, 434, 445Ritz’s, 188, 209

energybinding, 307blackbody radiation, 321conservation, 432, 520, 540density

compressional, 605, 623electromagnetic, 549

negative, 496, 607positive, 497rate of change of kinetic, 623

flux, 311gravitational, 253hyperbolic conservation, 463index, 496inertia, 141internal, 313kinetic

due to a magnetic field, 264Euclidean, 457, 467negative, 204random thermal motion,

309relativistic, 331resulting from random

thermal motion, 314levels of Dirac atom, 637negative, 549, 589rotation, 604radiant, 301relativistic, 534

conservation, 631total, 314

energy, relativistic, 576energy–momentum tensor, 161, 311,

332enthalpy, 143, 302, 303, 312, 313

density, 307total, 314

entropy, 313, 620blackbody radiation, 321relativistic invariant, 321

equivalence principle, 427, 431, 433,437, 443

downgraded by Einstein, 445Euclid’s fifth postulate, see Euclidean,

parallel postulateEuclidean

distance, 410geometry, 283, 397, 430length, 419, 522metric, 428parallel postulate, 79, 452

violation, 88

Page 688: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

Index 661

Euler’sequation, 409, 622relation, 318, 322

event horizon, 492extinction theorem, 192, 211

Faraday’sequation, 602tubes, 244, 291, 293, 533, 618

crowding, 293Fechner’s hypothesis, 202Fermat’s principle, 61, 134, 342, 346,

408, 449, 454, 643, 644Fermi transitions, 580fields

conditioning, 615fundamental, 613solenoidal and irrotational, 352

Fierz interference, 581first law, 316FitzGerald–Lorentz contraction, 116,

302, 312, 313, 315, 317, 387, 426, 450,464

as a rotation, 397shortening of circumference of

rotating disc, 465Fizeau’s drag coefficient, 109flat model, 361flux, quantum mechanical, 609forces

centrifugal and gravitational, 477centripetal, 150gravitational, 298quasi-longitudinal and

transversal, 147radiation reaction, 213, 217

Abraham’s, 215, 220self-reaction, 209torsional, 147

four-vector, 574, 575Abraham’s, 215dispersion relation, 611

time-like, 611, 615energy–momentum, 631invariance of, 579of electrodynamics, 181potential, 575, 600–602

of electroweak field, 574velocity, 226

Frenet–Serret equations, 226Fresnel’s dragging coefficient, 110, 111,

125, 192, 445Friedmann’s equation, 496fundamental form, 483

first, 95, 350, 367, 430, 433, 479,481

second, 370

gasdegenerate, 330, 333

pressure of, 333ideal, 330

gaugeinvariance, 554theories, 574transformation, 586

Gauss’s law, 606, 611, 614for gravitation, 158

geodesics, 99, 449, 487null, 492spreading of, 452

geodetic projection, 94Gitterwelt, 629, 630gravitation

Einstein’s theory of, 407Maxwellian theory of, 156Ritz’s theory of, 163

gravitationalcollapse, 463current, 349energy flux, 162field, 432

inhomogeneous, 445produced by acceleration,

444potential

Newton’s, 381, 431scalar, 409Schwarzschild’s, 371

radiation, 343redshift, 369vortex, 162waves, 163

propagation speed, 432

Page 689: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

662 A New Perspective on Relativity

traveling at the speed oflight, 161, 163

Grüneisenequation of state, 324, 333parameter, 330

gyroscopic motion, 228

Hamilton’s principle, 134heat, 316

conduction, 473diffusion equation, 475Fourier’s law, 475

heat content, see enthalpyHeaviside ellipsoid, 263helicity, 358, 540, 541, 543, 553, 588

inertial properties, 589longitudinal mode, 541operator, 632positive and negative, 550, 581reversal, 620, 642synonymous to circular

polarization, 562zero, 554

Helmholtz’selectromagnetic theory, 128, 620,

626equation, 135, 361, 550

Higgsfield, 576

analogous to aether, 576nonexistence, 554

mechanism, 549horizon, see circle at infinityhorocycle, 82, 85, 505, 536Hubble’s

law, 388, 421, 422parameter, 422

Hugoniot’s theorem, 625Huygens’s principle, 174, 448, 529hydrogen atom, 649, 650, 652

energy level splittings, 654gravitational analog, 377

hypercharge, 574hyperbolic

circumference of circle, 429distance, 80, 86, 412, 495, 511

as shortest, 397Klein’s definition, 511Poincaré’s definition, 512triangle inequality for, 72

geodesics, 452geometry, 91, 100, 535, 572

hallmark, 80transformation to elliptic,

570volume of a sphere, 378

involution, 389, 390law of cosines, 394law of sines, 387, 395length, 412line element, 406motion, 223parallel axiom, 461rotation, 393transition to elliptic, 96

ideal gas law, see Mariotte’s lawideal point, see point, at infinityideal triangle, 505incompressible fluid

condition for, 622, 623induction, 262, 554, 601inductive zone, 597inertia of energy

Heaviside’s law, 132Planck’s law, 307Thomson’s law, 149, 255

inversion, 58, 62involute, 363involution, 283involutory, 75isoperimetic quotient, 276isospin, 543, 571, 578, 586

jets, 583Jones and Mueller calculus, 537Joule

–Thomson process, 307heating, 596

K-calculus, 398Kepler’s law, 351

Page 690: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

Index 663

kinematiccondition of compatibility, 624relativity, 398

kinetic potential, 317Klein model, see projective, modelKlein–Gordon equation, 353, 587, 600,

608, 611, 612, 619

Lagrangianof random thermal motion, 323

Lamb shift, 638, 640, 653inability Dirac’s equation to

account for, 637Laplace’s equation, 357, 474, 623Larmor

formula, 223, 254generalization, 217reduction, 224

frequency, 647radiation term, 231

LC lumped circuit, 617least action, principle, 549Legendre’s equation, 593, 643libration, 442Lie

algebra, 573group

compact and non-compact,572

relation to non-Euclideangeometries, 572

product, 573Liénard’s

force, 180, 213–215, 230formula, 225potential, 252rate of energy loss, 216

relation to Beltrami metric,216

retarded potential, 182limit circle, 98limit cycle, see horocyclelimiting curve, see horocycle

Liouville–Beltrami half-plane model,91, 103

Lobaschevskian geometry, seehyperbolic geometry

Lobaschevsky–Friedmannmetric, 417space, 407

London’s equation, 609hallmark, 609

Lorentz–Dirac equation, 215boost, 502, 538, 572, 620electron, 148force, 154, 185, 217, 249

for magnetic charge, 250modifications, 153Thomson’s derivation, 290vanishing, 150violation of third law, 150,

532gauge, 602, 622, 626invariance, 106, 612, 613model, 263, 304

Abraham’s criticism, 305transformation, 93, 311, 313, 391,

400, 471derivation, 93for energy and momentum,

311

Mach’s principle, 375magnetic charge, 162, 249, 621

density, 618continuity equation, 618

magnetic field, internal, 609magnetic monopoles, 618magnetons, see magnetic monopolesMariotte’s law, 330, 331mass

anisotropy, 293as a vector, 146, 532density

Lorentz invariant, 307dipole moment, 343electromagnetic, 251, 302, 532electrostatic, 251, 269

Page 691: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

664 A New Perspective on Relativity

elliptically polarized component,638

enthalpy equivalence, 314increase by heat, 301invariancy, 583longitudinal, 146, 270measure of correlation of helicity

states, 589missing in collider production,

535operators, 143, 544polarization, 536, 542, 559, 583rest, 143, 144, 589

components, 148shell, 537, 590transverse, 146, 153, 193, 269, 534,

535, 588and longitudinal, 250, 532,

544, 582as vectors, 532

mass and energy, 289, 296based on conservation of

momentum, 293Einstein’s equivalence, 193, 296Poynting’s equivalence, 124, 273Thomson’s equivalence, 291, 292

mass and heat content, 314Planck’s equivalence, 298

maximum likelihood estimate, 336Maxwell’s

fish-eye, 61speed distribution, 334, 336theorem, 61

Maxwell’s equations, 128, 351, 619along a cable, 627and radiation pressure, 515defect, 183, 200spherical waves, 592relation to hyperbolic geometry,

468transverse, 129

and massless, 576Meissner effect, 554, 608Minkowski’s metric, 93, 430, 490, 542Möbius

automorphism, 392

transform, 57, 72, 104, 285, 286,392

Mössbauer effect, 526momentum

blackbody radiation, 321conservation, 295

role of aether, 292electrokinetic, see electrokinetic

momentumpolarization, 538transverse, 534

neutral currents, 574Newton’s law, 132, 410

Ampère’s contradiction, 178applied to the universe, 496correction to, 165

nonrelativistic limit, 638

Ohm’s law, 596, 617, 618Olber’s paradox, 396optical path length, 346, 448, 449, 643optico-gravitational phenomena, 342

parallax, 88, 472, 501, 535angle, 472

parity inversion, see β-decayparity violation, law, 581particle number, 329

non-conservation, 330partition function, relativistic, 328Pauli

equation, 584spin matrices, 543, 573, 584spinor, 586

equation, 584perihelion, advance, 363, 381, 437

Ritz’s priority, 166perpetual motion, 183, 204, 230, 317

as used by Carnot, 203as used by Helmholtz, 203as used by Poynting, 145as used by Ritz, 183, 230

perspectivity, 69–71phase transition, second-order, see

spontaneous symmetry-breaking

Page 692: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

Index 665

photon spin, 552Planck’s

hypothesis, 311relation, 314, 553

Poincarédefinition of hyperbolic distance,

525disc model, 65, 81, 100, 101, 392,

451, 508, 524Beltrami’s discovery, 91onto Klein model, 393

half-plane model, 79, 80, 100, 449,450

principle of relativity, 140representation, 145, 568sphere, 538, 562, 563stress, 253, 302, 306, 326, 468, 470

non-electromagnetic originof, 307

Poincaré, pressure, see Poincaré, stresspoint

antipodal, 94, 377at infinity, 69, 70, 79, 91, 97, 98,

464conjugate, 391fixed, 73, 284, 391

repulsive, 390Poisson’s

equation, 361, 610generalized, 363Riemann’s modification, 207

equations, 547law for thermal conduction, 473

polarization, 129, 160, 163, 329, 531circular, 537, 553, 587, 647complete, 535, 587

condition, 584degree, 581elementary particles, 550ellipse, 563elliptic, 577inertia, 144linear, 541, 647longitudinal, 581, 587plane, 588

potentialadvanced, 196, 200

nonexistence, 183chemical, 330conditioning, 615dipole, 273four-vector of electromagnetic

field, 574internal, 608irrotational, 625kinetic, 312, 314

Legendre transform, 318logarithmic, 235, 474Newton’s, 371retarded, 162, 181, 209, 229, 590,

615FitzGerald’s, 182Heaviside’s interpretation,

612irreversibility, 183Riemann’s first

introduction, xlviiRitz’s modification, 187transformation into

advanced, 229scalar, 547, 622solenoidal, see Coulomb gaugevector, see electrokinetic,

momentumvelocity, 623

power density, 254, 270, 599, 600, 604Poynting’s vector, 249, 292, 304, 307,

533, 551gravitational, 162

pre-acceleration, 231pre-geodesic, 368, 375pressure, 312, 330, 624

blackbody radiation, 321due to crowding of Faraday

tubes, 293hydrostatic, 143, 358, 622

analogy to, 554electromagnetic analog, 163,

623Lorentz-invariant, 305negative, 302

Page 693: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

666 A New Perspective on Relativity

radiation, 118, 212, 326, 509, 515,521, 551, 623

critical angle, 524Maxwell’s prescription, 515

relativistic invariant, 321, 323, 327principal elongations, 546probability amplitude for path

reversal, 585Proca’s equations, 600

modified, 608projective

correspondence, 68geometry, 80, 91, 94, 97invariant, see cross-ratiomodel, 96, 106, 362, 392, 508, 509,

523, 524plane, 71

projectivity, 283propagator, 587pseudorapidity, 535pseudosphere, 82, 84, 90, 95, 96, 367,

394, 418, 460, 464, 494, 539as a surface of revolution of the

tractrix, 275volume, 439

Pythagorean theoremelliptic, 535, 570, 644, 645Euclidean, 448, 467, 488hyperbolic, 283, 394, 467, 469, 539

quadrilateral, Lambert, 416, 420quadrupole interaction, 343, 381, 382quantization condition, 639quantum number, 632

angular momentum, 357azimuthal, 356, 637, 645, 647helicity, 359magnetic, see azimuthalorbital, 637principal, 637radial, 637spin, 359

radar method, see K-calculusradiation, 294, 554

by an accelerated charge, 264damping, 229

gauge, see Coulomb’s gaugemodel, 590non-thermal, 616reaction force, 212, 220, 226

components, 227vanishing, 225, 227

zone, 597radiation, pressure, see pressure,

radiationradius of curvature, 84, 463, 472

as the absolute constant, 450rapidity, 309, 311, 388, 578

example of a non-compact Liegroup, 572

in hadron colliders, 533stochastic, 337

Rayleighdistribution, 336scattering, 543, 570

redshift, 421exponential, 422gravitational, 436longitudinal and transverse, 396

reflectionlaw of in motion, 514total, 136

refraction, 128index, 61, 76, 133, 192, 211, 342,

348, 358, 562, 618Cauchy’s expression, 349elliptic space, 61mechanical analog, 346

refraction, double, see birefringencerelativity

and spin, 637Einstein’s principle, 445Poincaré’s principle, 295

Ricci tensor, 161, 432contracted, 409, 479

Riemann’s metric, for non-Euclideangeometries, 90, 495

rigid body, 137, 448Ritz’s force, 153, 180, 254

gravitational, 166reduction to Liénard’s, 180

Robertson–Walker metric, 418, 484rotating disc, 453, 484

Page 694: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

Index 667

Einstein’s analogy withgravitation, 445

Rydberg constant, 637

Schott radiation term, 230Schrödinger’s equation, 357, 630Schwarzschild

electrokinetic potential, 185force, 229metric, 346, 370, 434, 477, 484

exterior solution, 434, 479,481

interior solution, 477, 484example of elliptic

geometry, 484radius, 168, 171, 347, 433, 478

in terms of density, 463second law, 230Shapiro effect, 348singularity, 478Snell’s law, 77, 125, 136, 643Sommerfeld’s relativistic correction,

637due to fine-structure, 631

space-time paths, relativistic, 585spectral lines, gravitational shift of,

378spin, 135, 541

–orbit interaction, 632, 639multiplets, 544

spinors, 541spontaneous symmetry-breaking, 576,

612Sagnac effect, 434

explained by emission theory, 435generalized, 439non-conservation of angular

momentum, 435standard theory, 574Stefan’s law, 319, 321stereographic

arclengths, 486inner product, 440, 473

distortion, 64, 451for non-constant curvature,

478positive definiteness, 478

model, 362, 437plane, 64projection, 65, 91, 96, 369, 509

of spin, 558, 581, 584of the complex plane onto

the Poincaré sphere, 565,567

Stigler’s law of eponymy, 100, 119, 168,290, 311, 416, 502, 537

Stokes momentum space, 584Stokes parameters, 145, 534, 560, 638

and spin, 577analogy with angular

momentum, 545and polarization of elementary

particles, 550and SU(2), 544as a four-vector, 542, 565density matrix representation, 632the Dirac equation, 638

strain, 222, 545irrotational, 545longitudinal, 548solenoidal, 545transversal, 548

stress, 142Maxwell, 551, 568, 569shear, 358, 531tangential, 551tensor, 311thermal, 474

strong interaction, 571superconductivity, 576, 609, 622

hallmark, 352, 609superluminal speeds, 9, 293, 430, 627,

629surfaces of discontinuity, 624

tachyons, see superluminal speedstelegraph equation, 585temperature, 328, 329

limits, 328relativistic variation, 313

tessellations, 106thermal conductivity, 473Thomas precession, 502, 507

Borel’s priority, 502

Page 695: A New Perspective on Relativity

Aug. 26, 2011 11:17 SPI-B1197 A New Perspective on Relativity b1197-Index

668 A New Perspective on Relativity

timecoordinate, 409curvature, lack of significance,

481dilatation, 114, 115, 223, 296, 347,

400, 427, 491, 516, 525second-order, 298

free-fall, 349, 363, 370, 431, 448,463

geometric mean, 406, 407, 418local, 189logarithmic, 405, 418, 422proper, 217, 409reflection, 399

time, hyperbolic, see time, logarithmictorsion, radius of, 147, 226, 227tractrix, 90, 282, 366tranversality condition, 602triangle inequality, 58, 71, 72

cross-ratio, 71inverse cosine, 493

twin paradox, 297

ultraparallels, 82universe

closed, 492flat, 492open, 485, 492

vacuum, 549relation to aether, 650

velocityabsolute, 180, 189, 199

versus relative, 115complementary, 419composition law, 106, 190, 192,

208, 210, 286, 309, 389, 414, 421,459, 502, 645

argument against emissiontheories, 192

as the isomorphism fromKlein to Poincaré models,509

collinear, 126

Lorentz transforms from,308

group, 134, 629hyperbolic measure, 189, 560phase, 134, 629random, 308relative, 191

vibrancy conditionlongitudinal Doppler shift, 335transverse Doppler shift, 334, 336

virial, 323, 330Clausius’s, 180theorem, 323

vortex, 547velocity, 626

W -bosons, 574wave

compressional, see wave,longitudinal

condensation, 361, 553longitudinal, 128, 358, 553, 605,

614electric, 160model of hydrogen atom,

135transverse, 604transverse and longitudinal, 129,

163weak isotopic charge, 574Weber’s force, 178

applied to gravity, 164Weinberg angle, 574Weyl’s equations, 619, 631–633

Dirac’s transformation, 638Wien’s

displacement law, 319distribution, 335, 336, 517

Wigner angle, 502

Yukawa’s potential, 574, 612, 615

Z-boson, 576Zeeman effect, 647