ArticlePDF Available

Abstract

This is an introduction to spacetime algebra (STA) as a unified mathematical language for physics. STA simplifies, extends, and integrates the mathematical methods of classical, relativistic, and quantum physics while elucidating geometric structure of the theory. For example, STA provides a single, matrix-free spinor method for rotational dynamics with applications from classical rigid body mechanics to relativistic quantum theory-thus significantly reducing the mathematical and conceptual barriers between classical and quantum mechanics. The entire physics curriculum can be unified and simplified by adopting STA as the standard mathematical language. This would enable early infusion of spacetime physics and give it the prominent place it deserves in the curriculum.
Spacetime Physics with Geometric Algebra
1
David Hestenes
Department of Physics and Astronomy
Arizona State University, Tempe, Arizona 85287-1504
This is an introduction to
spacetime algebra (STA) as a unified mathematical
language for physics. STA simplifies, extends and integrates the mathemat-
ical methods of classical, relativistic and quantum physics while elucidating
geometric structure of the theory. For example, STA provides a single, matrix-
free spinor method for rotational dynamics with applications from classical
rigid body mechanics to relativistic quantum theory thus significantly reduc-
ing the mathematical and conceptual barriers between classical and quantum
mechanics. The entire physics curriculum can be unified and simplified by
adopting STA as the standard mathematical language. This would enable
early infusion of spacetime physics and give it the prominent place it deserves
in the curriculum.
I. Introduction
Einstein’s Special Theory of Relativity has been incorporated into the founda-
tions of theoretical physics for the better part of a century, yet it is still treated
as an add-on in the physics curriculum. Even today, a student can get a PhD in
physics with only a superficial knowledge of Relativity Theory and its import.
I submit that this sorry state of affairs is due, in large part, to serious language
barriers. The standard tensor algebra of relativity theory so differs from ordi-
nary vector algebra that it amounts to a new language for students to learn.
Moreover, it is not adequate for relativistic quantum theory, which introduces
a whole new language to deal with spin and quantization. The learning curve
for this language is so steep that only graduate students in theoretical physics
ordinarily attempt it. Thus, most physicists are effectively barred from a work-
ing knowledge of what is purported to be the most fundamental part of physics.
Little wonder that the majority is content with the nonrelativistic domain for
their research and teaching.
Beyond the daunting language barrier, tensor algebra has certain practical
limitations as a conceptual tool. Aside from its inability to deal with spinors,
standard tensor algebra is coordinate-based in an essential way, so much time
must be devoted to proving covariance of physical quantities and equations. This
reinforces reliance on coordinates in the physics curriculum, and it obscures the
fundamental role of geometric invariants in physics. We can do better much
better!
This is the second in a series of articles introducing geometric algebra (GA)
as a unified mathematical language for physics. The first article
1
(hereafter
1
Published in Am.J.Phys, 71 (6), June 2003.
1
referred to as GA1) shows how GA simplifies and unifies the mathematical
methods of classical physics and nonrelativistic quantum mechanics. This article
extends that unification to spacetime physics by developing a spacetime algebra
(STA) expressly designed for that purpose. A third article is planned to present
a profound and surprising extension of the language to incorporate General
Relativity.
2
Although this article provides a self-contained introduction to STA, the seri-
ous reader is advised to study GA1 first for background and motivation. This
is not a primer on relativity and quantum mechanics. Readers are expected
to be familiar with those subjects so they can make their own comparisons of
standard approaches to the topics treated here. Topics have been selected to
showcase unique advantages of STA rather than for balanced coverage of every
subject. Nevertheless, topics are developed in sufficient detail to make STA
useful in instruction and research, at least after some practice and consultation
with the literature. The general objectives of each Section in the article can be
summarized as follows:
Section II presents the defining grammar for STA and introduces basic defi-
nitions and theorems needed for coordinate-free formulation and application of
spacetime geometry to physics.
Section III distinguishes between proper (invariant) and relative formulations
of physics. It introduces a simple algebraic device called the spacetime split
to relate proper descriptions of physical properties to relative descriptions with
respect to inertial systems. This provides a seamless connection of STA to the
GA of classical physics in GA1.
Section IV extends the treatment of rotations and reflections in GA1 to a
coordinate-free treatment of Lorentz transformations on spacetime. The method
is more versatile than standard methods, because it applies to spinors as well
as vectors, and it reduces the composition of Lorentz transformations to the
geometric product.
Lorentz invariant physics with STA obviates any need for the passive Lorentz
transformations between coordinate systems that are required by standard co-
variant formulations. Instead, Section V uses the spinor form of an active
Lorentz transformation to characterize change of state along world lines. This
generalizes the spinor treatment of classical rigid body mechanics in GA1, so it
articulates smoothly with nonrelativistic theory. It has the dual advantages of
simplifying solutions of the classical Lorentz force equation while generalizing
it to a classical model of an electron with spin that is shown to be a classical
limit of the Dirac equation in Section VIII.
Section VI shows how STA simplifies electromagnetic field theory, including
reduction of Maxwell’s equations to a single invertible field equation. It is most
notable that this simplification comes from recognizing that the famous “Dirac
operator” is just the STA derivative with respect to a spacetime point, so it is
as significant for Maxwell’s equation as for Dirac’s equation.
Section VII reformulates Dirac’s famous equation for the electron in terms
of the real STA, thereby showing that complex numbers are superfluous in
relativistic quantum theory. STA reveals geometric structure in the Dirac wave
2
function that has long gone unrecognized in the standard matrix theory. That
structure is explicated and analyzed at length to ascertain implications for the
interpretation of quantum theory.
Section VIII discusses alternatives to the Copenhagen interpretation of quan-
tum mechanics that are motivated by geometric analysis of the Dirac theory.
The questions raised by this analysis may be more important than the conclu-
sions. My own view is that the Copenhagen interpretation cannot account for
the structure of the Dirac theory, but a fully satisfactory alternative remains to
be found.
Finally, Section IX outlines how STA can streamline the physics curriculum
to give the powerful ideas of relativistic field theory and quantum mechanics
roles that are commensurate with their importance.
II. Spacetime Algebra
The standard model for spacetime is a real 4D Minkowski vector space M
4
called
Minkowski spacetime or (by suppressing the distinction between the model and
the physical reality it is supposed to represent) simply spacetime. With vector
addition and scalar multiplication taken for granted, we impose the geometry
of spacetime on M
4
by defining the geometric product uv for vectors u, v, w by
the following rules:
(uv)w = u(vw) , associative (1)
u(v + w)=uv + uw , left distributive (2)
(v + w)u = vu + wu , right distributive (3)
v
2
=
v
|v |
2
, contraction (4)
where
v
is the signature of v and the magnitude |v | is a real positive scalar. As
usual in spacetime physics, we say that v is timelike if its signature is positive
(
v
= 1), spacelike if (
v
= 1), or lightlike if |v | = 0, which is equivalent to
null signature (
v
= 0).
It should be noted that these are the same rules defining the “classical ge-
ometric algebra” in GA1, except for the signature in the contraction rule (4)
that allows vectors to have negative or null square. (This modification was the
great innovation of Minkowski that we honor by invoking his name!)
Spacetime vectors are denoted by italic letters to distinguish them from the
3D vectors denoted by boldface letters in GA1. This convention is especially
helpful when we formulate relations between the two kinds of vector in Section
III.
By successive multiplications and additions, the vectors of M
4
generate a
geometric algebra G
4
= G(M
4
) called spacetime algebra (STA). As usual in a
geometric algebra, the elements of G
4
are called multivectors. The above rules
defining the geometric product are the basic grammar rules of STA.
In reviewing its manifold applications to physics, one can see that STA derives
astounding power and versatility from
3
the simplicity of its grammar,
the geometric meaning of multiplication,
the way geometry links the algebra to the physical world.
As we have seen before, the geometric product uv can be decomposed into a
symmetric inner product
u · v =
1
2
(uv + vu)=v · u, (5)
and an antisymmetric outer product
u v =
1
2
(uv vu)=v u. (6)
so that
uv = u · v + u v. (7)
To facilitate coordinate-free manipulations in STA, it is useful to generalize
the inner and outer products of vectors to arbitrary multivectors. We define the
outer product along with the notion of k-vector iteratively as follows: Scalars are
defined to be 0-vectors, vectors are 1-vectors, and bivectors, such as uv,are2-
vectors. For a given k-vector K, the integer k is called the step (or grade) of K.
For k 1, the outer product of a vector v with a k-vector K is a (k + 1)-vector
defined in terms of the geometric product by
v K =
1
2
(vK +(1)
k
Kv)=(1)
k
K v. (8)
The corresponding inner product is defined by
v · K =
1
2
(vK +(1)
k+1
Kv)=(1)
k+1
K · v, (9)
and it can be proved that the result is a (k 1)-vector. Adding (8) and (9) we
obtain
vK = v · K + v K, (10)
which obviously generalizes (7). The important thing about (10), is that it
decomposes vK into (k 1)-vector and (k + 1)-vector parts.
A basis for STA can be generated by a standard frame {γ
µ
;0, 1, 2, 3} of
orthonormal vectors, with timelike vector γ
0
in the forward light cone and com-
ponents g
µν
of the usual metric tensor given by
g
µν
= γ
µ
· γ
ν
=
1
2
(γ
µ
γ
ν
+ γ
ν
γ
µ
) . (11)
(We use c = 1 so spacelike and timelike intervals are measured in the same
unit.) The γ
µ
determine a unique righthanded unit pseudoscalar
i = γ
0
γ
1
γ
2
γ
3
= γ
0
γ
1
γ
2
γ
3
. (12)
It follows that
i
2
= 1 , and γ
µ
i =
µ
. (13)
4
Thus, i is a geometrical
1, but it anticommutes with all spacetime vectors.
By forming all distinct products of the γ
µ
we obtain a complete basis for the
STA G
4
consisting of the 2
4
= 16 linearly independent elements
1
µ
µ
γ
ν
µ
i, i . (14)
To facilitate algebraic manipulations it is convenient to introduce the recip-
rocal frame {γ
µ
} defined by the equations
γ
µ
= g
µν
γ
ν
or γ
µ
· γ
ν
= δ
ν
µ
. (15)
(summation convention in force!) Now, any multivector can be expressed as a
linear combination of the basis elements (14). For example, a bivector F has
the expansion
F =
1
2
F
µν
γ
µ
γ
ν
, (16)
with its “scalar components” F
µν
given by
F
µν
= γ
µ
· F · γ
ν
= γ
ν
· (γ
µ
· F )=(γ
ν
γ
µ
) · F. (17)
Note that the two inner products in the second form can be performed in either
order, so a parenthesis is not needed.
The entire spacetime algebra is obtained by taking linear combinations of
basis k-vectors in (14).
A generic element M of the STA, called a multivector, can therefore be written
in the expanded form
M = α + a + F + bi + βi , (18)
where α and β are scalars, a and b are vectors, and F is a bivector. This is a
decomposition of M into its k-vector parts, with k =0, 1, 2, 3, 4, as is expressed
more explicitly by putting (18) in the form
M =
4
k=0
M
(k)
, (19)
where the subscript (k)means“k-vector part.” Of course, M
(0)
= α, M
(1)
= a,
M
(2)
= F , M
(3)
= bi, M
(4)
= βi. Alternative notations include M
S
= M =
M
(0)
for the scalar part of a multivector. The scalar part of a product behaves
much like the “trace” in matrix algebra. For example, we have the very useful
theorem MN = NM for arbitrary M and N.
Computations are also facilitated by the operation of reversion, the name
indicating reversal in the order of geometric products. For M in the expanded
form (18) the reverse M
canbedenedby
M
= α + a F bi + βi . (20)
Note, in particular, the effect of reversion on the various k-vector parts.
α = α, a = a,
F = F,
˜
i = i. (21)
5
It is not difficult to prove that
(MN)
=
NM
, (22)
for arbitrary M and N. For example, in (20) we have (bi)
= ib = bi,where
the last sign follows from (13).
A positive definite magnitude |M | for any multivector M can now be defined
by
|M |
2
= |MM
|. (23)
Any multivector M can be decomposed into the sum of an even part M
+
and
an odd part M
defined in terms of the expanded form (18) by
M
+
= α + F + βi , (24)
M
= a + bi , (25)
or, equivalently, by
M
±
=
1
2
(M iMi) . (26)
The set {M
+
} of all even multivectors forms an important subalgebra of STA
called the even subalgebra.
If ψ is an even multivector, then ψ
ψ is also even, but its bivector part must
vanish according to (20), since (ψ
ψ)
= ψ
ψ. Therefore, ψ
ψ has only scalar and
pseudoscalar parts, as expressed by writing
ψ
ψ = ρe
= ρ(cos β + i sin β) , (27)
where ρ 0andβ are scalars. If ρ = 0 we can derive from ψ an even multivector
R = ψ(ψ
ψ)
1
2
satisfying
RR
= R
R =1. (28)
Then ψ can be put in the canonical form
ψ =(ρe
)
1
2
R (29)
We shall see that this invariant decomposition has a fundamental physical sig-
nificance in the Dirac Theory.
An important special case of the decomposition (29) is its application to a
bivector F , for which it is convenient to replace β/2byβ + π/2andwrite
f = ρ
1
2
Ri. Thus, for any bivector F that is not null (F
2
=0)wehavethe
invariant canonical form
F = fe
= f(cos β + i sin β) , (30)
where f
2
= f
f = |f |
2
,sof is said to be a timelike bivector with magnitude
|f |. Similarly, the dual if is said to be a spacelike bivector, since (if )
2
= −|f |
2
.
6
Thus the right side of (30) is the unique decomposition of F intoasumof
mutually commuting timelike and spacelike parts.
When F
2
=0,F is said to be a lightlike bivector, and it can still be written
in the form (30) with
f = k e = ke , (31)
where k is a null vector and e is a spacelike vector orthogonal to k.Inthis
case, the decomposition is not unique, and the exponential factor can always be
absorbed in the definition of f.
To extend spacetime algebra into a complete spacetime calculus, suitable def-
initions for derivatives and integrals are required. Though that can be done in a
completely coordinate-free way,
6
it is more expedient here to exploit one’s prior
knowledge about coordinates.
For each spacetime point x a standard frame {γ
µ
} determines a set of “rect-
angular coordinates” {x
µ
} given by
x
µ
= γ
µ
· x and x = x
µ
γ
µ
. (32)
In terms of these coordinates the derivative with respect to a spacetime point
x is an operator
x
that can be defined by
= γ
µ
µ
, (33)
where
µ
is given by
µ
=
∂x
µ
= γ
µ
·
. (34)
Thesquareof
is the usual d’Alembertian
2
= g
µν
µ
ν
where g
µν
= γ
µ
· γ
ν
. (35)
The matrix representation of the vector derivative
can be recognized as the
socalled “Dirac operator,” originally discovered by Dirac by seeking a “square
root” of the d’Alembertian (35) in order to find a first order “relativistically
invariant” wave equation for the electron. In STA however, where the γ
µ
are
vectors rather than matrices, it is clear that
is a vector operator; indeed, it
provides an appropriate definition for the derivative with respect to any space-
time vector variable.
Contrary to the impression given by conventional accounts of relativistic quan-
tum theory, the operator
is not specially adapted to spin-
1
2
wave equations.
It is equally apt for electromagnetic field equations, as seen in Section VI.
This is a good point to describe the relation of STA to the standard Dirac
algebra. The Dirac matrices are representations of the vectors γ
µ
in STA by
4 × 4 matrices, and to emphasize this correspondence the vectors here are de-
noted with the same symbols γ
µ
ordinarily used to represent the Dirac matrices.
In view of what we know about STA, this correspondence reveals the physi-
cal significance of the Dirac matrices, appearing so mysteriously in relativistic
7
quantum mechanics: The Dirac matrices are no more and no less than matrix
representations of an orthonormal frame of spacetime vectors and thereby they
characterize spacetime geometry. But how can this be? Dirac never said any
such thing! And physicists today regard the set {γ
µ
} as a single vector with
matrices for components. Nevertheless, their practice shows that the “frame
interpretation” is the correct one, though we shall see later that the “compo-
nent interpretation” is actually equivalent to it in certain circumstances. The
correct interpretation was actually inherent in Dirac’s argument to derive the
matrices in the first place: First he put the γ
µ
in one-to-one correspondence
with orthogonal directions in spacetime by indexing them. Second, he related
the γ
µ
to the metric tensor by imposing the “peculiar condition” (11) on the
matrices for formal algebraic reasons. But we see in (11) that this condition
has a clear geometric meaning in STA as the inner product of vectors in the
frame. Finally, Dirac introduced associativity automatically by employing ma-
trix algebra, without realizing that it has a geometric meaning in this context.
If indeed the physical significance of the Dirac matrices derives entirely from
their interpretation as a frame of vectors, then their specific matrix properties
must be irrelevant to physics. That is proved in Section VII by dispensing with
matrices altogether and formulating the Dirac theory entirely in terms of STA.
In relativistic quantum mechanics one often encounters the notation γ · p =
γ
µ
p
µ
,whereγ is regarded formally as a vector with matrices γ
µ
as components
and p is an ordinary vector. Likewise, the Dirac operator is denoted by γ · =
γ
µ
µ
without recognizing it as a generic vector derivative with components
µ
.
The notation γ · p has the same deficiencies as the notation σ · a criticized in
GA1. In STA it is inconsistent with identification of {γ
µ
} as an orthonormal
frame.
III. Proper Physics and Spacetime Splits
STA makes it possible to formulate and analyze conventional relativistic physics
in invariant form without reference to a coordinate system. To emphasize the
distinctive features of this formulation, I like to call it proper physics.” From the
proper point of view, the term “relativistic mechanics” is a misnomer, because
the theory is less rather than more relativistic than the so-called “nonrelativis-
tic” mechanics of Newton. The equations describing a particle in Newtonian
mechanics depend on the motion of the particle relative to some observer; in
Einstein’s mechanics they do not. Einstein originally formulated his mechanics
in terms of “relative variables” (such as the position and velocity of a particle
relative to a given observer), but he eliminated dependence of the equations on
the observer’s motion by the “relativity postulate,” which requires that the form
of the equations be invariant under a change of relative variables from those of
one inertial observer to those of another. Despite the taint of misnomer, the
terms “relativistic” and “nonrelativistic” are so ensconced in the literature that
it is awkward to avoid them.
Minkowski’s covariant formulation of Einstein’s theory replaced the explicit
8
use of variables relative to inertial observers by components relative to an ar-
bitrary coordinate system for spacetime. The proper formulation”givenhere
takes another step to move from covariance to invariance by relating particle
motion directly to Minkowski’s “absolute spacetime” without reference to any
coordinate system. Minkowski had the great idea of interpreting Einstein’s the-
ory of relativity as a prescription for fusing space and time into a single entity
“spacetime”.
5
The straightforward algebraic characterization of “Minkowski
spacetime” by spacetime algebra makes a proper formulation of physics possi-
ble.
The history or world line of a material particle is a timelike curve x = x(τ )in
spacetime. Particle conservation is expressed by assuming that the function x(τ )
is single-valued and continuous except possibly at discrete points where particle
creation and/or annihilation occurs. Only differentiable particle histories are
considered here, and τ always refers to the proper time (arc length) of a particle
history. After a unit of length (say centimeters) has been chosen, the physical
significance of the spacetime metric is fixed by the assumption that the proper
time of a material particle is equal to the time (in centimeters) recorded on a
(perhaps hypothetical) clock traveling with the particle.
The unit tangent v = v(τ)=dx/dτ x
.
of a particle history will be called
the (proper) velocity of the particle. By the definition of proper time, we have
= |dx | = |(dx)
2
|
1
2
,and
v
2
=1. (36)
The term “proper velocity,” is preferable to the alternative terms “world veloc-
ity,” “invariant velocity,” and “four velocity.” The adjective “proper” is used to
emphasize that the velocity v describes an intrinsic property of the particle, in-
dependent of any observer or coordinate system. The adjective “absolute” would
do the same, but it may not be free from undesirable connotations. Moreover,
the word “proper” is shorter and has already been used in a similar sense in
the terms “proper mass” and “proper time.” The adjective “invariant” is inap-
propriate, because no coordinates or transformation group has been introduced.
The velocity should not be called a “4-vector,” because that term means pseu-
doscalar in STA; besides, there is no need to refer to four components of the
velocity.
Though STA enables us to describe physical processes by proper equations,
observations and measurements are often expressed in terms of variables tied
to a particular inertial system, so we need to know how to reformulate proper
equations in terms of those variables. STA provides a very simple way to do
that called a spacetime split.
In STA a given inertial system is completely characterized by a single future-
pointing, timelike unit vector. Refer to the inertial system characterized by the
vector γ
0
as the γ
0
-system. The vector γ
0
is tangent to the world line of an
observer at rest in the γ
0
-system, so it is convenient to use γ
0
as a name for
the observer. The observer γ
0
is represented algebraically in STA in the same
way as any other physical system, and the spacetime split amounts to no more
9
than comparing the motion of a given system (the observer) to other physical
systems. Indeed, the world line of an inertial observer is the straight world
line of a free particle, so inertial frames can be characterized by free particles
without the anthropomorphic reference to observers.
An inertial observer γ
0
determines a unique mapping of spacetime into the
even subalgebra of STA. For each spacetime point (or event) x the mapping is
specified by
0
= t + x , (37)
where
t = x · γ
0
(38)
and
x = x γ
0
. (39)
This defines the γ
0
-split of spacetime. Equation (38) assigns a unique time t
to every event x; indeed, (38) is the equation for a one-parameter family of
spacelike hyperplanes with normal γ
0
.
Equation (39) assigns to each event x a unique position vector x in the γ
0
system. Thus, to each event x the single equation (37) assigns a unique time
andpositionintheγ
0
-system. Note that the reverse of (37) is
γ
0
x = γ
0
· x + γ
0
x = t x , (40)
so, since γ
2
0
=1,
x
2
=(
0
)(γ
0
x)=(t x)(t + x)=t
2
x
2
. (41)
The form and value of this equation are independent of the chosen observer; thus
we have proved that the expression t
2
x
2
is Lorentz invariant without even
mentioning a Lorentz transformation. Thus, the term “Lorentz invariant” can
be construed as meaning “independent of a chosen spacetime split.” In contrast
to (41), equation (37) is not Lorentz invariant; indeed, for a different observer
γ
0
we get the split
0
= t
+ x
. (42)
Mostly we shall work with manifestly Lorentz invariant equations, which are
independent of even an indirect reference to an inertial system.
The set of all position vectors (39) is the 3-dimensional position space of the
observer γ
0
, which we designate by P
3
= P
3
(γ
0
)={x = x γ
0
}. Note that P
3
consists of all bivectors in STA with γ
0
as a common factor. In agreement with
common parlance, we refer to the elements of P
3
as vectors. Thus, we have two
kinds of vectors, those in M
4
and those in P
3
. To distinguish between them,
we refer to elements of M
4
as proper vectors and to elements of P
3
as relative
10
vectors (relative to γ
0
, of course!). To keep the discussion clear, relative vectors
are designated in boldface, while proper vectors are not.
By the geometric product and sum, the vectors in P
3
generate the entire even
subalgebra of STA as the geometric algebra G
3
= G(P
3
) employed for classical
physics in GA1. This is made obvious by constructing a basis. Corresponding
to a standard basis {γ
µ
} for M
4
, we have a standard basis {σ
k
; k =1, 2, 3} for
P
3
,where
σ
k
= γ
k
γ
0
= γ
k
γ
0
. (43)
These generate a basis for the relative bivectors:
σ
i
σ
j
= σ
i
σ
j
= iσ
k
= γ
j
γ
i
, (44)
where the allowed values of the indices {i, j, k} are cyclic permutations of 1,2,3,
and the wedge is the outer product of relative vectors (not to be confused with
the outer product of proper vectors as in (43)). The right sides of (43) and (44)
show how the bivectors for spacetime are split into vectors and bivectors for P
3
.
Comparison with (14) shows that the σ
k
generate the entire even subalgebra,
which can therefore be identified with G
3
= G(P
3
). Remarkably, the right-
handed pseudoscalar for P
3
is identical to that for M
4
,thatis,
σ
1
σ
2
σ
3
= i = γ
0
γ
1
γ
2
γ
3
. (45)
To be consistent with the operation of reversion defined in GA1 for the algebra
G
3
we require
σ
k
= σ
k
and (σ
i
σ
j
)
= σ
j
σ
i
. (46)
This can be extended to the entire STA by defining
M
γ
0
M
γ
0
(47)
for an arbitrary multivector M. The explicit appearance of the timelike vector
γ
0
here shows the dependence of M
on a particular spacetime split. The
definitions in this paragraph guarantee smooth articulation of proper physics
with physical descriptions relative to inertial frames.
Now let us rapidly survey the spacetime splits of some important physical
quantities. Let x = x(τ) be the history of a particle with proper time τ and
proper velocity v = dx/dτ. The spacetime split of v is obtained by differentiating
(37); whence
0
= v
0
(1 + v) , (48)
where
v
0
= v · γ
0
=
dt
=
1 v
2
1
2
(49)
11
is the “time dilation” factor, and
v =
dx
dt
=
dt
dx
=
v γ
0
v · γ
0
(50)
is the relative velocity in the γ
0
-system. The last equality in (49) was obtained
from
1=v
2
=(
0
)(γ
0
v)=v
0
(1 + v)v
0
(1 v)=v
2
0
(1 v
2
) . (51)
Let p be the proper momentum (i.e., energy-momentum vector) of a particle.
The spacetime split of p into energy (or relative mass) E and relative momentum
p is given by
0
= E + p , (52)
where
E = p · γ
0
and p = p γ
0
. (53)
Of course
p
2
=(E + p)(E p)=E
2
p
2
= m
2
, (54)
where m is the proper mass of the particle.
The proper angular momentum of a particle relates its proper momentum p
to its location at a spacetime point x. Performing the splits as before, we find
px =(E + p)(t x)=Et + pt Ex px . (55)
The scalar part of this gives the familiar split
p · x = Et p · x , (56)
so often employed in the phase of a wave function. The bivector part gives us
the proper angular momentum
p x = pt Ex + i(x × p) , (57)
where, as explained in GA1, x × p is the standard vector cross product.
An electromagnetic field is a bivector-valued function F = F (x) on spacetime.
An observer γ
0
splits it into an electric (relative vector) part E and, a magnetic
(relative bivector) part iB;thus
F = E + iB , (58)
where
E =(F · γ
0
)γ
0
=
1
2
(F + F
) (59)
is the part of F that anticommutes with γ
0
,and
iB =(F γ
0
)γ
0
=
1
2
(F F
) (60)
is the part that commutes. Also, in accordance with (47), F
= E iB. Note
that the split of the electromagnetic field in (58) corresponds exactly to the split
of the angular momentum (57) into relative vector and bivector parts.
A different kind of spacetime split is most appropriate for Lorentz transfor-
mations, as explained in the next Section.
12
IV. Lorentz Transformations
Orthogonal transformations on spacetime are called Lorentz transformations.
With due attention to the indefinite signature of spacetime (11), geometric al-
gebra enables us to treat Lorentz transformations by the same coordinate-free
methods used in GA1 for 3D rotations and reflections. Again, the method has
the great advantage of reducing the composition of transformations to simple
versor multiplication. The method is developed here in complete generality to
include space and time inversion, but the emphasis is on rotors and rotations as
a foundation for classical spinor mechanics in the next Section and subsequent
connection to relativistic quantum mechanics in Section VIII.
The main theorem is that any Lorentz transformation of a spacetime vector
a can be expressed in the canonical form
L
a =
L
LaL
1
, (61)
where
L
=1ifversorL is an even multivector and
L
= 1ifL is odd. The
condition
LL
1
= 1 (62)
allows L to have any nonzero magnitude, but normalization to |L | = 1 is often
convenient. The Lorentz transformation L
is said to be proper if
L
=1,and
improper if
L
= 1. It is said to be orthochronous if, for any timelike vector v,
v · L
(v) > 0 . (63)
A proper, orthochronous Lorentz transformation is called a Lorentz rotation (or
a restricted Lorentz transformation). For a Lorentz rotation R
the canonical
form can be written
R
(a)=RaR
, (64)
where the even multivector R is called a rotor and is normalized by the condition
RR
=1. (65)
The rotors form a multiplicative group called the rotor group, which is a double-
valued representation of the Lorentz rotation group (also called the restricted
Lorentz group).
As in the 3D case, the canonical form (61) simplifies the whole treatment of
Lorentz transformations. In particular, its main advantage is that it reduces
the composition law for Lorentz transformations,
L
2
L
1
= L
3
(66)
to the versor product
L
2
L
1
= L
3
. (67)
13
It follows from the rotor form (64), that, for any vectors a and b,
(R
a)(Rb)=RabR
= R(ab). (68)
Thus, Lorentz rotations preserve the geometric product. This implies that the
Lorentz rotation (64) can be extended to any multivector M as
R
M = RMR
. (69)
The most elementary kind of Lorentz transformation is a reflection n
by a
(non-null) vector n, according to
n
(a)=nan
1
. (70)
This is a reflection with respect to a hyperplane with normal n.Evenifn is
normalized to |n | = 1, if it is spacelike we need n
1
= n in (70) to account
for its negative signature. A reflection
v
(a)=vav (71)
with respect to a timelike vector v = v
1
is called a time reflection.Let
n
1
,n
2
,n
3
be spacelike vectors that compose the trivector
n
3
n
2
n
1
= iv . (72)
A space inversion v
s
can then be defined as the composite of reflections with
respect to these three vectors, so it can be written
v
s
(a)=n
3
n
2
n
1
an
1
n
2
n
3
= ivavi = vav . (73)
Note the difference in sign between the right sides of (71) and (73). The com-
posite of the time reflection (71) with the space inversion (73) is the spacetime
inversion
v
st
(a)=v
s
v(a)=iai
1
= a, (74)
which is represented by the pseudoscalar i. Note that spacetime inversion is
proper but not orthochronous, so it is not a rotation despite the fact that i is
even.
Two basic types of Lorentz rotation can be obtained from the product of two
reflections, namely timelike rotations (or boosts)andspacelike rotations. For a
boost
L
(a)=LaL
, (75)
the rotor L can be factored into a product
L = v
2
v
1
(76)
14
of two unit timelike vectors v
1
and v
2
. The boost is a rotation in the timelike
plane containing v
1
and v
2
. The factorization (76) is not unique. Indeed, for a
given L any timelike vector in the plane can be chosen as v
1
,andv
2
can then
be computed from (76). Similarly, for a spacelike rotation
U
(a)=Ua
U, (77)
the rotor U can be factored into a product
U = n
2
n
1
(78)
of two unit spacelike vectors in the spacelike plane of the rotation. Note that
the product, say n
2
v
1
, of a spacelike vector with a timelike vector is not a
rotor, because the corresponding Lorentz transformation is not orthochronous.
Likewise, the pseudoscalar i is not a rotor, even though it can be expressed as
the product of two bivectors, for it does not satisfy the rotor condition RR
=1.
The Lorentz rotation (64) can be applied to a standard frame {γ
µ
},trans-
forming it into a new frame of vectors {e
µ
} given by
e
µ
=
µ
R
. (79)
A spacetime rotor split of this Lorentz rotation is accomplished by a split of the
rotor R into the product
R = LU , (80)
where U
= γ
0
0
=
U or
0
U = γ
0
(81)
and L
= γ
0
L
γ
0
= L or
γ
0
L
=
0
. (82)
This determines a split of (79) into a sequence of two Lorentz rotations deter-
mined by U and L respectively; thus,
e
µ
=
µ
R
= L(
µ
U)L
. (83)
In particular, by (81) and (82),
e
0
=
0
R
=
0
L
= L
2
γ
0
. (84)
Hence,
L
2
= e
0
γ
0
. (85)
This determines L uniquely in terms of the timelike vectors e
0
and γ
0
,which,
in turn, uniquely determines the split (80) of R,sinceU can be computed from
U = L
R.
15
It is essential to note that the “spacetime rotor split” (80) is quite different
from the “spacetime split” introduced in the preceding section, for example in
(58). The terminology is motivated by the expression of rotors U and L in terms
of relative vectors, to which we now turn.
Equation (81) for variable U defines the “little group” of Lorentz rotations
that leave γ
0
invariant; This is the group of “spatial rotations” in the γ
0
-system.
Each such rotation takes a frame of proper vectors γ
k
(for k =1, 2, 3) into a new
frame of vectors
k
U in the γ
0
-system. Multiplication by γ
0
expresses this as
a rotation of relative vectors σ
k
= γ
k
γ
0
into relative vectors e
k
; thus, we get
e
k
= Uσ
k
U
= Uσ
k
U, (86)
in exact agreement with the equation for 3D rotations in GA1.
Equation (84) can be solved for L, in particular, for the case where e
0
= v is
the proper velocity of a particle of mass m. Then (48) enables us to write (85)
in the alternative forms
L
2
=
0
=
0
m
=
E + p
m
, (87)
It is easily verified that this has the solution
L =(
0
)
1
2
=
1+
0
2(1 + v · γ
0
)
1
2
=
m +
0
2m(m + p · γ
0
)
1
2
=
m + E + p
2m(m + E)
1
2
.
(88)
This displays L as a boost of a particle from rest in the γ
0
-system to a relative
momentum p.
Generalizing the treatment of rotating frames in GA1, the Lorentz rotation
of a frame (79) can be related to the standard matrix form by writing
e
µ
=
µ
R
= α
ν
µ
γ
ν
. (89)
As in GA1, this can be solved for the matrix elements
α
ν
µ
= e
µ
· γ
ν
=(γ
ν
µ
R
)
(0)
. (90)
Or it can be solved for the rotor,
7
with the result
R = ±(A
A)
1
2
A, (91)
where
A e
µ
γ
µ
= α
ν
µ
γ
ν
γ
µ
(92)
Equation (89) can be used to describe a change of coordinate frames.
In the tensorial approach to Lorentz rotations, the coordinates x
µ
= γ
µ
· x of
apointx transform according to
x
µ
x
µ
= α
µ
ν
x
ν
, with α
µ
ν
α
ν
λ
= δ
µ
λ
(93)
16
as the orthogonality condition on the transformation. This can be interpreted
either as a passive or an active transformation. In the passive case, it is accom-
panied by a (usually implicit) transformation of coordinate frame:
γ
µ
γ
µ
= α
λ
µ
γ
λ
, (94)
so that each spacetime point x = x
µ
γ
µ
= x
µ
γ
µ
is left unchanged.
In the active case, each spacetime point x = x
µ
γ
µ
is mapped to a new space-
time point
x
= x
µ
γ
µ
= x
µ
γ
µ
= RxR
, (95)
where the last form was obtained by identifying γ
µ
with e
µ
in (89). This shows
that STA enables us to dispense with coordinates entirely in the treatment of
Lorentz transformations. Consequently, we deal with active Lorentz transfor-
mations only in the coordinate-free form (64) or (61), and we dispense with
passive transformations entirely.
If all this seems rather obvious, just turn to any textbook on relativistic
quantum theory,
8
where the γ
µ
are matrices and (89) is introduced as a change
in matrix representation to prove relativistic invariance of the “Dirac operator”
γ
µ
µ
= γ
µ
µ
. In STA this is recognized as a passive Lorentz transformation,
so it is superfluous. Consequently, this aspect of Lorentz invariance need not be
mentioned in our treatment of the Dirac equation in Section VII.
V. Spinor Particle Mechanics
Now we are prepared to exploit the unique advantages of STA with a spinor for-
mulation of relativistic (or proper) mechanics. This approach has three major
benefits. First, it articulates perfectly with the rotor formulation of nonrelativis-
tic rigid body mechanics in GA1. Second, it articulates perfectly with Dirac’s
quantum theory of the electron, providing it with an informative and useful
classical limit that includes a natural classical explanation for the gyromagnetic
ratio g = 2. Indeed, the spinor used here for particle mechanics is an obvi-
ous special case of the real Dirac spinor introduced in Section VII. Finally, the
spinor formulation simplifies the solution of problems in relativistic mechanics
and automatically generalizes particle mechanics to include spin precession.
The rotor equation for a frame
e
µ
=
µ
R
(96)
can be used to describe the relativistic kinematics of a rigid body (with negligible
dimensions) traversing a world line x = x(τ ) with proper time τ , provided we
identify e
0
with the proper velocity v of the body, so that
dx
= x
.
= v = e
0
=
0
R
. (97)
17
Then {e
µ
= e
µ
(τ); µ =0, 1, 2, 3} is a comoving frame traversing the world line
along with the particle, and the rotor R must also be a function of proper time,
so that, at each time τ , equation (96) describes a Lorentz rotation of some
arbitrarily chosen fixed frame {γ
µ
} into the comoving frame {e
µ
= e
µ
(τ)}.
Thus, we have a rotor-valued function of proper time R = R(τ) determining a
1-parameter family of Lorentz rotations e
µ
(τ)=R(τ )γ
µ
R
(τ). The rotor R is a
unimodular spinor, as it satisfies the unimodular condition RR
=1.
Thespacelikevectorse
k
=
k
R
(for k =1, 2, 3) can be identified with the
principal axes of the body. But the same equations can be used for modeling
a particle with an intrinsic angular momentum or spin,wheree
3
is identified
with the spin direction ˆs;sowewrite
ˆs = e
3
=
3
R
. (98)
Later we see that this corresponds exactly to the spin vector in the Dirac theory
where the magnitude of the spin has the constant value |s | h/2.
The rotor equation of motion for R = R(τ) has the form
˙
R =
1
2
R (99)
where Ω = Ω(τ) is a bivector-valued function. The fact that = 2
˙
RR
=
Ωis
necessarily a bivector is easily proved by differentiating RR
= 1. Differentiating
(96) and using (99), we see that the equations of motion for the comoving frame
have the form
˙e
µ
=Ω· e
µ
. (100)
Clearly can be interpreted as a generalized rotational velocity of the comoving
frame.
The dynamics of the rigid body, that is, the effect of external forces and
torques on the body, is completely characterized by specifying as a definite
function of proper time. The single rotor equation (99) is equivalent to the set
of four frame equations (100). Besides the theoretical advantage of being closely
related to the Dirac equation, as we shall see, it has the practical advantage of
being simpler and easier to solve than the set of frame equations (100). The
corresponding nonrelativistic rotor equation for a spinning body was introduced
in GA1. It should be noted that the nonrelativistic rotor equation describes only
rotational motion, while its relativistic generalization (99) describes rotational
and translational motion together.
For a classical particle with mass m and charge e in an electromagnetic field
F , the dynamics is specified by
Ω=
e
m
F. (101)
So (100) gives the particle equation of motion
m ˙v = eF · v (102)
18
This may be recognized as the classical Lorentz force with tensor components
m ˙v
µ
= eF
µν
v
ν
, but note that tensor theory does not admit the more powerful
rotor equation of motion (99).
As demonstrated in specific examples that follow, even if one is interested in
the motion of a structureless point charge, the rotor equation (99) is easier to
solve than the Lorentz force equation (102). However, if one wants to extend
the model to an electron with spin, the same solution automatically describes
the electron’s spin precession. The result is physically meaningful too, for, as
we see later, the classical model of an electron with proper rotational veloc-
ity (101) proportional to the field F gives the same gyromagnetic ratio as the
Dirac equation. Indeed, it is a well-defined classical limit of the Dirac equation,
though Planck’s constant remains in the magnitude of the spin. This role of the
electromagnetic field F as a rotational velocity is so simple and natural that it
deserves a name. I propose to dub the relation (101) the Lorentz Torque,since
it is a straightforward generalization of the Lorentz Force (102). It is notewor-
thy that this idea, which is so natural in STA, seems never to have occurred
to physicists using tensor theory. This is one more example of the influence of
mathematical language on physical theory.
A. Motion in constant electric and magnetic fields.
If F is a uniform field on spacetime, then
˙
= 0 and (99) has the solution
R = e
1
2
τ
R
0
, (103)
where R
0
= R(0) specifies the initial conditions. When this is substituted into
(103) we get the explicit τ dependence of the proper velocity v. The integration
of (97) for the history x(t) is most simply accomplished in the general case
of arbitrary non-null F by exploiting the invariant decomposition F = fe
determined in (30). This separates into mutually commuting parts
1
=
(e/m)f cos ϕ and Ω
2
=(e/m)if sin ϕ,so
e
1
2
τ
= e
1
2
(Ω
1
+Ω
2
)τ
= e
1
2
1
τ
e
1
2
2
τ
. (104)
It also determines an invariant decomposition of the initial velocity v(0) into a
component v
1
in the f-plane and a component v
2
orthogonal to the f-plane;
thus,
v(0) = f
1
(f · v(0)) + f
1
(f v(0)) = v
1
+ v
2
. (105)
When this is substituted in (97) and (104) is used, we get
dx
= v = e
1
τ
v
1
+ e
2
τ
v
2
. (106)
Note that this is an invariant decomposition of the motion into “electriclike”
and “magneticlike” components. It integrates easily to give the history
x(τ) x(0) = 2(e
1
τ
1)
1
1
v
1
+2(e
2
τ
1)Ω
1
2
v
2
. (107)
19
This general result, which applies for arbitrary initial conditions and arbitrary
uniform electric and magnetic fields, has such a simple form because it is ex-
pressed in terms of invariants. It looks far more complicated when subjected to
a space-time split and expressed directly as a function of “laboratory fields” in
an inertial system. Details are given in my mechanics book.
3
B. Electron in the field of a plane wave.
As a second example with important applications, we integrate the rotor
equation for a “classical test charge” in an electromagnetic plane wave.
9
This
is useful for describing the interaction of electrons with lasers. As explained
at the end of Section VI in GA1, any plane wave field F = F (x)withproper
propagation vector k canbewritteninthecanonicalform
F = fz, (108)
where f is a constant null bivector (f
2
= 0), and the x-dependence of F is
exhibited explicitly by
z(k · x)=α
+
e
i(k·x)
+ α
e
i(k·x)
, (109)
with
α
±
= ρ
±
e
±
±
, (110)
where δ
±
and ρ
±
0 are scalars. It is crucial to note that the “imaginary” i
here is the unit pseudoscalar, because it endows these solutions with geometrical
properties not possessed by conventional “complex solutions.” Indeed, as noted
in GA1, the pseudoscalar property of i implies that the two terms on the right
side of (109) describe right and left circular polarizations. Thus, the orientation
of i determines handedness of the solutions.
For the plane wave (108), Maxwell’s equation reduces to the algebraic condi-
tion,
kf =0. (111)
This implies k
2
= 0 as well as f
2
= 0. To integrate the rotor equation of motion
˙
R =
e
2m
FR, (112)
it is necessary to express F as a function of τ . This can be done by using special
properties of F to find constants of motion. Multiplying (112) by k and using
(111) we find immediately that kR is a constant of the motion. So, with the
initial condition R(0) = 1, we obtain k = kR = Rk = kR
; whence
RkR
= k. (113)
Thus, the one parameter family of Lorentz rotations represented by R = R(τ )
lies in the little group of the lightlike vector k. Multiplying (113) by (96), we
find the constants of motion k · e
µ
= k · γ
µ
. This includes the constant
ω = k · v, (114)
20
which can be interpreted as the frequency of the plane wave “seen by the par-
ticle.” Since v = dx/dτ, we can integrate (114) immediately to get
k · (x(τ ) x(0)) = ωτ . (115)
Inserting this into (109) and absorbing k · x(0) in the phase factor, we get
z(k · x)=z(ωτ), expressing the desired τ dependence of F . Equation (112)
can now be integrated directly, with the result
R =exp(efz
1
/2m)=1+
e
2m
fz
1
, (116)
where
z
1
=
2
ω
sin (ωτ/2)
α
+
e
τ/2
+ α
e
τ/2
. (117)
This gives the velocity v and, by integrating (97), the complete particle history.
Details are given elsewhere.
9
It is of practical interest to know that this solution
is equivalent to the “Volkov solution” of the Dirac equation for an electron in a
plane wave field.
10
In this case, the quantum mechanical solution is equivalent
to its classical limit. The solution has practical applications to the interaction
of electrons with laser fields.
13
The problem of motion in a Coulomb field has been solved by the same spinor
method,
11
but no other exact solutions of the rotor equation (99) with Lorentz
torque have been published.
C. Spin precession.
We have established that specification of kinematics by the rotor equation
(99) and dynamics by Ω = (e/m)F is a geometrically perspicuous and analyti-
cally efficient means of characterizing the motion of a classical charged particle,
and noted that it automatically provides us with a classical model of spin preces-
sion. Now let us take a more general approach to modeling and analyzing spin
precession. Any dynamics of spin precession can be characterized by specifying
a functional form for Ω. That includes gravitational precession
12
and electron
spin precession in the Dirac theory. To facilitate the analysis for any given
dynamical model, we first carry the analysis as far as possible for arbitrary Ω.
Then we give a specific application to measurement of the g-factor for a Dirac
particle.
The rotor equation of motion (99) determines both translational and rota-
tional motions of the comoving frame (96), whatever the frame models physi-
cally. It is of interest to separate translational and rotational modes, though
they are generally coupled. This can be done by a spacetime split by the particle
velocity v or by the reference vector γ
0
. We consider both ways and how they
are related.
D. Larmor and Thomas precession.
To split the rotational velocity Ω by the velocity v,wewrite
Ω=Ωv
2
=(· v)v +( v)v. (118)
21
This produces the split
Ω=Ω
+
+Ω
, (119)
where
+
=
1
2
(Ω + v
v)=(· v)v vv , (120)
and
=
1
2
(Ω v
v)=( v)v. (121)
Note that · v v was used in (120) to express
+
entirely in terms of the
proper acceleration ˙v and velocity v. This split has exactly the same form as
the split (58) of the electromagnetic bivector into electric and magnetic parts
corresponding here to
+
and
respectively. However, it is a split with respect
to the instantaneous “rest frame” of the particle rather than a fixed inertial
frame. In the rest frame the relative velocity of the particle itself vanishes, of
course, so the particle’s acceleration is entirely determined by the “electriclike”
part
+
, as (120) shows explicitly. The “magneticlike” part
is completely
independent of the particle motion; it is the Larmor precession (frequency) of
the spin for a particle with a magnetic moment, so let us refer to it as the
Larmor precession in the general case.
Unfortunately, (119) does not completely decouple precession from translation
because
+
contributes to both. Also, we need a way to compare precessions
at different points on the particle history. These difficulties can be resolved by
adopting the γ
0
-split
R = LU , (122)
exactly as defined by (80) and subsequent equations. At every time τ, this split
determines a “deboost” of relative vectors e
k
e
0
=
k
γ
0
R
= Rσ
k
R
(k =1, 2, 3)
into relative vectors
e
k
= L
(e
k
e
0
)L = Uσ
k
U (123)
in the fixed reference system of γ
0
. The particle is brought to rest, so to speak,
so we can watch it precess (or spin) in one place. The precession is described
by an equation of the form
dU
dt
=
1
2
iωU, (124)
so, as already shown in GA1, differentiation of (123) yields the familiar equations
for a rotating frame:
de
k
dt
= ω × e
k
. (125)
22
The problem now is to express ω in terms of the given and determine the
relative contributions of the parts
+
and
. To do that, we use the time
dilation factor v
0
= v · γ
0
= dt/dτ to change the time variable in (124) and
write
ω = iωv
0
(126)
so (124) becomes
˙
U =
1
2
ωU. Then differentiation of (122) and use of (99) gives
Ω=2
˙
RR
=2
˙
LL
+ L
. (127)
Solving for ω and using the split (119), we get
ω = L
L + L
˙vvL 2L
˙
L. (128)
Differentiation of (87) leads to
L
vv)L = L
˙
L +
˙
LL
, (129)
while differentiation of (88) gives
2
˙
LL
=
˙v (v + γ
0
)
1+v · γ
0
. (130)
These terms combine to give the well-known Thomas precession frequency
ω
T
=
(2
˙
LL
) γ
0
γ
0
=
˙
LL
L
˙
L
=
v v γ
0
)γ
0
1+v · γ
0
= i
v
2
0
1+v
0
v ×
˙
v .
(131)
The last step here, expressing the proper vectors in terms of relative vectors,
was derived from the split
˙vv v v = v
2
0
(
˙
v + i(v ×
˙
v)) . (132)
Finally, writing
ω
L
= L
L (133)
for the transformed Larmor precession, we have the desired result
ω = ω
T
+ ω
L
. (134)
The Thomas term describes the effect of motion on the precession explicitly and
completely.
23
E. The g-factor in spin precession.
Now let us apply the rotor approach to a practical problem of spin precession.
In general, for a charged particle with an intrinsic magnetic moment in a uniform
electromagnetic field F = F
+
+ F
,
Ω=
e
mc
F
+
+
g
2
F
=
e
mc
F +
1
2
(g 2)F
, (135)
where as defined by (121), F
is the magnetic field in the instantaneous rest
frame of the particle, and g is the usual gyromagnetic ratio. This yields the
classical equation of motion (102) for the velocity, but by (98) and (100) the
equation of motion for the spin is
˙s =
e
m
[ F +
1
2
(g 2)F
] · s. (136)
This is the well-known Bargmann-Michel-Telegdi (BMT) equation, which is used
in high precision measurements of the g-factor for the electron and muon.
To apply the BMT equation, it must be solved for the rate of spin precession.
The general solution for an arbitrary combination F = E+iB of uniform electric
and magnetic fields is most easily found by replacing the BMT equation by the
rotor equation
˙
R =
e
2m
FR+ R
1
2
(g 2)
e
2m
iB
0
, (137)
where
iB
0
= R
F
R =
1
2
R
FR (R
FR)
. (138)
is an “effective magnetic field” in the “rest system” of the particle. With initial
conditions R(0) = L
0
, U(0) = 1, for a boost without spatial rotation, a solution
of (137) is
R =exp
e
2m
L
0
exp
1
2
(g 2)
e
2m
iB
0
τ
, (139)
where B
0
is defined by
B
0
=
1
2i
L
0
FL
0
(L
0
FL
0
)
= B +
v
2
00
1+v
00
v
0
×(B×v
0
)+v
00
E×v
0
,
(140)
where v
00
= v(0) · γ
0
=(1v
2
)
1
2
. The first factor in (139) has the same effect
on both the velocity v and the spin s, so the last factor gives directly the change
in the relative directions of the relative velocity v and the spin s.Thiscanbe
measured experimentally.
3
To conclude this section, some general remarks about the description of spin
will be helpful in applications and in comparisons with more conventional ap-
proaches. We have represented the spin by the proper vector s = |s |e
3
defined
24
by (98) and alternatively by the relative vector σ = |s |e
3
,wheree
3
is defined by
(123). For a particle with proper velocity v = L
2
γ
0
, these two representations
are related by
sv = Lσ L
(141)
or, equivalently, by
σ = L
(sv)L = L
sLγ
0
. (142)
A straightforward spacetime split of the proper spin vector s, like (48) for the
velocity vector, gives
0
= s
0
+ s , (143)
where
s = s γ
0
(144)
is the relative spin vector, and s · v = 0 implies that
v
0
s
0
= v · s . (145)
From (141) and (143), the relation of s to σ is found to be
s = σ +(v
0
1)(σ ·
ˆ
v)
ˆ
v , (146)
where v
0
= v · γ
0
and
ˆ
v = v /|v |. Both vectors s and σ are sometimes used
in the literature, and some confusion results from a failure to recognize that
they come from two different kinds of spacetime split. Of course either one can
be used, since one determines the other, but σ is usually simpler because its
magnitude is constant. Note from (146) that they are indistinguishable in the
non-relativistic approximation.
VI. Electromagnetic Field Theory
In STA an electromagnetic field is represented by a bivector-valued function
F = F (x) on spacetime. The field produced by a source with proper current
density J = J(x) is determined by Maxwell’s Equation
F = J. (147)
As explained in Section II, the differential operator
=
x
in STA is regarded
as the (vector) derivative with respect to a spacetime point x.
Since
is a vector operator the expansion (10) applies, so we can write
F =
· F +
F, (148)
25
where
· F is the divergence of F and
F is the curl. We can accordingly
separate (147) into vector and trivector parts:
· F = J, (149)
F =0. (150)
This is the coordinate-free form for the two covariant tensor equations for the
electromagnetic field in standard relativistic theory.
As a pedagogical point, it is worth noting that the decomposition (148) into
divergence and curl is a straightforward generalization of the 3D vectorial de-
composition introduced in GA1. Also note that, as standard SI units are not
well suited for spacetime physics, we choose a system of units that minimizes
the number of constants in basic equations. The reader can infer the choice
from the spacetime split of Maxwell’s equation given below.
The reduction of the two Maxwell equations (149) and (150) to the to a sin-
gle “Maxwell’s Equation” (147) brings many simplifications to electromagnetic
theory. For example, the operator
has an inverse so (147) can be solved for
F =
1
J, (151)
Of course,
1
is an integral operator that depends on boundary conditions on
F for the region on which it is defined, so (151) is an integral form of Maxwell’s
equation. However, if the “current” J = J(x) is the sole source of F , then (151)
provides the unique solution to (147).
Next we survey other simplifications to the formulation and analysis of elec-
tromagnetic equations. Differentiating (147) we obtain
2
F =
J =
· J +
J, (152)
where
2
is the d’Alembertian (35). Separately equating scalar and bivector
parts of (152), we obtain the charge conservation law
· J = 0 (153)
and an alternative equation for the E-M field
2
F =
J. (154)
A. Electromagnetic Potentials.
A different field equation is obtained by using the fact that, under general con-
ditions, any continuous bivector field F = F (x) can be expressed as a derivative
with the specific form
F =
(A + Bi) , (155)
where A = A(x)andB = B(x) are vector fields, so F has a “vector potential”
A and a “trivector potential” Bi. This is a generalization of the well-known
26
“Helmholtz theorem” in vector analysis.
4
Since
A =
· A +
A with a
similar equation for
B, the bivector part of (155) can be written
F =
A +(
B)i, (156)
while the scalar and pseudoscalar parts yield the so-called “Lorenz condition”
· A =
· B =0, (157)
Inserting (155) into Maxwell’s equation (147) and separating vector and trivec-
tor parts, we obtain the usual wave equation for the vector potential
2
A = J, (158)
as well as
2
Bi =0. (159)
The last equation shows that B is independent of the source J,soitcanbe
set to zero in (155). However, in a theory with magnetic charges, Maxwell’s
equation takes the form
F = J + iK, (160)
where K = K(x) is a vector field, the “magnetic current density.” On substituting
(155) into (160) we obtain in place of (159),
2
Bi = iK . (161)
The pseudoscalar i can be factored out to make (161) appear symmetrical with
(157), but this symmetry between the roles of electric and magnetic currents is
deceptive, because one is vectorial while the other is actually trivectorial.
The separation of the generalized Maxwell’s equation (160) into parts with
electric and magnetic sources can be achieved by again using (148) and again
getting (149) for the vector part but getting
F = iK (162)
for the trivector part. This equation can be made to look similar to (149) by
duality to put it in the form
· (Fi)=K. (163)
Note that the dual Fi of the bivector F is also a bivector. Hereafter we restrict
our attention to the “physical case” K =0.
B. Maxwell’s equation for material media.
Sometimes the source current J can be decomposed into a conduction current
J
C
and a magnetization current
· M, where the generalized magnetization
M = M (x) is a bivector field; thus
J = J
C
+
· M. (164)
27
The Gordon decomposition of the Dirac current is of this ilk. Because of the
mathematical identity
· (
· M)=(
) · M = 0, the conservation law
· J = 0 implies also that
· J
C
= 0. Using (164), equation (149) can be put
in the form
· G = J
C
(165)
wherewehavedenedanewfield
G = F M. (166)
A disadvantage of this approach is that it mixes up physically different kinds
of entities, an E-M field F and a matter field M . However, in most materials
M is a function of the field F , so when a “constitutive equation” M = M(F )is
known (165) becomes a well defined equation for F.
C. Energy-momentum tensor.
STA enables us to write the usual Maxwell energy-momentum tensor T(n)=
T (n(x),x) for the electromagnetic field in the compact form
T (n)=
1
2
Fn
F =
1
2
FnF . (167)
Recall that the tensor field T (n) is a vector-valued linear function on the tan-
gent space at each spacetime point x describing the flow of energy-momentum
through a surface with normal n = n(x), By linearity T (n)=n
µ
T
µ
,where
n
µ
= n · γ
µ
and
T
µ
T (γ
µ
)=
1
2
µ
F. (168)
ThedivergenceofT (n) can be evaluated by using Maxwell’s equation (147),
with the result
µ
T
µ
= T (
)=J · F. (169)
Its value is the negative of the Lorentz force (density) F · J, which is the rate
of energy-momentum transfer from the source J to the field F .
D. Eigenvectors of the Maxwell Tensor.
The compact, invariant form (167) enables us to solve easily the eigenvector
problem for the Maxwell energy-momentum tensor. If F is not a null field, it
has the invariant decomposition F = fe
given by (30), which, when inserted
in (167), gives
T (n)=
1
2
fnf (170)
This is simpler than (167) because f is simpler than F . Note also that it
implies that all fields differing only by an arbitrary “duality factor” e
have
the same energy-momentum tensor. The eigenvalues can be found from (170)
by inspection. The bivector f determines a timelike plane. Any vector n in that
plane satisfies n f = 0, or equivalently, nf = fn. On the other hand, if n
28
is orthogonal to the plane, then n · f = 0 and nf = fn. For these two cases,
(170) gives us
T (n)=±
1
2
f
2
n. (171)
Thus T (n) has a pair of doubly degenerate eigenvalues ±
1
2
f
2
corresponding to
“eigenbivectors” f and if, all expressible in terms of F by inverting (30). This
approach should be compared with conventional matrix methods to appreciate
the simplifications achieved by STA.
E. Relation to tensor formulations.
The versatility of STA is also illustrated by the ease with which the above
invariant formulation of “Maxwell theory” can be related to more conventional
formulations. The tensor components F
µν
of the E-M field F are given by (17),
whence, using (34), we find
µ
F
µν
= J · γ
ν
= J
ν
(172)
for the tensor components of Maxwell’s equation (149). Similarly, the tensor
components of (163) are
[ν
F
αβ]
= K
µ
µναβ
, (173)
where the brackets indicate antisymmetrization and
µναβ
= i
1
· (γ
µ
γ
ν
γ
α
γ
β
).
The tensor components of the energy-momentum tensor (168) are
T
µν
= γ
µ
· T
ν
=
1
2
(γ
µ
ν
F )
(0)
=(γ
µ
· F ) · (F · γ
ν
)
1
2
γ
µ
· γ
ν
(F
2
)
(0)
= F
µα
F
ν
α
1
2
g
µν
F
αβ
F
αβ
(174)
F. Spacetime splits in E-M theory.
To demonstrate how smoothly the proper formulation of E-M theory articu-
lates with the relative formulation, we quickly survey several spacetime splits.
A spacetime split of Maxwell’s equation (147) puts it in the standard relative
vector form for an inertial system. Thus, following the procedure in Section 4,
0
= J
0
+ J (175)
splits the current J intoachargedensityJ
0
= J · γ
0
and a relative current
J = J γ
0
in the γ
0
-system. Similarly,
γ
0
=
t
+ (176)
splits
=
x
into a time derivative
t
= γ
0
·
and spatial derivative =
γ
0
=
x
with respect to the relative position vector x = x γ
0
. Combining
this with the split of F into electric and magnetic parts, we get Maxwell’s
equation (147) in the split form
(
t
+ )(E + iB)=J
0
J , (177)
29
in agreement with the formulation in GA1.
Note that (176) splits the D’Alembertian into
2
=(
γ
0
)(γ
0
)=(
t
)(
t
+ )=
2
t
2
. (178)
The vector field T
0
= T (γ
0
)=T (γ
0
)istheenergy-momentum density in the
γ
0
-system. The split
T
0
γ
0
= T
0
γ
0
= T
00
+ T
0
(179)
separates it into an energy density T
00
= T
0
· γ
0
and a momentum density
T
0
= T
0
γ
0
. Using the fact that γ
0
anticommutes with relative vectors, from
(168) we obtain
T
0
γ
0
=
1
2
FF
=
1
2
(E
2
+ B
2
)+E×B , (180)
in agreement with GA1.
The spacetime split helps us with physical interpretation. Corresponding to
the split F = E + iB, the magnetization field M splits into
M = P + iM , (181)
where P is the electric polarization density and M is the magnetic moment
density.Writing
G = D + iH , (182)
we see that (166) gives us the familiar relations
D = E + P , (183)
H = B M . (184)
Insertion of (182) into (165) with a spacetime split yields the usual set of
Maxwell’s equations for a material medium.
VII. Real Relativistic Quantum Theory
The Dirac equation is the cornerstone of relativistic quantum theory, if not the
single most important equation in all of quantum physics. This Section shows
how STA simplifies the entire Dirac theory, reveals hidden geometric structure
with implications for physical interpretation, and provides a common spinor
method for classical and quantum physics with a more direct and transparent
classical limit of the Dirac equation.
First, we show how to reformulate the standard matrix version of Dirac theory
in terms of the real STA. As this reformulation eliminates superfluous complex
numbers and matrices from the standard version, I call it the real Dirac theory.
30
Next we provide the real Dirac wave function with a geometric interpretation
by relating it to local observables. The term “local observable” is non-standard
but the concept is not unprecedented. It refers to assignment of physical inter-
pretation to some local quantity such as energy or charge density rather than
to global quantities such as expectation values. It serves as a device for describ-
ing local geometric structure of the theory quite apart from claims of objective
reality. Its bearing on the interpretation of quantum mechanics is discussed in
the next Section.
For reference purposes, I provide a complete catalog of relations between
local observables in the real theory and the socalled “bilinear covariants” in the
matrix theory. This facilitates translation between the two formulations. It will
be noted that the real version is substantially simpler, and the complexities of
translation can be avoided by sticking to the real theory alone.
Finally, I provide a thorough analysis of local conservation laws in the real
Dirac theory to ascertain further what STA can tell us about geometric structure
and physical interpretation. The analysis is much more complete than any
treatment in textbooks that I know.
This account is limited to the single particle Dirac theory. The tendency in
textbooks is to forego a thorough study of single particle theory and leap at
once to the second quantized many particle theory. I leave it to the reader to
decide what might be lost by that practice.
Space does not permit an adequate account of “real solutions” of the Dirac
equation in this article. Partial treatments are given elsewhere,
15, 16
but it is
worth mentioning here that in some respects the real Dirac equation is easier
to solve and analyze than the Schroedinger equation.
A. Derivation of the real Dirac theory.
Derivation of the real STA version of the Dirac theory from the standard
matrix version is essentially the same as for the Pauli theory, but the differences
are sufficient to justify a quick review. To find a representation of the Dirac
theory in terms of STA, we begin with a Dirac spinor Ψ, a column matrix of 4
complex numbers. Let u be a fixed spinor with the properties
u
u =1, (185)
γ
0
u = u, (186)
γ
2
γ
1
u = i
u. (187)
In writing this we regard the γ
µ
, for the moment, as 4 × 4 Dirac matrices, and
i
as the unit imaginary in the complex number field of the Dirac algebra. Now,
we can write any Dirac spinor
Ψ=ψu , (188)
where ψ is a matrix that can be expressed as a polynomial in the γ
µ
.The
coefficients in this polynomial can be taken as real, for if there is a term with
an imaginary coefficient, then (187) enables us to make it real without altering
31
(188) by replacing i
in the term by γ
2
γ
1
on the right of the term. Furthermore,
the polynomial can be taken to be an even multivector, for if any term is odd,
then (186) allows us to make it even by multiplying on the right by γ
0
.Thus,
in (188) we may assume that ψ is a real even multivector, so we can reinterpret
the γ
µ
in ψ as vectors in STA instead of matrices. Thus, we have established
a correspondence between Dirac spinors and even multivectors in STA. The
correspondence must be one-to-one, because the space of even multivectors (like
the space of Dirac spinors) is exactly 8-dimensional, with 1 scalar, 1 pseudoscalar
and 6 bivector dimensions.
Finally, it should be noted that by eliminating the ungeometrical imaginary
i
from the base field we reduce the degrees of freedom in the Dirac theory
by half, with consequent simplification of the theory that shows up in the real
version. The Dirac algebra is generated by the Dirac matrices over the base
field of complex numbers, so it has 2
4
× 2 = 32 degrees of freedom and can be
identified with the algebra of 4 × 4 complex matrices. From (14) we see that
STA has 2
4
= 16 degrees of freedom.
One immediate simplification brought by STA appears in the spacetime split.
To write his equation in hamiltonian form, Dirac defined 4 × 4 matrices
α
k
= γ
k
γ
0
(189)
for k =1, 2, 3. This is, in fact, a representation of the 2 × 2 Pauli matrices by
4 ×4 matrices. STA eliminates this awkward and irrelevant distinction between
matrix representations of different dimension, so the α
k
can be identified with
the σ
k
, as we have already done in the spacetime split (43).
There are several ways to represent a Dirac spinor in STA,
18
but all repre-
sentations are, of course, mathematically equivalent. The representation chosen
here has the advantages of simplicity and, as we shall see, ease of interpretation.
To distinguish a spinor ψ in STA from its matrix representation Ψ in the
Dirac algebra, let us call it a real spinor to emphasize the elimination of the
ungeometrical imaginary i
. Alternatively, we might refer to ψ as the operator
representation of a Dirac spinor, because, as shown below, it plays the role of
an operator generating observables in the theory.
In terms of the real wave function ψ, the Dirac equation for an electron can
bewrittenintheform
γ
µ
(
µ
ψγ
2
γ
1
¯h eA
µ
ψ)=γ
0
, (190)
where m is the mass and e = −|e | is the charge of the electron, while the
A
µ
= A · γ
µ
are components of the electromagnetic vector potential. To prove
that this is equivalent to the standard matrix form of the Dirac equation,
8
we
simply interpret the γ
µ
as matrices, multiply by u on the right and use (186)
and (188) to get the standard form
γ
µ
(i
¯h∂
µ
eA
µ
= mΨ . (191)
This completes the proof. Alternative proofs are given elsewhere.
17–19
The
original converse derivation of (190) from (191) was much more indirect.
14
32
Henceforth, we can work with the real Dirac equation (190) without refer-
ence to its matrix representation (191). We know from previous Sections that
computations in STA can be carried out without introducing a basis, and we
recognize the so-called “Dirac operator”
= γ
µ
µ
as the vector derivative
with respect to a spacetime point, so let us write the real Dirac equation in the
coordinate-free form
ψi¯h eAψ = mψγ
0
, (192)
where A = A
µ
γ
µ
is the electromagnetic vector potential, and the notation
i γ
2
γ
1
=
3
γ
0
= iσ
3
(193)
emphasizes that this bivector plays the role of the imaginary i
that appears
explicitly in the matrix form (191) of the Dirac equation. To interpret the theory,
it is crucial to note that the bivector i has a definite geometrical interpretation
while i
does not.
B. Lorentz invariance.
Equation (192) is Lorentz invariant, despite the explicit appearance of the
constants γ
0
and i = γ
2
γ
1
in it. These constants need not be associated with
vectors in a particular reference frame, though it is often convenient to do so.
It is only required that γ
0
be a fixed, future-pointing, timelike unit vector while
i is a spacelike unit bivector that commutes with γ
0
. The constants can be
changed by a Lorentz rotation
γ
µ
γ
µ
=
µ
C, (194)
where C is a constant rotor, so C
C =1,
γ
0
=
0
C and i
= Ci
C. (195)
A corresponding change in the wave function,
ψ ψ
= ψ
C, (196)
induces a mapping of the Dirac equation (192) into an equation of the same
form:
ψi
¯h eAψ
=
γ
0
. (197)
This transformation is