Chapter 13

Minkowski Spacetime

I. Introduction

We pause in this chapter to further digest the implications of the last two chapters. It is certainly clear that the developments which culminated in the 1905 Special Theory of Relativity have forced us to reconsider several central concepts of the space-time models that we have discussed which predate Einstein's 1905 paper. One of the most important implications of Einstein's theory is the non-existence of an observer-independent "universal time". The relational structures on the set of events E must be reorganized so that they are consistent with the two postulates of special relativity as well as all implications of the postulates. The space-time model that emerges from this reorganization is called Minkowski space-time, so named because it was initially devised by the mathematician Hermann Minkowski. As we shall see it is a natural geometrical scheme which incorporates the postulates in an economical way.

Minkowski space-time is important for another reason which bears heavily on the subsequent history of space-time models. Prior to Minkowski's invention of this space-time model in 1908, Einstein and others who thought about these matters considered the Special Theory of Relativity to be primarily a physical theory. That is to say, they did not focus on the geometrical implications of the new physics. The theory of Special Relativity, if valid, presented severe problems for Newtonian mechanics. The physicists of the day were predominantly concerned with how to devise a new mechanics consistent with the postulates (and indirectly with the laws of electricity and magnetism). It was Minkowski who focussed attention on the mathematical structure of the new theory and its geometrical implications.

As with all such reformulations, Minkowski's geometrical approach could have turned out to be merely a peculiar representation which neatly summarized known effects. However, after Einstein learned about the Minkowskian approach he became convinced that the economy and simplicity of the approach could not be accidental. This "geometrical" approach turned out to be the key to Einstein's development, after a ten year search of a theory which incorporated both the special theory and the effects of gravitation. We will turn to the question of how to combine gravity and Special Relativity in the next two chapters. For now we just want to make the reader aware of the potential significance of the geometrical approach, initiated by Minkowski in 1908, which culminated in 1915 with Einstein's formulation of the General Theory of Relativity.

II. Incompatibility of the Postulates of SRT and Galilean Relativity

How should we proceed to reorganize our grand set E of all events so that the imposed structures reflect the constraints imposed by accepting Postulates I and II? In Chapter 9 we discovered that, in Galilean Spacetime, any pair of observers moving uniformly (at constant velocity) with respect to one another, were fundamentally equivalent. In Galilean Spacetime, there is no upper limit to allowed velocities so the observers can be moving with respect to one another with any speed, no matter how large. From the geometrical point of view, such equivalent observers are represented on a spacetime diagram as a set of straight worldlines oriented in the forward time direction. Since the angle between any two worldlines is a measure of the relative velocity of the two observers, it follows that any angle <180deg. is allowable between them. Such observers are equivalent in the sense that any mechanical experiments performed by any of these observers will lead to the same result no matter which one performs the experiment. The range of worldlines for these equivalent observers is illustrated in the space-time diagram in Fig. 13.1:

theory, there is also a set of equivalent observers. The question is: is the set of equivalent observers in Galilean space-time the same set as the set of equivalent observers stipulated in Postulate I of the special theory? The geometrical approach allows us to see clearly that the two sets of observers are different.

The crucial difference is that Postulate II puts a constraint on the set of allowable worldlines.

Postulate II, which holds that "the speed of light is a universal constant independent of the motion of the source" entails that the vacuum speed of light, c, is a constant for all observers. The argument is simple. Choose some source of light S. Find an observer 0, who is at rest with respect to 5. Postulate II entails that O will measure the speed of light emitted by S to be equal to c. Now consider some other observer 0'. 0' is either at rest with respect to O or moving with respect to 0. Consider, first, that O' is at rest with respect to 0. In this case it is clear that O' is at rest with respect to S, and Postulate II entails that O' measures the speed of light emitted by S to be c = c.

Consider, next, that O" is moving with some constant velocity V with respect to 0. Since S is at rest with respect to 0, 0" is moving with velocity V with respect to S. But, if O" is moving with velocity V with respect to S then O" can take himself to be at rest and consider S as moving with velocity -V with respect to him. But, Postulate II says that the speed of light is independent of the motion of the source. Since O observes the velocity of light emitted by S to be C" = C, then O" observes the velocity of light emitted by S to be C" = C. Therefore, the vacuum speed of light is a constant for all inertial observers.

In addition, we want to use the fact that light is the fastest signal that can be sent between two points. We elevate an empirical result and a theoretical consideration to the status of a Postulate. The empirical fact is that no massive body travels at or beyond the speed of light and experiments indicate that attempts to push a massive object faster than the speed of light meet with greater and greater (unsurmountable) resistance as the object approaches the speed of light "barrier". The theoretical considerations come from Maxwell's Electromagnetic Theory. The transformations which connect observers who "see" the same electric and magnetic phenomena (the "Maxwellian equivalent observers'') suggest that the speed of light is a fixed limit on signal velocities. We shall, therefore, feel free to use the limiting character of the speed of light in our exploration of the geometry of Minkowski Spacetime.

Let an observer O who takes himself to be at rest represent his worldline as a line perpendicular to the bottom of the page. Then, given that the speed of light is taken to be equal to 1 in our system of units, no straight worldlines can pass through which come closer than 45deg. to the positive or negative x-axis. We show in the following diagram the observer O and light signals sent respectively along the positive and negative x-axes.

Fig. 13.2

Thus the rule "particles can not have space-time tilts greater than 45deg." reflects the facts that the speed of light is the fastest allowable physical velocity and this velocity of light is the same for all observers. Postulate II is thus incompatible with Postulate I if the set of invariant (equivalent) observers specified by Postulate I is taken to be the set of Galilean invariant observers.

Therefore, if Postulates I and II are to hold conjointly, the principle of relativity stipulated by Postulate I cannot be Galilean relativity but must take some other form. We want, then, to discover what the structure of space-time will be if it is to be consistent with both postulates.

III. The lightcone structure.

Consider the worldline of a single observer 0. Since light is the fastest signal the most ideal manner in which the observer can communicate information to other space-time event locations is to send ~and receive) light signals. The sending of such a light signal is shown in Fig. 13.3. This defines a "light cone" at C*. Let us assume that the signal sent from C* will be received by other observers. These other observers are emitting their own light signals. The net effect is to define a light cone structure at every event in the set of events E. This situation is illustrated in Fig. 13.4.

Postulate II entails that not only is there a "light cone" defined at every space-time point but that because the speed of light is a universal constant for all observers the "tilt" of the cones is every where and every where the same, independent of the velocity of an observer at any space-time point. We can thus think of Postulate II as endowing the set of all events E with a preferred "light-cone" structure that is globally (in E) the same. This cone-structure has enormous consequences for the causal structure of E. Any events which lie inside the light cone generated by O's signal at C*, e.g., cannot be causally influenced by O's actions after C*. Those events which can be causally linked in E are called "causally connectible". Intuitively, event el is causally connectible to event e2 if a signal can be sent from el to e2, or conversely. The invariance of the velocity of light, as we shall see, means that if our observer O sees el and e2 as causally connected that any inertial observer will agree with him.

This new causal structure can be partly understood by considering the set of events accessible to a given observer. To see this we refer the

reader to Fig. 13.5.

The set of events labelled FT are those events that occur after event C* and that can be causally influenced by any signal emanating from C*. The set of events PT are those events that occur before event C* and which can causally influence C* by means of a signal sent to C*. Those labelled FL and PL respectively "future light-like" and "past light-like" relative to C* determined by light signals emanating from or coming to C*. Finally, the events labelled S are "space-like related" to C*.

The designation FT means that any freely moving observer who goes through C* could find a velocity, less than light velocity, such that the event C* and any event ef in FT falls on his worldline. Relative to this observer the two events necessarily have the temporal relation tc < te . The set of events in PT are characterized similarly: an observer O' who goes through any point in ep in PT could find some velocity V such that C* will fall on his worldline. For such an observer te ~ tc . Generally, we see that the FT events are the events, relative to C*, which can be subsequently affected by C* while the PT events are those that could have affected an observer who passes through event C* so long as the observer's velocity does not exceed the speed of light either subsequent to C* or prior to C*. For such a sub-luminal observer, we see that the world of spatial and temporal points is only partly accessible at a given instant on the observer's local clock. Those events in the region S are not accessible to an observer passing through C* whose velocity is less than light velocity. FT is the subset of events in E which can be known by an observer who passes through C* with a speed less than the speed of light. PT is the subset of events in E which can affect C*.

Any observer passing through C* can communicate with events in S only if he travels faster than the speed of light or can send a super luminal signal. But, this is impossible. That is not to say, however, that O is forever unable to know about events in S. However, he must wait for a signal sent from any event in S to arrive at some time later than tc on his local clock. If O wants to send information to events in S, he must send it at a time earlier than tc . Let e be such an event in S. In order for O to receive information from e , O must wait a non-zero time interval after C* at which time the light signal from es arrives at O's worldline. This can easily be seen by inspecting the following space-time diagram, Fig. 13.6(a). In Fig. 13.6(b) we show what must be the case if O wishes to send a signal to e . It is clear that the event eB must precede the event C*.

This is the state of affairs forced upon us by Postulate II.

IV. Implications of the light cone structure.

How does the light cone structure induced by the postulates of the special theory affect the conceptions of space and time? One effect is that not every allowable Galilean inertial observer is an allowable relativistic inertial observer. Consider again the set of all Galilean inertial observers. The tilt between the worldlines of any two such observers could be anything up to 180 . The constraints induced by Postulate II forbid, however, any inertial observers from having worldlines which tilt by more than 90 . Since the set of space-time "tilts" for the Galilean space-time includes observers whose worldlines differ by more than 90deg., we see that Galilean relativity will never be consistent with Postulate II. Thus, consider the event C* and construct the set of all Galilean observers who experience C*. This is shown in Fig. 13.7.

In this figure we have indicated by drawing all tilted straight worldlines with slopes between straight up and at 90 to that direction. The arrows are meant to indicate the temporal sense. No arrows are directed backwards in "universal" time since backward time travel is not allowed in the Galilean world-view. Those Galilean observers who are travelling precisely at the speed of light are indicated by the heavy lines. The dashed worldlines in Fig. 13.7 are the worldlines of Galilean observers forbidden by the special theory.

The other major effect is to relativize a number of concepts which in earlier models were observer independent.

(l) causal connectibility

In Galilean space-time any two events are causally connectible. In the special theory of relativity (Minkowski space-time) only events which are such that one lies within or on the future light cone of the other are causally connectible.

(2) simultaneity

In Galilean space-time the simultaneity of events is independent of the relative motion of inertial observers. However, in the special theory of relativity events which are simultaneous for one inertial observer will not, in general, be simultaneous for another inertial observer.

(3) temporal intervals

In Galilean space-time there exists a universal time function. The temporal interval between arbitrary events is well-defined and unique, irrespective of the relative motion of inertial observers. In the special theory of relativity time intervals are relativized to an observer. The time interval between two events will not, in general, agree with the time interval measured by another inertial observer.

(4) spatial intervals

In Galilean space-time the spatial interval is undefined for events that lie on different time slices. For events which lie on the same time slice the value of the spatial interval is independent of the relative motion of observers who cross the same time slice. The spatial interval between events in the special theory of relativity is observer-dependent. The spatial interval between events on the same simultaneity slice is dependent on the relative state of motion of different observers.

Are there, then no absolute concepts left? The answer is "not quite".

While neither the time intervals between events nor the spatial

intervals between spatio temporal events are observer-independent, there

is a quantity which is observer-independent for all inertial observers in

Minkowski space-time called the "Interval" between two events which is

observer-independent. This quantity may be defined by making use of the

basic ideas already introduced in Chapter 12 (pp.~,lS above).

Let there be two observers A and B with B moving relative to A with

some velocity along B's x-axis.

See Fig. 13.8

Suppose A and B wish to observe one and the same event e in our Minkowski space-time. As we recall, the procedure is that A and B send light signals to e and receive light signals from e in return. The emission and reception event times are recorded on each observer's local clock. The situation is shown on a st diagram in Fig. 13.9, where we have labelled the proper times of emission and reception of the probe light signals.

These proper times are related as follows:

tB = ~tA

Recall that ~ = ~ . What can we learn from this? A few minutes of symbol manipulation leads one to the following realization:

tB; tB = (~tA ~ tA ) = tA; tA

What is the meaning of the above relation? It tells us that there is a number which has the same value for both A and B. The sum of the emission and reception times (which is related to the time interval) and the difference of the emission and reception times (which is related to the spatial distance) are not observer invariant. The product of the send and receive times, however, is observer invariant. Since this is such an important object, we give it a special name, "the Interval."

There is another useful form of the invariance of I. This is obtained by using the relation between the times tsend and treceive and the time tA (the time of occurrence of e as measured by A) and the spatial distance A (the distance of the event e from A). They are related via:

A 2 ( A2 Al)

A 2 ( A2 + Al )

These measures are shown on the following diagram.

It is a simple matter to solve for dA and tA in terms of tA and tA . The result is:

Now that we have expressed the proper times in terms of dA and tA ~ it is a simple matter to calculate the Interval I:

send reCeive (t - d)(t + d) = tA = dA

This result states that the combination of the difference between the squares of dA and tA is invariant' This means that the reference to A can be dropped, i.e. te - td is an invariant for all observers.

This result implies that the "geometry" of the Minkowski space-time is not Euclidean geometry, because if the time and space coordinates (t,x) are coordinates of events in a Euclidean space-time we would expect the "distance" between event locations to be given by a formula of the type


That is, it is the sum of squares that is expected in a geometrically Euclidean space-time. Consider the structure of the surface of simultaneity t = 0. From the definition of the Interval we have, for t = 0,

I It O = deg. - x - y - z

- I I O = + x2 + y2 + z

Thus -I|t O is the square of the ordinary Euclidean distance between the observer's origin and the event (O,x,y,z). Consequently these t=O "slices"

are Euclidean with the natural Euclidean (distance) function ~

_ t=O

However the space-time interval is not the square of a Euclidean distance.


This discovery that the difference rather than the "ordinary" sum of squares is invariant may be used as a point of departure for the study of a new type of four-dimensional geometry: the non-Euclidean Minkowskian space-time. Focussing on the geometrical implications of the special theory of relativity turned out to be an enormously fruitful approach. It was the road taken by Herman Minkowski in 1908 and by Einstein afterwards. In Minkowski space-time the interval between "slightly" different events is identified with the "distance" measure. If an event p has space-time coordinates (t ,x ,y ,z ) and a "slightly" different event q has space-time coordinates (t ,x ,y ,z ), where

t = tp + dt, x = x + dx, y = y + dy, z = q + dz

then the natural "distance" of this space-time is given by ds - I = (dt) - (dx) - (dy) -(dz)

With this reorganization of the set of events we have come to another mile-stone in the history of space-time models. The Minkowski space-time is a truly new object, born from the ashes of the absolute structures in the Aristotelian, Galilean, and Newtonian space-time models. Its properties are tied in a direct way to the experimentally verified postulates of Einstein's 1905 theory of Special Relativity.