We have done considerable mountain climbing. Now we are in the rarefied atmosphere of theories of excessive beauty and we are nearing a high plateau on which geometry, optics, mechanics, and wave mechanics meet on common ground. Only concentrated thinking, and a considerable amount of re–creation, will reveal the beauty of our subject in which the last word has not been spoken.

Cornelius Lanczos,

The Variational Principles of Mechanics[29], p. 229

One way to simplify the analysis of a problem is to express it in a form in which the solution has a simple representation. However, it may not be easy to formulate the problem in such a way initially. It is often useful to start by formulating the problem in one way, and then transform it. For example, the formulation of the problem of the motion of a number of gravitating bodies is simple in rectangular coordinates, but it is easier to understand aspects of the motion in terms of orbital elements, such as the semimajor axes, eccentricities, and inclinations of the orbits. The semimajor axis and eccentricity of an orbit depend on both the configuration and the velocity of the body. Such transformations are more general than those that express changes in configuration coordinates. Here we investigate transformations of phase-space coordinates that involve both the generalized coordinates and the generalized momenta.

Suppose we have two different Hamiltonian systems, and suppose the trajectories of the two systems are in one-to-one correspondence. In this case both Hamiltonian systems can be mathematical models of the same physical system. Some questions about the physical system may be easier to answer by reference to one model and others may be easier to answer in the other model. For example, it may be easier to formulate the physical system in one model and to discover a conserved quantity in the other. Canonical transformations are maps between Hamiltonian systems that preserve the dynamics.

A *canonical transformation* is a phase-space coordinate transformation and an associated transformation of the Hamiltonian such that the dynamics given by Hamilton's equations in the two representations describe the same evolution of the system.

A *point transformation* is a canonical transformation that extends a possibly time-dependent transformation of the configuration coordinates to a phase-space transformation. For example, one might want to reexpress motion in terms of polar coordinates, given a description in terms of rectangular coordinates. In order to extend a transformation of the configuration coordinates to a phase-space transformation we must specify how the momenta and Hamiltonian are transformed.

We have already seen how coordinate transformations can be carried out in the Lagrangian formulation (see section 1.6.1). In that case, we found that if the Lagrangian transforms by composition with the coordinate transformation, then the Lagrange equations are equivalent.

Lagrangians that differ by the addition of a total time derivative have the same Lagrange equations, but may have different momenta conjugate to the generalized coordinates. So there is more than one way to make a canonical extension of a coordinate transformation.

Here, we find the particular canonical extension of a coordinate transformation for which the Lagrangians transform by composition with the transformation, with no extra total time derivative terms added to the Lagrangian.

Let *L* be a Lagrangian for a system. Consider the coordinate transformation *q* = *F* (*t*, *q*′). The velocities transform by

We obtain a Lagrangian *L*′ in the transformed coordinates by composition of *L* with the coordinate transformation. We require that *L*′(*t*, *q*′, *v*′) = *L*(*t*, *q*, *v*), so:

The momentum conjugate to *q*′ is

where we have used

So, from equation (5.3),^{1}

We can collect these results to define a canonical phase-space transformation *C*_{H}:^{2}

The Hamiltonian is obtained by the Legendre transform

using relations (5.1) and (5.3) in the second step. Fully expressed in terms of the transformed coordinates and momenta, the transformed Hamiltonian is

The Hamiltonians *H*′ and *H* are equivalent because *L* and *L*′ have the same value for a given dynamical state and so have the same paths of stationary action. In general *H* and *H*′ do not have the same values for a given dynamical state, but differ by a term that depends on the coordinate transformation.

For time-independent transformations, ∂_{0}*F* = 0, there are a number of simplifications. The relationship of the velocities (5.1) becomes

Comparing this to the relation (5.5) between the momenta, we see that in this case the momenta transform “oppositely” to the velocities^{3}

so the product of the momenta and the velocities is not changed by the transformation. This, combined with the fact that by construction *L*(*t*, *q*, *v*) = *L*′(*t*, *q*′, *v*′), shows that

For time-independent coordinate transformations the Hamiltonian transforms by composition with the associated phase-space transformation. We can also see this from the general relationship (5.7) between the Hamiltonians.

The procedure F->CH takes a procedure F implementing a transformation of configuration coordinates and returns a procedure implementing a transformation of phase-space coordinates:^{4}

Consider a particle moving in a central field. In rectangular coordinates a Hamiltonian is

Let's look at this Hamiltonian in polar coordinates. The phase-space transformation is obtained by applying F->CH to the procedure p->r that takes a time and a polar tuple and returns a tuple of rectangular coordinates (see section 1.6.1). The transformation is time independent so the Hamiltonian transforms by composition. In polar coordinates the Hamiltonian is

There are three terms. There is the potential energy, which depends on the radius, there is the kinetic energy due to radial motion, and there is the kinetic energy due to tangential motion. As expected, the angle *φ* does not appear and thus the angular momentum is a conserved quantity. By going to polar coordinates we have decoupled one of the two degrees of freedom in the problem.

If the transformation is time varying the Hamiltonian must be adjusted by adding a correction to the composition of the Hamiltonian and the transformation (see equation 5.8):

The correction is computed by

For example, consider a transformation to coordinates translating with velocity *v*:

We compute the additive adjustment required for the Hamiltonian:

Notice that this is the negation of the inner product of the momentum and the velocity of the coordinate system.

Let's see how a simple free-particle Hamiltonian is transformed:

The transformed Hamiltonian is:

**Exercise 5.1: Galilean invariance**

Is this result what you expected? Let's investigate.

Recall that in exercise 1.29 we showed that if the kinetic energy is

Let *C*_{H} be the phase space extension of the translation transformation, and *C* be the local tuple extension. The transformed Hamiltonian is *H*′ = *H* ∘ *C*_{H} + *K*; the transformed Lagrangian is *L*′ = *L* ∘ *C*.

**a.** Derive the relationship between *p* and *p*′ both from *C*_{H} and from the Lagrangians. Are they the same? Derive the relationship between *v* and *v*′ by taking the derivative of the Hamiltonians with respect to the momenta (Hamilton's equation). Show that the Legendre transform of *L*′ gives the same *H*′.

**b.** We have shown that *L* and *L*′ differ by a total time derivative. So for any uniformly moving coordinate system we can write the Lagrangian as *p*^{2}/(2*m*). Show that this differs from *H*′ by a total time derivative in the corresponding Lagrangians.

**Exercise 5.2: Rotations**

Let *q* and *q*′ be rectangular coordinates that are related by a rotation *R*: *q* = *Rq*′. The Lagrangian for the system is *C*_{H}. Compare the transformation equations for the rectangular components of the momenta to those for the rectangular components of the velocities. Are you surprised, considering equation (5.10)?

Although we have shown how to extend any coordinate transformation of the configuration space to a canonical transformation, there are other ways to construct canonical transformations. How do we know if we have a canonical transformation? To test if a transformation is canonical we may use the fact that if the transformation is canonical, then Hamilton's equations of motion for the transformed system and the original system will be equivalent.

Consider a Hamiltonian *H* and a phase-space transformation *C*_{H}. Let *D _{s}* be the function that takes a Hamiltonian and gives the Hamiltonian state-space derivative:

Hamilton's equations are

for any realizable phase-space path *σ*.

The transformation *C*_{H} transforms the phase-space path *σ*′ (*t*) = (*t*, *q*′ (*t*), *p*′ (*t*)) into *σ*(*t*) = (*t*, *q*(*t*), *p*(*t*)):

The rates of change of the phase-space coordinates are transformed by the derivative of the transformation

The transformation is canonical if the equations of motion obtained from the new Hamiltonian are the same as those that could be obtained by transforming the equations of motion derived from the original Hamiltonian to the new coordinates:

Using equation (5.14), we see that

With *σ* = *C*_{H} ∘ *σ*′, we find

This condition must hold for any realizable phase-space path *σ*′. Certainly this is true if the following condition holds for every phase-space point:^{6}

Any transformation that satisfies equation (5.20) is a canonical transformation among phase-space representations of a dynamical system. In one phase-space representation the system's dynamics is characterized by the Hamiltonian *H*′ and in the other by *H*. The idea behind this equation is illustrated in figure 5.1.

We can formalize this test as a program:

where Hamiltonian->state-derivative, which was introduced in chapter 3, implements *D _{s}*. The transformation is canonical if these residuals are zero.

For time-independent point transformations an appropriate Hamiltonian can be formed by composition with the corresponding phase-space transformation. For more general canonical transformations, we will see that if a transformation is independent of time, a suitable Hamiltonian for the transformed system can be obtained by composing the Hamiltonian with the phase-space transformation. In this case we obtain a more specific formula:

The analysis of the harmonic oscillator illustrates the use of a general canonical transformation in the solution of a problem. The harmonic oscillator is a mathematical model of a simple spring-mass system. A Hamiltonian for a spring-mass system with mass *m* and spring constant *k* is

Hamilton's equations of motion are

giving the second-order system

The solution is

where

and where *A* and *φ* are determined by initial conditions.

We use the polar-canonical transformation:

where

Here *α* is an arbitrary parameter. We define:

And now we just run our test:

So the transformation is canonical for the harmonic oscillator.^{7}

Let's use our polar-canonical transformation *C _{α}* to help us solve the harmonic oscillator. We substitute expressions (5.28) and (5.29) for

If we choose

and the new Hamiltonian no longer depends on the coordinate. Hamilton's equation for *I* is

so *I* is constant. The equation for *θ* is

so

In the original variables,

with the constant

**Exercise 5.3: Trouble in Lagrangian world**

Is there a Lagrangian *L*′ that corresponds to the harmonic oscillator Hamiltonian *H*′(*t*, *θ*, *I*) = *ωI*? What could this possibly mean?

**Exercise 5.4: Group properties**

If we say that *C*_{H} is canonical with respect to Hamiltonians *H* and *H*′ if and only if *D*_{s}*H* ∘ *C*_{H} = *DC*_{H} · *D*_{s}*H*′, then:

**a.** Show that the composition of canonical transformations is canonical.

**b.** Show that composition of canonical transformations is associative.

**c.** Show that the identity transformation is canonical.

**d.** Show that there is an inverse for a canonical transformation and the inverse is canonical.

We have seen that for time-dependent point transformations the Hamiltonian appropriate for the transformed system is the original Hamiltonian composed with the transformation and augmented with an additive correction. Here we find a similar decomposition for general time-dependent canonical transformations.

The key to this decomposition is to separate the time part and the phase-space part of the Hamiltonian state derivative:^{8}

where

as code:^{9}

If we assume that *H*′ = *H* ∘ *C*_{H} + *K*, then the canonical condition (5.20) becomes

Expanding the state derivative, the canonical condition is

Equation (5.40) is satisfied if the following conditions are met:

The value of *T* ∘ *C*_{H} does not depend on *C*_{H}, so this term is really very simple. Notice that equation (5.41) does not depend upon *K* and that equation (5.42) does not depend upon *H*.

These can be implemented as follows:

Consider a time-dependent transformation to uniformly rotating coordinates:^{10}

with components

As a program this is

The extension of this transformation to a phase-space transformation is

We first verify that this time-dependent transformation satisfies equation (5.41). We will try it for an arbitrary Hamiltonian with three degrees of freedom:

And it works. Note that this result did not depend on any details of the Hamiltonian, suggesting that we might be able to make a test that does not require a Hamiltonian. We will see that shortly.

Since we have a point transformation, we can compute the required adjustment to the Hamiltonian:

So, for this transformation an appropriate correction to the Hamiltonian is

which is minus the rate of rotation of the coordinate system multiplied by the angular momentum. We implement *K* as a procedure

and apply the test. We find:

The residuals are zero so this *K* correctly completes the canonical transformation.

We just saw that for the case of rotating coordinates the truth of equation (5.41) did not depend on the details of the Hamiltonian. If *C*_{H} satisfies equation (5.41) for any *H* then we can derive a condition on *C*_{H} that is independent of *H*.

Let's start with an expanded version of equation (5.41):

using the chain rule.

We introduce a shuffle function:

The argument to

Let *J* be the multiplier corresponding to the constant linear function

where *s*^{⋆} is an arbitrary argument, shaped like *DH*(*s*), that is compatible for multiplication with *s*. The value of *s*^{⋆} is irrelevant because *D*

We can move the *DC*_{H}(*s*′) to the left of *DH*(*C*_{H}(*s*′)) by taking its transpose:^{11}

Since

This is true for any *H* if

As a program, this is^{12,13}

This condition, equation (5.52), on *C*_{H}, called the *canonical condition*, does not depend on the details of *H*. This is a remarkable result: we can decide whether a phase-space transformation preserves the dynamics of Hamilton's equations without further reference to the details of the dynamical system. If the transformation is time dependent we can add a correction to the Hamiltonian to make it canonical.

The polar-canonical transformation satisfies the canonical condition:

But not every transformation we might try satisfies the canonical condition. For example, we might try *x* = *p* sin *θ* and *p _{x}* =

So this transformation does not satisfy the canonical condition.

The canonical condition can be written simply in terms of Poisson brackets.

The Poisson bracket can be written in terms of

as can be seen by writing out the components.

We break the transformation *C*_{H} into position and momentum parts:

In terms of the individual component functions, the canonical condition (5.52) is

where *i* = *j* and 0 otherwise. These equations are called the *fundamental Poisson brackets*. If a transformation satisfies these Poisson bracket relations then it satisfies the canonical condition.

We have found that a transformation is canonical if its position-momentum part satisfies the canonical condition, but for a time-dependent transformation we may have to modify the Hamiltonian by the addition of a suitable *K*. We can rewrite these conditions in terms of Poisson brackets. If the Hamiltonian is

the transformation will be canonical if the coordinate-momentum transformation satisfies the fundamental Poisson brackets, and *K* satisfies:

**Exercise 5.5: Poisson bracket conditions**

Fill in the details to show that the canonical condition (5.52) is equivalent to the fundamental Poisson brackets (5.56) and that the condition on *K* (5.42) is equivalent to the Poisson bracket condition on *K* (5.58).

It is convenient to reformulate the canonical condition in terms of matrices. We can obtain a matrix representation of a structure with the utility s->m that takes a structure that represents a multiplier of a linear transformation and returns a matrix representation of the multiplier. The procedure s->m takes three arguments: (s->m ls A rs). The ls and rs specify the shapes of objects that multiply A on the left and right to give a numerical value. These specify the basis. So, the matrix representation of the multiplier corresponding to

This matrix, **J**, is useful, so we supply a procedure J-matrix so that (J-matrix n) gives this matrix for an *n* degree-of-freedom system.

We can now reexpress the canonical condition (5.52) as a matrix equation:

There is a further simplification available. The elements of the first row and the first column of the matrix representation of

Consider transformations for which the time does not depend on the coordinates or momenta^{14}

For this kind of transformation the first row and the first column of the residuals of the canonical-transform? test are identically zero:

But for C-general these are not zero. Since the transformations we are considering at most shift time, we need to consider only the submatrix associated with the coordinates and the momenta.

The *qp* submatrix^{15} of dimension 2*n* × 2*n* of the matrix **J** is called the *symplectic unit* for *n* degrees of freedom:

The matrix **J*** _{n}* satisfies the following identities:

A 2*n* × 2*n* matrix **A** that satisfies the relation

is called a *symplectic matrix*. We can determine whether a matrix is symplectic:

An appropriate symplectic unit matrix of a given size is produced by the procedure symplectic-unit.

If the matrix representation of the derivative of a transformation is a symplectic matrix the transformation is a *symplectic transformation*. Here is a test for whether a transformation is symplectic:^{16}

The procedure symplectic-transform? returns a zero matrix if and only if the transformation being tested passes the symplectic matrix test.

For example, the point transformations are symplectic. We can show this for a general possibly time-dependent two-degree-of-freedom point transformation:

More generally, the phase-space part of the canonical condition is equivalent to the symplectic condition (for two degrees of freedom) even in the case of an unrestricted phase-space transformation.

**Exercise 5.6: Symplectic matrices**

Let **A** be a symplectic matrix: **A**^{−1} are symplectic.

**Exercise 5.7: Polar-canonical transformations**

Let *x*, *p* and *θ*, *I* be two sets of canonically conjugate variables. Consider transformations of the form *x* = *βI ^{α}* sin

**Exercise 5.8: Standard map**

Is the standard map a symplectic transformation? Recall that the standard map is: *I*′ = *I* + *K* sin *θ*, with *θ*′ = *θ* + *I*′, both modulo 2*π*.

**Exercise 5.9: Whittaker transform**

Shew that the transformation *q* = log ((sin *p*′)/*q*′) with *p* = *q*′ cot *p*′ is symplectic.

Canonical transformations allow us to change the phase-space coordinate system that we use to express a problem, preserving the form of Hamilton's equations. If we solve Hamilton's equations in one phase-space coordinate system we can use the transformation to carry the solution to the other coordinate system. What other properties are preserved by a canonical transformation?

We noted in equation (5.10) that point transformations that are canonical extensions of time-independent coordinate transformations preserve the value of *pv*. This does not hold for more general canonical transformations. We can illustrate this with the polar-canonical transformation. Along corresponding paths *x*, *p _{x}* and

and so *Dx* is

The difference of *pv* and the transformed *p*′*v*′ is

In general this is not zero. So the product *pv* is not necessarily invariant under general canonical transformations.

Here is a remarkable fact: the composition of the Poisson bracket of two phase-space state functions with a canonical transformation is the same as the Poisson bracket of each of the two functions composed with the transformation separately. Loosely speaking, the Poisson bracket is invariant under canonical phase-space transformations.

Let *f* and *g* be two phase-space state functions. Using the

where the fact that *C*_{H} satisfies equation (5.41) was used in the middle. This is

Consider a canonical transformation *C*_{H}. Let *Ĉ _{t}* be a function with parameter

where *C*_{H} is symplectic then the determinant of *DĈ _{t}* is one (see section 4.2.3), so

Thus, phase-space volume is preserved by symplectic transformations.

Liouville's theorem shows that time evolution preserves phase-space volume. Here we see that canonical transformations also preserve phase volumes. Later, we will find that time evolution actually generates a canonical transformation.

Define

where *Q* = *I*_{1} and *P* = *I*_{2} are the coordinate and momentum selectors, respectively. The arguments *ζ*_{1} and *ζ*_{2} are incremental phase-space states with zero time components.

The *ω* form can also be written as a sum over degrees of freedom:

Notice that the contributions for each *i* do not mix components from different degrees of freedom.

This bilinear form is closely related to the symplectic 2-form of differential geometry. It differs in that the symplectic 2-form is formally a function of the phase-space point as well as the incremental vectors.

Under a canonical transformation *s* = *C*_{H}(*s*′), incremental states transform with the derivative

We will show that the 2-form is invariant under this transformation

if the time components of the

We have shown that condition (5.41) does not depend on the details of the Hamiltonian *H*. So if a transformation satisfies the canonical condition we can use condition (5.41) with *H* replaced by an arbitrary function *f* of phase-space states:

In terms of *ω*, the Poisson bracket is

as can be seen by writing out the components. We use the fact that Poisson brackets are invariant under canonical transformations:

Using the relation (5.74) to expand the left-hand side of equation (5.76) we obtain:

The right-hand side of equation (5.76) is

Now the left-hand side must equal the right-hand side for any *f* and *g*, so the equation must also be true for arbitrary

So the

We have proven that

for canonical *C*_{H} and incremental states

Thus the bilinear antisymmetric function *ω* is invariant under even time-varying canonical transformations if the increments are restricted to have zero time component.

As a program, *ω* is

On page 356 we showed that point transformations are sym-plectic. Here we can see that the 2-form is preserved under these transformations for two degrees of freedom:

Alternatively, let **z**_{1} and **z**_{2} be the matrix representations of the *qp* parts of *ζ*_{1} and *ζ*_{2}. The matrix representation of *ω* is

Let **A** be the matrix representation of the *qp* part of *DC _{H}*(

But this is true if

which is equivalent to the condition that **A** is symplectic. (If a matrix is symplectic then its transpose is symplectic. See exercise 5.6).

The symplectic condition is symmetrical in that if **A** is symplec-tic then

is satisfied by time-varying canonical transformations, and time-varying canonical transformations are symplectic. But if the transformation is time varying then

is not satisfied because **J** is not invertible. Equation (5.86) is satisfied, however, for time-independent transformations.

The invariance of the symplectic 2-form under canonical transformations has a simple interpretation. Consider how the area of an incremental parallelogram in phase space transforms under canonical transformation. Let (Δ*q*, Δ*p*) and (*δq*, *δp*) be small increments in phase space, originating at (*q*, *p*). Consider the incremental parallelogram with vertex at (*q*, *p*) with these two phase-space increments as edges. The sum of the areas of the canonical projections of this incremental parallelogram can be written

The right-hand side is the sum of the areas on the canonical planes;^{17} for each *i* the area of a parallelogram is computed from the components of the vectors defining its adjacent sides. Let *ζ*_{1} = (0, Δ*q,* Δ*p*) and *ζ*_{2} = (0, *δq, δp*); then the sum of the areas of the incremental parallelograms is just

where *ω* is the bilinear antisymmetric function introduced in equation (5.70). The function *ω* is invariant under canonical transformations, so the sum of the areas of the incremental parallelograms is invariant under canonical transformations.

There is an integral version of this differential relation. Consider the oriented area of a region *R*′ in phase space (see figure 5.2). Suppose we make a canonical transformation from coordinates (*q*′, *p*′) to (*q*, *p*) taking region *R*′ to region *R*. The boundary of the region in the transformed coordinates is just the image under the canonical transformation of the original boundary. Let *R* onto the *q ^{i}*,

The area of an arbitrary region is just the limit of the sum of the areas of incremental parallelograms that cover the region, so the sum of oriented areas is preserved by canonical transformations:

That is, the sum of the projected areas on the canonical planes is preserved by canonical transformations. Another way to say this is

The equality-of-areas relation (5.90) can also be written as an equality of line integrals using Stokes's theorem, for simply-connected regions

The canonical planes are disjoint except at the origin, so the projected areas intersect in at most one point. Thus we may independently accumulate the line integrals around the boundaries of the individual projections of the region onto the canonical planes into a line integral around the unprojected region:

**Exercise 5.10: Watch out**

Consider the canonical transformation *C*_{H}:

**a.** Show that the transformation is symplectic for any *a*.

**b.** Show that equation (5.92) is not generally satisfied for the region enclosed by a curve of constant *J*.

We have considered a number of properties of general canonical transformations without having a method for coming up with them. Here we introduce the method of *generating functions*. The generating function is a real-valued function that compactly specifies a canonical transformation through its partial derivatives, as follows.

Consider a real-valued function *F*_{1}(*t*, *q*, *q*′) mapping configurations expressed in two coordinate systems to the reals. We will use *F*_{1} to construct a canonical transformation from one coordinate system to the other. We will show that the following relations among the coordinates, the momenta, and the Hamiltonians specify a canonical transformation:

The transformation will then be explicitly given by solving for one set of variables in terms of the others: To obtain the primed variables in terms of the unprimed ones, let *A* be the inverse of ∂_{1}*F*_{1} with respect to the third argument,

then

Let *B* be the coordinate part of the phase-space transformation *q* = *B*(*t*, *q*′, *p*′). This *B* is an inverse function of ∂_{2}*F*_{1}, satisfying

Using *B*, we have

To put the transformation in explicit form requires that the inverse functions *A* and *B* exist.

We can use the above relations to verify that some given transformation from one set of phase-space coordinates (*q*, *p*) with Hamiltonian function *H*(*t*, *q*, *p*) to another set (*q*′, *p*′) with Hamiltonian function *H*′(*t*, *q*′, *p*′) is canonical by finding an *F*_{1}(*t*, *q*, *q*′) such that the above relations are satisfied. We can also use arbitrarily chosen generating functions of type *F*_{1} to generate new canonical transformations.

The polar-canonical transformation (5.27) from coordinate and momentum (*x*, *p _{x}*) to new coordinate and new momentum (

introduced earlier, is canonical. This can also be demonstrated by finding a suitable *F*_{1} generating function. The generating function satisfies a set of partial differential equations, (5.93) and (5.94):

Using relations (5.102) and (5.103), which specify the transformation, equation (5.104) can be rewritten

which is easily integrated to yield

where *φ* is some integration “constant” with respect to the first integration. Substituting this form for *F*_{1} into the second partial differential equation (5.105), we find

but if we set *φ* = 0 the desired relations are recovered. So the generating function

generates the polar-canonical transformation. This shows that this transformation is canonical.

We can prove directly that the transformation generated by an *F*_{1} is canonical by showing that if Hamilton's equations are satisfied in one set of coordinates then they will be satisfied in the other set of coordinates. Let *F*_{1} take arguments (*t*, *x*, *y*). The relations among the coordinates are

and the Hamiltonians are related by

Substituting the generating function relations (5.110) into this equation, we have

Take the partial derivatives of this equality of expressions with respect to the variables *x* and *y*:^{18}

where the arguments are unambiguous and have been suppressed. On solution paths we can use Hamilton's equations for the (*x*, *p _{x}*) system to replace the partial derivatives of

Now compute the derivatives of *p _{x}* and

Using the fact that elementary partials commute, (∂_{2}(∂_{1}*F*_{1})* _{i}*)

Provided that ∂_{2}∂_{1}*F*_{1} is nonsingular,^{19} we have derived one of Hamilton's equations for the (*y*, *p _{y}*) system:

Hamilton's other equation,

can be derived in a similar way. So the generating function relations indeed specify a canonical transformation.

Generating functions can be used to specify a canonical transformation by the prescription given above. Here we show how to get a generating function from a canonical transformation, and derive the generating function rules.

The generating function representation of canonical transformations can be derived from the Poincaré integral invariants, as follows. We first show that, given a canonical transformation, the integral invariants imply the existence of a function of phase-space coordinates that can be written as a path-independent line integral. Then we show that partial derivatives of this function, represented in mixed coordinates, give the generating function relations between the old and new coordinates. We need to do this only for time-independent transformations because time-dependent transformations become time independent in the extended phase space (see section 5.5).

Let *C* be a time-independent canonical transformation, and let *C _{t}* be the

where *R*′ is a two-dimensional region in (*q*′, *p*′) coordinates at time *t*, *R* = *C _{t}*(

where *γ*′ is a curve in phase-space coordinates that begins at *γ*′(1) = (*q*′, *p*′), and *γ* is its image under *C _{t}*.

Let

and let

So the value of *G _{t}*(

Let

where *γ*′ is any path from *q*′, *p*′. Changing the initial point from *Ḡ* by a constant:

If we define *F* so that

then

demonstrating equation (5.120).

The phase-space point (*q*, *p*) in unprimed variables corresponds to (*q*′, *p*′) in primed variables, at an arbitrary time *t*. Both *p* and *q* are determined given *q*′ and *p*′. In general, given any two of these four quantities, we can solve for the other two. If we can solve for the momenta in terms of the positions we get a particular class of generating functions.^{20} We introduce the functions

that solve the transformation equations (*t*, *q*, *p*) = *C*(*t*, *q*′, *p*′) for the momenta in terms of the coordinates at a specified time. With these we introduce a function *F*_{1}(*t*, *q*, *q*′) such that

The function *F*_{1} has the same value as *F* but has different arguments. We will show that this *F*_{1} is in fact the generating function for canonical transformations introduced in section 5.4. Let's be explicit about the definition of *F*_{1} in terms of a line integral:

The two line integrals can be combined into this one because they are both expressed as integrals along a curve in (*q*, *q*′).

We can use the path independence of *F*_{1} to compute the partial derivatives of *F*_{1} with respect to particular components and consequently derive the generating function relations for the momenta.^{21} So we conclude that

These are just the configuration and momentum parts of the generating function relations for canonical transformation. So starting with a canonical transformation, we can find a generating function that gives the coordinate–momentum part of the transformation through its derivatives.

Starting from a general canonical transformation, we have constructed an *F*_{1} generating function from which the canonical transformation may be rederived. So we expect there is a generating function for every canonical transformation.^{22}

Point transformations were excluded from the previous argument because we could not deduce the momenta from the coordinates. However, a similar derivation allows us to make a generating function for this case. The integral invariants give us an equality of area integrals. There are other ways of writing the equality-of-areas relation (5.90) as a line integral. We can also write

The minus sign arises because by flipping the axes we are traversing the area in the opposite sense. Repeating the argument just given, we can define a function

that is independent of the path *γ*′. If we can solve for *q*′ and *p* in terms of *q* and *p*′ we can define the functions

and define

Then the canonical transformation is given as partial derivatives of *F*_{2}:

and

For canonical transformations that can be described by both an *F*_{1} and an *F*_{2}, there must be a relation between them. The alternative line integral expressions for the area integral are related. Consider the difference

The functions *F* and *F*′ are related by an integrated term

as are *F*_{1} and *F*_{2}:

The generating functions *F*_{1} and *F*_{2} are related by a Legendre transform:

We have passive variables *q* and *t*:

But *p* = ∂_{1}*F*_{1}(*t*, *q*, *q*′) from the first transformation, so

Furthermore, since *H*′(*t*, *q*′, *p*′) − *H*(*t*, *q*, *p*) = ∂_{0}*F*_{1}(*t*, *q*, *q*′) we can conclude that

We have used generating functions of the form *F*_{1}(*t*, *q*, *q*′) to construct canonical transformations:

We can also construct canonical transformations with generating functions of the form *F*_{2}(*t*, *q*, *p*′), where the third argument of *F*_{2} is the momentum in the primed system.^{23}

As in the *F*_{1} case, to put the transformation in explicit form requires that appropriate inverse functions be constructed to allow the solution of the equations.

Similarly, we can construct two other forms for generating functions, named mnemonically enough *F*_{3} and *F*_{4}:

and

These four classes of generating functions are called *mixed-variable generating functions* because the canonical transformations they generate give a mixture of old and new variables in terms of a mixture of old and new variables.

In every case, if the generating function does not depend explicitly on time then the Hamiltonians are obtained from one another purely by composition with the appropriate canonical transformation. If the generating function depends on time, then there are additional terms.

The generating functions presented each treat the coordinates and momenta collectively. One could define more complicated generating functions for which the transformations of different degrees of freedom are specified by generating functions of different types.

Point transformations can be represented in terms of a generating function of type *F*_{2}. Equations (5.6), which define a canonical point transformation derived from a coordinate transformation *F*, are

Let *S* be the inverse transformation of *F* with respect to the second argument

so that *q*′ = *S*(*t*, *F* (*t*, *q*′)). The momentum transformation that accompanies this coordinate transformation is

We can find the generating function *F*_{2} that gives this transformation by integrating equation (5.152) to get

Substituting this into equation (5.151), we get

We do not need the freedom provided by *φ*, so we can set it equal to zero:

with

So this *F*_{2} gives the canonical transformation of equations (5.161) and (5.162).

The canonical transformation for the coordinate transformation *S* is the inverse of the canonical transformation for *F*. By design *F* and *S* are inverses on the coordinate arguments. The identity function is *q* = *I*(*q*′) = *S*(*t*, *F* (*t*, *q*′)). Differentiating yields

so

Using this, the relation between the momenta (5.166) is

showing that *F*_{2} gives a point transformation equivalent to the point transformation (5.160). So from this other point of view the point transformation is canonical.

The *F*_{1} that corresponds to the *F*_{2} for a point transformation is

This is why we could not use generating functions of type *F*_{1} to construct point transformations.

A commonly required point transformation is the transition between polar coordinates and rectangular coordinates:

Using the formula for the generating function of a point transformation just derived, we find:

So the full transformation is derived:

We can isolate the rectangular coordinates to one side of the transformation and the polar coordinates to the other:

So, interpreted in terms of Newtonian vectors,

A useful time-dependent point transformation is the transition to a rotating coordinate system. This is most easily accomplished in polar coordinates. Here we have

where Ω is the angular velocity of the rotating coordinate system. The generating function is

This yields the transformation equations

which show that the momenta are the same in both coordinate systems. However, here the Hamiltonian is not a simple composition:

The Hamiltonians differ by the derivative of the generating function with respect to the time argument. In transforming to rotating coordinates, the values of the Hamiltonians differ by the product of the angular momentum and the angular velocity of the coordinate system. Notice that this addition to the Hamiltonian is the same as was found earlier (5.45).

In this example we illustrate how canonical transformations can be used to eliminate some of the degrees of freedom, leaving a problem with fewer degrees of freedom.

Suppose that only certain combinations of the coordinates appear in the Hamiltonian. We make a canonical transformation to a new set of phase-space coordinates such that these combinations of the old phase-space coordinates are some of the new phase-space coordinates. We choose other independent combinations of the coordinates to complete the set. The advantage is that these other independent coordinates do not appear in the new Hamiltonian, so the momenta conjugate to them are conserved quantities.

Let's see how this idea enables us to reduce the problem of two gravitating bodies to the simpler problem of the relative motion of the two bodies. In the process we will discover that the momentum of the center of mass is conserved. This simpler problem is an instance of the Kepler problem. The Kepler problem is also encountered in the formulation of the more general *n*-body problem.

Consider the motion of two masses *m*_{1} and *m*_{2}, subject only to a mutual gravitational attraction described by the potential *V* (*r*). This problem has six degrees of freedom. The rectangular coordinates of the particles are *x*_{1} and *x*_{2}, with conjugate momenta *p*_{1} and *p*_{2}. Each of these is a structure of the three rectangular components. The distance between the particles is *r* = ‖*x*_{1} − *x*_{2}‖. The Hamiltonian for the two-body problem is

The gravitational potential energy depends only on the relative positions of the two bodies. We do not need to specify *V* further at this point.

Since the only combination of coordinates that appears in the Hamiltonian is *x*_{2} − *x*_{1}, we choose new coordinates so that one of the new coordinates is this combination:

To complete the set of new coordinates we choose another to be some independent linear combination

where *a* and *b* are to be determined. We can use an *F*_{2}-type generating function

where *p* and *P* will be the new momenta conjugate to *x* and *X*, respectively. We deduce

We can solve these for the new momenta:

The generating function is not time dependent, so the new Hamiltonian is the old Hamiltonian composed with the transformation:

with the definitions

and

We recognize *m* as the “reduced mass.”

Notice that if the term proportional to *pP* were not present then the *x* and *X* degrees of freedom would not be coupled at all, and furthermore, the *X* part of the Hamiltonian would be just the Hamiltonian of a free particle, which is easy to solve. The condition that the “cross terms” disappear is

which is satisfied by

for any *c*. For a transformation to be defined, *c* must be nonzero. So with this choice the Hamiltonian becomes

with

and

The reduced mass is the same as before, and now

Notice that, without further specifying *c*, the problem has been separated into the problem of determining the relative motion of the two masses, and the problem of the other degrees of freedom. We did not need a priori knowledge that the center of mass might be important; in fact, only for a particular choice of *c* = (*m*_{1} + *m*_{2})^{−1} does *X* become the center of mass.

It is often useful to compose a sequence of canonical transformations to make up the transformation we need for any particular mechanical problem. The transformations we have supplied are especially useful as components in these computations.

We will illustrate the use of canonical transformations to learn about planar motion in a central field. The strategy will be to consider perturbations of circular motion in the central field. The analysis will proceed by transforming to a rotating coordinate system that rides on a circular reference orbit, and then making approximations that restrict the analysis to orbits that differ from the circular orbit only slightly.

In rectangular coordinates we can easily write a Hamiltonian for the motion of a particle of mass *m* in a field defined by a potential energy that is a function only of the distance from the origin as follows:

In this coordinate system Hamilton's equations are easy, and they are exactly what is needed to develop trajectories by numerical integration, but the expressions are not very illuminating:

We can learn more by converting to polar coordinates centered on the source of our field:

This coordinate system explicitly incorporates the geometrical symmetry of the potential energy. Extending this coordinate transformation to a point transformation, we can write the new Hamiltonian as:

We can now write Hamilton's equations in these new coordinates, and they are much more illuminating than the equations expressed in rectangular coordinates:

The angular momentum *p _{φ}* is conserved, and we are free to choose its constant value, so

It is instructive to consider how orbits that are close to the circular orbit differ from the circular orbit. This is best done in rotating coordinates in which a body moving in the circular orbit is a stationary point at the origin. We can do this by converting to coordinates that are rotating with the circular orbit and centered on the orbiting body. We proceed in three stages. First we will transform to a polar coordinate system that is rotating at angular velocity Ω. Then we will return to rectangular coordinates, and finally, we will shift the coordinates so that the origin is on the reference circular orbit.

We start by examining the system in rotating polar coordinates. This is a time-dependent coordinate transformation:

Using equation (5.178), we can write the new Hamiltonian directly:

*H*″ is not time dependent, and therefore it is conserved. It is not the sum of the potential energy and the kinetic energy. Energy is not conserved in the moving coordinate system, but what is conserved here is a new quantity, the *Jacobi constant*, that combines the energy with the product of the angular momentum of the particle in the new coordinate and the angular velocity of the coordinate system. We will want to keep track of this term.

Next, we return to rectangular coordinates, but they are rotating with the reference circular orbit:

The Hamiltonian is

With one more quick manipulation we shift the coordinate system so that the origin is out on our circular orbit. We define new rectangular coordinates *ξ* and *η* with the following simple canonical transformation of coordinates and momenta:

In this final coordinate system the Hamiltonian is

and Hamilton's equations are uselessly complicated, but the next step is to consider only trajectories for which the coordinates *ξ* and *η* are small compared with *R*_{0}. Under this assumption we will be able to construct approximate equations of motion for these trajectories that are linear in the coordinates, thus yielding simple analyzable motion. To this point we have made no approximations. The equations above are perfectly accurate for any trajectories in a central field.

The idea is to expand the potential-energy term in the Hamiltonian as a series and to discard any term higher than second-order in the coordinates, thus giving us first-order-accurate Hamilton's equations:

So the (negated) generalized forces are

With this expansion we obtain the linearized Hamilton's equations:

Of course, once we have linear equations we know how to solve them exactly. Because the linearized Hamiltonian is conserved we cannot get exponential expansion or collapse, so the possible solutions are quite limited. It is instructive to convert these equations into a second-order system. We use Ω^{2} = *DV*(*R*_{0})/(*mR*_{0}), equation (5.207), to eliminate the *DV* terms:

Combining these, we find

where

Thus we have a simple harmonic oscillator with frequency *ω* as one of the components of the solution. The general solution has three parts:

where

The constants *η*_{0}, *ξ*_{0}, *C*_{0}, and *φ*_{0} are determined by the initial conditions. If *C*_{0} = 0, the particle of interest is on a circular trajectory, but not necessarily the same one as the reference trajectory. If *C*_{0} = 0 and *ξ*_{0} = 0, we have a “fellow traveler,” a particle in the same circular orbit as the reference orbit but with different phase. If *C*_{0} = 0 and *η*_{0} = 0, we have a particle in a circular orbit that is interior or exterior to the reference orbit and shearing away from the reference orbit. The shearing is due to the fact that the angular velocity for a circular orbit varies with the radius. The constant *A* gives the rate of shearing at each radius. If both *η*_{0} = 0 and *ξ*_{0} = 0 but *C*_{0} ≠ 0, then we have “epicyclic motion.” A particle in a nearly circular orbit may be seen to move in an ellipse around the circular reference orbit. The ellipse will be elongated in the direction of circular motion by the factor 2Ω/*ω*, and it will rotate in the direction opposite to the direction of the circular motion. The initial phase of the epicycle is *φ*_{0}. Of course, any combination of these solutions may exist.

The epicyclic frequency *ω* and the shearing rate *A* are determined by the force law (the radial derivative of the potential energy). For a force law proportional to a power of the radius,

the epicyclic frequency is related to the orbital frequency by

and the shearing rate is

For a few particular integer force laws we see:

We can get some insight into the kinds of orbits produced by the epicyclic approximation by looking at a few examples. For some force laws we have integer ratios of epicyclic frequency to orbital frequency. In those cases we have closed orbits. For an inverse-square force law (*n* = 3) we get elliptical orbits with the center of the field at a focus of the ellipse. Figure 5.3 shows how an approximation to such an orbit can be constructed by superposition of the motion on an elliptical epicycle with the motion of the same frequency on a circle. If the force is proportional to the radius (*n* = 0) we get a two-dimensional harmonic oscillator. Here the epicyclic frequency is twice the orbital frequency. Figure 5.4 shows how this yields elliptical orbits that are centered on the source of the central force. An orbit is closed when *ω*/Ω is a rational fraction. If the force is proportional to the −3/4 power of the radius, the epicyclic frequency is 3/2 the orbital frequency. This yields the three-lobed pattern seen in figure 5.5. For other force laws the orbits predicted by this analysis are multi-lobed patterns produced by precessing approximate ellipses. Most of the cases have incommensurate epicyclic and orbital frequencies, leading to orbits that do not close in finite time.

The epicyclic approximation gives a very good idea of what actual orbits look like. Figure 5.6, drawn by numerical integration of the orbit produced by integrating the original rectangular equations of motion for a particle in the field, shows the rosette-type picture characteristic of incommensurate epicyclic and orbital frequencies for an *F* = −*r*^{−2.3} force law.

We can directly compare a numerically integrated system with one of our epicyclic approximations. For example, the result of numerically integrating our *F* ∝ *r*^{−3/4} system is very similar to the picture we obtained by epicycles. (See figure 5.7 and compare it with figure 5.5.)

**Exercise 5.11: Collapsing orbits**

What exactly happens as the force law becomes steeper? Investigate this by sketching the contours of the Hamiltonian in *r*, *p _{r}* space for various values of the force-law exponent,

The addition of a total time derivative to a Lagrangian leads to the same Lagrange equations. However, the two Lagrangians have different momenta, and they lead to different Hamilton's equations. Here we find out how to represent the corresponding canonical transformation with a generating function.

Let's restate the result about total time derivatives and Lagrangians from the first chapter. Consider some function *G*(*t*, *q*) of time and coordinates. We have shown that if *L* and *L*′ are related by

then the Lagrange equations of motion are the same. The generalized coordinates used in the two Lagrangians are the same, but the momenta conjugate to the coordinates are different. In the usual way, define

and

So we have

Evaluated on a trajectory, we have

This transformation is a special case of an *F*_{2}-type transformation. Let

then the associated transformation is

Explicitly, the new Hamiltonian is

where we have used the fact that *q* = *q*′. The transformation is interesting in that the coordinate transformation is the identity transformation, but the new and old momenta are not the same, even in the case in which *G* has no explicit time dependence. Suppose we have a Hamiltonian of the form

then the transformed Hamiltonian is

We see that this transformation may be used to modify terms in the Hamiltonian that are linear in the momenta. Starting from *H*, the transformation introduces linear momentum terms; starting from *H*′, the transformation eliminates the linear terms.

We illustrate the use of this transformation with the driven pendulum. The Hamiltonian for the driven pendulum derived from the *T* − *V* Lagrangian (see section 1.6.2) is

where *y _{s}* is the drive function. The Hamiltonian is rather messy, and includes a term that is linear in the angular momentum with a coefficient that depends on both the angular coordinate and the time. Let's see what happens if we apply our transformation to the problem to eliminate the linear term. We can identify the transformation function

The transformed momentum is

and the transformed Hamiltonian is

Dropping the last two terms, which do not affect the equations of motion, we find

So we have found, by a straightforward canonical transformation, a Hamiltonian for the driven pendulum with the rather simple form of a pendulum with gravitational acceleration that is modified by the acceleration of the pivot. It is, in fact, the Hamiltonian that corresponds to the alternative form of the Lagrangian for the driven pendulum that we found earlier by inspection (see equation 1.120). Here the derivation is by a simple canonical transformation, motivated by a desire to eliminate unwanted terms that are linear in the momentum.

**Exercise 5.12: Construction of generating functions**

Suppose that canonical transformations

are generated by two *F*_{1}-type generating functions, *F*_{1a}(*t*, *q*, *q*′) and *F*_{1b}(*t*, *q*′, *q*″).

**a.** Show that the generating function for the inverse transformation of *C _{a}* is

**b.** Define a new kind of generating function,

*F _{x}*(

We see that

*p* = ∂_{1}*F*_{x}(*t*, *q*, *q*′, *q*″) = ∂_{1}*F*_{1a}(*t*, *q*, *q*′)

*p*″ = −∂_{3}*F _{x}*(

Show that ∂_{2}*F*_{x} = 0, allowing a solution to eliminate *q*′.

**c.** Using the formulas for *p* and *p*″ above, and the result from part **b**, Show that *F _{x}* is an appropriate generating function for the composition transformation

**Exercise 5.13: Linear canonical transformations**

We consider systems with two degrees of freedom and transformations for which the Hamiltonian transforms by composition.

**a.** Consider the linear canonical transformations that are generated by

Show that these transformations are just the point transformations, and that the corresponding *F*_{1} is zero.

**b.** Other linear canonical transformations can be generated by

Surely we can make even more generators by constructing *F*_{3}- and *F*_{4}-type transformations analogously. Are all of the linear canonical transformations obtainable in this way? If not, show one that cannot be so generated.

**c.** Can all linear canonical transformations be generated by compositions of transformations generated by the functions shown in parts **a** and **b** above?

**d.** How many independent parameters are necessary to specify all possible linear canonical transformations for systems with two degrees of freedom?

**Exercise 5.14: Integral invariants**

Consider the linear canonical transformation for a system with two degrees of freedom generated by the function

and the general parallelogram with a vertex at the origin and with adjacent sides starting at the origin and extending to the phase-space points (*x*_{1a}, *x*_{2a}, *p*_{1a}, *p*_{2a}) and (*x*_{1b}, *x*_{2b}, *p*_{1b}, *p*_{2b}).

**a.** Find the area of the given parallelogram and the area of the target parallelogram under the canonical transformation. Notice that the area of the parallelogram is not preserved.

**b.** Find the areas of the projections of the given parallelogram and the areas of the projections of the target under canonical transformation. Show that the sum of the areas of the projections on the action-like planes is preserved.

**Exercise 5.15: Standard-map generating function**

Find a generating function for the standard map (see exercise 5.8 on page 357).

In this section we show that we can treat time as just another coordinate if we wish. Systems described by a time-dependent Hamiltonian may be recast in terms of a time-independent Hamiltonian with an extra degree of freedom. An advantage of this view is that what was a time-dependent canonical transformation can be treated as a time-independent transformation, where there are no additional conditions for adjusting the Hamiltonian.

Suppose that we have some system characterized by a time-dependent Hamiltonian, for example, a periodically driven pendulum. We may imagine that there is some extremely massive oscillator, unperturbed by the motion of the relatively massless pendulum, that produces the drive. Indeed, we may think of time itself as the coordinate of an infinitely massive particle moving uniformly and driving everything else. We often consider the rotation of the Earth as exactly such a stable time reference when performing short-time experiments in the laboratory.

More formally, consider a dynamical system with *n* degrees of freedom, whose behavior is described by a possibly time-dependent Lagrangian *L* with corresponding Hamiltonian *H*. We make a new dynamical system with *n* + 1 degrees of freedom by extending the generalized coordinates to include time and introducing a new independent variable. We also extend the generalized velocities to include a velocity for the time coordinate. In this new *extended state space* the coordinates are redundant, so there is a constraint relating the time coordinate to the new independent variable.

We relate the original dynamical system to the extended dynamical system as follows: Let *q* be a coordinate path. Let (*q _{e}*,

We can find a Lagrangian for the extended system by requiring that the value of the action be unchanged. Introduce the extended Lagrangian action

with

We have

The extended system is subject to a constraint that relates the time to the new independent variable. We assume the constraint is of the form *φ*(*τ*; *q _{e}*,

The Lagrange equations of *q _{e}* are satisfied for the paths

The momenta conjugate to the coordinates are

So the extended momenta have the same values as the original momenta at the corresponding states. The momentum conjugate to the time coordinate is the negation of the energy plus *v _{λ}*. The momentum conjugate to

Next we carry out the transformation to the corresponding Hamiltonian formulation. First, note that the Lagrangian *L _{e}* is a homogeneous form of degree one in the velocities. Thus, by Euler's theorem,

The

So the Hamiltonian

We have used the fact that at corresponding states the momenta have the same values, so on paths *p _{e}* =

The Hamiltonian *λ* so we deduce that *p _{λ}* is constant. In fact,

This extended Hamiltonian governs the evolution of the extended system, for arbitrary *f*.^{25}

Hamilton's equations reduce to

The second equation gives the required relation between *t* and *τ*. The first and third equations are equivalent to Hamilton's equations in the original coordinates, as we can see by using *q _{e}* =

Using *Dt*(*τ*) = *Df*(*τ*) and dividing these factors out, we recover Hamilton's equations.^{26}

Now consider the special case for which the time is the same as the independent variable: *f*(*τ*) = *τ*, *Df*(*τ*) = 1. In this case *q* = *q _{e}* and

Hamilton's equation for *t* becomes *Dt*(*τ*) = 1, restating the constraint. Hamilton's equations for *Dq _{e}* and

The extended Hamiltonian (5.274) does not depend on the independent variable, so it is a conserved quantity. Thus, up to an additive constant *p _{t}* is equal to minus the energy. The Hamilton's equation for

The extension transformation is canonical in the sense that the two sets of equations of motion describe equivalent dynamics. However, the transformation is not symplectic; in fact, it does not even have the same number of input and output variables.

**Exercise 5.16: Homogeneous extended Lagrangian**

Verify that *L _{e}* is homogeneous of degree one in the velocities.

**Exercise 5.17: Lagrange equations**

**a.** Verify that the Lagrange equations for *q _{e}* are satisfied for exactly the same trajectories that satisfy the original Lagrange equations for

**b.** Verify that the Lagrange equation for *t* relates the rate of change of energy to ∂_{0}*L*.

**Exercise 5.18: Lorentz transformations**

Investigate Lorentz transformations as point transformations in the extended phase space.

An example that shows the utility of reformulating a problem in the extended phase space is the restricted three-body problem: the motion of a low-mass particle subject to the gravitational attraction of two other massive bodies that move in some fixed orbit. The problem is an idealization of the situation where a body with very small mass moves in the presence of two bodies with much larger masses. Any effects of the smaller body on the larger bodies are neglected. In the simplest version, the motion of all three bodies is assumed to be in the same plane, and the orbits of the two massive bodies are circular.

The motion of the bodies with larger masses is not influenced by the small mass, so we model this situation as the small body moving in a time-varying field of the larger bodies undergoing a prescribed motion. This situation can be captured as a time-dependent Hamiltonian:

where *r*_{1}(*t*) and *r*_{2}(*t*) are the distances of the small body to the larger bodies, *m* is the mass of the small body, and *m*_{1} and *m*_{2} are the masses of the larger bodies. Note that *r*_{1}(*t*) and *r*_{2}(*t*) are quantities that depend both on the position of the small particle and the time-varying position of the massive particles.

The massive bodies are in circular orbits and maintain constant distance from the center of mass. Let *a*_{1} and *a*_{2} be the distances to the center of mass; then the distances satisfy *m*_{1}*a*_{1} = *m*_{2}*a*_{2}. The angular frequency is *a* is the distance between the masses.

In polar coordinates, with the center of mass of the subsystem of massive particles at the origin and with *r* and *θ* describing the position of the low-mass particle, the positions of the two massive bodies are *a*_{2} = *m*_{1}*a*/(*m*_{1}+*m*_{2}) with *θ*_{2} = Ω*t*, *a*_{1} = *m*_{2}*a*/(*m*_{1}+*m*_{2}) with *θ*_{1} = Ω*t* + *π*. The distances to the point masses are

In polar coordinates, the Hamiltonian is

The Hamiltonian can be written in terms of some function *f* such that

The essential feature is that *θ* and *t* appear in the Hamiltonian only in the combination *θ* − Ω*t*.

One way to get rid of the time dependence is to choose a new set of variables with one coordinate equal to this combination *θ* − Ω*t*, by making a point transformation to a rotating coordinate system. We have shown that

with

is a canonical transformation. The new Hamiltonian, which is not the energy, is conserved because there is no explicit time dependence. It is a useful conserved quantity—the Jacobi constant.^{27}

We can also eliminate the dependence on the independent time-like variable from the Hamiltonian for the restricted problem by going to the extended phase space, choosing *t* = *τ*. The Hamiltonian

is autonomous and is consequently a conserved quantity. Again, we see that *θ* and *t* occur only in the combination *θ* − Ω*t*, which suggests a point transformation to a new coordinate *θ*′ = *θ* − Ω*t*. This point transformation is independent of the new independent variable *τ*. The transformation is specified in equations (5.280–5.283), augmented by relations specifying how the time coordinate and its conjugate momentum are handled:

The new Hamiltonian is obtained by composing the old Hamiltonian with the transformation:

We recognize that the new Hamiltonian in the extended phase space, which has the same value as the original Hamiltonian in the extended phase space, is just the Jacobi constant plus *t*′, so

**Exercise 5.19: Transformations in the extended phase space**

In section 5.2.1 we found that time-dependent transformations for which the derivative of the coordinate–momentum part is symplectic are canonical only if the Hamiltonian is modified by adding a function *K* subject to certain constraints (equation 5.42). Show that the constraints on *K* follow from the symplectic condition in the extended phase space, using the choice *t* = *τ*.

The Poincaré invariant (section 5.3) is especially useful in the extended phase space with *t* = *τ*. In the extended phase space the extended Hamiltonian does not depend on the independent variable. In the extended phase space canonical transformations are symplectic and the Hamiltonian transforms by composition.

For the special choice of *t* = *τ*, equation (5.90) can be rephrased in an interesting way. Let *E* be the value of the Hamiltonian in the original unextended phase space. Using *q ^{n}* =

and

The relations (5.289) and (5.290) are two formulations of the *Poincaré–Cartan integral invariant*.

Suppose we have a system with *n*+1 degrees of freedom described by a time-independent Hamiltonian in a (2*n* + 2)-dimensional phase space. Here we can play the converse game: we can choose any generalized coordinate to play the role of “time” and the negation of its conjugate momentum to play the role of a new *n*-degree-of-freedom time-dependent Hamiltonian in a *reduced phase space* of 2*n* dimensions.

More precisely, let

and suppose we have a system described by a time-independent Hamiltonian

For each solution path there is a conserved quantity *E*. Let's choose a coordinate *q ^{n}* to be the time in a reduced phase space. We define the dynamical variables for the

In the original phase space a coordinate such as *q ^{n}* maps time to a coordinate. In the formulation of the reduced phase space we will have to use the inverse function

and thus

We propose that a Hamiltonian in the reduced phase space is the negative of the inverse of *f*(*q*^{0}, …, *q ^{n}*;

Note that in the reduced phase space we will have indices for the structured variables in the range 0 … *n*−1, whereas in the original phase space the indices are in the range 0 … *n*. We will show that *H _{r}* is an appropriate Hamiltonian for the given dynamical system in the reduced phase space. To compute Hamilton's equations we must expand the implicit definition of

Note that *by construction* this function is identically a constant *g* = *E*. Thus all of its partial derivatives are zero:

where we have suppressed the arguments. Solving for partials of *H _{r}*, we get

Using these relations, we can deduce the Hamilton's equations in the reduced phase space from the Hamilton's equations in the original phase space:

Consider planar motion in a central field. We have already seen this expressed in polar coordinates in equation (3.100):

There are two degrees of freedom and the Hamiltonian is time independent. Thus the energy, the value of the Hamiltonian, is conserved on realizable paths. Let's forget about time and reparameterize this system in terms of the orbital radius *r*.^{28} To do this we solve

for *p _{r}*, obtaining

which is the Hamiltonian in the reduced phase space.

Hamilton's equations are now quite simple:

The momentum *p _{φ}* is independent of

To see the utility of this procedure, we continue our example with a definite potential energy—a gravitating point mass:

When we substitute this into equation (5.307) we obtain a mess that can be simplified to

Integrating this, we obtain another mess, which can be simplified and rearranged to obtain the following:

This can be recognized as the polar-coordinate form of the equation of a conic section with eccentricity *e* and parameter *p*:

where

In fact, if the orbit is an ellipse with semimajor axis *a*, we have

and so we can identify the role of energy and angular momentum in shaping the ellipse:

What we get from analysis in the reduced phase space is the geometry of the trajectory, but we lose the time-domain behavior. The reduction is often worth the price.

Although we have treated time in a special way so far, we have found that time is not special. It can be included in the coordinates to make a driven system autonomous. And it can be eliminated from any autonomous system in favor of any other coordinate. This leads to numerous strategies for simplifying problems, by removing time variation and then performing canonical transforms on the resulting conservative autonomous system to make a nice coordinate that we can then dump back into the role of time.

We can represent canonical transformations with mixed-variable generating functions. We can extend these to represent transformations in the extended phase space. Let *F*_{2} be a generating function with arguments (*t*, *q*, *p*′). Then, the corresponding

The relations between the coordinates and the momenta are the same as before. We also have

The first equation gives the relationship between the original Hamiltonians:

as required. Time-independent canonical transformations, where *H*′ = *H* ∘ *C*_{H}, have symplectic *qp* part. The generating-function representation of a time-dependent transformation does not depend on the independent variable in the extended phase space. So, in extended phase space the *qp* part of the transformation, which includes the time and the momentum conjugate to time, is symplectic.

**Exercise 5.20: Rotating coordinates in extended phase space**

In the extended phase space the time is one of the coordinates. Carry out the transformation to rotating coordinates using an *F*_{2}-type generating function in the extended phase space. Compare Hamiltonian (5.178) to the Hamiltonian obtained by composition with the transformation.

Canonical transformations can be used to reformulate a problem in coordinates that are easier to understand or that expose some symmetry of a problem.

In this chapter we have investigated different representations of a dynamical system. We have found that different representations will be equivalent if the coordinate–momentum part of the transformation has a symplectic derivative, and if the Hamiltonian transforms in a specified way. If the phase-space transformation is time independent, then the Hamiltonian transforms by composition with the phase-space transformation. The symplectic condition can be equivalently expressed in terms of the fundamental Poisson brackets. The Poisson bracket and the *ω* function are invariant under canonical transformations. The invariance of *ω* implies that the sum of the areas of the projections onto fundamental coordinate–momentum planes is preserved (Poincaré integral invariant) by canonical transformations.

A generating function is a real-valued function of the phase-space coordinates and time that represents a canonical transformation through its partial derivatives. We found that every canonical transformation can be represented by a generating function. The proof depends on the Poincaré integral invariant.

We can formulate an extended phase space in which time is treated as another coordinate. Time-dependent transformations are simple in the extended phase space. In the extended phase space the Poincaré integral invariant is the Poincaré–Cartan integral invariant. We can also reformulate a time-independent problem as a time-dependent problem with fewer degrees of freedom, with one of the original coordinates taking on the role of time; this is the reduced phase space.

**Exercise 5.21: Hierarchical Jacobi coordinates**

A Hamiltonian for the *n*-body problem is

with

and

where *x _{i}* is the tuple of rectangular coordinates for body

The potential energy of the system depends only on the relative positions of the bodies, so the relative motion decouples from the center of mass motion. In this problem we explore canonical transformations that achieve this decoupling.

**a.** Canonical heliocentric coordinates. The coordinates transform as follows:

where *X* is the center of mass of the system, and

for *i* > 0, the differences of the position of body *i* and the body with index 0 (which might be the Sun). Find the associated canonical momenta using an *F*_{2}-type generating function. Show that the potential energy can be written solely in terms of the coordinates for *i* > 0. Show that the kinetic energy is not in the form of a sum of squares of momenta divided by mass constants.

**b.** Jacobi coordinates. The Jacobi coordinates isolate the center of mass motion, without spoiling the usual diagonal quadratic form of the kinetic energy. Define *X _{i}* to be the center of mass of the bodies with indices less than or equal to

The Jacobi coordinates are defined by

for 0 < *i* < *n*, and

The coordinates *i* < *n* are the difference of the position of body *i* − 1 and the center of mass of bodies with lower indices; the coordinate *F*_{2}-type generating function. Show that the kinetic energy can still be written in the form

for some constants *V* can be written solely in terms of the Jacobi coordinates *i* > 0.

**c.** Hierarchical Jacobi coordinates. Define a “body” as a tuple of a mass and a rectangular position tuple. An *n*-body “system” is a tuple of *n* bodies: (*b*_{0}, *b*_{1}, …, *b*_{n−1}). Define a “linking” transformation *j* and *k* that takes an *n*-body system and returns a new linked system:

The bodies in the new system are the same as the bodies in the old system *j* and *k*:

This is a transformation to relative coordinates and center of mass for bodies *j* and *k*. Extend this transformation to phase space and show that it preserves the form of the kinetic energy

Show that the transformation to Jacobi coordinates of part **b** is generated by a composition of linking transformations:

Interpret the coordinate transformation produced by such a succession of linking transformations; why do we call this a “linking” transformation? What requirement has to be satisfied for a composition of linking transformations to isolate the center of mass of the system (make it one of the coordinates)? Taking this constraint into account, find hierarchical Jacobi coordinates for a system with six bodies, arranged as two triple systems, each of which is a binary plus a third body. Verify that one of the coordinates is the center of mass of the system, and that the kinetic energy remains a sum of squares of the momenta divided by an appropriate mass constant.

^{1} Solving for *p* in terms of *p*′ involves multiplying equation (5.3) on the right by (∂_{1}*F*(*t*, *q*′))^{−1}. This inverse is the structure that when multiplying ∂_{1}*F*(*t*, *q*′) on the right gives an identity structure. Structures representing linear transformations may be represented in terms of matrices. In this case, the matrix representation of the inverse structure is the inverse of the matrix representing the given structure.

^{2}In chapter 1 the transformation *C* takes a local tuple in one coordinate system and gives a local tuple in another coordinate system. In this chapter *C*_{H} is a phase-space transformation.

^{3}The velocities and the momenta are dual geometric objects with respect to time-independent point transformations. The velocities are coordinates of a vector field on the configuration manifold, and the momenta are coordinates of a covector field on the configuration manifold. The invariance of the inner product *pv* under time-independent point transformations provides a motivation for our use of superscripts for velocity components and subscripts for momentum components.

^{4}The procedure solve-linear-right multiplies its first argument by the inverse of its second argument on the right. So, if *u* = *vM* then *v* = *uM*^{−1}; (solve-linear-right u M) produces v.

^{5}*D _{s}* is not a derivative operator. It is not linear because the time component is a nonzero constant.

^{6}Sometimes we use a center dot to indicate multiplication, to avoid the ambiguity of the use of juxtaposition to indicate both multiplication and function application. This is not to be interpreted as a vector dot product.

^{7}Actually, for *I* = 0 the transform is not well defined and so it is not canonical for that value. This transformation is “locally canonical” in that it is canonical for nonzero values of *I*. We will ignore this essentially topological problem.

^{8}Unlike *D _{s}*,

^{9}The procedure zero-like produces a structure of zeros with the shape of its argument.

^{10}This is just a rearrangement of the arguments of *R _{z}*:

^{11}For each linear transformation *T* : *A* → *A* of incremental phase-space states there is a unique linear transformation *transpose* of *T*, such that for every real-valued linear function *g* : *A* → **R** of incremental phase-space states, and for every *a* ∈ *A* we have *a* this is *DT* (*a*) is *DC*_{H}(*s*′), and *Dg*(*a*) is *DH*(*C*_{H}(*s*′)).

^{12}The procedure compatible-shape takes any structure and produces another structure that is guaranteed to multiply with the given structure to produce a numerical quantity. For example, the shape of *DH*(*s*) is a compatible shape to the shape of *s*: if they are multiplied the result is a numerical quantity. This is the *s*^{⋆} that appears in equation (5.48).

^{13}The procedure transpose is simply defined for traditional matrices, but because structures that specify linear transformations may have arbitrary substructure, the procedure needs to be supplied with a template that specifies this structure. So the procedure transpose takes two arguments: (transpose ms rs), where ms is the structure to be transposed and the template rs is a structure that is appropriate for multiplication with ms on the right.

^{14}Actually, this is more interesting: we allow transformations that arbitrarily distort time, as tau is an arbitrary literal function. The canonical condition is concerned only with the possibly time-dependent transformation of coordinates and momenta.

^{15}The *qp* submatrix of a square matrix of dimension 2*n* + 1 is the 2*n*-dimensional matrix obtained by deleting the first row and the first column of the given matrix. This can be computed by:

^{16}The procedure D-as-matrix is defined as:

^{17}The *q ^{i}*,

^{18}The structure ∂_{2}∂_{1}*F*_{1} is a down of downs, so it is compatible for contraction with an up on either side. But it is not symmetrical, so the associations must be specified. To solve this problem we use index notation (ugh!).

So we use indices to select particular components of structured objects. If an index symbol appears both as a superscript and as a subscript in an expression, the value of the expression is the sum over all possible values of the index symbol of the designated components (Einstein summation convention). Thus, for example, if *p* are of dimension *n* then the indicated product

^{19}A structure is nonsingular if the determinant of the matrix representation of the structure is nonzero.

^{20}Point transformations are not in this class: we cannot solve for the momenta in terms of the positions for point transformations, because for a point transformation the primed and unprimed coordinates can be deduced from each other, so there is not enough information in the coordinates to deduce the momenta.

^{21}Let *F* be defined as the path-independent line integral

then ∂* _{i}F*(

^{22}There may be some singular cases and topological problems that prevent this from being rigorously true.

^{23}The various generating functions are traditionally known by the names *F*_{1}, *F*_{2}, *F*_{3}, and *F*_{4}. Please don't blame us.

^{24}We augment the Lagrangian with the total time derivative of the constraint so that the Legendre transform will be well defined.

^{25}Once we have made this reduction, taking *p _{λ}* to be zero, we can no longer perform a Legendre transform back to the extended Lagrangian system; we cannot solve for

^{26}If *f* is strictly increasing then *Df* is never zero.

^{27}Actually, the traditional Jacobi constant is *C* = −2*H*′.

^{28}We could have chosen to reparameterize in terms of *φ*, but then both *p _{r}* and