5Canonical Transformations

We have done considerable mountain climbing. Now we are in the rarefied atmosphere of theories of excessive beauty and we are nearing a high plateau on which geometry, optics, mechanics, and wave mechanics meet on common ground. Only concentrated thinking, and a considerable amount of re–creation, will reveal the beauty of our subject in which the last word has not been spoken.

Cornelius Lanczos, The Variational Principles of Mechanics [29], p. 229

One way to simplify the analysis of a problem is to express it in a form in which the solution has a simple representation. However, it may not be easy to formulate the problem in such a way initially. It is often useful to start by formulating the problem in one way, and then transform it. For example, the formulation of the problem of the motion of a number of gravitating bodies is simple in rectangular coordinates, but it is easier to understand aspects of the motion in terms of orbital elements, such as the semimajor axes, eccentricities, and inclinations of the orbits. The semimajor axis and eccentricity of an orbit depend on both the configuration and the velocity of the body. Such transformations are more general than those that express changes in configuration coordinates. Here we investigate transformations of phase-space coordinates that involve both the generalized coordinates and the generalized momenta.

Suppose we have two different Hamiltonian systems, and suppose the trajectories of the two systems are in one-to-one correspondence. In this case both Hamiltonian systems can be mathematical models of the same physical system. Some questions about the physical system may be easier to answer by reference to one model and others may be easier to answer in the other model. For example, it may be easier to formulate the physical system in one model and to discover a conserved quantity in the other. Canonical transformations are maps between Hamiltonian systems that preserve the dynamics.

A canonical transformation is a phase-space coordinate transformation and an associated transformation of the Hamiltonian such that the dynamics given by Hamilton's equations in the two representations describe the same evolution of the system.

5.1 Point Transformations

A point transformation is a canonical transformation that extends a possibly time-dependent transformation of the configuration coordinates to a phase-space transformation. For example, one might want to reexpress motion in terms of polar coordinates, given a description in terms of rectangular coordinates. In order to extend a transformation of the configuration coordinates to a phase-space transformation we must specify how the momenta and Hamiltonian are transformed.

We have already seen how coordinate transformations can be carried out in the Lagrangian formulation (see section 1.6.1). In that case, we found that if the Lagrangian transforms by composition with the coordinate transformation, then the Lagrange equations are equivalent.

Lagrangians that differ by the addition of a total time derivative have the same Lagrange equations, but may have different momenta conjugate to the generalized coordinates. So there is more than one way to make a canonical extension of a coordinate transformation.

Here, we find the particular canonical extension of a coordinate transformation for which the Lagrangians transform by composition with the transformation, with no extra total time derivative terms added to the Lagrangian.

Let L be a Lagrangian for a system. Consider the coordinate transformation q = F (t, q′). The velocities transform by

$\begin{matrix} v = \partial_{0} F (t, q') + \partial_{1} F (t, q') v' . & (5.1) \end{matrix}$

We obtain a Lagrangian L′ in the transformed coordinates by composition of L with the coordinate transformation. We require that L′(t, q′, v′) = L(t, q, v), so:

$\begin{matrix} \begin{matrix} L' (t, q', v') = L (t, F (t, q'), \partial_{0} F (t, q') + \partial_{1} F (t, q') v') . & (5.2) \end{matrix} \end{matrix}$

The momentum conjugate to q′ is

$\begin{array}{l} p' & = \partial_{2} L' (t, q', v') \\ = \partial_{2} L (t, F (t, q'), \partial_{0} F (t, q') + \partial_{1} F (t, q') v') \partial_{1} F (t, q') \\ = p \partial_{1} F (t, q'), & (5.3) \end{array}$

where we have used

$\begin{array}{l} p & = \partial_{2} L (t, q, v) \\ = \partial_{2} L (t, F (t, q'), \partial_{0} F (t, q') + \partial_{1} F (t, q') v') . & (5.4) \end{array}$

So, from equation (5.3),¹

$\begin{array}{l} p = p' {(\partial_{1} F (t, q'))}^{- 1} . & (5.5) \end{array}$

We can collect these results to define a canonical phase-space transformation C_H:²

$\begin{array}{l} (t, q, p) & = C_{H} (t, q', p') \\ = (t, F (t, q'), p' {(\partial_{1} F (t, q'))}^{- 1}) . & (5.6) \end{array}$

The Hamiltonian is obtained by the Legendre transform

$\begin{array}{l} H' (t, q', p') \\ = p' v' - L' (t, q', v') \\ = (p \partial_{1} F (t, q')) ((\partial_{1} F {(t, q')}^{- 1} (v - \partial_{0} F (t, q')))) - L (t, q, v) \\ = p v - L (t, q, v) - p \partial_{0} F (t, q') \\ = H (t, q, p) - p \partial_{0} F (t, q'), & (5.7) \end{array}$

using relations (5.1) and (5.3) in the second step. Fully expressed in terms of the transformed coordinates and momenta, the transformed Hamiltonian is

$\begin{array}{l} H' (t, q', p') = & H (t, F (t, q'), p' {(\partial_{1} F (t, q'))}^{- 1}) \\ - (p' {(\partial_{1} F (t, q'))}^{- 1}) \partial_{0} F (t, q') . & (5.8) \end{array}$

The Hamiltonians H′ and H are equivalent because L and L′ have the same value for a given dynamical state and so have the same paths of stationary action. In general H and H′ do not have the same values for a given dynamical state, but differ by a term that depends on the coordinate transformation.

For time-independent transformations, ∂₀F = 0, there are a number of simplifications. The relationship of the velocities (5.1) becomes

$\begin{array}{l} v = \partial_{1} F (t, q') v' . & (5.9) \end{array}$

Comparing this to the relation (5.5) between the momenta, we see that in this case the momenta transform “oppositely” to the velocities³

$\begin{array}{l} p v = p' {(\partial_{1} F (t, q'))}^{- 1} \partial_{1} F (t, q') v' = p' v', & (5.10) \end{array}$

so the product of the momenta and the velocities is not changed by the transformation. This, combined with the fact that by construction L(t, q, v) = L′(t, q′, v′), shows that

$\begin{array}{l} H (t, q, p) & = p v - L (t, q, v) \\ = p' v' - L' (t, q', v') \\ = H' (t, q', p') . & (5.11) \end{array}$

For time-independent coordinate transformations the Hamiltonian transforms by composition with the associated phase-space transformation. We can also see this from the general relationship (5.7) between the Hamiltonians.

Implementing point transformations

The procedure F->CH takes a procedure F implementing a transformation of configuration coordinates and returns a procedure implementing a transformation of phase-space coordinates:⁴

(define ((F->CH F) state)
  (up (time state)
      (F state)
      (solve-linear-right
        (momentum state)
        (((partial 1) F) state))))

Consider a particle moving in a central field. In rectangular coordinates a Hamiltonian is

(define ((H-central m V) state)
  (let ((x (coordinate state))
        (p (momentum state)))
    (+ (/ (square p) (* 2 m))
       (V (sqrt (square x))))))

Let's look at this Hamiltonian in polar coordinates. The phase-space transformation is obtained by applying F->CH to the procedure p->r that takes a time and a polar tuple and returns a tuple of rectangular coordinates (see section 1.6.1). The transformation is time independent so the Hamiltonian transforms by composition. In polar coordinates the Hamiltonian is

(show-expression
  ((compose
     (H-central 'm (literal-function 'V))
     (F->CH p->r))
   (up 't (up 'r 'phi) (down 'p_r 'p_phi))))

$V (r) + \frac{\frac{1}{2} p_{r}^{2}}{m} + \frac{\frac{1}{2} p_{φ}^{2}}{m r^{2}}$

There are three terms. There is the potential energy, which depends on the radius, there is the kinetic energy due to radial motion, and there is the kinetic energy due to tangential motion. As expected, the angle φ does not appear and thus the angular momentum is a conserved quantity. By going to polar coordinates we have decoupled one of the two degrees of freedom in the problem.

If the transformation is time varying the Hamiltonian must be adjusted by adding a correction to the composition of the Hamiltonian and the transformation (see equation 5.8):

$\begin{array}{l} H' = H \circ C_{H} + K & (5.12) \end{array}$

The correction is computed by

(define ((F->K F) state)
  (- (* (solve-linear-right
          (momentum state)
          (((partial 1) F) state))
        (((partial 0) F) state))))

For example, consider a transformation to coordinates translating with velocity v:

(define ((translating v) state)
  (+ (coordinates state) (* v (time state))))

We compute the additive adjustment required for the Hamiltonian:

((F->K (translating (up 'v^x 'v^y 'v^z)))
 (up 't (up 'x 'y 'z) (down 'p_x 'p_y 'p_z)))

(+ (* -1 p_x v^x) (* -1 p_y v^y) (* -1 p_z v^z))

Notice that this is the negation of the inner product of the momentum and the velocity of the coordinate system.

Let's see how a simple free-particle Hamiltonian is transformed:

(define ((H-free m) s)
(/ (square (momentum s)) (* 2 m)))

The transformed Hamiltonian is:

(define H-prime
  (+ (compose (H-free 'm)
              (F->CH (translating (up 'v^x 'v^y 'v^z))))
     (F->K (translating (up 'v^x 'v^y 'v^z)))))

(H-prime
  (up 't
      (up 'xprime 'yprime 'zprime)
      (down 'pprime_x 'pprime_y 'pprime_z)))
(+ (* -1 pprime_x v^x)
   (* -1 pprime_y v^y)
   (* -1 pprime_z v^z)
   (/ (* 1/2 (expt pprime_x 2)) m)
   (/ (* 1/2 (expt pprime_y 2)) m)
   (/ (* 1/2 (expt pprime_z 2)) m))

Exercise 5.1: Galilean invariance

Is this result what you expected? Let's investigate.

Recall that in exercise 1.29 we showed that if the kinetic energy is $\frac{1}{2} m v^{2}$ then the translation to a uniformly moving coordinate system introduces extra terms that can be identified as a total time derivative. Since these terms do not affect the Lagrange equations, we can take the kinetic energy in the transformed coordinates to also be $\frac{1}{2} m {(v')}^{2}$ .

Let C_H be the phase space extension of the translation transformation, and C be the local tuple extension. The transformed Hamiltonian is H′ = H ∘ C_H + K; the transformed Lagrangian is L′ = L ∘ C.

a. Derive the relationship between p and p′ both from C_H and from the Lagrangians. Are they the same? Derive the relationship between v and v′ by taking the derivative of the Hamiltonians with respect to the momenta (Hamilton's equation). Show that the Legendre transform of L′ gives the same H′.

b. We have shown that L and L′ differ by a total time derivative. So for any uniformly moving coordinate system we can write the Lagrangian as $\frac{1}{2} m v^{2}$ . Similarly, we would expect to always be able to write the Hamiltonian as p²/(2m). Show that this differs from H′ by a total time derivative in the corresponding Lagrangians.

Exercise 5.2: Rotations

Let q and q′ be rectangular coordinates that are related by a rotation R: q = Rq′. The Lagrangian for the system is $L (t, q, v) = \frac{1}{2} m v^{2} - V (q)$ . Find the corresponding phase-space transformation C_H. Compare the transformation equations for the rectangular components of the momenta to those for the rectangular components of the velocities. Are you surprised, considering equation (5.10)?

5.2 General Canonical Transformations

Although we have shown how to extend any coordinate transformation of the configuration space to a canonical transformation, there are other ways to construct canonical transformations. How do we know if we have a canonical transformation? To test if a transformation is canonical we may use the fact that if the transformation is canonical, then Hamilton's equations of motion for the transformed system and the original system will be equivalent.

Consider a Hamiltonian H and a phase-space transformation C_H. Let D_s be the function that takes a Hamiltonian and gives the Hamiltonian state-space derivative:⁵

$\begin{array}{l} D_{s} H (t, q, p) = (1, \partial_{2} H (t, q, p), - \partial_{1} H (t, q, p)) . & (5.13) \end{array}$

Hamilton's equations are

$\begin{array}{l} D σ = D_{s} H \circ σ, & (5.14) \end{array}$

for any realizable phase-space path σ.

The transformation C_H transforms the phase-space path σ′ (t) = (t, q′ (t), p′ (t)) into σ(t) = (t, q(t), p(t)):

$\begin{array}{l} σ = C_{H} \circ σ' . & (5.15) \end{array}$

The rates of change of the phase-space coordinates are transformed by the derivative of the transformation

$\begin{array}{l} D σ = D (C_{H} \circ σ') = (D C_{H} \circ σ') D σ' . & (5.16) \end{array}$

The transformation is canonical if the equations of motion obtained from the new Hamiltonian are the same as those that could be obtained by transforming the equations of motion derived from the original Hamiltonian to the new coordinates:

$\begin{array}{l} D σ = (D C_{H} \circ σ') D σ' = (D C_{H} \circ σ') (D_{s} H' \circ σ') . & (5.17) \end{array}$

Using equation (5.14), we see that

$\begin{array}{l} D_{s} H \circ σ = (D C_{H} \circ σ') (D_{s} H' \circ σ') . & (5.18) \end{array}$

art — **Figure 5.1** A canonical transformation C_H relates the descriptions of a dynamical system in two phase-space coordinate systems. The transformation shows how Hamilton's equations in one coordinate system may be derived from Hamilton's equations in the other coordinate system.

With σ = C_H ∘ σ′, we find

$\begin{array}{l} D_{s} H \circ C_{H} \circ σ' = (D C_{H} \circ σ') (D_{s} H' \circ σ') . & (5.19) \end{array}$

This condition must hold for any realizable phase-space path σ′. Certainly this is true if the following condition holds for every phase-space point:⁶

$\begin{array}{l} D_{s} H \circ C_{H} = D C_{H} \cdot D_{s} H' . & (5.20) \end{array}$

Any transformation that satisfies equation (5.20) is a canonical transformation among phase-space representations of a dynamical system. In one phase-space representation the system's dynamics is characterized by the Hamiltonian H′ and in the other by H. The idea behind this equation is illustrated in figure 5.1.

We can formalize this test as a program:

(define (canonical? C H Hprime)
  (- (compose (Hamiltonian->state-derivative H) C)
     (* (D C) (Hamiltonian->state-derivative Hprime))))

where Hamiltonian->state-derivative, which was introduced in chapter 3, implements D_s. The transformation is canonical if these residuals are zero.

For time-independent point transformations an appropriate Hamiltonian can be formed by composition with the corresponding phase-space transformation. For more general canonical transformations, we will see that if a transformation is independent of time, a suitable Hamiltonian for the transformed system can be obtained by composing the Hamiltonian with the phase-space transformation. In this case we obtain a more specific formula:

$\begin{array}{l} D_{s} H \circ C_{H} = D C_{H} \cdot D_{s} (H \circ C_{H}) . & (5.21) \end{array}$

Polar-canonical transformation

The analysis of the harmonic oscillator illustrates the use of a general canonical transformation in the solution of a problem. The harmonic oscillator is a mathematical model of a simple spring-mass system. A Hamiltonian for a spring-mass system with mass m and spring constant k is

$\begin{array}{l} H (t, x, p_{x}) = \frac{p_{x}^{2}}{2 m} + \frac{1}{2} k x^{2} . & (5.22) \end{array}$

Hamilton's equations of motion are

$\begin{array}{l} D x = p_{x} / m \\ D p_{x} = - k x, & (5.23) \end{array}$

giving the second-order system

$\begin{array}{l} m D^{2} x + k x = 0. & (5.24) \end{array}$

The solution is

$\begin{array}{l} x (t) = A \sin (ω t + φ), & (5.25) \end{array}$

where

$\begin{array}{l} ω = \sqrt{k / m} & (5.26) \end{array}$

and where A and φ are determined by initial conditions.

We use the polar-canonical transformation:

$\begin{array}{l} (t, x, p_{x}) = C_{α} (t, θ, I) & (5.27) \end{array}$

where

$\begin{array}{l} x = \sqrt{\frac{2 I}{α}} \sin θ & (5.28) \end{array}$

$\begin{array}{l} p_{x} = \sqrt{2 α I} \cos θ . & (5.29) \end{array}$

Here α is an arbitrary parameter. We define:

(define ((polar-canonical alpha) state)
  (let ((t (time state))
        (theta (coordinate state))
        (I (momentum state)))
    (let ((x (* (sqrt (/ (* 2 I) alpha)) (sin theta)))
          (p_x (* (sqrt (* 2 alpha I)) (cos theta))))
      (up t x p_x))))

And now we just run our test:

(define ((H-harmonic m k) s)
  (+ (/ (square (momentum s)) (* 2 m))
     (* 1/2 k (square (coordinate s)))))

((canonical?
   (polar-canonical 'alpha)
   (H-harmonic 'm 'k)
   (compose (H-harmonic 'm 'k)
            (polar-canonical 'alpha)))
 (up 't 'theta 'I))

(up 0 0 0)

So the transformation is canonical for the harmonic oscillator.⁷

Let's use our polar-canonical transformation C_α to help us solve the harmonic oscillator. We substitute expressions (5.28) and (5.29) for x and p_x in the Hamiltonian, getting our new Hamiltonian:

$\begin{array}{l} H' (t, θ, I) = \frac{α I}{m} {(\cos θ)}^{2} + \frac{k I}{α} {(\sin θ)}^{2} . & (5.30) \end{array}$

If we choose $α = \sqrt{k m}$ then we obtain

$\begin{array}{l} H' (t, θ, I) = \sqrt{\frac{k}{m}} I = ω I, & (5.31) \end{array}$

and the new Hamiltonian no longer depends on the coordinate. Hamilton's equation for I is

$\begin{array}{l} D I (t) = - \partial_{1} H' (t, θ (t), I (t)) = 0, & (5.32) \end{array}$

so I is constant. The equation for θ is

$\begin{array}{l} D θ (t) = \partial_{2} H' (t, θ (t), I (t)) = ω, & (5.33) \end{array}$

$\begin{array}{l} θ (t) = ω t + φ . & (5.34) \end{array}$

In the original variables,

$\begin{array}{l} x (t) & = \sqrt{2 I (t) / α} \sin θ (t) \\ = A \sin (ω t + φ), & (5.35) \end{array}$

with the constant $A = \sqrt{2 I (t) / α}$ . So we have found the solution to the problem by making a canonical transformation to new phase-space variables for which the solution is easy and then transforming the solutions back to the original variables.

Exercise 5.3: Trouble in Lagrangian world

Is there a Lagrangian L′ that corresponds to the harmonic oscillator Hamiltonian H′(t, θ, I) = ωI? What could this possibly mean?

Exercise 5.4: Group properties

If we say that C_H is canonical with respect to Hamiltonians H and H′ if and only if D_sH ∘ C_H = DC_H · D_sH′, then:

a. Show that the composition of canonical transformations is canonical.

b. Show that composition of canonical transformations is associative.

c. Show that the identity transformation is canonical.

d. Show that there is an inverse for a canonical transformation and the inverse is canonical.

5.2.1 Time-Dependent Transformations

We have seen that for time-dependent point transformations the Hamiltonian appropriate for the transformed system is the original Hamiltonian composed with the transformation and augmented with an additive correction. Here we find a similar decomposition for general time-dependent canonical transformations.

The key to this decomposition is to separate the time part and the phase-space part of the Hamiltonian state derivative:⁸

$\begin{array}{l} D_{s} H (s) & = (1, + \partial_{2} H (s), - \partial_{1} H (s)) \\ = T (s) + D H (s) & (5.36) \end{array}$

where

$\begin{array}{l} T (s) = (1, 0, 0), & (5.37) \end{array}$

$\begin{array}{l} D H (s) = (0, + \partial_{2} H (s), - \partial_{1} H (s)), & (5.38) \end{array}$

as code:⁹

(define (T-func s)
  (up 1
      (zero-like (coordinates s))
      (zero-like (momenta s))))

(define ((D-phase-space H) s)
  (up 0 (((partial 2) H) s) (- (((partial 1) H) s))))

If we assume that H′ = H ∘ C_H + K, then the canonical condition (5.20) becomes

$\begin{array}{l} D_{s} H \circ C_{H} = D C_{H} \cdot D_{s} (H \circ C_{H} + K) . & (5.39) \end{array}$

Expanding the state derivative, the canonical condition is

$\begin{array}{l} (T + D H) \circ C_{H} = D C_{H} \cdot (T + D (H \circ C_{H} + K)) . & (5.40) \end{array}$

Equation (5.40) is satisfied if the following conditions are met:

$\begin{array}{l} D H \circ C_{H} = D C_{H} \cdot D (H \circ C_{H}) & (5.41) \end{array}$

$\begin{array}{l} T \circ C_{H} = D C_{H} \cdot (T + D K) . & (5.42) \end{array}$

The value of T ∘ C_H does not depend on C_H, so this term is really very simple. Notice that equation (5.41) does not depend upon K and that equation (5.42) does not depend upon H.

These can be implemented as follows:

(define (canonical-H? C H)
  (- (compose (D-phase-space H) C)
     (* (D C)
        (D-phase-space (compose H C)))))

(define (canonical-K? C K)
  (- (compose T-func C)
     (* (D C)
        (+ T-func (D-phase-space K)))))

Rotating coordinates

Consider a time-dependent transformation to uniformly rotating coordinates:¹⁰

$\begin{array}{l} q = R (Ω) (t, q'), & (5.43) \end{array}$

with components

$\begin{array}{l} x = x' \cos (Ω t) - y' \sin (Ω t) \\ y = x' \sin (Ω t) + y' \cos (Ω t) . & (5.44) \end{array}$

As a program this is

(define ((rotating Omega) state)
  (let ((t (time state)) (qp (coordinate state)))
    (let ((xp (ref qp 0)) (yp (ref qp 1)) (zp (ref qp 2)))
      (up (- (* (cos (* Omega t)) xp)
             (* (sin (* Omega t)) yp))
          (+ (* (sin (* Omega t)) xp)
             (* (cos (* Omega t)) yp))
          zp))))

The extension of this transformation to a phase-space transformation is

(define (C-rotating Omega) (F->CH (rotating Omega)))

We first verify that this time-dependent transformation satisfies equation (5.41). We will try it for an arbitrary Hamiltonian with three degrees of freedom:

(define H-arbitrary
  (literal-function 'H
                    (-> (UP Real (UP Real Real Real) (DOWN Real Real Real))
                        Real)))

((canonical-H? (C-rotating 'Omega) H-arbitrary)
 (up 't (up 'xp 'yp 'zp) (down 'pp_x 'pp_y 'pp_z)))
(up 0 (up 0 0 0) (down 0 0 0))

And it works. Note that this result did not depend on any details of the Hamiltonian, suggesting that we might be able to make a test that does not require a Hamiltonian. We will see that shortly.

Since we have a point transformation, we can compute the required adjustment to the Hamiltonian:

((F->K (rotating 'Omega))
 (up 't (up 'xp 'yp 'zp) (down 'pp_x 'pp_y 'pp_z)))
(+ (* Omega pp_x yp) (* -1 Omega pp_y xp))

So, for this transformation an appropriate correction to the Hamiltonian is

$\begin{array}{l} K (Ω) (t; x', y', z'; p_{x}^{'}, p_{y}^{'}, p_{z}^{'}) = - Ω (x' p_{y}^{'} - y' p_{x}^{'}), & (5.45) \end{array}$

which is minus the rate of rotation of the coordinate system multiplied by the angular momentum. We implement K as a procedure

(define ((K Omega) s)
  (let ((qp (coordinate s)) (pp (momentum s)))
    (let ((xp (ref qp 0)) (yp (ref qp 1))
                          (ppx (ref pp 0)) (ppy (ref pp 1)))
      (* -1 Omega (- (* xp ppy) (* yp ppx))))))

and apply the test. We find:

((canonical-K? (C-rotating 'Omega) (K 'Omega))
 (up 't (up 'xp 'yp 'zp) (down 'pp_x 'pp_y 'pp_z)))
(up 0 (up 0 0 0) (down 0 0 0))

The residuals are zero so this K correctly completes the canonical transformation.

5.2.2 Abstracting the Canonical Condition

We just saw that for the case of rotating coordinates the truth of equation (5.41) did not depend on the details of the Hamiltonian. If C_H satisfies equation (5.41) for any H then we can derive a condition on C_H that is independent of H.

Let's start with an expanded version of equation (5.41):

$\begin{array}{l} D H \circ C_{H} = D C_{H} \cdot ((D H \circ C_{H}) \cdot D C_{H}), & (5.46) \end{array}$

using the chain rule.

We introduce a shuffle function:

$\begin{array}{l} \tilde{J} ([a, b, c]) = (0, c, - b) . & (5.47) \end{array}$

The argument to $\tilde{J}$ is a down tuple of components of the derivative of a Hamiltonian-like function. The shuffle function is linear. Using $\tilde{J}$ we can write $D H = \tilde{J} \circ D H$ .

Let J be the multiplier corresponding to the constant linear function $\tilde{J}$ :

$\begin{array}{l} J = (D \tilde{J}) (s^{⋆}), & (5.48) \end{array}$

where s^⋆ is an arbitrary argument, shaped like DH(s), that is compatible for multiplication with s. The value of s^⋆ is irrelevant because D $\tilde{J}$ is a constant function. Then we can rewrite equation (5.46) as

$\begin{array}{l} J \cdot D H (C_{H} (s')) = D C_{H} (s') \cdot J \cdot (D H (C_{H} (s')) \cdot D C_{H} (s')) . & (5.49) \end{array}$

We can move the DC_H(s′) to the left of DH(C_H(s′)) by taking its transpose:¹¹

$\begin{array}{l} J \cdot D H (C_{H} (s')) \\ = D C_{H} (s') \cdot J \cdot ({(D C_{H} (s'))}^{T} \cdot D H (C_{H} (s'))) . & (5.50) \end{array}$

Since ${(D C_{H} (s'))}^{T}$ is a linear transformation and multiplication is associative for the multipliers of linear transformations, we can write

$\begin{array}{l} J \cdot D H (C_{H} (s')) = D C_{H} (s') \cdot J \cdot {(D C_{H} (s'))}^{T} \cdot D H (C_{H} (s')) . & (5.51) \end{array}$

This is true for any H if

$\begin{array}{l} J = D C_{H} (s') \cdot J \cdot {(D C_{H} (s'))}^{T} . & (5.52) \end{array}$

As a program, this is^12,13

(define (J-func DHs)
  (up 0 (ref DHs 2) (- (ref DHs 1))))

(define ((canonical-transform? C) s)
  (let ((J ((D J-func) (compatible-shape s)))
        (DCs ((D C) s)))
    (- J (* DCs J (transpose DCs s)))))

This condition, equation (5.52), on C_H, called the canonical condition, does not depend on the details of H. This is a remarkable result: we can decide whether a phase-space transformation preserves the dynamics of Hamilton's equations without further reference to the details of the dynamical system. If the transformation is time dependent we can add a correction to the Hamiltonian to make it canonical.

Examples

The polar-canonical transformation satisfies the canonical condition:

((canonical-transform? (polar-canonical 'alpha))
 (up 't 'theta 'I))
(up (up 0 0 0) (up 0 0 0) (up 0 0 0))

But not every transformation we might try satisfies the canonical condition. For example, we might try x = p sin θ and p_x = p cos θ. The implementation is

(define (a-non-canonical-transform state)
  (let ((t (time state))
        (theta (coordinate state))
        (p (momentum state)))
    (let ((x (* p (sin theta)))
          (p_x (* p (cos theta))))
      (up t x p_x))))

((canonical-transform? a-non-canonical-transform)
 (up 't 'theta 'p))
(up (up 0 0 0) (up 0 0 (+ -1 p)) (up 0 (+ 1 (* -1 p)) 0))

So this transformation does not satisfy the canonical condition.

Canonical condition and Poisson brackets

The canonical condition can be written simply in terms of Poisson brackets.

The Poisson bracket can be written in terms of $\tilde{J}$ :

$\begin{array}{l} {f, g} = (D f) \cdot (\tilde{J} \circ (D g)) = (D f) \cdot J \cdot (D g), & (5.53) \end{array}$

as can be seen by writing out the components.

We break the transformation C_H into position and momentum parts:

$\begin{array}{l} q = A (t, q', p') & (5.54) \end{array}$

$\begin{array}{l} p = B (t, q', p') . & (5.55) \end{array}$

In terms of the individual component functions, the canonical condition (5.52) is

$\begin{array}{l} δ_{j}^{i} = {A^{i}, B_{j}} \\ 0 = {A^{i}, A^{j}} \\ 0 = {B_{i}, B_{j}} & (5.56) \end{array}$

where $δ_{j}^{i}$ is 1 if i = j and 0 otherwise. These equations are called the fundamental Poisson brackets. If a transformation satisfies these Poisson bracket relations then it satisfies the canonical condition.

We have found that a transformation is canonical if its position-momentum part satisfies the canonical condition, but for a time-dependent transformation we may have to modify the Hamiltonian by the addition of a suitable K. We can rewrite these conditions in terms of Poisson brackets. If the Hamiltonian is

$\begin{array}{l} H' (t, q', p') = H (t, A (t, q', p'), B (t, q', p')) + K (t, q', p'), & (5.57) \end{array}$

the transformation will be canonical if the coordinate-momentum transformation satisfies the fundamental Poisson brackets, and K satisfies:

$\begin{array}{l} {A^{i}, K} + \partial_{0} A^{i} = 0 \\ {B_{j}, K} + \partial_{0} B_{j} = 0. & (5.58) \end{array}$

Exercise 5.5: Poisson bracket conditions

Fill in the details to show that the canonical condition (5.52) is equivalent to the fundamental Poisson brackets (5.56) and that the condition on K (5.42) is equivalent to the Poisson bracket condition on K (5.58).

Symplectic matrices

It is convenient to reformulate the canonical condition in terms of matrices. We can obtain a matrix representation of a structure with the utility s->m that takes a structure that represents a multiplier of a linear transformation and returns a matrix representation of the multiplier. The procedure s->m takes three arguments: (s->m ls A rs). The ls and rs specify the shapes of objects that multiply A on the left and right to give a numerical value. These specify the basis. So, the matrix representation of the multiplier corresponding to $\tilde{J}$ is

(let* ((s (up 't (up 'x 'y) (down 'px 'py)))
       (s* (compatible-shape s))
       (J ((D J-func) s*)))
  (s->m s* J s*))

(matrix-by-rows
  (list 0 0 0 0 0)
  (list 0 0 0 1 0)
  (list 0 0 0 0 1)
  (list 0 -1 0 0 0)
  (list 0 0 -1 0 0))

This matrix, J, is useful, so we supply a procedure J-matrix so that (J-matrix n) gives this matrix for an n degree-of-freedom system.

We can now reexpress the canonical condition (5.52) as a matrix equation:

$\begin{array}{l} J = D C_{H} (s') \cdot J \cdot {(D C_{H} (s'))}^{T} . & (5.59) \end{array}$

There is a further simplification available. The elements of the first row and the first column of the matrix representation of $\tilde{J}$ are all zeros. This has simplifying consequences. Consider a general transformation of phase-space states (for two degrees of freedom):

(define C-general
  (literal-function 'C
                    (-> (UP Real (UP Real Real) (DOWN Real Real))
                        (UP Real (UP Real Real) (DOWN Real Real)))))

Consider transformations for which the time does not depend on the coordinates or momenta¹⁴

(define (C-simple-time s)
  (let ((cs (C-general s)))
    (up ((literal-function 'tau) (time s))
        (coordinates cs)
        (momenta cs))))

For this kind of transformation the first row and the first column of the residuals of the canonical-transform? test are identically zero:

(let* ((s (up 't (up 'x 'y) (down 'p_x 'p_y)))
       (s* (compatible-shape s)))
  (m:nth-row
    (s->m s* ((canonical-transform? C-simple-time) s) s*)
    0))

(up 0 0 0 0 0)

(let ((s (up 't (up 'x 'y) (down 'p_x 'p_y)))
      (s* (compatible-shape s)))
  (m:nth-col
    (s->m s* ((canonical-transform? C-simple-time) s) s*)
    0))

(up 0 0 0 0 0)

But for C-general these are not zero. Since the transformations we are considering at most shift time, we need to consider only the submatrix associated with the coordinates and the momenta.

The qp submatrix¹⁵ of dimension 2n × 2n of the matrix J is called the symplectic unit for n degrees of freedom:

$\begin{array}{l} J_{n} = (\begin{array}{l} 0_{n \times n} & 1_{n \times n} \\ - 1_{n \times n} & 0_{n \times n} \end{array}) . & (5.60) \end{array}$

The matrix J_n satisfies the following identities:

$\begin{array}{l} J_{n}^{T} = J_{n}^{- 1} = - J_{n} . & (5.61) \end{array}$

A 2n × 2n matrix A that satisfies the relation

$\begin{array}{l} J_{n} = A J_{n} A^{T} & (5.62) \end{array}$

is called a symplectic matrix. We can determine whether a matrix is symplectic:

(define (symplectic-matrix? M)
  (let ((2n (m:dimension M)))
    (let ((J (symplectic-unit (quotient 2n 2))))
      (- J (* M J (transpose M))))))

An appropriate symplectic unit matrix of a given size is produced by the procedure symplectic-unit.

If the matrix representation of the derivative of a transformation is a symplectic matrix the transformation is a symplectic transformation. Here is a test for whether a transformation is symplectic:¹⁶

(define ((symplectic-transform? C) s)
  (symplectic-matrix? (qp-submatrix ((D-as-matrix C) s))))

The procedure symplectic-transform? returns a zero matrix if and only if the transformation being tested passes the symplectic matrix test.

For example, the point transformations are symplectic. We can show this for a general possibly time-dependent two-degree-of-freedom point transformation:

(define (F s)
  ((literal-function 'F
                     (-> (X Real (UP Real Real)) (UP Real Real)))
   (time s) (coordinates s)))

((symplectic-transform? (F->CH F))
 (up 't (up 'x 'y) (down 'px 'py)))

(matrix-by-rows
  (list 0 0 0 0)
  (list 0 0 0 0)
  (list 0 0 0 0)
  (list 0 0 0 0))

More generally, the phase-space part of the canonical condition is equivalent to the symplectic condition (for two degrees of freedom) even in the case of an unrestricted phase-space transformation.

(let* ((s (up 't (up 'x 'y) (down 'p_x 'p_y)))
       (s* (compatible-shape s)))
  (- (qp-submatrix
       (s->m s* ((canonical-transform? C-general) s) s*))
     ((symplectic-transform? C-general) s)))

(matrix-by-rows
  (list 0 0 0 0)
  (list 0 0 0 0)
  (list 0 0 0 0)
  (list 0 0 0 0))

Exercise 5.6: Symplectic matrices

Let A be a symplectic matrix: $J_{n} = A J_{n} A^{T}$ . Show that $A^{T}$ and A⁻¹ are symplectic.

Exercise 5.7: Polar-canonical transformations

Let x, p and θ, I be two sets of canonically conjugate variables. Consider transformations of the form x = βI^α sin θ and p = βI^α cos θ. Determine all α and β for which this transformation is symplectic.

Exercise 5.8: Standard map

Is the standard map a symplectic transformation? Recall that the standard map is: I′ = I + K sin θ, with θ′ = θ + I′, both modulo 2π.

Exercise 5.9: Whittaker transform

Shew that the transformation q = log ((sin p′)/q′) with p = q′ cot p′ is symplectic.

5.3 Invariants of Canonical Transformations

Canonical transformations allow us to change the phase-space coordinate system that we use to express a problem, preserving the form of Hamilton's equations. If we solve Hamilton's equations in one phase-space coordinate system we can use the transformation to carry the solution to the other coordinate system. What other properties are preserved by a canonical transformation?

Noninvariance of pv

We noted in equation (5.10) that point transformations that are canonical extensions of time-independent coordinate transformations preserve the value of pv. This does not hold for more general canonical transformations. We can illustrate this with the polar-canonical transformation. Along corresponding paths x, p_x and θ, I

$\begin{array}{l} x (t) & = \sqrt{\frac{2 I (t)}{α}} \sin θ (t) \\ p_{x} (t) & = \sqrt{2 I (t) α} \cos θ (t), & (5.63) \end{array}$

and so Dx is

$\begin{array}{l} D x (t) = D θ (t) \sqrt{\frac{2 I (t)}{α}} \cos θ (t) + D I (t) \frac{1}{\sqrt{2 I (t) α}} \sin θ (t) . & (5.64) \end{array}$

The difference of pv and the transformed p′v′ is

$\begin{array}{l} P_{x} (t) D x (t) - I (t) D θ (t) \\ = I (t) D θ (t) (2 \cos^{2} θ (t) - 1) + D I (t) \sin θ (t) \cos θ (t) . & (5.65) \end{array}$

In general this is not zero. So the product pv is not necessarily invariant under general canonical transformations.

Invariance of Poisson brackets

Here is a remarkable fact: the composition of the Poisson bracket of two phase-space state functions with a canonical transformation is the same as the Poisson bracket of each of the two functions composed with the transformation separately. Loosely speaking, the Poisson bracket is invariant under canonical phase-space transformations.

Let f and g be two phase-space state functions. Using the $\tilde{J}$ representation of the Poisson bracket (see section 5.2.2), we deduce

$\begin{array}{l} {f \circ C_{H}, g \circ C_{H}} \\ = (D (f \circ C_{H})) \cdot (\tilde{J} \circ D (g \circ C_{H})) \\ = (D f \circ C_{H}) \cdot D C_{H} \cdot (\tilde{J} \circ ((D g \circ C_{H}) \cdot D C_{H})) \\ = (D f \circ C_{H}) \cdot (\tilde{J} \circ D g \circ C_{H}) \\ = (D f \cdot (\tilde{J} \circ D g)) \circ C_{H} \\ = {f, g} \circ C_{H}, & (5.66) \end{array}$

where the fact that C_H satisfies equation (5.41) was used in the middle. This is

$\begin{array}{l} {f \circ C_{H}, g \circ C_{H}} = {f, g} \circ C_{H} . & (5.67) \end{array}$

Volume preservation

Consider a canonical transformation C_H. Let Ĉ_t be a function with parameter t such that (q, p) = Ĉ_t(q′, p′) if (t, q, p) = C_H(t, q′, p′). The function Ĉ_t maps phase-space coordinates to alternate phase-space coordinates at a given time. Consider regions R in (q, p) and R in (q′, p′) such that R = Ĉ_t(R′). The volume of region R′ is

$\begin{array}{l} V (R) = \int_{R} \hat{1} = \int_{R'} \det (D {\hat{C}}_{t}), & (5.68) \end{array}$

where $\hat{1}$ is the function whose value is one for every input. Now if C_H is symplectic then the determinant of DĈ_t is one (see section 4.2.3), so

$\begin{array}{l} V (R) = V (R') . & (5.69) \end{array}$

Thus, phase-space volume is preserved by symplectic transformations.

Liouville's theorem shows that time evolution preserves phase-space volume. Here we see that canonical transformations also preserve phase volumes. Later, we will find that time evolution actually generates a canonical transformation.

The symplectic 2-form

Define

$\begin{array}{l} ω (ζ_{1}, ζ_{2}) = P (ζ_{2}) Q (ζ_{1}) - P (ζ_{1}) Q (ζ_{2}), & (5.70) \end{array}$

where Q = I₁ and P = I₂ are the coordinate and momentum selectors, respectively. The arguments ζ₁ and ζ₂ are incremental phase-space states with zero time components.

The ω form can also be written as a sum over degrees of freedom:

$\begin{array}{l} ω (ζ_{1}, ζ_{2}) = \sum_{i} (P_{i} (ζ_{2}) Q^{i} (ζ_{1}) - P_{i} (ζ_{1}) Q^{i} (ζ_{2})) . & (5.71) \end{array}$

Notice that the contributions for each i do not mix components from different degrees of freedom.

This bilinear form is closely related to the symplectic 2-form of differential geometry. It differs in that the symplectic 2-form is formally a function of the phase-space point as well as the incremental vectors.

Under a canonical transformation s = C_H(s′), incremental states transform with the derivative

$\begin{array}{l} ζ_{i} = D C_{H} (s') ζ_{i}^{'} . & (5.72) \end{array}$

We will show that the 2-form is invariant under this transformation

$\begin{array}{l} ω (ζ_{1}, ζ_{2}) = ω (ζ_{1}^{'}, ζ_{2}^{'}), & (5.73) \end{array}$

if the time components of the $ζ_{i}^{'}$ are both zero.

We have shown that condition (5.41) does not depend on the details of the Hamiltonian H. So if a transformation satisfies the canonical condition we can use condition (5.41) with H replaced by an arbitrary function f of phase-space states:

$\begin{array}{l} D f (C_{H} (s')) = (D C_{H} (s')) \cdot (D (f \circ C_{H}) (s')) . & (5.74) \end{array}$

In terms of ω, the Poisson bracket is

$\begin{array}{l} {f, g} (s) = ω (D f (s), D g (s)) & (5.75) \end{array}$

as can be seen by writing out the components. We use the fact that Poisson brackets are invariant under canonical transformations:

$\begin{array}{l} ({f, g} \circ C_{H}) (s') = {f \circ C_{H}, g \circ C_{H}} (s') . & (5.76) \end{array}$

Using the relation (5.74) to expand the left-hand side of equation (5.76) we obtain:

$\begin{array}{l} ({f, g} \circ C_{H}) (s') \\ = ω ((D f \circ C_{H}) (s'), (D g \circ C_{H}) (s')) \\ = ω ((D C_{H} (s')) \cdot (D (f \circ C_{H}) (s')), \\ (D C_{H} (s')) \cdot (D (g \circ C_{H}) (s'))) . & (5.77) \end{array}$

The right-hand side of equation (5.76) is

$\begin{array}{l} {f \circ C_{H}, g \circ C_{H}} (s') = ω (D (f \circ C_{H}) (s'), D (g \circ C_{H}) (s')) . & (5.78) \end{array}$

Now the left-hand side must equal the right-hand side for any f and g, so the equation must also be true for arbitrary $ζ_{i}^{'}$ of the form

$\begin{array}{l} ζ_{1}^{'} = D (f \circ C_{H}) (s') \\ ζ_{2}^{'} = D (g \circ C_{H}) (s') . & (5.79) \end{array}$

So the $ζ_{i}^{'}$ are arbitrary incremental states with zero time components.

We have proven that

$\begin{array}{l} ω (ζ_{1}^{'}, ζ_{2}^{'}) = ω (D C_{H} (s') \cdot ζ_{1}^{'}, D C_{H} (s') \cdot ζ_{2}^{'}) . & (5.80) \end{array}$

for canonical C_H and incremental states $ζ_{i}^{'}$ with zero time components. Using equation (5.72), we have

$\begin{array}{l} ω (ζ_{1}^{'}, ζ_{2}^{'}) = ω (ζ_{1}, ζ_{2}) . & (5.81) \end{array}$

Thus the bilinear antisymmetric function ω is invariant under even time-varying canonical transformations if the increments are restricted to have zero time component.

As a program, ω is

(define (omega zeta1 zeta2)
  (- (* (momentum zeta2) (coordinate zeta1))
     (* (momentum zeta1) (coordinate zeta2))))

On page 356 we showed that point transformations are sym-plectic. Here we can see that the 2-form is preserved under these transformations for two degrees of freedom:

(define (F s)
  ((literal-function 'F
                     (-> (X Real (UP Real Real)) (UP Real Real)))
   (time s)
   (coordinates s)))

(let ((s (up 't (up 'x 'y) (down 'p_x 'p_y)))
      (zeta1 (up 0 (up 'dx1 'dy1) (down 'dp1_x 'dp1_y)))
      (zeta2 (up 0 (up 'dx2 'dy2) (down 'dp2_x 'dp2_y))))
  (let ((DCs ((D (F->CH F)) s)))
    (- (omega zeta1 zeta2)
       (omega (* DCs zeta1) (* DCs zeta2)))))
0

Alternatively, let z₁ and z₂ be the matrix representations of the qp parts of ζ₁ and ζ₂. The matrix representation of ω is

$\begin{array}{l} ω (ζ_{1}, ζ_{2}) = z_{1}^{T} \cdot J_{n} \cdot z_{2} . & (5.82) \end{array}$

Let A be the matrix representation of the qp part of DC_H(s′) Then the invariance of ω is equivalent to

$\begin{array}{l} z_{1}^{T} \cdot A^{T} \cdot J_{n} \cdot A \cdot z_{2} = z_{1}^{T} \cdot J_{n} \cdot z_{2} . & (5.83) \end{array}$

But this is true if

$\begin{array}{l} A^{T} \cdot J_{n} \cdot A = J_{n}, & (5.84) \end{array}$

which is equivalent to the condition that A is symplectic. (If a matrix is symplectic then its transpose is symplectic. See exercise 5.6).

The symplectic condition is symmetrical in that if A is symplec-tic then $A^{T}$ is symplectic, because the symplectic unit is invertible. The canonical condition

$\begin{array}{l} J = D C_{H} (s') \cdot J \cdot {(D C_{H} (s'))}^{T}, & (5.85) \end{array}$

is satisfied by time-varying canonical transformations, and time-varying canonical transformations are symplectic. But if the transformation is time varying then

$\begin{array}{l} J = {(D C_{H} (s'))}^{T} \cdot J \cdot D C_{H} (s'), & (5.86) \end{array}$

is not satisfied because J is not invertible. Equation (5.86) is satisfied, however, for time-independent transformations.

Poincaré integral invariant

The invariance of the symplectic 2-form under canonical transformations has a simple interpretation. Consider how the area of an incremental parallelogram in phase space transforms under canonical transformation. Let (Δq, Δp) and (δq, δp) be small increments in phase space, originating at (q, p). Consider the incremental parallelogram with vertex at (q, p) with these two phase-space increments as edges. The sum of the areas of the canonical projections of this incremental parallelogram can be written

$\begin{array}{l} \sum_{i} Δ A_{i} = \sum_{i} (Δ q^{i} δ p_{i} - Δ p_{i} δ q^{i}) . & (5.87) \end{array}$

The right-hand side is the sum of the areas on the canonical planes;¹⁷ for each i the area of a parallelogram is computed from the components of the vectors defining its adjacent sides. Let ζ₁ = (0, Δq, Δp) and ζ₂ = (0, δq, δp); then the sum of the areas of the incremental parallelograms is just

$\begin{array}{l} \sum_{i} Δ A_{i} = ω (ζ_{1}, ζ_{2}), & (5.88) \end{array}$

where ω is the bilinear antisymmetric function introduced in equation (5.70). The function ω is invariant under canonical transformations, so the sum of the areas of the incremental parallelograms is invariant under canonical transformations.

There is an integral version of this differential relation. Consider the oriented area of a region R′ in phase space (see figure 5.2). Suppose we make a canonical transformation from coordinates (q′, p′) to (q, p) taking region R′ to region R. The boundary of the region in the transformed coordinates is just the image under the canonical transformation of the original boundary. Let $R_{q^{i}, p_{i}}$ be the projection of the region R onto the qⁱ, p_i plane of coordinate qⁱ and conjugate momentum p_i, and let A_i be its area. Similarly, let $R_{q'^{i}, p_{i}^{'}}^{'}$ be the projection of R′ onto the $q'^{i}, p_{i}^{'}$ plane, and let $A_{i}^{'}$ be its area.

art — **Figure 5.2** A region R′ in phase space is mapped by a canonical transformation C_H to a region R. The projections of region R onto the planes formed by canonical basis pairs *q_j*, *p_j* are *R_j*. The projections of R′ are $R_{j}^{'}$ . In general, the areas of the regions R and R′ are not the same, but the sums of the areas of the canonical plane projections are the same.

The area of an arbitrary region is just the limit of the sum of the areas of incremental parallelograms that cover the region, so the sum of oriented areas is preserved by canonical transformations:

$\begin{array}{l} \sum_{i} A_{i} = \sum_{i} A_{i}^{'} . & (5.89) \end{array}$

That is, the sum of the projected areas on the canonical planes is preserved by canonical transformations. Another way to say this is

$\begin{array}{l} \sum_{i} \int_{R_{q^{i}, p_{i}}} d q^{i} d p_{i} = \sum_{i} \int_{R_{q'^{i}, p_{i}^{'}}^{'}} d q'^{i} d p_{i}^{'} . & (5.90) \end{array}$

The equality-of-areas relation (5.90) can also be written as an equality of line integrals using Stokes's theorem, for simply-connected regions $R_{q^{i}, p_{i}}$ and $R_{q'^{i}, p_{i}^{'}}^{'}$ :

$\begin{array}{l} \sum_{i} \oint_{\partial R_{q^{i}, p_{i}}} p_{i} d q^{i} = \sum_{i} \oint_{\partial R_{q'^{i}, p_{i}^{'}}^{'}} p_{i}^{'} d q'^{i} . & (5.91) \end{array}$

The canonical planes are disjoint except at the origin, so the projected areas intersect in at most one point. Thus we may independently accumulate the line integrals around the boundaries of the individual projections of the region onto the canonical planes into a line integral around the unprojected region:

$\begin{array}{l} \oint_{\partial R} \sum_{i} p_{i} d q^{i} = \oint_{\partial R'} \sum_{i} p_{i}^{'} d q'^{i} . & (5.92) \end{array}$

Exercise 5.10: Watch out

Consider the canonical transformation C_H:

$(t, x, p) = C_{H} (t, θ, J) = (t, \sqrt{2 (J + a)} \sin θ, \sqrt{2 (J + a)} \cos θ) .$

a. Show that the transformation is symplectic for any a.

b. Show that equation (5.92) is not generally satisfied for the region enclosed by a curve of constant J.

5.4 Generating Functions

We have considered a number of properties of general canonical transformations without having a method for coming up with them. Here we introduce the method of generating functions. The generating function is a real-valued function that compactly specifies a canonical transformation through its partial derivatives, as follows.

Consider a real-valued function F₁(t, q, q′) mapping configurations expressed in two coordinate systems to the reals. We will use F₁ to construct a canonical transformation from one coordinate system to the other. We will show that the following relations among the coordinates, the momenta, and the Hamiltonians specify a canonical transformation:

$\begin{array}{l} p = \partial_{1} F_{1} (t, q, q') & (5.93) \end{array}$

$\begin{array}{l} p' = - \partial_{2} F_{1} (t, q, q') & (5.94) \end{array}$

$\begin{array}{l} H' (t, q', p') - H (t, q, p) = \partial_{0} F_{1} (t, q, q') . & (5.95) \end{array}$

The transformation will then be explicitly given by solving for one set of variables in terms of the others: To obtain the primed variables in terms of the unprimed ones, let A be the inverse of ∂₁F₁ with respect to the third argument,

$\begin{array}{l} q' = A (t, q, \partial_{1} F_{1} (t, q, q')); & (5.96) \end{array}$

then

$\begin{array}{l} q' = A (t, q, p) & (5.97) \end{array}$

$\begin{array}{l} p' = - \partial_{2} F_{1} (t, q, A (t, q, p)) . & (5.98) \end{array}$

Let B be the coordinate part of the phase-space transformation q = B(t, q′, p′). This B is an inverse function of ∂₂F₁, satisfying

$\begin{array}{l} q = B (t, q', - \partial_{2} F_{1} (t, q, q')) . & (5.99) \end{array}$

Using B, we have

$\begin{array}{l} q = B (t, q', p') & (5.100) \end{array}$

$\begin{array}{l} p = \partial_{1} F_{1} (t, B (t, q', p'), q') . & (5.101) \end{array}$

To put the transformation in explicit form requires that the inverse functions A and B exist.

We can use the above relations to verify that some given transformation from one set of phase-space coordinates (q, p) with Hamiltonian function H(t, q, p) to another set (q′, p′) with Hamiltonian function H′(t, q′, p′) is canonical by finding an F₁(t, q, q′) such that the above relations are satisfied. We can also use arbitrarily chosen generating functions of type F₁ to generate new canonical transformations.

The polar-canonical transformation

The polar-canonical transformation (5.27) from coordinate and momentum (x, p_x) to new coordinate and new momentum (θ, I),

$\begin{matrix} x = \sqrt{\frac{2 I}{α}} \sin θ & (5.102) \end{matrix}$

$\begin{array}{l} p_{x} = \sqrt{2 I α} \cos θ, & (5.103) \end{array}$

introduced earlier, is canonical. This can also be demonstrated by finding a suitable F₁ generating function. The generating function satisfies a set of partial differential equations, (5.93) and (5.94):

$\begin{array}{l} p_{x} = \partial_{1} F_{1} (t, x, θ) & (5.104) \end{array}$

$\begin{array}{l} I = - \partial_{2} F_{1} (t, x, θ) . & (5.105) \end{array}$

Using relations (5.102) and (5.103), which specify the transformation, equation (5.104) can be rewritten

$\begin{array}{l} p_{x} = x α \cot θ = \partial_{1} F_{1} (t, x, θ), & (5.106) \end{array}$

which is easily integrated to yield

$\begin{array}{l} F_{1} (t, x, θ) = \frac{α}{2} x^{2} \cot θ + φ (t, θ), & (5.107) \end{array}$

where φ is some integration “constant” with respect to the first integration. Substituting this form for F₁ into the second partial differential equation (5.105), we find

$\begin{array}{l} I = - \partial_{2} F_{1} (t, x, θ) = \frac{α}{2} \frac{x^{2}}{{(\sin θ)}^{2}} - \partial_{1} φ (t, θ), & (5.108) \end{array}$

but if we set φ = 0 the desired relations are recovered. So the generating function

$\begin{array}{l} F_{1} (t, x, θ) = \frac{α}{2} x^{2} \cot θ & (5.109) \end{array}$

generates the polar-canonical transformation. This shows that this transformation is canonical.

5.4.1 F₁ Generates Canonical Transformations

We can prove directly that the transformation generated by an F₁ is canonical by showing that if Hamilton's equations are satisfied in one set of coordinates then they will be satisfied in the other set of coordinates. Let F₁ take arguments (t, x, y). The relations among the coordinates are

$\begin{array}{l} p_{x} = \partial_{1} F_{1} (t, x, y) \\ p_{y} = - \partial_{2} F_{1} (t, x, y) & (5.110) \end{array}$

and the Hamiltonians are related by

$\begin{array}{l} H' (t, y, p_{y}) = H (t, x, p_{x}) + \partial_{0} F_{1} (t, x, y) . & (5.111) \end{array}$

Substituting the generating function relations (5.110) into this equation, we have

$\begin{array}{l} H' (t, y, - \partial_{2} F_{1} (t, x, y)) \\ = H (t, x, \partial_{1} F_{1} (t, x, y)) + \partial_{0} F_{1} (t, x, y) . & (5.112) \end{array}$

Take the partial derivatives of this equality of expressions with respect to the variables x and y:¹⁸

$\begin{array}{l} - {(\partial_{2} H')}^{j} {(\partial_{1} {(\partial_{2} F_{1})}_{j})}_{i} \\ = {(\partial_{1} H)}_{i} + {(\partial_{2} H)}^{j} {(\partial_{1} {(\partial_{1} F_{1})}_{j})}_{i} + {(\partial_{1} \partial_{0} F_{1})}_{i} \\ {(\partial_{1} H')}_{i} - {(\partial_{2} H')}^{j} {(\partial_{2} {(\partial_{2} F_{1})}_{j})}_{i} \\ = {(\partial_{2} H)}^{j} {(\partial_{2} {(\partial_{1} F_{1})}_{j})}_{i} + {(\partial_{2} \partial_{0} F_{1})}_{i} & (5.113) \end{array}$

where the arguments are unambiguous and have been suppressed. On solution paths we can use Hamilton's equations for the (x, p_x) system to replace the partial derivatives of H with derivatives of x and p_x, obtaining

$\begin{array}{l} - {(\partial_{2} H')}^{j} {(\partial_{1} {(\partial_{2} F_{1})}_{j})}_{i} \\ = - {(D p_{x})}_{i} + {(D x)}^{j} {(\partial_{1} {(\partial_{1} F_{1})}_{j})}_{i} + {(\partial_{1} \partial_{0} F_{1})}_{i} \\ {(\partial_{1} H')}_{i} - {(\partial_{2} H')}^{j} {(\partial_{2} {(\partial_{2} F_{1})}_{j})}_{i} \\ = {(D x)}^{j} {(\partial_{2} {(\partial_{1} F_{1})}_{j})}_{i} + {(\partial_{2} \partial_{0} F_{1})}_{i} . & (5.114) \end{array}$

Now compute the derivatives of p_x and p_y, from equations (5.110), along consistent paths:

$\begin{array}{l} {(D p_{x})}_{i} = {(\partial_{1} {(\partial_{1} F_{1})}_{i})}_{j} {(D x)}^{j} + {(\partial_{2} {(\partial_{1} F_{1})}_{i})}_{j} {(D_{y})}^{j} + \partial_{0} {(\partial_{1} F_{1})}_{i} \\ {(D p_{y})}_{i} = - {(\partial_{1} {(\partial_{2} F_{1})}_{i})}_{j} {(D x)}^{j} - {(\partial_{2} {(\partial_{2} F_{1})}_{i})}_{j} {(D_{y})}^{j} - \partial_{0} {(\partial_{2} F_{1})}_{i} . \\ (5.115) \end{array}$

Using the fact that elementary partials commute, (∂₂(∂₁F₁)_i)_j = (∂₁(∂₂F₁)_j)_i, and substituting this expression for (Dp_x)_i into the first of equations (5.114) yields

$\begin{array}{l} - {(\partial_{2} H')}^{j} {(\partial_{1} {(\partial_{2} F_{1})}_{j})}_{i} = - {(\partial_{1} {(\partial_{2} F_{1})}_{j})}_{i} {(D_{y})}^{j} . & (5.116) \end{array}$

Provided that ∂₂∂₁F₁ is nonsingular,¹⁹ we have derived one of Hamilton's equations for the (y, p_y) system:

$\begin{array}{l} D_{y} (t) = \partial_{2} H' (t, y (t), p_{y} (t)) . & (5.117) \end{array}$

Hamilton's other equation,

$\begin{array}{l} D p_{y} (t) = - \partial_{1} H' (t, y (t), p_{y} (t)), & (5.118) \end{array}$

can be derived in a similar way. So the generating function relations indeed specify a canonical transformation.

5.4.2 Generating Functions and Integral Invariants

Generating functions can be used to specify a canonical transformation by the prescription given above. Here we show how to get a generating function from a canonical transformation, and derive the generating function rules.

The generating function representation of canonical transformations can be derived from the Poincaré integral invariants, as follows. We first show that, given a canonical transformation, the integral invariants imply the existence of a function of phase-space coordinates that can be written as a path-independent line integral. Then we show that partial derivatives of this function, represented in mixed coordinates, give the generating function relations between the old and new coordinates. We need to do this only for time-independent transformations because time-dependent transformations become time independent in the extended phase space (see section 5.5).

Generating functions of type F₁

Let C be a time-independent canonical transformation, and let C_t be the qp-part of the transformation. The transformation C_t preserves the integral invariant equation (5.90). One way to express the equality of areas is as a line integral (5.92):

$\begin{array}{l} \oint_{\partial R} \sum_{i} p_{i} d q^{i} = \oint_{\partial R'} \sum_{i} p_{i}^{'} d q'^{i}, & (5.119) \end{array}$

where R′ is a two-dimensional region in (q′, p′) coordinates at time t, R = C_t(R′) is the corresponding region in (q, p) coordinates, and ∂R indicates the boundary of the region R. This holds for any region and its boundary. We will show that this implies there is a function F (t, q′, p′) that can be defined in terms of line integrals

$\begin{array}{l} F (t, q', p') - F (t, q_{0}^{'}, p_{0}^{'}) \\ = \int_{γ = C_{t} (γ')} \sum_{i} p_{i} d q^{i} - \int_{γ'} \sum_{i} p_{i}^{'} d q'^{i}, & (5.120) \end{array}$

where γ′ is a curve in phase-space coordinates that begins at $γ' (0) = (q_{0}^{'}, p_{0}^{'})$ and ends at γ′(1) = (q′, p′), and γ is its image under C_t.

Let

$\begin{array}{l} G_{t} (γ') = \int_{γ = C_{t} (γ')} \sum_{i} p_{i} d q^{i} - \int_{γ'} \sum_{i} p_{i}^{'} d q'^{i}, & (5.121) \end{array}$

and let $γ_{1}^{'}$ and $γ_{2}^{'}$ be two paths with the same endpoints. Then

$\begin{array}{l} G_{t} (γ_{2}^{'}) - G_{t} (γ_{1}^{'}) & = \oint_{\partial R} \sum p_{i} d q^{i} - \oint_{\partial R'} \sum p_{i}^{'} d q'^{i} \\ = 0. & (5.122) \end{array}$

So the value of G_t(γ′) depends only on the endpoints of γ′.

Let

$\begin{array}{l} {\bar{G}}_{t, q_{0}^{'}, p_{0}^{'}} (q', p') = G_{t} (γ'), & (5.123) \end{array}$

where γ′ is any path from $q_{0}^{'}, p_{0}^{'}$ to q′, p′. Changing the initial point from $q_{0}^{'} p_{0}^{'}$ to $q_{1}^{'} p_{1}^{'}$ changes the value of Ḡ by a constant:

$\begin{array}{l} {\bar{G}}_{t, q_{1}^{'}, p_{1}^{'}} (q', p') - {\bar{G}}_{t, q_{0}^{'}, p_{0}^{'}} (q', p') = {\bar{G}}_{t, q_{1}^{'}, p_{1}^{'}} (q_{0}^{'}, p_{0}^{'}) . & (5.124) \end{array}$

If we define F so that

$\begin{array}{l} F (t, q', p') = {\bar{G}}_{t, q_{1}^{'}, p_{1}^{'}} (q', p'), & (5.125) \end{array}$

then

$\begin{array}{l} F (t, q', p') - F (t, q_{0}^{'}, p_{0}^{'}) = {\bar{G}}_{t, q_{0}^{'}, p_{0}^{'}} (q', p'), & (5.126) \end{array}$

demonstrating equation (5.120).

The phase-space point (q, p) in unprimed variables corresponds to (q′, p′) in primed variables, at an arbitrary time t. Both p and q are determined given q′ and p′. In general, given any two of these four quantities, we can solve for the other two. If we can solve for the momenta in terms of the positions we get a particular class of generating functions.²⁰ We introduce the functions

$\begin{array}{l} p & = f_{p} (t, q, q') \\ p' & = f_{p'} (t, q, q') & (5.127) \end{array}$

that solve the transformation equations (t, q, p) = C(t, q′, p′) for the momenta in terms of the coordinates at a specified time. With these we introduce a function F₁(t, q, q′) such that

$\begin{array}{l} F_{1} (t, q, q') = F (t, q, f_{p} (t, q, q')) . & (5.128) \end{array}$

The function F₁ has the same value as F but has different arguments. We will show that this F₁ is in fact the generating function for canonical transformations introduced in section 5.4. Let's be explicit about the definition of F₁ in terms of a line integral:

$\begin{array}{l} F_{1} (t, q, q') - F_{1} (t, q_{0}, q_{0}^{'}) \\ = \int_{q_{0}, q_{0}^{'}}^{q, q'} (f_{p} (t, q, q') d q - f_{p'} (t, q, q') d q') . & (5.129) \end{array}$

The two line integrals can be combined into this one because they are both expressed as integrals along a curve in (q, q′).

We can use the path independence of F₁ to compute the partial derivatives of F₁ with respect to particular components and consequently derive the generating function relations for the momenta.²¹ So we conclude that

$\begin{array}{l} {(\partial_{1} F_{1} (t, q, q'))}_{i} = f_{p_{i}} (t, q, q') & (5.130) \end{array}$

$\begin{array}{l} {(\partial_{2} F_{1} (t, q, q'))}_{i} = - f_{p_{i}^{'}} (t, q, q') . & (5.131) \end{array}$

These are just the configuration and momentum parts of the generating function relations for canonical transformation. So starting with a canonical transformation, we can find a generating function that gives the coordinate–momentum part of the transformation through its derivatives.

Starting from a general canonical transformation, we have constructed an F₁ generating function from which the canonical transformation may be rederived. So we expect there is a generating function for every canonical transformation.²²

Generating functions of type F₂

Point transformations were excluded from the previous argument because we could not deduce the momenta from the coordinates. However, a similar derivation allows us to make a generating function for this case. The integral invariants give us an equality of area integrals. There are other ways of writing the equality-of-areas relation (5.90) as a line integral. We can also write

$\begin{array}{l} \oint_{\partial R} \sum_{i} p_{i} d q^{i} = - \oint_{\partial R'} \sum_{i} q_{i}^{'} d p'^{i} . & (5.132) \end{array}$

The minus sign arises because by flipping the axes we are traversing the area in the opposite sense. Repeating the argument just given, we can define a function

$\begin{array}{l} F' (t, q', p') - F' (t, q_{0}^{'}, p_{0}^{'}) \\ = \int_{γ = C (t, γ')} \sum_{i} p_{i} d q^{i} + \int_{γ'} \sum_{i} q_{i}^{'} d p'^{i} & (5.133) \end{array}$

that is independent of the path γ′. If we can solve for q′ and p in terms of q and p′ we can define the functions

$\begin{array}{l} q' = f_{q'}^{'} (t, q, p') \\ p = f_{p}^{'} (t, q, p') & (5.134) \end{array}$

and define

$\begin{array}{l} F_{2} (t, q, p') = F' (t, f_{q'}^{'} (t, q, p'), p') . & (5.135) \end{array}$

Then the canonical transformation is given as partial derivatives of F₂:

$\begin{array}{l} {(\partial_{1} F_{2} (t, q, p'))}_{i} = f_{p_{i}}^{'} (t, q, p') & (5.136) \end{array}$

and

$\begin{array}{l} {(\partial_{2} F_{2} (t, q, p'))}^{i} = f_{q_{i}^{'}}^{'} (t, q, p') . & (5.137) \end{array}$

Relationship between F₁ and F₂

For canonical transformations that can be described by both an F₁ and an F₂, there must be a relation between them. The alternative line integral expressions for the area integral are related. Consider the difference

$\begin{array}{l} (F' (t, q', p') - F' (t, q_{0}^{'}, p_{0}^{'})) - (F (t, q', p') - F (t, q_{0}^{'}, p_{0}^{'})) \\ = \int_{γ'} \sum_{i} p_{i}^{'} d q'^{i} + \int_{γ'} \sum_{i} q_{i}^{'} d p'^{i} \\ = \int_{γ'} \sum_{i} d (p_{i}^{'} q'^{i}) \\ = \sum_{i} (p')_{i} {(q')}^{i} - \sum_{i} {(p_{0}^{'})}_{i} {(q_{0}^{'})}^{i} . & (5.138) \end{array}$

The functions F and F′ are related by an integrated term

$\begin{array}{l} F' (t, q', p') - F (t, q', p') = p' q', & (5.139) \end{array}$

as are F₁ and F₂:

$\begin{array}{l} F_{2} (t, q, p') - F_{1} (t, q, q') = p' q' . & (5.140) \end{array}$

The generating functions F₁ and F₂ are related by a Legendre transform:

$\begin{array}{l} p' = - \partial_{2} F_{1} (t, q, q') & (5.141) \end{array}$

$\begin{array}{l} p' q' = - F_{1} (t, q, q') + F_{2} (t, q, p') & (5.142) \end{array}$

$\begin{array}{l} q' = \partial_{2} F_{2} (t, q, p') . & (5.143) \end{array}$

We have passive variables q and t:

$\begin{array}{l} - \partial_{1} F_{1} (t, q, q') + \partial_{1} F_{2} (t, q, p') = 0 & (5.144) \end{array}$

$\begin{array}{l} - \partial_{0} F_{1} (t, q, q') + \partial_{0} F_{2} (t, q, p') = 0. & (5.145) \end{array}$

But p = ∂₁F₁(t, q, q′) from the first transformation, so

$\begin{array}{l} p = \partial_{1} F_{2} (t, q, p') . & (5.146) \end{array}$

Furthermore, since H′(t, q′, p′) − H(t, q, p) = ∂₀F₁(t, q, q′) we can conclude that

$\begin{array}{l} H' (t, q', p') - H (t, q, p) = \partial_{0} F_{2} (t, q, p') . & (5.147) \end{array}$

5.4.3 Types of Generating Functions

We have used generating functions of the form F₁(t, q, q′) to construct canonical transformations:

$\begin{array}{l} p = \partial_{1} F_{1} (t, q, q') & (5.148) \end{array}$

$\begin{array}{l} p' = - \partial_{2} F_{1} (t, q, q') & (5.149) \end{array}$

$\begin{array}{l} H' (t, q', p') - H (t, q, p) = \partial_{0} F_{1} (t, q, q') . & (5.150) \end{array}$

We can also construct canonical transformations with generating functions of the form F₂(t, q, p′), where the third argument of F₂ is the momentum in the primed system.²³

$\begin{array}{l} p = \partial_{1} F_{2} (t, q, p') & (5.151) \end{array}$

$\begin{array}{l} q' = \partial_{2} F_{2} (t, q, p') & (5.152) \end{array}$

$\begin{array}{l} H' (t, q', p') - H (t, q, p) = \partial_{0} F_{2} (t, q, p') & (5.153) \end{array}$

As in the F₁ case, to put the transformation in explicit form requires that appropriate inverse functions be constructed to allow the solution of the equations.

Similarly, we can construct two other forms for generating functions, named mnemonically enough F₃ and F₄:

$\begin{array}{l} q = - \partial_{1} F_{3} (t, p, q') & (5.154) \end{array}$

$\begin{array}{l} p' = - \partial_{2} F_{3} (t, p, q') & (5.155) \end{array}$

$\begin{array}{l} H' (t, q', p') - H (t, q, p) = \partial_{0} F_{3} (t, p, q') & (5.156) \end{array}$

and

$\begin{array}{l} q = - \partial_{1} F_{4} (t, p, p') & (5.157) \end{array}$

$\begin{array}{l} q' = \partial_{2} F_{4} (t, p, p') & (5.158) \end{array}$

$\begin{matrix} H' (t, q', p') - H (t, q, p) = \partial_{0} F_{4} (t, p, p') & (5.159) \end{matrix}$

These four classes of generating functions are called mixed-variable generating functions because the canonical transformations they generate give a mixture of old and new variables in terms of a mixture of old and new variables.

In every case, if the generating function does not depend explicitly on time then the Hamiltonians are obtained from one another purely by composition with the appropriate canonical transformation. If the generating function depends on time, then there are additional terms.

The generating functions presented each treat the coordinates and momenta collectively. One could define more complicated generating functions for which the transformations of different degrees of freedom are specified by generating functions of different types.

5.4.4 Point Transformations

Point transformations can be represented in terms of a generating function of type F₂. Equations (5.6), which define a canonical point transformation derived from a coordinate transformation F, are

$\begin{array}{l} (t, q, p) = C (t, q', p') = (t, F (t, q'), p' {(\partial_{1} F (t, q'))}^{- 1}) . & (5.160) \end{array}$

Let S be the inverse transformation of F with respect to the second argument

$\begin{array}{l} q' = S (t, q), & (5.161) \end{array}$

so that q′ = S(t, F (t, q′)). The momentum transformation that accompanies this coordinate transformation is

$\begin{array}{l} p' = p {(\partial_{1} S (t, q))}^{- 1} . & (5.162) \end{array}$

We can find the generating function F₂ that gives this transformation by integrating equation (5.152) to get

$\begin{array}{l} F_{2} (t, q, p') = p' S (t, q) + φ (t, q) . & (5.163) \end{array}$

Substituting this into equation (5.151), we get

$\begin{array}{l} p = p' \partial_{1} S (t, q) + \partial_{1} φ (t, q) . & (5.164) \end{array}$

We do not need the freedom provided by φ, so we can set it equal to zero:

$\begin{array}{l} F_{2} (t, q, p') = p' S (t, q), & (5.165) \end{array}$

with

$\begin{array}{l} p = p' \partial_{1} S (t, q) . & (5.166) \end{array}$

So this F₂ gives the canonical transformation of equations (5.161) and (5.162).

The canonical transformation for the coordinate transformation S is the inverse of the canonical transformation for F. By design F and S are inverses on the coordinate arguments. The identity function is q = I(q′) = S(t, F (t, q′)). Differentiating yields

$\begin{array}{l} 1 = \partial_{1} S (t, F (t, q')) \partial_{1} F (t, q'), & (5.167) \end{array}$

$\begin{array}{l} \partial_{1} F (t, q') = {(\partial_{1} S (t, F (t, q')))}^{- 1} . & (5.168) \end{array}$

Using this, the relation between the momenta (5.166) is

$\begin{array}{l} p = p' {(\partial_{1} F (t, q'))}^{- 1}, & (5.169) \end{array}$

showing that F₂ gives a point transformation equivalent to the point transformation (5.160). So from this other point of view the point transformation is canonical.

The F₁ that corresponds to the F₂ for a point transformation is

$\begin{array}{l} F_{1} (t, q, q') & = F_{2} (t, q, p') - p' q' \\ = p' S (t, q) - p' q' \\ = 0. & (5.170) \end{array}$

This is why we could not use generating functions of type F₁ to construct point transformations.

Polar and rectangular coordinates

A commonly required point transformation is the transition between polar coordinates and rectangular coordinates:

$\begin{array}{l} x = r \cos θ & (5.171) \\ y = r \sin θ . \end{array}$

Using the formula for the generating function of a point transformation just derived, we find:

$\begin{array}{l} F_{2} (t; r, θ; p_{x}, p_{y}) = [\begin{matrix} p_{x} & p_{y}] (\begin{matrix} r \cos θ \\ r \sin θ \end{matrix}) . \end{matrix} & (5.172) \end{array}$

So the full transformation is derived:

$\begin{array}{l} (x, y) & = \partial_{2} F_{2} (t; r, θ; p_{x}, p_{y}) \\ = (r \cos θ, r \sin θ) \\ [p_{r}, p_{θ}] & = \partial_{1} F_{2} (t; r, θ; p_{x}, p_{y}) \\ = [p_{x} \cos θ + p_{y} \sin θ, - p_{x} r \sin θ + p_{y} r \cos θ] . & (5.173) \end{array}$

We can isolate the rectangular coordinates to one side of the transformation and the polar coordinates to the other:

$\begin{array}{l} p_{r} = \frac{1}{r} (p_{x} x + p_{y} y) \\ p_{θ} = - p_{x} y + p_{y} x . & (5.174) \end{array}$

So, interpreted in terms of Newtonian vectors, $p_{r} = \hat{r} \cdot \vec{p}$ is the radial component of the linear momentum and $p_{θ} = ‖ \vec{r} \times \vec{p} ‖$ is the magnitude of the angular momentum. The point transformation is time independent, so the Hamiltonian transforms by composition.

Rotating coordinates

A useful time-dependent point transformation is the transition to a rotating coordinate system. This is most easily accomplished in polar coordinates. Here we have

$\begin{array}{l} r' = r \\ θ' = θ - Ω t, & (5.175) \end{array}$

where Ω is the angular velocity of the rotating coordinate system. The generating function is

$\begin{array}{l} F_{2} (t; r, θ; p_{r}^{'}, p_{θ}^{'}) = [\begin{matrix} p_{r}^{'} & p_{θ}^{'}] \end{matrix} (\begin{matrix} r \\ θ - Ω t \end{matrix}) . & (5.176) \end{array}$

This yields the transformation equations

$\begin{array}{l} r' & = r \\ θ' & = θ - Ω t \\ p_{r} & = p_{r}^{'} \\ p_{θ} & = p_{θ}^{'}, & (5.177) \end{array}$

which show that the momenta are the same in both coordinate systems. However, here the Hamiltonian is not a simple composition:

$\begin{array}{l} H' (t; r', θ'; p_{r}^{'}, p_{θ}^{'}) = H (t; r', θ' + Ω t; p_{r}^{'}, p_{θ}^{'}) - p_{θ}^{'} Ω . & (5.178) \end{array}$

The Hamiltonians differ by the derivative of the generating function with respect to the time argument. In transforming to rotating coordinates, the values of the Hamiltonians differ by the product of the angular momentum and the angular velocity of the coordinate system. Notice that this addition to the Hamiltonian is the same as was found earlier (5.45).

Reducing the two-body problem to the one-body problem

In this example we illustrate how canonical transformations can be used to eliminate some of the degrees of freedom, leaving a problem with fewer degrees of freedom.

Suppose that only certain combinations of the coordinates appear in the Hamiltonian. We make a canonical transformation to a new set of phase-space coordinates such that these combinations of the old phase-space coordinates are some of the new phase-space coordinates. We choose other independent combinations of the coordinates to complete the set. The advantage is that these other independent coordinates do not appear in the new Hamiltonian, so the momenta conjugate to them are conserved quantities.

Let's see how this idea enables us to reduce the problem of two gravitating bodies to the simpler problem of the relative motion of the two bodies. In the process we will discover that the momentum of the center of mass is conserved. This simpler problem is an instance of the Kepler problem. The Kepler problem is also encountered in the formulation of the more general n-body problem.

Consider the motion of two masses m₁ and m₂, subject only to a mutual gravitational attraction described by the potential V (r). This problem has six degrees of freedom. The rectangular coordinates of the particles are x₁ and x₂, with conjugate momenta p₁ and p₂. Each of these is a structure of the three rectangular components. The distance between the particles is r = ‖x₁ − x₂‖. The Hamiltonian for the two-body problem is

$\begin{array}{l} H (t; x_{1}, x_{2}; p_{1}, p_{2}) = \frac{p_{1}^{2}}{2 m_{1}} + \frac{p_{2}^{2}}{2 m_{2}} + V (r) . & (5.179) \end{array}$

The gravitational potential energy depends only on the relative positions of the two bodies. We do not need to specify V further at this point.

Since the only combination of coordinates that appears in the Hamiltonian is x₂ − x₁, we choose new coordinates so that one of the new coordinates is this combination:

$\begin{array}{l} x = x_{2} - x_{1} . & (5.180) \end{array}$

To complete the set of new coordinates we choose another to be some independent linear combination

$\begin{array}{l} X = a x_{1} + b x_{2}, & (5.181) \end{array}$

where a and b are to be determined. We can use an F₂-type generating function

$\begin{array}{l} F_{2} (t; x_{1}, x_{2}; p, P) = (x_{2} - x_{1}) p + (a x_{1} + b x_{2}) P, & (5.182) \end{array}$

where p and P will be the new momenta conjugate to x and X, respectively. We deduce

$\begin{array}{l} (x, X) = \partial_{2} F_{2} (t; x_{1}, x_{2}; p, P) = (x_{2} - x_{1}, a x_{1} + b x_{2}) \\ [p_{1}, p_{2}] = \partial_{1} F_{2} (t; x_{1}, x_{2}; p, P) = [- p + a P, p + b P] . & (5.183) \end{array}$

We can solve these for the new momenta:

$\begin{array}{l} P = \frac{p_{1} + p_{2}}{a + b} & (5.184) \end{array}$

$\begin{array}{l} p = \frac{a p_{2} - b p_{1}}{a + b} . & (5.185) \end{array}$

The generating function is not time dependent, so the new Hamiltonian is the old Hamiltonian composed with the transformation:

$\begin{array}{l} H' (t; x, X; p, P) & = \frac{{(- p + a P)}^{2}}{2 m_{1}} + \frac{{(p + b P)}^{2}}{2 m_{2}} + V (‖ x ‖) \\ = \frac{p^{2}}{2 m} + \frac{P^{2}}{2 M} + V (‖ x ‖) \\ + (\frac{b}{m_{2}} - \frac{a}{m_{1}}) p P, & (5.186) \end{array}$

with the definitions

$\begin{array}{l} \frac{1}{m} = \frac{1}{m_{1}} + \frac{1}{m_{2}} & (5.187) \end{array}$

and

$\begin{array}{l} \frac{1}{M} = \frac{a^{2}}{m_{1}} + \frac{b^{2}}{m_{2}} . & (5.188) \end{array}$

We recognize m as the “reduced mass.”

Notice that if the term proportional to pP were not present then the x and X degrees of freedom would not be coupled at all, and furthermore, the X part of the Hamiltonian would be just the Hamiltonian of a free particle, which is easy to solve. The condition that the “cross terms” disappear is

$\begin{array}{l} \frac{b}{m_{2}} - \frac{a}{m_{1}} = 0, & (5.189) \end{array}$

which is satisfied by

$\begin{array}{l} a = c m_{1} \\ b = c m_{2} & (5.190) \end{array}$

for any c. For a transformation to be defined, c must be nonzero. So with this choice the Hamiltonian becomes

$\begin{array}{l} H' (t; x, X; p, P) = H_{X} (t, X, P) + H_{x} (t, x, p) & (5.191) \end{array}$

with

$\begin{array}{l} H_{x} (t, x, p) = \frac{p^{2}}{2 m} + V (r) & (5.192) \end{array}$

and

$\begin{array}{l} H_{X} (t, X, P) = \frac{P^{2}}{2 M} . & (5.193) \end{array}$

The reduced mass is the same as before, and now

$\begin{array}{l} M = \frac{1}{c^{2} (m_{1} + m_{2})} . & (5.194) \end{array}$

Notice that, without further specifying c, the problem has been separated into the problem of determining the relative motion of the two masses, and the problem of the other degrees of freedom. We did not need a priori knowledge that the center of mass might be important; in fact, only for a particular choice of c = (m₁ + m₂)⁻¹ does X become the center of mass.

Epicyclic motion

It is often useful to compose a sequence of canonical transformations to make up the transformation we need for any particular mechanical problem. The transformations we have supplied are especially useful as components in these computations.

We will illustrate the use of canonical transformations to learn about planar motion in a central field. The strategy will be to consider perturbations of circular motion in the central field. The analysis will proceed by transforming to a rotating coordinate system that rides on a circular reference orbit, and then making approximations that restrict the analysis to orbits that differ from the circular orbit only slightly.

In rectangular coordinates we can easily write a Hamiltonian for the motion of a particle of mass m in a field defined by a potential energy that is a function only of the distance from the origin as follows:

$\begin{matrix} H (t; x, y; p_{x}, p_{y}) = \frac{p_{x}^{2} + p_{y}^{2}}{2 m} + V (\sqrt{x^{2} + y^{2}}) . & (5.195) \end{matrix}$

In this coordinate system Hamilton's equations are easy, and they are exactly what is needed to develop trajectories by numerical integration, but the expressions are not very illuminating:

$D x \begin{matrix} = \frac{p_{x}}{m} & (5.196) \end{matrix}$

$D y \begin{matrix} = \frac{p_{y}}{m} & (5.197) \end{matrix}$

$D p_{x} \begin{matrix} = - D V (\sqrt{x^{2} + y^{2}}) \frac{x}{\sqrt{x^{2} + y^{2}}} & (5.198) \end{matrix}$

$D p_{y} \begin{matrix} = - D V (\sqrt{x^{2} + y^{2}}) \frac{y}{\sqrt{x^{2} + y^{2}}} . & (5.199) \end{matrix}$

We can learn more by converting to polar coordinates centered on the source of our field:

$\begin{matrix} x = r \cos φ & (5.200) \end{matrix}$

$\begin{matrix} y = r \sin φ . & (5.201) \end{matrix}$

This coordinate system explicitly incorporates the geometrical symmetry of the potential energy. Extending this coordinate transformation to a point transformation, we can write the new Hamiltonian as:

$\begin{matrix} H' (t; r, φ; p_{r}, p_{φ}) = \frac{p_{r}^{2}}{2 m} + \frac{p_{φ}^{2}}{2 m r^{2}} + V (r) . & (5.202) \end{matrix}$

We can now write Hamilton's equations in these new coordinates, and they are much more illuminating than the equations expressed in rectangular coordinates:

$\begin{matrix} D r = \frac{p_{r}}{m} & (5.203) \end{matrix}$

$\begin{matrix} D φ = \frac{p_{φ}}{m r^{2}} & (5.204) \end{matrix}$

$\begin{matrix} D p_{r} = \frac{p_{φ}^{2}}{m r^{3}} - D V (r) & (5.205) \end{matrix}$

$\begin{matrix} D p_{φ} = 0. & (5.206) \end{matrix}$

The angular momentum p_φ is conserved, and we are free to choose its constant value, so Dφ depends only on r. We also see that we can establish a circular orbit at any radius R₀: we choose p_φ = p_φ0 so that $p_{φ 0}^{2} / (m R_{0}^{3}) - D V (R_{0}) = 0$ . This will ensure that Dp_r = 0, and thus Dr = 0. The square of the angular velocity of this circular orbit is

$Ω^{2} \begin{matrix} = \frac{D V (R_{0})}{m R_{0}} . & (5.207) \end{matrix}$

It is instructive to consider how orbits that are close to the circular orbit differ from the circular orbit. This is best done in rotating coordinates in which a body moving in the circular orbit is a stationary point at the origin. We can do this by converting to coordinates that are rotating with the circular orbit and centered on the orbiting body. We proceed in three stages. First we will transform to a polar coordinate system that is rotating at angular velocity Ω. Then we will return to rectangular coordinates, and finally, we will shift the coordinates so that the origin is on the reference circular orbit.

We start by examining the system in rotating polar coordinates. This is a time-dependent coordinate transformation:

$r' \begin{matrix} = r & (5.208) \end{matrix}$

$φ' \begin{matrix} = φ - Ω t & (5.209) \end{matrix}$

$p_{r}^{'} \begin{matrix} = p r & (5.210) \end{matrix}$

$p_{φ}^{'} \begin{matrix} = p_{φ} . & (5.211) \end{matrix}$

Using equation (5.178), we can write the new Hamiltonian directly:

$H ″ (t; r', φ'; p_{r}^{'}, p_{φ}^{'}) \begin{matrix} = \frac{p_{r}^{' 2}}{2 m} + \frac{p_{φ}^{' 2}}{2 m r'^{2}} + V (r') - p_{φ}^{'} Ω . & (5.212) \end{matrix}$

H″ is not time dependent, and therefore it is conserved. It is not the sum of the potential energy and the kinetic energy. Energy is not conserved in the moving coordinate system, but what is conserved here is a new quantity, the Jacobi constant, that combines the energy with the product of the angular momentum of the particle in the new coordinate and the angular velocity of the coordinate system. We will want to keep track of this term.

Next, we return to rectangular coordinates, but they are rotating with the reference circular orbit:

$\begin{matrix} x' = r' \cos φ' & (5.213) \end{matrix}$

$\begin{matrix} y' = r' \sin φ' & (5.214) \end{matrix}$

$\begin{matrix} p_{x}^{'} = p_{r}^{'} \cos φ' - \frac{p_{φ}^{'}}{r'} \sin φ' & (5.215) \end{matrix}$

$\begin{matrix} p_{y}^{'} = p_{r}^{'} \sin φ' + \frac{p_{φ}^{'}}{r'} \cos φ' . & (5.216) \end{matrix}$

The Hamiltonian is

$\begin{array}{l} H ‴ (t; x', y'; p_{x}^{'}, p_{y}^{'}) \\ = \frac{p_{x}^{' 2} + p_{y}^{' 2}}{2 m} + Ω (y' p_{x}^{'} - x' p_{y}^{'}) + V (\sqrt{x'^{2} + y'^{2}}) . & (5.217) \end{array}$

With one more quick manipulation we shift the coordinate system so that the origin is out on our circular orbit. We define new rectangular coordinates ξ and η with the following simple canonical transformation of coordinates and momenta:

$\begin{matrix} ξ = x' - R_{0} & (5.218) \end{matrix}$

$\begin{matrix} η = y' & (5.219) \end{matrix}$

$\begin{matrix} p_{ξ} = p_{x}^{'} & (5.220) \end{matrix}$

$\begin{matrix} p_{η} = p_{y}^{'} . & (5.221) \end{matrix}$

In this final coordinate system the Hamiltonian is

$\begin{array}{l} H ″″ (t; ξ, η; p_{ξ}, p_{η}) = & \frac{p_{ξ}^{2} + p_{η}^{2}}{2 m} + Ω (η p_{ξ} - (ξ + R_{0}) p_{η}) \\ + V (\sqrt{{(ξ + R_{0})}^{2} + η^{2}}), & (5.222) \end{array}$

and Hamilton's equations are uselessly complicated, but the next step is to consider only trajectories for which the coordinates ξ and η are small compared with R₀. Under this assumption we will be able to construct approximate equations of motion for these trajectories that are linear in the coordinates, thus yielding simple analyzable motion. To this point we have made no approximations. The equations above are perfectly accurate for any trajectories in a central field.

The idea is to expand the potential-energy term in the Hamiltonian as a series and to discard any term higher than second-order in the coordinates, thus giving us first-order-accurate Hamilton's equations:

$\begin{matrix} U (ξ, η) = V (\sqrt{{(ξ + R_{0})}^{2} + η^{2}}) & (5.223) \end{matrix}$

$\begin{matrix} = V (R_{0} + ξ + \frac{η^{2}}{2 R_{0}} + ...) & (5.224) \end{matrix}$

$\begin{array}{l} = V (R_{0}) + D V (R_{0}) (ξ + \frac{η^{2}}{2 R_{0}}) \\ + D^{2} V (R_{0}) \frac{ξ^{2}}{2} + \dots . & (5.225) \end{array}$

So the (negated) generalized forces are

$\partial_{0} U (ξ, η) \begin{matrix} = D V (R_{0}) + D^{2} V (R_{0}) ξ + \dots & (5.226) \end{matrix}$

$\partial_{1} U (ξ, η) \begin{matrix} = D V (R_{0}) \frac{η}{R_{0}} + \dots . & (5.227) \end{matrix}$

With this expansion we obtain the linearized Hamilton's equations:

$\begin{matrix} D ξ = \frac{p_{ξ}}{m} + Ω η & (5.228) \end{matrix}$

$\begin{matrix} D η = \frac{p_{η}}{m} - Ω (ξ + R_{0}) & (5.229) \end{matrix}$

$\begin{matrix} D p_{ξ} = - D V (R_{0}) - D^{2} V (R_{0}) ξ + \dots + Ω p_{η} & (5.230) \end{matrix}$

$\begin{matrix} D p_{η} = - D V (R_{0}) \frac{η}{R_{0}} + ... - Ω p_{ξ} . & (5.231) \end{matrix}$

Of course, once we have linear equations we know how to solve them exactly. Because the linearized Hamiltonian is conserved we cannot get exponential expansion or collapse, so the possible solutions are quite limited. It is instructive to convert these equations into a second-order system. We use Ω² = DV(R₀)/(mR₀), equation (5.207), to eliminate the DV terms:

$\begin{matrix} D^{2} ξ - 2 Ω D η = (Ω^{2} - \frac{D^{2} V (R_{0})}{m}) ξ & (5.232) \end{matrix}$

$\begin{matrix} D^{2} η + 2 Ω D ξ = 0. & (5.233) \end{matrix}$

Combining these, we find

$\begin{matrix} D^{3} ξ + ω^{2} D ξ = 0, & (5.234) \end{matrix}$

where

$\begin{matrix} ω^{2} = 3 Ω^{2} + \frac{D^{2} V (R_{0})}{m} . & (5.235) \end{matrix}$

Thus we have a simple harmonic oscillator with frequency ω as one of the components of the solution. The general solution has three parts:

$\begin{matrix} (\begin{matrix} ξ (t) \\ η (t) \end{matrix}) = η_{0} (\begin{matrix} 0 \\ 1 \end{matrix}) & (5.236) \end{matrix}$

$\begin{matrix} + ξ_{0} (\begin{matrix} 1 \\ - 2 A t \end{matrix}) & (5.237) \end{matrix}$

$\begin{matrix} + C_{0} (\begin{matrix} \sin (ω t + φ_{0}) \\ \frac{2 Ω}{ω} \cos (ω t + φ_{0}) \end{matrix}) & (5.238) \end{matrix}$

where

$\begin{matrix} A = \frac{Ω^{2} m - D^{2} V (R_{0})}{4 Ω m} . & (5.239) \end{matrix}$

The constants η₀, ξ₀, C₀, and φ₀ are determined by the initial conditions. If C₀ = 0, the particle of interest is on a circular trajectory, but not necessarily the same one as the reference trajectory. If C₀ = 0 and ξ₀ = 0, we have a “fellow traveler,” a particle in the same circular orbit as the reference orbit but with different phase. If C₀ = 0 and η₀ = 0, we have a particle in a circular orbit that is interior or exterior to the reference orbit and shearing away from the reference orbit. The shearing is due to the fact that the angular velocity for a circular orbit varies with the radius. The constant A gives the rate of shearing at each radius. If both η₀ = 0 and ξ₀ = 0 but C₀ ≠ 0, then we have “epicyclic motion.” A particle in a nearly circular orbit may be seen to move in an ellipse around the circular reference orbit. The ellipse will be elongated in the direction of circular motion by the factor 2Ω/ω, and it will rotate in the direction opposite to the direction of the circular motion. The initial phase of the epicycle is φ₀. Of course, any combination of these solutions may exist.

The epicyclic frequency ω and the shearing rate A are determined by the force law (the radial derivative of the potential energy). For a force law proportional to a power of the radius,

$\begin{matrix} F \propto r^{1 - n}, & (5.240) \end{matrix}$

the epicyclic frequency is related to the orbital frequency by

$\begin{matrix} \frac{ω}{Ω} = 2 \sqrt{1 - \frac{n}{4}} & (5.241) \end{matrix}$

and the shearing rate is

$\begin{matrix} \frac{A}{Ω} = \frac{n}{4} . & (5.242) \end{matrix}$

For a few particular integer force laws we see:

art

We can get some insight into the kinds of orbits produced by the epicyclic approximation by looking at a few examples. For some force laws we have integer ratios of epicyclic frequency to orbital frequency. In those cases we have closed orbits. For an inverse-square force law (n = 3) we get elliptical orbits with the center of the field at a focus of the ellipse. Figure 5.3 shows how an approximation to such an orbit can be constructed by superposition of the motion on an elliptical epicycle with the motion of the same frequency on a circle. If the force is proportional to the radius (n = 0) we get a two-dimensional harmonic oscillator. Here the epicyclic frequency is twice the orbital frequency. Figure 5.4 shows how this yields elliptical orbits that are centered on the source of the central force. An orbit is closed when ω/Ω is a rational fraction. If the force is proportional to the −3/4 power of the radius, the epicyclic frequency is 3/2 the orbital frequency. This yields the three-lobed pattern seen in figure 5.5. For other force laws the orbits predicted by this analysis are multi-lobed patterns produced by precessing approximate ellipses. Most of the cases have incommensurate epicyclic and orbital frequencies, leading to orbits that do not close in finite time.

**Figure 5.3** Epicyclic construction of an approximate orbit for F ∝ r⁻². The large dotted circle is the reference circular orbit and the dotted ellipses are the epicycles. The epicycles are twice as long as they are wide. The solid ellipse is the approximate trajectory produced by a particle moving on the epicycles. The sense of orbital motion is counterclockwise, and the epicycles are rotating clockwise. The arrows represent the increment of velocity contributed by the epicycle to the circular reference orbit.

**Figure 5.4** Epicyclic construction of an approximate orbit for F ∝ r. The large dotted circle is the reference circular orbit and the small dotted circles are the epicycles. The solid ellipse is the approximate trajectory produced by a particle moving on the epicycles. The sense of orbital motion is counterclockwise, and the epicycles are rotating clockwise. The arrows represent the increment of velocity contributed by the epicycle to the circular reference orbit.

**Figure 5.5** Epicyclic construction of an approximate orbit for F ∝ r^−3/4. The large dotted circle is the reference circular orbit and the dotted ellipses are the epicycles. The epicycles have a 4:3 ratio of length to width. The solid trefoil is the approximate trajectory produced by a particle moving on the epicycles. The sense of orbital motion is counterclockwise, and the epicycles are rotating clockwise. The arrows represent the increment of velocity contributed by the epicycle to the circular reference orbit.

**Figure 5.6** The numerically integrated orbit of a particle with a force law F ∝ r^−2.3. For this law the ratio of the epicyclic frequency to the orbital frequency is about .83666—close to 5/6, but not quite. This is manifest in the nearly five-fold symmetry of the rosette-like shape and the fact that one must cross approximately six orbits to get from the inside to the outside of the rosette.

The epicyclic approximation gives a very good idea of what actual orbits look like. Figure 5.6, drawn by numerical integration of the orbit produced by integrating the original rectangular equations of motion for a particle in the field, shows the rosette-type picture characteristic of incommensurate epicyclic and orbital frequencies for an F = −r^−2.3 force law.

We can directly compare a numerically integrated system with one of our epicyclic approximations. For example, the result of numerically integrating our F ∝ r^−3/4 system is very similar to the picture we obtained by epicycles. (See figure 5.7 and compare it with figure 5.5.)

Exercise 5.11: Collapsing orbits

What exactly happens as the force law becomes steeper? Investigate this by sketching the contours of the Hamiltonian in r, p_r space for various values of the force-law exponent, n. For what values of n are there stable circular orbits? In the case that there are no stable circular orbits, what happens to circular and other noncircular orbits? How are these results consistent with Liouville's theorem and the nonexistence of attractors in Hamiltonian systems?

**Figure 5.7** The numerically integrated orbit of a particle with a force law F ∝ r^−3/4. For this law the ratio of the epicyclic frequency to the orbital frequency is exactly 3/2. This is manifest in the three-fold symmetry of the rosette-like shape and the fact that one must cross two orbits to get from the inside to the outside of the rosette.

5.4.5 Total Time Derivatives

The addition of a total time derivative to a Lagrangian leads to the same Lagrange equations. However, the two Lagrangians have different momenta, and they lead to different Hamilton's equations. Here we find out how to represent the corresponding canonical transformation with a generating function.

Let's restate the result about total time derivatives and Lagrangians from the first chapter. Consider some function G(t, q) of time and coordinates. We have shown that if L and L′ are related by

$\begin{matrix} L' (t, q, \dot{q}) = L (t, q, \dot{q}) + \partial_{0} G (t, q) + \partial_{1} G (t, q) \dot{q} & (5.243) \end{matrix}$

then the Lagrange equations of motion are the same. The generalized coordinates used in the two Lagrangians are the same, but the momenta conjugate to the coordinates are different. In the usual way, define

$P \begin{matrix} (t, q, \dot{q}) = \partial_{2} L (t, q, \dot{q}) & (5.244) \end{matrix}$

and

$P' \begin{matrix} (t, q, \dot{q}) = \partial_{2} L' (t, q, \dot{q}) . & (5.245) \end{matrix}$

So we have

$P' \begin{matrix} (t, q, \dot{q}) = P (t, q, \dot{q}) + \partial_{1} G (t, q) . & (5.246) \end{matrix}$

Evaluated on a trajectory, we have

$p' \begin{matrix} (t) = p (t) + \partial_{1} G (t, q (t)) . & (5.247) \end{matrix}$

This transformation is a special case of an F₂-type transformation. Let

$F_{2} \begin{matrix} (t, q, p') = q p' - G (t, q); & (5.248) \end{matrix}$

then the associated transformation is

$\begin{matrix} q' = \partial_{2} F_{2} (t, q, p') = q & (5.249) \end{matrix}$

$\begin{matrix} p = \partial_{1} F_{2} (t, q, p') = p' - \partial_{1} G (t, q) & (5.250) \end{matrix}$

$\begin{array}{l} H' (t, q', p') & = H (t, q, p) + \partial_{0} F_{2} (t, q, p') \\ = H (t, q, p) - \partial_{0} G (t, q) . & (5.251) \end{array}$

Explicitly, the new Hamiltonian is

$\begin{matrix} H' (t, q', p') = H (t, q', p' - \partial_{1} G (t, q')) - \partial_{0} G (t, q'), & (5.252) \end{matrix}$

where we have used the fact that q = q′. The transformation is interesting in that the coordinate transformation is the identity transformation, but the new and old momenta are not the same, even in the case in which G has no explicit time dependence. Suppose we have a Hamiltonian of the form

$\begin{matrix} H (t, x, p) = \frac{p^{2}}{2 m} + V (x); & (5.253) \end{matrix}$

then the transformed Hamiltonian is

$\begin{matrix} H' (t, x', p') = \frac{{(p' - \partial_{1} G (t, x'))}^{2}}{2 m} + V (x') - \partial_{0} G (t, x') . & (5.254) \end{matrix}$

We see that this transformation may be used to modify terms in the Hamiltonian that are linear in the momenta. Starting from H, the transformation introduces linear momentum terms; starting from H′, the transformation eliminates the linear terms.

Driven pendulum

We illustrate the use of this transformation with the driven pendulum. The Hamiltonian for the driven pendulum derived from the T − V Lagrangian (see section 1.6.2) is

$\begin{array}{l} H (t, θ, p_{θ}) \\ = \frac{p_{θ}^{2}}{2 m l^{2}} - g l m \cos θ \\ + g m y_{s} (t) - \frac{p_{θ}}{l} \sin θ D y_{s} (t) - \frac{m}{2} {(\cos θ)}^{2} {(D y_{s} (t))}^{2}, & (5.255) \end{array}$

where y_s is the drive function. The Hamiltonian is rather messy, and includes a term that is linear in the angular momentum with a coefficient that depends on both the angular coordinate and the time. Let's see what happens if we apply our transformation to the problem to eliminate the linear term. We can identify the transformation function G by requiring that the linear term in momentum be killed:

$\begin{matrix} G (t, θ) = - m l \cos θ D y_{s} (t) . & (5.256) \end{matrix}$

The transformed momentum is

$p_{θ}^{'} = p_{θ} \begin{matrix} + m l \sin θ D y_{s} (t), & (5.257) \end{matrix}$

and the transformed Hamiltonian is

$\begin{array}{l} H' (t, θ, p_{θ}^{'}) = & \frac{{(p_{θ}^{'})}^{2}}{2 m l^{2}} - m l (g + D^{2} y_{s}) \cos θ \\ + g m y_{s} (t) - \frac{m}{2} {(y_{s} (t))}^{2} . & (5.258) \end{array}$

Dropping the last two terms, which do not affect the equations of motion, we find

$\begin{matrix} H' (t, θ, p_{θ}^{'}) = \frac{{(p_{θ}^{'})}^{2}}{2 m l^{2}} - m l (g + D^{2} y_{s}) \cos θ . & (5.259) \end{matrix}$

So we have found, by a straightforward canonical transformation, a Hamiltonian for the driven pendulum with the rather simple form of a pendulum with gravitational acceleration that is modified by the acceleration of the pivot. It is, in fact, the Hamiltonian that corresponds to the alternative form of the Lagrangian for the driven pendulum that we found earlier by inspection (see equation 1.120). Here the derivation is by a simple canonical transformation, motivated by a desire to eliminate unwanted terms that are linear in the momentum.

Exercise 5.12: Construction of generating functions

Suppose that canonical transformations

$(t, q, p) = C_{a} (t, q', p') and (t, q', p') = C_{b} (t, q ″, p ″)$

are generated by two F₁-type generating functions, F_1a(t, q, q′) and F_1b(t, q′, q″).

a. Show that the generating function for the inverse transformation of C_a is F_1c(t, q′, q) = −F_1a(t, q, q′).

b. Define a new kind of generating function,

F_x(t, q, q′, q″) = F_1a(t, q, q′) + F_1b(t, q′, q″).

We see that

p = ∂₁F_x(t, q, q′, q″) = ∂₁F_1a(t, q, q′)

p″ = −∂₃F_x(t, q, q′, q″) = −∂₂F_1b(t, q′, q″)

Show that ∂₂F_x = 0, allowing a solution to eliminate q′.

c. Using the formulas for p and p″ above, and the result from part b, Show that F_x is an appropriate generating function for the composition transformation C_a ∘ C_b.

Exercise 5.13: Linear canonical transformations

We consider systems with two degrees of freedom and transformations for which the Hamiltonian transforms by composition.

a. Consider the linear canonical transformations that are generated by

$F_{2} (t; x_{1}, x_{2}; p_{1}^{'}, p_{2}^{'}) = p_{1}^{'} a x_{1} + p_{1}^{'} b x_{2} + p_{2}^{'} c x_{1} + p_{2}^{'} d x_{2} .$

Show that these transformations are just the point transformations, and that the corresponding F₁ is zero.

b. Other linear canonical transformations can be generated by

$F_{1} (t; x_{1}, x_{2}; x_{1}^{'}, x_{2}^{'}) = x_{1}^{'} a x_{1} + x_{1}^{'} b x_{2} + x_{2}^{'} c x_{1} + x_{2}^{'} d x_{2} .$

Surely we can make even more generators by constructing F₃- and F₄-type transformations analogously. Are all of the linear canonical transformations obtainable in this way? If not, show one that cannot be so generated.

c. Can all linear canonical transformations be generated by compositions of transformations generated by the functions shown in parts a and b above?

d. How many independent parameters are necessary to specify all possible linear canonical transformations for systems with two degrees of freedom?

Exercise 5.14: Integral invariants

Consider the linear canonical transformation for a system with two degrees of freedom generated by the function

$F_{1} (t; x_{1}, x_{2}; x_{1}^{'}, x_{2}^{'}) = x_{1}^{'} a x_{1} + x_{1}^{'} b x_{2} + x_{2}^{'} c x_{1} + x_{2}^{'} d x_{2},$

and the general parallelogram with a vertex at the origin and with adjacent sides starting at the origin and extending to the phase-space points (x_1a, x_2a, p_1a, p_2a) and (x_1b, x_2b, p_1b, p_2b).

a. Find the area of the given parallelogram and the area of the target parallelogram under the canonical transformation. Notice that the area of the parallelogram is not preserved.

b. Find the areas of the projections of the given parallelogram and the areas of the projections of the target under canonical transformation. Show that the sum of the areas of the projections on the action-like planes is preserved.

Exercise 5.15: Standard-map generating function

Find a generating function for the standard map (see exercise 5.8 on page 357).

5.5 Extended Phase Space

In this section we show that we can treat time as just another coordinate if we wish. Systems described by a time-dependent Hamiltonian may be recast in terms of a time-independent Hamiltonian with an extra degree of freedom. An advantage of this view is that what was a time-dependent canonical transformation can be treated as a time-independent transformation, where there are no additional conditions for adjusting the Hamiltonian.

Suppose that we have some system characterized by a time-dependent Hamiltonian, for example, a periodically driven pendulum. We may imagine that there is some extremely massive oscillator, unperturbed by the motion of the relatively massless pendulum, that produces the drive. Indeed, we may think of time itself as the coordinate of an infinitely massive particle moving uniformly and driving everything else. We often consider the rotation of the Earth as exactly such a stable time reference when performing short-time experiments in the laboratory.

More formally, consider a dynamical system with n degrees of freedom, whose behavior is described by a possibly time-dependent Lagrangian L with corresponding Hamiltonian H. We make a new dynamical system with n + 1 degrees of freedom by extending the generalized coordinates to include time and introducing a new independent variable. We also extend the generalized velocities to include a velocity for the time coordinate. In this new extended state space the coordinates are redundant, so there is a constraint relating the time coordinate to the new independent variable.

We relate the original dynamical system to the extended dynamical system as follows: Let q be a coordinate path. Let (q_e, t) : τ ↦ (q_e(τ), t(τ)) be a coordinate path in the extended system where τ is the new independent variable. Then q_e = q ∘ t, or q_e(τ) = q(t(τ)). Consequently, if v = Dq is the velocity along a path then v_e(τ) = Dq_e(τ) = Dq(t(τ)) · Dt(τ) = v(t(τ)) · v_t(τ).

We can find a Lagrangian for the extended system by requiring that the value of the action be unchanged. Introduce the extended Lagrangian action

$\begin{matrix} S_{e} [q_{e}, t] (τ_{1}, τ_{2}) = \int_{τ_{1}}^{τ_{2}} (L_{e} \circ Γ [q_{e}, t]), & (5.260) \end{matrix}$

with

$\begin{matrix} L_{e} (τ; q_{e}, t; v_{e}, v_{t}) = L (t, q_{e}, v_{e} / v_{t}) v_{t} . & (5.261) \end{matrix}$

We have

$\begin{matrix} S [q] (t (τ_{1}), t (τ_{2})) = S_{e} [q \circ t, t] (τ_{1}, τ_{2}) . & (5.262) \end{matrix}$

The extended system is subject to a constraint that relates the time to the new independent variable. We assume the constraint is of the form φ(τ; q_e, t; v_e, v_t) = t − f(τ) = 0. The constraint is a holonomic constraint involving the coordinates and time, so we can incorporate this constraint by augmenting the Lagrangian:²⁴

$\begin{array}{l} L_{e}^{'} (τ; q_{e}, t, λ; v_{e}, v_{t}, v_{λ}) \\ = L_{e} (τ; q_{e}, t; v_{e}, v_{t}) + v_{λ} (v_{t} - D f (τ)) \\ = L (t, q_{e}, v_{e} / v_{t}) v_{t} + v_{λ} (v_{t} - D f (τ)) . & (5.263) \end{array}$

The Lagrange equations of $L_{e}^{'}$ for q_e are satisfied for the paths q ∘ t where q is any path that satisfies the original Lagrange equations of L.

The momenta conjugate to the coordinates are

$\begin{array}{l} P_{e} (τ; q_{e}, t, λ; v_{e}, v_{t}, v_{λ}) \\ = \partial_{2, 0} L_{e}^{'} (τ; q_{e}, t, λ; v_{e}, v_{t}, v_{λ}) \\ = \partial_{2} L (t, q_{e}, v_{e} / v_{t}) \\ = P (t, q_{e}, v_{e} / v_{t}) & (5.264) \end{array}$

$\begin{array}{l} P_{t} (τ; q_{e}, t, λ; v_{e}, v_{t}, v_{λ}) \\ = \partial_{2, 1} L_{e}^{'} (τ; q_{e}, t, λ; v_{e}, v_{t}, v_{λ}) \\ = L (t, q_{e}, v_{e} / v_{t}) - \partial_{2} L (t, q_{e}, v_{e} / v_{t}) (v_{e} / v_{t}) + v_{λ} \\ = - ℰ (t, q_{e}, v_{e} / v_{t}) + v_{λ} & (5.265) \end{array}$

$\begin{array}{l} P_{λ} (τ; q_{e}, t, λ; v_{e}, v_{t}, v_{λ}) \\ = \partial_{2, 2} L_{e}^{'} (τ; q_{e}, t, λ; v_{e}, v_{t}, v_{λ}) \\ = v_{t} - D f (τ) . & (5.266) \end{array}$

So the extended momenta have the same values as the original momenta at the corresponding states. The momentum conjugate to the time coordinate is the negation of the energy plus v_λ. The momentum conjugate to λ is the constraint, which must be zero.

Next we carry out the transformation to the corresponding Hamiltonian formulation. First, note that the Lagrangian L_e is a homogeneous form of degree one in the velocities. Thus, by Euler's theorem,

$\begin{array}{l} \partial_{2} L_{e} (τ; q_{e}, t; v_{e}, v_{t}) \cdot (v_{e}, v_{t}) = L_{e} (τ; q_{e}, t; v_{e}, v_{t}) . & (5.267) \end{array}$

The $p \dot{q} -part$ of the Legendre transform of $L_{e}^{'}$ is

$\begin{array}{l} \partial_{2} L_{e}^{'} (τ; q_{e}, t, λ; v_{e}, v_{t}, v_{λ}) \cdot (v_{e}, v_{t}, v_{λ}) \\ = \partial_{2} L_{e} (τ; q_{e}, t; v_{e}, v_{t}) \cdot (v_{e}, v_{t}) + v_{λ} v_{t} + (v_{t} - D f (τ)) v_{λ} \\ = L_{e} (τ; q_{e}, t; v_{e}, v_{t}) + v_{λ} v_{t} + (v_{t} - D f (τ)) v_{λ} . & (5.268) \end{array}$

So the Hamiltonian $H_{e}^{'}$ corresponding to $L_{e}^{'}$ is

$\begin{array}{l} H_{e}^{'} (τ; q_{e}, t, λ; p_{e}, p_{t}, p_{λ}) & = v_{λ} v_{t} \\ = (p_{t} + H (t, q_{e}, p_{e})) (p_{λ} + D f (τ)) . & (5.269) \end{array}$

We have used the fact that at corresponding states the momenta have the same values, so on paths p_e = p ∘ t, and

$\begin{array}{l} ℰ (t, q_{e}, v_{e} / v_{t}) = H (t, q_{e}, p_{e}) . & (5.270) \end{array}$

The Hamiltonian $H_{e}^{'}$ does not depend on λ so we deduce that p_λ is constant. In fact, p_λ must be given the value zero, because it is the constraint. When there is a cyclic coordinate we can form a reduced Hamiltonian for the remaining degrees of freedom by substituting the constant value of conserved momentum conjugate to the cyclic coordinate into the Hamiltonian. The resulting Hamiltonian is

$\begin{array}{l} H_{e} (τ; q_{e}, t; p_{e}, p_{t}) = (p_{t} + H (t, q_{e}, p_{e})) D f (τ) . & (5.271) \end{array}$

This extended Hamiltonian governs the evolution of the extended system, for arbitrary f.²⁵

Hamilton's equations reduce to

$\begin{array}{l} D q_{e} (τ) & = \partial_{2} H (t (τ), q_{e} (τ), p_{e} (τ)) D f (τ) \\ D t (τ) & = D f (τ) \\ D p_{e} (τ) & = - \partial_{1} H (t (τ), q_{e} (τ), p_{e} (τ)) D f (τ) \\ D p_{t} (τ) & = - \partial_{0} H (t (τ), q_{e} (τ), p_{e} (τ)) D f (τ) . & (5.272) \end{array}$

The second equation gives the required relation between t and τ. The first and third equations are equivalent to Hamilton's equations in the original coordinates, as we can see by using q_e = q ∘ t to rewrite them:

$\begin{array}{l} D q (t (τ)) D t (τ) = \partial_{2} H (t (τ), q (t (τ)), p (t (τ))) D f (τ) \\ D p (t (τ)) D t (τ) = - \partial_{1} H (t (τ), q (t (τ)), p (t (τ))) D f (τ) . & (5.273) \end{array}$

Using Dt(τ) = Df(τ) and dividing these factors out, we recover Hamilton's equations.²⁶

Now consider the special case for which the time is the same as the independent variable: f(τ) = τ, Df(τ) = 1. In this case q = q_e and p = p_e. The extended Hamiltonian becomes

$\begin{array}{l} H_{e}^{'} (τ; q_{e}, t; p_{e}, p_{t}) = p_{t} + H (t, q_{e}, p_{e}) . & (5.274) \end{array}$

Hamilton's equation for t becomes Dt(τ) = 1, restating the constraint. Hamilton's equations for Dq_e and Dp_e are directly Hamilton's equations:

$\begin{array}{l} D q (τ) = \partial_{2} H (τ, q (τ), p (τ)) \\ D p (τ) = - \partial_{1} H (τ, q (τ), p (τ)) . & (5.275) \end{array}$

The extended Hamiltonian (5.274) does not depend on the independent variable, so it is a conserved quantity. Thus, up to an additive constant p_t is equal to minus the energy. The Hamilton's equation for Dp_t relates the change of the energy to ∂₀H. Note that in the more general case, the momentum conjugate to the time is not the negation of the energy. This choice, t(τ) = τ, is useful for a number of applications.

The extension transformation is canonical in the sense that the two sets of equations of motion describe equivalent dynamics. However, the transformation is not symplectic; in fact, it does not even have the same number of input and output variables.

Exercise 5.16: Homogeneous extended Lagrangian

Verify that L_e is homogeneous of degree one in the velocities.

Exercise 5.17: Lagrange equations

a. Verify that the Lagrange equations for q_e are satisfied for exactly the same trajectories that satisfy the original Lagrange equations for q.

b. Verify that the Lagrange equation for t relates the rate of change of energy to ∂₀L.

Exercise 5.18: Lorentz transformations

Investigate Lorentz transformations as point transformations in the extended phase space.

Restricted three-body problem

An example that shows the utility of reformulating a problem in the extended phase space is the restricted three-body problem: the motion of a low-mass particle subject to the gravitational attraction of two other massive bodies that move in some fixed orbit. The problem is an idealization of the situation where a body with very small mass moves in the presence of two bodies with much larger masses. Any effects of the smaller body on the larger bodies are neglected. In the simplest version, the motion of all three bodies is assumed to be in the same plane, and the orbits of the two massive bodies are circular.

The motion of the bodies with larger masses is not influenced by the small mass, so we model this situation as the small body moving in a time-varying field of the larger bodies undergoing a prescribed motion. This situation can be captured as a time-dependent Hamiltonian:

$\begin{array}{l} H (t; x, y; p_{x}, p_{y}) = \frac{p_{x}^{2} + p_{y}^{2}}{2 m} - \frac{G m m_{1}}{r_{1} (t)} - \frac{G m m_{2}}{r_{2} (t)}, & (5.276) \end{array}$

where r₁(t) and r₂(t) are the distances of the small body to the larger bodies, m is the mass of the small body, and m₁ and m₂ are the masses of the larger bodies. Note that r₁(t) and r₂(t) are quantities that depend both on the position of the small particle and the time-varying position of the massive particles.

The massive bodies are in circular orbits and maintain constant distance from the center of mass. Let a₁ and a₂ be the distances to the center of mass; then the distances satisfy m₁a₁ = m₂a₂. The angular frequency is $Ω = \sqrt{G (m_{1} + m_{2}) / a^{3}}$ where a is the distance between the masses.

In polar coordinates, with the center of mass of the subsystem of massive particles at the origin and with r and θ describing the position of the low-mass particle, the positions of the two massive bodies are a₂ = m₁a/(m₁+m₂) with θ₂ = Ωt, a₁ = m₂a/(m₁+m₂) with θ₁ = Ωt + π. The distances to the point masses are

$\begin{array}{l} {(r_{2} (t))}^{2} = r^{2} + a_{2}^{2} - 2 a_{2} r \cos (θ - Ω t) \\ {(r_{1} (t))}^{2} = r^{2} + a_{1}^{2} - 2 a_{1} r \cos (θ - Ω t - π) . & (5.277) \end{array}$

In polar coordinates, the Hamiltonian is

$\begin{array}{l} H (t; r, θ; p_{r}, p_{θ}) = \frac{1}{2 m} (p_{r}^{2} + \frac{p_{θ}^{2}}{r^{2}}) - \frac{G m m_{1}}{r_{1} (t)} - \frac{G m m_{2}}{r_{2} (t)} . & (5.278) \end{array}$

The Hamiltonian can be written in terms of some function f such that

$\begin{array}{l} H (t; r, θ; p_{r}, p_{θ}) = f (r, θ - Ω t, p_{r}, p_{θ}) . & (5.279) \end{array}$

The essential feature is that θ and t appear in the Hamiltonian only in the combination θ − Ωt.

One way to get rid of the time dependence is to choose a new set of variables with one coordinate equal to this combination θ − Ωt, by making a point transformation to a rotating coordinate system. We have shown that

$\begin{array}{l} r' = r & (5.280) \end{array}$

$\begin{array}{l} θ' = θ - Ω t & (5.281) \end{array}$

$\begin{array}{l} p_{r}^{'} = p_{r} & (5.282) \end{array}$

$\begin{array}{l} p_{θ}^{'} = p_{θ} & (5.283) \end{array}$

with

$\begin{array}{l} H' (t; r', θ'; p_{r}^{'}, p_{θ}^{'}) & = H (t; r', θ' + Ω t; p_{r}^{'}, p_{θ}^{'}) - Ω p_{θ}^{'} \\ = f (r', θ', p_{r}^{'}, p_{θ}^{'}) - Ω p_{θ}^{'} & (5.284) \end{array}$

is a canonical transformation. The new Hamiltonian, which is not the energy, is conserved because there is no explicit time dependence. It is a useful conserved quantity—the Jacobi constant.²⁷

We can also eliminate the dependence on the independent time-like variable from the Hamiltonian for the restricted problem by going to the extended phase space, choosing t = τ. The Hamiltonian

$\begin{array}{l} H_{e} (τ; r, θ, t; p_{r}, p_{θ}, p_{t}) & = H (t; r, θ; p_{r}, p_{θ}) + p_{t} \\ = f (r, θ - Ω t, p_{r}, p_{θ}) + p_{t} & (5.285) \end{array}$

is autonomous and is consequently a conserved quantity. Again, we see that θ and t occur only in the combination θ − Ωt, which suggests a point transformation to a new coordinate θ′ = θ − Ωt. This point transformation is independent of the new independent variable τ. The transformation is specified in equations (5.280–5.283), augmented by relations specifying how the time coordinate and its conjugate momentum are handled:

$\begin{array}{l} t = t' & (5.286) \end{array}$

$\begin{array}{l} p_{t} = - Ω p_{θ}^{'} + p_{t}^{'} . & (5.287) \end{array}$

The new Hamiltonian is obtained by composing the old Hamiltonian with the transformation:

$\begin{array}{l} H_{e}^{'} (τ; r', θ', t'; p_{r}^{'}, p_{θ}^{'}, p_{t}^{'}) \\ = H_{e} (τ; r', θ' + Ω t', t'; p_{r}^{'}, p_{θ}^{'}, p_{t}^{'} - Ω p_{θ}^{'}) \\ = f (r', θ', p_{r}^{'}, p_{θ}^{'}) + p_{t}^{'} - Ω p_{θ}^{'} . & (5.288) \end{array}$

We recognize that the new Hamiltonian in the extended phase space, which has the same value as the original Hamiltonian in the extended phase space, is just the Jacobi constant plus $p_{t}^{'}$ . The new Hamiltonian does not depend on t′, so $p_{t}^{'}$ is a constant of the motion. In fact, its value is irrelevant to the rest of the dynamical evolution, so we may set the value of $p_{t}^{'}$ to zero if we like. Thus, we have found that the Hamiltonian in the extended phase space, which is conserved, is just the Jacobi constant plus an additive arbitrary constant. We have two routes to the Jacobi constant: (1) transform the original system to a rotating coordinate system to eliminate the time dependence, but in the process add extra terms to the Hamiltonian, and (2) go to the extended phase space and immediately get a conserved quantity, and by going to rotating coordinates recognize that this Hamiltonian is the same as the Jacobi constant. So sometimes the Hamiltonian in the extended phase space is a useful conserved quantity.

Exercise 5.19: Transformations in the extended phase space

In section 5.2.1 we found that time-dependent transformations for which the derivative of the coordinate–momentum part is symplectic are canonical only if the Hamiltonian is modified by adding a function K subject to certain constraints (equation 5.42). Show that the constraints on K follow from the symplectic condition in the extended phase space, using the choice t = τ.

5.5.1 Poincaré–Cartan Integral Invariant

The Poincaré invariant (section 5.3) is especially useful in the extended phase space with t = τ. In the extended phase space the extended Hamiltonian does not depend on the independent variable. In the extended phase space canonical transformations are symplectic and the Hamiltonian transforms by composition.

For the special choice of t = τ, equation (5.90) can be rephrased in an interesting way. Let E be the value of the Hamiltonian in the original unextended phase space. Using qⁿ = t and p_n = p_t = −E, we can write

$\begin{array}{l} \sum_{i = 0}^{n - 1} \int_{R_{i}} d q^{i} d p_{i} - \int_{R_{n}} d t d E = \sum_{i = 0}^{n - 1} \int_{R_{i}^{'}} d q'^{i} d p_{i}^{'} - \int_{R_{n}^{'}} d t' d E' & (5.289) \end{array}$

and

$\begin{array}{l} \oint_{\partial R} (\sum_{i = 0}^{n - 1} p_{i} d q^{i} - E d t) = \oint_{\partial R'} (\sum_{i = 0}^{n - 1} p_{i}^{'} d q'^{i} - E' d t') . & (5.290) \end{array}$

The relations (5.289) and (5.290) are two formulations of the Poincaré–Cartan integral invariant.

5.6 Reduced Phase Space

Suppose we have a system with n+1 degrees of freedom described by a time-independent Hamiltonian in a (2n + 2)-dimensional phase space. Here we can play the converse game: we can choose any generalized coordinate to play the role of “time” and the negation of its conjugate momentum to play the role of a new n-degree-of-freedom time-dependent Hamiltonian in a reduced phase space of 2n dimensions.

More precisely, let

$\begin{array}{l} q = (q^{0}, \dots, q^{n}) \\ p = [p_{0}, \dots, p_{n}], & (5.291) \end{array}$

and suppose we have a system described by a time-independent Hamiltonian

$\begin{array}{l} H (t, q, p) = f (q, p) = E . & (5.292) \end{array}$

For each solution path there is a conserved quantity E. Let's choose a coordinate qⁿ to be the time in a reduced phase space. We define the dynamical variables for the n-degree-of-freedom reduced phase space:

$\begin{array}{l} q_{r} = (q_{r}^{0}, \dots, q_{r}^{n - 1}) \\ p^{r} = [p_{0}^{r}, \dots, p_{n - 1}^{r}] . & (5.293) \end{array}$

In the original phase space a coordinate such as qⁿ maps time to a coordinate. In the formulation of the reduced phase space we will have to use the inverse function τ = (qⁿ)⁻¹ to map the coordinate to the time, giving the new coordinates in terms of the new time

$\begin{array}{l} q_{r}^{i} = q^{i} \circ τ \\ p_{i}^{r} = p_{i} \circ τ, & (5.294) \end{array}$

and thus

$\begin{array}{l} D q_{r}^{i} = D (q^{i} \circ τ) = (D q^{i} \circ τ) (D τ) = (D q^{i} \circ τ) / (D q^{n} \circ τ) \\ D p_{i}^{r} = D (p_{i} \circ τ) = (D p_{i} \circ τ) (D τ) = (D p_{i} \circ τ) / (D q^{n} \circ τ) . & (5.295) \end{array}$

We propose that a Hamiltonian in the reduced phase space is the negative of the inverse of f(q⁰, …, qⁿ; p₀, …, p_n) = E with respect to the p_n argument:

$\begin{array}{l} H_{r} (x, q_{r}, p^{r}) = - (the p_{x} such that f (q_{r}, x; p^{r}, p_{x}) = E) . & (5.296) \end{array}$

Note that in the reduced phase space we will have indices for the structured variables in the range 0 … n−1, whereas in the original phase space the indices are in the range 0 … n. We will show that H_r is an appropriate Hamiltonian for the given dynamical system in the reduced phase space. To compute Hamilton's equations we must expand the implicit definition of H_r. We define an auxiliary function

$\begin{array}{l} g (x, q_{r}, p^{r}) = f (q_{r}, x; p^{r}, - H_{r} (x, q_{r}, p^{r})) . & (5.297) \end{array}$

Note that by construction this function is identically a constant g = E. Thus all of its partial derivatives are zero:

$\begin{array}{l} \partial_{0} g & = {(\partial_{0} f)}^{n} - {(\partial_{1} f)}^{n} \partial_{0} H_{r} = 0 \\ {(\partial_{1} g)}_{i} & = {(\partial_{0} f)}_{i} - {(\partial_{1} f)}^{n} {(\partial_{1} H_{r})}_{i} = 0 \\ {(\partial_{2} g)}^{i} & = {(\partial_{1} f)}^{i} - {(\partial_{1} f)}^{n} {(\partial_{2} H_{r})}^{i} = 0, & (5.298) \end{array}$

where we have suppressed the arguments. Solving for partials of H_r, we get

$\begin{array}{l} {(\partial_{1} H_{r})}_{i} = {(\partial_{0} f)}_{i} / {(\partial_{1} f)}^{n} = {(\partial_{1} H)}_{i} / {(\partial_{2} H)}^{n} \\ {(\partial_{2} H_{r})}^{i} = {(\partial_{1} f)}^{i} / {(\partial_{1} f)}^{n} = {(\partial_{2} H)}^{i} / {(\partial_{2} H)}^{n} . & (5.299) \end{array}$

Using these relations, we can deduce the Hamilton's equations in the reduced phase space from the Hamilton's equations in the original phase space:

$\begin{array}{l} D q_{r}^{i} (x) & = \frac{D q^{i} (τ (x))}{D q^{n} (τ (x))} \\ = \frac{{(\partial_{2} H (τ (x), q (τ (x)), p (τ (x))))}^{i}}{{(\partial_{2} H (τ (x), q (τ (x)), p (τ (x))))}^{n}} \\ = {(\partial_{2} H_{r} (x, q_{r} (x), p^{r} (x)))}^{i} & (5.300) \end{array}$

$\begin{array}{l} D p_{i}^{r} (x) & = \frac{D p_{i} (τ (x))}{D q^{n} (τ (x))} \\ = \frac{- {(\partial_{1} H (τ (x), q (τ (x)), p (τ (x))))}_{i}}{{(\partial_{2} H (τ (x), q (τ (x)), p (τ (x))))}^{n}} \\ = - {(\partial_{1} H_{r} (x, q_{r} (x), p^{r} (x)))}_{i} . & (5.301) \end{array}$

Orbits in a central field

Consider planar motion in a central field. We have already seen this expressed in polar coordinates in equation (3.100):

$\begin{array}{l} H (t; r, φ; p_{r}, p_{φ}) = \frac{p_{r}^{2}}{2 m} + \frac{p_{φ}^{2}}{2 m r^{2}} + V (r) . & (5.302) \end{array}$

There are two degrees of freedom and the Hamiltonian is time independent. Thus the energy, the value of the Hamiltonian, is conserved on realizable paths. Let's forget about time and reparameterize this system in terms of the orbital radius r.²⁸ To do this we solve

$\begin{array}{l} H (t; r, φ; p_{r}, p_{φ}) = E & (5.303) \end{array}$

for p_r, obtaining

$\begin{array}{l} H' (r, φ, p_{φ}) = - p_{r} = - {(2 m (E - V (r)) - \frac{p_{φ}^{2}}{r^{2}})}^{\frac{1}{2}}, & (5.304) \end{array}$

which is the Hamiltonian in the reduced phase space.

Hamilton's equations are now quite simple:

$\begin{array}{l} \frac{d φ}{d r} = \frac{\partial H'}{\partial p_{φ}} = \frac{p_{φ}}{r^{2}} {(2 m (E - V (r)) - \frac{p_{φ}^{2}}{r^{2}})}^{- \frac{1}{2}} & (5.305) \end{array}$

$\begin{array}{l} \frac{d p_{φ}}{d r} = - \frac{\partial H'}{\partial φ} = 0. & (5.306) \end{array}$

The momentum p_φ is independent of r (as it was with t), so for any particular orbit we may define a constant angular momentum L. Thus our problem ends up as a simple quadrature:

$\begin{array}{l} φ (r) = \int_{}^{r} \frac{L}{r^{2}} {(2 m (E - V (r)) - \frac{L^{2}}{r^{2}})}^{- \frac{1}{2}} d r + φ_{0} . & (5.307) \end{array}$

To see the utility of this procedure, we continue our example with a definite potential energy—a gravitating point mass:

$\begin{array}{l} V (r) = - \frac{μ}{r} . & (5.308) \end{array}$

When we substitute this into equation (5.307) we obtain a mess that can be simplified to

$\begin{array}{l} φ (r) = L \int_{}^{r} \frac{d r}{r \sqrt{2 m E r^{2} + 2 m μ r - L^{2}}} + φ_{0} . & (5.309) \end{array}$

Integrating this, we obtain another mess, which can be simplified and rearranged to obtain the following:

$\begin{array}{l} \frac{1}{r} = \frac{m μ}{L^{2}} (1 - \sqrt{1 + \frac{2 E L^{2}}{m μ^{2}}} \sin (φ (r) - φ_{0})) . & (5.310) \end{array}$

This can be recognized as the polar-coordinate form of the equation of a conic section with eccentricity e and parameter p:

$\begin{array}{l} \frac{1}{r} = \frac{1 + e \cos θ}{p} & (5.311) \end{array}$

where

$\begin{array}{l} \begin{array}{l} e = \sqrt{1 + \frac{2 E L^{2}}{m μ^{2}}}, & p = \frac{L^{2}}{m μ} & and & θ = φ_{0} - φ (r) - \frac{π}{2} . \end{array} & (5.312) \end{array}$

In fact, if the orbit is an ellipse with semimajor axis a, we have

$\begin{array}{l} p = a (1 - e^{2}) & (5.313) \end{array}$

and so we can identify the role of energy and angular momentum in shaping the ellipse:

$\begin{array}{l} E = - \frac{μ}{2 a} & and & L = \sqrt{m μ a (1 - e^{2})} . & (5.314) \end{array}$

What we get from analysis in the reduced phase space is the geometry of the trajectory, but we lose the time-domain behavior. The reduction is often worth the price.

Although we have treated time in a special way so far, we have found that time is not special. It can be included in the coordinates to make a driven system autonomous. And it can be eliminated from any autonomous system in favor of any other coordinate. This leads to numerous strategies for simplifying problems, by removing time variation and then performing canonical transforms on the resulting conservative autonomous system to make a nice coordinate that we can then dump back into the role of time.

Generating functions in extended phase space

We can represent canonical transformations with mixed-variable generating functions. We can extend these to represent transformations in the extended phase space. Let F₂ be a generating function with arguments (t, q, p′). Then, the corresponding $F_{2}^{e}$ in the extended phase space can be taken to be

$\begin{array}{l} F_{2}^{e} (τ; q, t; p', p_{t}^{'}) = t p_{t}^{'} + F_{2} (t, q, p) . & (5.315) \end{array}$

The relations between the coordinates and the momenta are the same as before. We also have

$\begin{array}{l} p_{t} & = {(\partial_{1} F_{2}^{e})}_{n} (τ; q, t; p', p_{t}^{'}) = p_{t}^{'} + \partial_{0} F_{2} (t, q, p) \\ t' & = {(\partial_{2} F_{2}^{e})}^{n} (τ; q, t; p', p_{t}^{'}) = t . & (5.316) \end{array}$

The first equation gives the relationship between the original Hamiltonians:

$\begin{array}{l} H' (t, q', p') = H (t, q, p) + \partial_{0} F_{2} (t, q, p), & (5.317) \end{array}$

as required. Time-independent canonical transformations, where H′ = H ∘ C_H, have symplectic qp part. The generating-function representation of a time-dependent transformation does not depend on the independent variable in the extended phase space. So, in extended phase space the qp part of the transformation, which includes the time and the momentum conjugate to time, is symplectic.

Exercise 5.20: Rotating coordinates in extended phase space

In the extended phase space the time is one of the coordinates. Carry out the transformation to rotating coordinates using an F₂-type generating function in the extended phase space. Compare Hamiltonian (5.178) to the Hamiltonian obtained by composition with the transformation.

5.7 Summary

Canonical transformations can be used to reformulate a problem in coordinates that are easier to understand or that expose some symmetry of a problem.

In this chapter we have investigated different representations of a dynamical system. We have found that different representations will be equivalent if the coordinate–momentum part of the transformation has a symplectic derivative, and if the Hamiltonian transforms in a specified way. If the phase-space transformation is time independent, then the Hamiltonian transforms by composition with the phase-space transformation. The symplectic condition can be equivalently expressed in terms of the fundamental Poisson brackets. The Poisson bracket and the ω function are invariant under canonical transformations. The invariance of ω implies that the sum of the areas of the projections onto fundamental coordinate–momentum planes is preserved (Poincaré integral invariant) by canonical transformations.

A generating function is a real-valued function of the phase-space coordinates and time that represents a canonical transformation through its partial derivatives. We found that every canonical transformation can be represented by a generating function. The proof depends on the Poincaré integral invariant.

We can formulate an extended phase space in which time is treated as another coordinate. Time-dependent transformations are simple in the extended phase space. In the extended phase space the Poincaré integral invariant is the Poincaré–Cartan integral invariant. We can also reformulate a time-independent problem as a time-dependent problem with fewer degrees of freedom, with one of the original coordinates taking on the role of time; this is the reduced phase space.

5.8 Projects

Exercise 5.21: Hierarchical Jacobi coordinates

A Hamiltonian for the n-body problem is

$\begin{array}{l} H = T + V & (5.318) \end{array}$

with

$\begin{array}{l} T (t; x_{0,} x_{1}, \dots, x_{n - 1}; p_{0}, p_{1}, \dots, p_{n - 1}) = \sum_{i = 0}^{n - 1} \frac{p_{i}^{2}}{2 m_{i}} & (5.319) \end{array}$

and

$\begin{array}{l} V (t; x_{0}, x_{1}, \dots, x_{n - 1}; p_{0}, p_{1}, \dots, p_{n - 1}) = \sum_{i < j} f_{i j} (‖ x_{i} - x_{j} ‖), & (5.320) \end{array}$

where x_i is the tuple of rectangular coordinates for body i and p_i is the tuple of conjugate linear momenta for body i.

The potential energy of the system depends only on the relative positions of the bodies, so the relative motion decouples from the center of mass motion. In this problem we explore canonical transformations that achieve this decoupling.

a. Canonical heliocentric coordinates. The coordinates transform as follows:

$\begin{array}{l} x_{0}^{'} = X, & (5.321) \end{array}$

where X is the center of mass of the system, and

$\begin{array}{l} x_{i}^{'} = x_{i} - x_{0}, & (5.322) \end{array}$

for i > 0, the differences of the position of body i and the body with index 0 (which might be the Sun). Find the associated canonical momenta using an F₂-type generating function. Show that the potential energy can be written solely in terms of the coordinates for i > 0. Show that the kinetic energy is not in the form of a sum of squares of momenta divided by mass constants.

b. Jacobi coordinates. The Jacobi coordinates isolate the center of mass motion, without spoiling the usual diagonal quadratic form of the kinetic energy. Define X_i to be the center of mass of the bodies with indices less than or equal to i:

$\begin{array}{l} X_{i} = \frac{\sum_{j = 0}^{i} m_{j} x_{j}}{\sum_{j = 0}^{i} m_{j}} . & (5.323) \end{array}$

The Jacobi coordinates are defined by

$\begin{array}{l} x_{i - 1}^{'} = x_{i} - X_{i - 1}, & (5.324) \end{array}$

for 0 < i < n, and

$\begin{array}{l} x_{n - 1}^{'} = X_{n - 1} . & (5.325) \end{array}$

The coordinates $x_{i}^{'}$ for 0 < i < n are the difference of the position of body i − 1 and the center of mass of bodies with lower indices; the coordinate $x_{n - 1}^{'}$ is the center of mass of the system. Complete the canonical transformation by finding the conjugate momenta using an F₂-type generating function. Show that the kinetic energy can still be written in the form

$\begin{array}{l} T (t; x_{0}^{'}, x_{1}^{'}, \dots, x_{n - 1}^{'}; p_{0}^{'}, p_{1}^{'}, \dots, p_{n - 1}^{'}) = \sum_{i = 0}^{n - 1} \frac{{p_{i}^{'}}^{2}}{2 m_{i}^{'}}, & (5.326) \end{array}$

for some constants $m_{i}^{'}$ , and that the potential V can be written solely in terms of the Jacobi coordinates $x_{i}^{'}$ with indices i > 0.

c. Hierarchical Jacobi coordinates. Define a “body” as a tuple of a mass and a rectangular position tuple. An n-body “system” is a tuple of n bodies: (b₀, b₁, …, b_n−1). Define a “linking” transformation $ℒ_{j k}$ for bodies j and k that takes an n-body system and returns a new linked system:

$\begin{array}{l} (b_{0}^{'}, \dots, b_{n - 1}^{'}) = ℒ_{j k} (b_{0}, \dots, b_{n - 1}) . & (5.327) \end{array}$

The bodies in the new system are the same as the bodies in the old system $b_{i}^{'} = b_{i}$ except for bodies j and k:

$\begin{array}{l} (m_{j}^{'}, x_{j}^{'}) = (m_{j} m_{k} / (m_{j} + m_{k}), x_{k} - x_{j}) \\ (m_{k}^{'}, x_{k}^{'}) = (m_{j} + m_{k}, (m_{j} x_{j} + m_{k} x_{k}) / (m_{j} + m_{k})) . & (5.328) \end{array}$

This is a transformation to relative coordinates and center of mass for bodies j and k. Extend this transformation to phase space and show that it preserves the form of the kinetic energy

$\begin{array}{l} \sum_{i} \frac{{(p_{i})}^{2}}{2 m_{i}} = \sum_{i} \frac{{(p_{i}^{'})}^{2}}{2 m_{i}^{'}} . & (5.329) \end{array}$

Show that the transformation to Jacobi coordinates of part b is generated by a composition of linking transformations:

$\begin{array}{l} ℒ_{n - 2, n - 1} \circ \dots \circ ℒ_{1, 2} \circ ℒ_{0, 1} . & (5.330) \end{array}$

Interpret the coordinate transformation produced by such a succession of linking transformations; why do we call this a “linking” transformation? What requirement has to be satisfied for a composition of linking transformations to isolate the center of mass of the system (make it one of the coordinates)? Taking this constraint into account, find hierarchical Jacobi coordinates for a system with six bodies, arranged as two triple systems, each of which is a binary plus a third body. Verify that one of the coordinates is the center of mass of the system, and that the kinetic energy remains a sum of squares of the momenta divided by an appropriate mass constant.

¹ Solving for p in terms of p′ involves multiplying equation (5.3) on the right by (∂₁F(t, q′))⁻¹. This inverse is the structure that when multiplying ∂₁F(t, q′) on the right gives an identity structure. Structures representing linear transformations may be represented in terms of matrices. In this case, the matrix representation of the inverse structure is the inverse of the matrix representing the given structure.

²In chapter 1 the transformation C takes a local tuple in one coordinate system and gives a local tuple in another coordinate system. In this chapter C_H is a phase-space transformation.

³The velocities and the momenta are dual geometric objects with respect to time-independent point transformations. The velocities are coordinates of a vector field on the configuration manifold, and the momenta are coordinates of a covector field on the configuration manifold. The invariance of the inner product pv under time-independent point transformations provides a motivation for our use of superscripts for velocity components and subscripts for momentum components.

⁴The procedure solve-linear-right multiplies its first argument by the inverse of its second argument on the right. So, if u = vM then v = uM⁻¹; (solve-linear-right u M) produces v.

⁵D_s is not a derivative operator. It is not linear because the time component is a nonzero constant.

⁶Sometimes we use a center dot to indicate multiplication, to avoid the ambiguity of the use of juxtaposition to indicate both multiplication and function application. This is not to be interpreted as a vector dot product.

⁷Actually, for I = 0 the transform is not well defined and so it is not canonical for that value. This transformation is “locally canonical” in that it is canonical for nonzero values of I. We will ignore this essentially topological problem.

⁸Unlike D_s, $D$ is linear and can be a derivative operator.

⁹The procedure zero-like produces a structure of zeros with the shape of its argument.

¹⁰This is just a rearrangement of the arguments of R_z: R(Ω)(t, q′) = R_z(Ωt)(q′).

¹¹For each linear transformation T : A → A of incremental phase-space states there is a unique linear transformation $T^{T} : A^{⋆} \to A^{⋆}$ of the dual space, called the transpose of T, such that for every real-valued linear function g : A → R of incremental phase-space states, and for every a ∈ A we have $(T^{T} (g)) (a) = g (T (a))$ . As linear multipliers ${(D T (a))}^{T} \cdot D g (a) \cdot a = D g (a) \cdot D T (a) \cdot a$ . But for arbitrary a this is ${(D T (a))}^{T} \cdot D g (a) = D g (a) \cdot D T (a)$ . In our application, DT (a) is DC_H(s′), and Dg(a) is DH(C_H(s′)).

¹²The procedure compatible-shape takes any structure and produces another structure that is guaranteed to multiply with the given structure to produce a numerical quantity. For example, the shape of DH(s) is a compatible shape to the shape of s: if they are multiplied the result is a numerical quantity. This is the s^⋆ that appears in equation (5.48).

¹³The procedure transpose is simply defined for traditional matrices, but because structures that specify linear transformations may have arbitrary substructure, the procedure needs to be supplied with a template that specifies this structure. So the procedure transpose takes two arguments: (transpose ms rs), where ms is the structure to be transposed and the template rs is a structure that is appropriate for multiplication with ms on the right.

¹⁴Actually, this is more interesting: we allow transformations that arbitrarily distort time, as tau is an arbitrary literal function. The canonical condition is concerned only with the possibly time-dependent transformation of coordinates and momenta.

¹⁵The qp submatrix of a square matrix of dimension 2n + 1 is the 2n-dimensional matrix obtained by deleting the first row and the first column of the given matrix. This can be computed by:

(define (qp-submatrix m)
  (m:submatrix m 1 (m:num-rows m) 1 (m:num-cols m)))

¹⁶The procedure D-as-matrix is defined as:

(define ((D-as-matrix F) s)
  (s->m (compatible-shape (F s)) ((D F) s) s)))

¹⁷The qⁱ, p_i plane is the ith canonical plane in these phase-space variables.

¹⁸The structure ∂₂∂₁F₁ is a down of downs, so it is compatible for contraction with an up on either side. But it is not symmetrical, so the associations must be specified. To solve this problem we use index notation (ugh!).

So we use indices to select particular components of structured objects. If an index symbol appears both as a superscript and as a subscript in an expression, the value of the expression is the sum over all possible values of the index symbol of the designated components (Einstein summation convention). Thus, for example, if $\dot{q}$ and p are of dimension n then the indicated product $p_{i} {\dot{q}}^{i}$ is to be interpreted as $Σ_{i = 0}^{n - 1} p_{i} {\dot{q}}^{i}$ .

¹⁹A structure is nonsingular if the determinant of the matrix representation of the structure is nonzero.

²⁰Point transformations are not in this class: we cannot solve for the momenta in terms of the positions for point transformations, because for a point transformation the primed and unprimed coordinates can be deduced from each other, so there is not enough information in the coordinates to deduce the momenta.

²¹Let F be defined as the path-independent line integral

$F (x) = \int_{x_{0}}^{x} \sum_{i} f_{i} (x) d x^{i} + F (x_{0});$

then ∂_iF(x) = f_i(x).

²²There may be some singular cases and topological problems that prevent this from being rigorously true.

²³The various generating functions are traditionally known by the names F₁, F₂, F₃, and F₄. Please don't blame us.

²⁴We augment the Lagrangian with the total time derivative of the constraint so that the Legendre transform will be well defined.

²⁵Once we have made this reduction, taking p_λ to be zero, we can no longer perform a Legendre transform back to the extended Lagrangian system; we cannot solve for p_t in terms of v_t. However, the Legendre transform in the extended system from $H_{e}^{'}$ to $L_{e}^{'}$ , with associated state variables, is well defined.

²⁶If f is strictly increasing then Df is never zero.

²⁷Actually, the traditional Jacobi constant is C = −2H′.

²⁸We could have chosen to reparameterize in terms of φ, but then both p_r and r would occur in the resulting time-independent Hamiltonian. The path we have chosen takes advantage of the fact that φ does not appear in our Hamiltonian, so p_φ is a constant of the motion. This structure suggests that to solve this kind of problem we need to look ahead, as in playing chess.

5Canonical Transformations

5.1 Point Transformations

Implementing point transformations

5.2 General Canonical Transformations

Polar-canonical transformation

5.2.1 Time-Dependent Transformations

Rotating coordinates

5.2.2 Abstracting the Canonical Condition

Examples

Canonical condition and Poisson brackets

Symplectic matrices

5.3 Invariants of Canonical Transformations

Noninvariance of pv

Invariance of Poisson brackets

Volume preservation

The symplectic 2-form

Poincaré integral invariant

5.4 Generating Functions

The polar-canonical transformation

5.4.1 F1 Generates Canonical Transformations

5.4.2 Generating Functions and Integral Invariants

Generating functions of type F1

Generating functions of type F2

Relationship between F1 and F2

5.4.3 Types of Generating Functions

5.4.4 Point Transformations

Polar and rectangular coordinates

Rotating coordinates

Reducing the two-body problem to the one-body problem

Epicyclic motion

5.4.5 Total Time Derivatives

Driven pendulum

5.5 Extended Phase Space

Restricted three-body problem

5.5.1 Poincaré–Cartan Integral Invariant

5.6 Reduced Phase Space

Orbits in a central field

Generating functions in extended phase space

5.7 Summary

5.8 Projects

5.4.1 F₁ Generates Canonical Transformations

Generating functions of type F₁

Generating functions of type F₂

Relationship between F₁ and F₂