6Canonical Evolution

What, then, is time? I know well enough what it is, provided that nobody asks me; but if I am asked what it is and try to explain, I am baffled. All the same I can confidently say that I know that if nothing passed, there would be no past time; if nothing were going to happen, there would be no future time; and if nothing were there would be no present time.

Augustine of Hippo, from Confessions, Book XI, Section 14. Translation by R.S. Pine-Coffin, 1961.

Time evolution generates a canonical transformation: if we consider all possible initial states of a Hamiltonian system and follow all the trajectories for the same time interval, then the map from the initial state to the final state of each trajectory is a canonical transformation. Hamilton–Jacobi theory gives a mixed-variable generating function that generates this time-evolution transformation. For the few integrable systems for which we can solve the Hamilton–Jacobi equation this transformation gets us action-angle coordinates, which form a starting point to study perturbations.

6.1 Hamilton–Jacobi Equation

If we could find a canonical transformation so that the transformed Hamiltonian was identically zero, then by Hamilton's equations the new coordinates and momenta would be constants. All of the time variation of the solution would be captured in the canonical transformation, and there would be nothing more to the solution. A mixed-variable generating function that does this job satisfies a partial differential equation called the Hamilton–Jacobi equation. In most cases, a Hamilton–Jacobi equation cannot be solved explicitly. When it can be solved, however, a Hamilton–Jacobi equation provides a means of reducing a problem to a useful simple form.

Recall the relations satisfied by an F₂-type generating function:

$\begin{matrix} q' = \partial_{2} F_{2} (t, q, p') & (6.1) \end{matrix}$

$\begin{matrix} p = \partial_{1} F_{2} (t, q, p') & (6.2) \end{matrix}$

$\begin{matrix} H' (t, q', p') = H (t, q, p) + \partial_{0} F_{2} (t, q, p') . & (6.3) \end{matrix}$

If we require the new Hamiltonian to be zero, then F₂ must satisfy the equation

$\begin{matrix} 0 = H (t, q, \partial_{1} F_{2} (t, q, p')) + \partial_{0} F_{2} (t, q, p') . & (6.4) \end{matrix}$

So the solution of the problem is “reduced” to the problem of solving an n-dimensional partial differential equation for F₂ with unspecified new (constant) momenta p′. This is a Hamilton–Jacobi equation, and in some cases we can solve it.

We can also attempt a somewhat less drastic method of solution. Rather than try to find an F₂ that makes the new Hamiltonian identically zero, we can seek an F₂-shaped function W that gives a new Hamiltonian that is solely a function of the new momenta. A system described by this form of Hamiltonian is also easy to solve. So if we set

$\begin{array}{l} H ″ (t, q ″, p ″) & = H (t, q, \partial_{1} W (t, q, p ″)) + \partial_{0} W (t, q, p ″) \\ = E (p ″) & (6.5) \end{array}$

and are able to solve for W, then the problem is essentially solved. In this case, the primed momenta are all constant and the primed positions are linear in time. This is an alternate form of the Hamilton–Jacobi equation.

These forms are related. Suppose that we have a W that satisfies the second form of the Hamilton–Jacobi equation (6.5). Then the F₂ constructed from W

$\begin{matrix} F_{2} (t, q, p') = W (t, q, p') - E (p') t & (6.6) \end{matrix}$

satisfies the first form of the Hamilton–Jacobi equation (6.4). Furthermore,

$\begin{matrix} p = \partial_{1} F_{2} (t, q, p') = \partial_{1} W (t, q, p'), & (6.7) \end{matrix}$

so the primed momenta are the same in the two formulations. But

$\begin{array}{l} q' & = \partial_{2} F_{2} (t, q, p') \\ = \partial_{2} W (t, q, p') - D E (p') t \\ = q ″ - D E (p') t, & (6.8) \end{array}$

so we see that the primed coordinates differ by a term that is linear in time—both $p' (t) = p_{0}^{'}$ and $q' (t) = q_{0}^{'}$ are constant. Thus we can use either W or F₂ as the generating function, depending on the form of the new Hamiltonian we want.

Note that if H is time independent then we can often find a time-independent W that does the job. For time-independent W the Hamilton–Jacobi equation simplifies to

$\begin{matrix} E (p') = H (t, q, \partial_{1} W (t, q, p')) . & (6.9) \end{matrix}$

The corresponding F₂ is then linear in time. Notice that an implicit requirement is that the energy can be written as a function of the new momenta alone. This excludes the possibility that the transformed phase-space coordinates q′ and p′ are simply initial conditions for q and p.

It turns out that there is flexibility in the choice of the function E. With an appropriate choice the phase-space coordinates obtained through the transformation generated by W are action-angle coordinates.

Exercise 6.1: Hamilton–Jacobi with F₁

We have used an F₂-type generating function to carry out the Hamilton–Jacobi transformations. Carry out the equivalent transformations with an F₁-type generating function. Find the equations corresponding to equations (6.4), (6.5), and (6.9).

6.1.1 Harmonic Oscillator

Consider the familiar time-independent Hamiltonian

$\begin{matrix} H (t, x, p) = \frac{p^{2}}{2 m} + \frac{k x^{2}}{2} . & (6.10) \end{matrix}$

We form the Hamilton–Jacobi equation for this problem:

$\begin{matrix} 0 = H (t, x, \partial_{1} F_{2} (t, x, p')) + \partial_{0} F_{2} (t, x, p') . & (6.11) \end{matrix}$

Using F₂(t, x, p′) = W (t, x, p′) − E(p′)t, we find

$\begin{matrix} E (p') = H (t, x, \partial_{1} W (t, x, p')) . & (6.12) \end{matrix}$

Writing this out explicitly yields

$\begin{matrix} E (p') = \frac{{(\partial_{1} W (t, x, p'))}^{2}}{2 m} + \frac{k x^{2}}{2}, & (6.13) \end{matrix}$

and solving for ∂₁W gives

$\begin{matrix} \partial_{1} W (t, x, p') = \sqrt{2 m (E (p') - \frac{k x^{2}}{2})} . & (6.14) \end{matrix}$

Integrating gives the desired W:

$\begin{matrix} W (t, x, p') = \int^{x} \sqrt{2 m (E (p') - \frac{k z^{2}}{2})} d z . & (6.15) \end{matrix}$

We can use either W or the corresponding F₂ as the generating function. First, take W to be the generating function. We obtain the coordinate transformation by differentiating:

$\begin{array}{l} x' & = \partial_{2} W (t, x, p') \\ = \int^{x} \frac{m D E (p')}{\sqrt{2 m (E (p') - \frac{k z^{2}}{2})}} d z & (6.16) \end{array}$

and then integrating to get

$\begin{matrix} x' = \sqrt{\frac{m}{k}} D E (p') \arcsin (\sqrt{\frac{k}{2 E (p')} x}) + C (p'), & (6.17) \end{matrix}$

with some integration constant C(p′). Inverting this, we get the unprimed coordinate in terms of the primed coordinate and momentum:

$\begin{matrix} x = \sqrt{\frac{2 E (p')}{k}} \sin [\frac{1}{D E (p')} \sqrt{\frac{k}{m}} (x' - C (p'))] . & (6.18) \end{matrix}$

The new Hamiltonian H′ depends only on the momentum:

$\begin{matrix} H' (t, x', p') = E (p') . & (6.19) \end{matrix}$

The equations of motion are just

$\begin{array}{l} D x' (t) = \partial_{2} H' (t, x' (t), p' (t)) = D E (p') \\ D p' (t) = - \partial_{1} H' (t, x' (t), p' (t)) = 0, & (6.20) \end{array}$

with solution

$\begin{array}{l} x' (t) = D E (p') t + x_{0}^{'} \\ p' (t) = p_{0}^{'} & (6.21) \end{array}$

for initial conditions $x_{0}^{'}$ and $p_{0}^{'}$ . If we plug these expressions for x′(t) and p′(t) into equation (6.18) we find

$\begin{array}{l} x (t) & = \sqrt{\frac{2 E (p')}{k}} \sin [\frac{1}{D E (p')} \sqrt{\frac{k}{m}} (D E (p') t + x_{0}^{'} - C (p'))] \\ = \sqrt{\frac{2 E (p')}{k}} \sin [\sqrt{\frac{k}{m}} (t - t_{0})] \\ = A \sin (ω t + φ), & (6.22) \end{array}$

where the angular frequency is $ω = \sqrt{k / m}$ , the amplitude is $A = \sqrt{2 E (p') / k}$ , and the phase is $φ = - ω t_{0} = ω (x_{0}^{'} - C (p')) / D E (p')$ .

We can also use F₂ = W − Et as the generating function. The new Hamiltonian is zero, so both x′ and p′ are constant, but the relationship between the old and new variables is

$\begin{array}{l} x' & = \partial_{2} F_{2} (t, x, p') \\ = \partial_{2} W (t, x, p') - D E (p') t \\ = \int^{x} \frac{m D E (p')}{\sqrt{2 m (E (p') - \frac{k z^{2}}{2})}} - D E (p') t \\ = \sqrt{\frac{m}{k}} D E (p') \arcsin (\sqrt{\frac{k}{2 E (p')}} x) + C (p') - D E (p') t . & (6.23) \end{array}$

Plugging in the solution $x' = x_{0}^{'}$ and $p' = p_{0}^{'}$ and solving for x, we find equation (6.22). So once again we see that the two approaches are equivalent.

It is interesting to note that the solution depends upon the constants E(p′) and DE(p′), but otherwise the motion is not dependent in any essential way on what the function E actually is. The momentum p is constant and the values of the constants are set by the initial conditions. Given a particular function E, the initial conditions determine p′, but the solution can be obtained without further specifying the E function.

If we choose particular functions E we can get particular canonical transformations. For example, a convenient choice is simply

$\begin{matrix} E (p') = α p', & (6.24) \end{matrix}$

for some constant α that will be chosen later. We find

$\begin{matrix} x = \sqrt{\frac{2 α p'}{k}} \sin \frac{ω}{α} x' . & (6.25) \end{matrix}$

So we see that a convenient choice is $α = ω = \sqrt{k / m}$ , so

$\begin{matrix} x = \sqrt{\frac{2 p'}{β}} \sin x', & (6.26) \end{matrix}$

with $β = \sqrt{k m}$ . The new Hamiltonian is

$\begin{matrix} H' (t, x', p') = E (p') = ω p' . & (6.27) \end{matrix}$

The solution is just $x' = ω t + x_{0}^{'}$ and $p' = p_{0}^{'}$ . Substituting the expression for x in terms of x′ and p′ into H(t, x, p) = H′(t, x′, p′), we derive

$\begin{array}{l} \begin{array}{l} p & = {[2 m (p' α - \frac{k}{2} x^{2})]}^{1 / 2} \\ = \sqrt{2 p' β} \cos x' . & (6.28) \end{array} \end{array}$

The two transformation equations (6.26) and (6.28) are what we have called the polar-canonical transformation (equation 5.29). We have already shown that this transformation is canonical and that it solves the harmonic oscillator, but it was not derived. Here we have derived this transformation as a particular case of the solution of the Hamilton–Jacobi equation.

We can also explore other choices for the E function. For example, we could choose

$\begin{matrix} E (p') = \frac{1}{2} α p'^{2} . & (6.29) \end{matrix}$

Following the same steps as before, we find

$\begin{matrix} x = \sqrt{\frac{α p'^{2}}{k}} \sin \frac{ω}{α} \frac{x'}{p'} . & (6.30) \end{matrix}$

So a convenient choice is again α = ω, leaving

$\begin{array}{l} x = \frac{p'}{β} \sin \frac{x'}{p'} \\ p = β p' \cos \frac{x'}{p'}, & (6.31) \end{array}$

with β = (km)^1/4. By construction, this transformation is also canonical and also brings the harmonic oscillator problem into an easily solvable form:

$\begin{matrix} H' (t, x', p') = \frac{1}{2} ω p'^{2} . & (6.32) \end{matrix}$

The harmonic oscillator Hamiltonian has been transformed to what looks a lot like the Hamiltonian for a free particle. This is very interesting. Notice that whereas Hamiltonian (6.27) does not have a well defined Legendre transform to an equivalent Lagrangian, the “free particle” harmonic oscillator has a well defined Legendre transform:

$\begin{matrix} L' (t, x', \dot{x}') = \frac{\dot{x}'^{2}}{2 ω} . & (6.33) \end{matrix}$

Of course, there may be additional properties that make one choice more useful than others for particular applications.

Exercise 6.2: Pendulum

Formulate and solve a Hamilton–Jacobi equation for the pendulum; investigate both the circulating and oscillating regions of phase space. (Note: This is a long story and requires some knowledge of elliptic functions.)

6.1.2 Hamilton–Jacobi Solution of the Kepler Problem

We can use the Hamilton–Jacobi equation to find canonical coordinates that solve the Kepler problem. This is an essential first step in doing perturbation theory for orbital problems.

In rectangular coordinates (x, y, z), the Kepler Hamiltonian is

$\begin{matrix} H_{r} (t; x, y, z; p_{x}, p_{y}, p_{z}) = \frac{p^{2}}{2 m} - \frac{μ}{r}, & (6.34) \end{matrix}$

where r² = x² + y² + z² and $p^{2} = p_{x}^{2} + p_{y}^{2} + p_{z}^{2}$ .

We try a generating function of the form $W (t; x, y, z; p_{x}^{'}, p_{y}^{'}, p_{z}^{'})$ . The Hamilton–Jacobi equation is then¹

$\begin{array}{l} \begin{array}{l} E (p') & = \frac{1}{2 m} [{(\partial_{1, 0} W (t; x, y, z; p_{x}^{'}, p_{y}^{'}, p_{z}^{'}))}^{2} \\ + {(\partial_{1, 1} W (t; x, y, z; p_{x}^{'}, p_{y}^{'}, p_{z}^{'}))}^{2} \\ + {(\partial_{1, 2} W (t; x, y, z; p_{x}^{'}, p_{y}^{'}, p_{z}^{'}))}^{2}] - \frac{μ}{r} . & (6.35) \end{array} \end{array}$

This is a partial differential equation in the three partial derivatives of W. We stare at it a while and give up.

Next we try converting to spherical coordinates. This is motivated by the fact that the potential energy depends only on r. The Hamiltonian in spherical coordinates (r, θ, φ), where θ is the colatitude and φ is the longitude, is

$\begin{matrix} H_{s} (t; r, θ, φ; p_{r}, p_{θ}, p_{φ}) = \frac{1}{2 m} [p_{r}^{2} + \frac{p_{θ}^{2}}{r^{2}} + \frac{p_{φ}^{2}}{r^{2} {(\sin θ)}^{2}}] - \frac{μ}{r} . & (6.36) \end{matrix}$

The Hamilton–Jacobi equation is

$\begin{array}{l} E (p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) \\ \begin{array}{l} = \frac{1}{2 m} [{(\partial_{1, 0} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}))}^{2} \\ + \frac{1}{r^{2}} {(\partial_{1, 1} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}))}^{2} \\ + \frac{1}{r^{2} {(\sin θ)}^{2}} {(\partial_{1, 2} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}))}^{2}] - \frac{μ}{r} . & (6.37) \end{array} \end{array}$

We can solve this Hamilton–Jacobi equation by successively isolating the dependence on the various variables. Looking first at the φ dependence, we see that, outside of W, φ appears only in one partial derivative. If we write

$\begin{matrix} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) = f (r, θ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) + p_{2}^{'} φ, & (6.38) \end{matrix}$

then $\partial_{1, 2} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) = p_{2}^{'}$ , and then φ does not appear in the remaining equation for f:

$\begin{array}{l} E (p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) \\ \begin{array}{l} = \frac{1}{2 m} {{(\partial_{0} f (r, θ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}))}^{2} \\ + \frac{1}{r^{2}} [{(\partial_{1} f (r, θ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}))}^{2} + \frac{{(p_{2}^{'})}^{2}}{{(\sin θ)}^{2}}]} - \frac{μ}{r} . & (6.39) \end{array} \end{array}$

Any function of the $p_{i}^{'}$ could have been used as the coefficient of φ in the generating function. This particular choice has the nice feature that $p_{2}^{'}$ is the z component of the angular momentum.

We can eliminate the θ dependence if we choose

$\begin{matrix} f (r, θ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) = R (r, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) + Θ (θ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) & (6.40) \end{matrix}$

and require that Θ be a solution to

$\begin{matrix} {(\partial_{0} Θ (θ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}))}^{2} + \frac{{(p_{2}^{'})}^{2}}{{(\sin θ)}^{2}} = {(p_{1}^{'})}^{2} . & (6.41) \end{matrix}$

We are free to choose the right-hand side to be any function of the new momenta. This choice reflects the fact that the left-hand side is non-negative. It turns out that $p_{1}^{'}$ is the total angular momentum. This equation for Θ can be solved by quadrature.

The remaining equation that determines R is

$\begin{matrix} E (p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) = \frac{1}{2 m} [{(\partial_{0} R (r, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}))}^{2} + \frac{1}{r^{2}} {(p_{1}^{'})}^{2}] - \frac{μ}{r}, & (6.42) \end{matrix}$

which also can be solved by quadrature.

Altogether the solution of the Hamilton–Jacobi equation reads

$\begin{array}{l} W (r, θ, φ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) & = \int^{r} {(2 m E (p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) + \frac{2 m μ}{r} - \frac{{(p_{1}^{'})}^{2}}{r^{2}})}^{1 / 2} d r \\ + \int^{θ} {({(p_{1}^{'})}^{2} - \frac{{(p_{2}^{'})}^{2}}{{(\sin θ)}^{2}})}^{1 / 2} d θ \\ + p_{2}^{'} φ . & (6.43) \end{array}$

It is interesting that our solution to the Hamilton–Jacobi partial differential equation is of the form

$\begin{array}{l} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) \\ \begin{matrix} = R (r, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) + Θ (θ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) + Φ (φ, p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) . & (6.44) \end{matrix} \end{array}$

Thus we have a separation-of-variables technique that involves writing the solution as a sum of functions of the individual variables. This might be contrasted with the separation-of-variables technique encountered in elementary quantum mechanics and classical electrodynamics, which uses products of functions of individual variables.

The coordinates q′ = (q′⁰, q′¹, q′²) conjugate to the momenta $p' = [p_{0}^{'}, p_{1}^{'}, p_{2}^{'}]$ are

$\begin{array}{l} q'^{0} & = \partial_{2, 0} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) \\ = m \partial_{0} E (p') \int^{r} {(2 m E (p') + \frac{2 m μ}{r} - \frac{{(p_{1}^{'})}^{2}}{r^{2}})}^{- 1 / 2} d r \\ q'^{1} & = \partial_{2, 1} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) \\ = p_{1}^{'} \int^{θ} {({(p_{1}^{'})}^{2} - \frac{{(p_{2}^{'})}^{2}}{{(\sin θ)}^{2}})}^{- 1 / 2} d θ \\ + \int^{r} (m \partial_{1} E (p') - \frac{p_{1}^{'}}{r^{2}}) {(2 m E (p') + \frac{2 m μ}{r} - \frac{{(p_{1}^{'})}^{2}}{r^{2}})}^{- 1 / 2} d r \\ q'^{2} & = \partial_{2, 2} W (t; r, θ, φ; p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) \\ = φ - \frac{p_{2}^{'}}{{(\sin θ)}^{2}} \int^{θ} {({(p_{1}^{'})}^{2} - \frac{{(p_{2}^{'})}^{2}}{{(\sin θ)}^{2}})}^{- 1 / 2} d θ \\ + m \partial_{2} E (p') \int^{r} {(2 m E (p') + \frac{2 m μ}{r} - \frac{{(p_{1}^{'})}^{2}}{r^{2}})}^{- 1 / 2} d r . \end{array}$

We are still free to choose the functional form of E. A convenient (and conventional) choice is

$\begin{matrix} E (p_{0}^{'}, p_{1}^{'}, p_{2}^{'}) = - \frac{m μ^{2}}{2 {(p_{0}^{'})}^{2}} . & (6.45) \end{matrix}$

With this choice the momentum $p_{0}^{'}$ has dimensions of angular momentum, and the conjugate coordinate is an angle.

The Hamiltonian for the Kepler problem is reduced to

$\begin{matrix} H' (t, q', p') = E (p') = - \frac{m μ^{2}}{2 (p_{0}^{'}) 2} . & (6.46) \end{matrix}$

Thus

$\begin{matrix} q'^{0} = n t + β^{0} & (6.47) \end{matrix}$

$\begin{matrix} q'^{1} = β^{1} & (6.48) \end{matrix}$

$\begin{matrix} q'^{2} = β^{2}, & (6.49) \end{matrix}$

where $n = m μ^{2} / {(p_{0}^{'})}^{3}$ and where β⁰, β¹, and β² are the initial values of the components of q′. Only one of the new variables changes with time.

The canonical phase-space coordinates can be written in terms of the parameters that specify an orbit. We merely summarize the results; for further explanation see [36] or [38].

Assume we have a bound orbit with semimajor axis a, eccentricity e, inclination i, longitude of ascending node Ω, argument of pericenter ω, and mean anomaly M. The three canonical momenta are $p_{0}^{'} = \sqrt{m μ a}, p_{1}^{'} = \sqrt{m μ a (1 - e^{2})}$ , and $p_{2}^{'} = \sqrt{m μ a (1 - e^{2})} \cos i$ . The first momentum is related to the energy, the second momentum is the total angular momentum, and the third momentum is the component of the angular momentum in the ẑ direction. The conjugate canonical coordinates are (q′)⁰ = M, (q′)¹ = ω, and (q′)² = Ω.

6.1.3 F₂ and the Lagrangian

The solution to the Hamilton–Jacobi equation, the mixed-variable generating function that generates time evolution, is related to the action used in the variational principle. In particular, the time derivative of the generating function along realizable paths has the same value as the Lagrangian.

Let ${\tilde{F}}_{2} (t) = F_{2} (t, q (t), p' (t))$ be the value of F₂ along the paths q and p′ at time t. The derivative of ${\tilde{F}}_{2}$ is

$\begin{array}{l} D {\tilde{F}}_{2} (t) & = \partial_{0} F_{2} (t, q (t), p' (t)) \\ + \partial_{1} F_{2} (t, q (t), p' (t)) D q (t) \\ + \partial_{2} F_{2} (t, q (t), p' (t)) D q' (t) . & (6.50) \end{array}$

Using the Hamilton–Jacobi equation (6.4), this becomes

$\begin{array}{l} D {\tilde{F}}_{2} (t) & = - H (t, q (t), \partial_{1} F_{2} (t, q (t), p' (t))) \\ + \partial_{1} F_{2} (t, q (t), p' (t)) D q (t) \\ + \partial_{2} F_{2} (t, q (t), p' (t)) D q' (t) . & (6.51) \end{array}$

Now, using equation (6.2), we get

$\begin{array}{l} \begin{array}{l} D {\tilde{F}}_{2} (t) & = - H (t, q (t), p (t)) \\ + p (t) D q (t) \\ + \partial_{2} F_{2} (t, q (t), p' (t)) D p' (t) . & (6.52) \end{array} \end{array}$

But $p (t) D q (t) - H (t, q (t), p (t)) = L (t, q (t), D q (t))$ , so

$\begin{matrix} D {\tilde{F}}_{2} (t) = L (t, q (t), D q (t)) + \partial_{2} F_{2} (t, q (t), p' (t)) D p' (t) . & (6.53) \end{matrix}$

On realizable paths we have Dp′(t) = 0, so along realizable paths the time derivative of F₂ is the same as the Lagrangian along the path. The time integral of the Lagrangian along any path is the action along that path. This means that, up to an additive term that is constant on realizable paths but may be a function of the transformed phase-space coordinates q′ and p′, the F₂ that solves the Hamilton–Jacobi equation has the same value as the Lagrangian action for realizable paths.

The same conclusion follows for the Hamilton–Jacobi equation formulated in terms of F₁. Up to an additive term that is constant on realizable paths but may be a function of the transformed phase-space coordinates q′ and p′, the F₁ that solves the corresponding Hamilton–Jacobi equation has the same value as the Lagrangian action for realizable paths.

Recall that a transformation given by an F₂-type generating function is also given by an F₁-type generating function related to it by a Legendre transform (see equation 5.142):

$\begin{matrix} F_{1} (t, q, q') = F_{2} (t, q, p') - q' p', & (6.54) \end{matrix}$

provided the transformations are nonsingular. In this case, both q′ and p′ are constant on realizable paths, so the additive constants that make F₁ and F₂ equal to the Lagrangian action differ by q′p′.

Exercise 6.3: Harmonic oscillator

Let's check this for the harmonic oscillator (of course).

a. Finish the integral (6.15):

$W (t, x, p') = \int^{x} \sqrt{2 m (E (p') - \frac{k z^{2}}{2})} d z .$

Write the result in terms of the amplitude $A = \sqrt{2 E (p') / k}$ .

b. Check that this generating function gives the transformation

$x' = \partial_{2} W (t, x, p') = \sqrt{\frac{m}{k}} D E (p') \arcsin (\frac{x}{\sqrt{2 E (p') / k}}),$

which is the same as equation (6.17) for a particular choice of the integration constant. The other part of the transformation is

$p = \partial_{1} W (t, x, p') = \sqrt{m k} \sqrt{A^{2} - x^{2}},$

with the same definition of A as before.

c. Compute the time derivative of the associated F₂ along realizable paths (Dp′(t) = 0), and compare it to the Lagrangian along realizable paths.

6.1.4 The Action Generates Time Evolution

We define the function $\bar{F} (t_{1}, q_{1}, t_{2}, q_{2})$ to be the value of the action for a realizable path q such that q(t₁) = q₁ and q(t₂) = q₂. So $\bar{F}$ satisfies

$\begin{matrix} \bar{F} (t_{1}, q (t_{1}), t_{2,} q (t_{2})) = S [q] (t_{1}, t_{2}) = \int_{t_{1}}^{t_{2}} L \circ Γ [q] . & (6.55) \end{matrix}$

For variations η that are not necessarily zero at the end times and for realizable paths q, the variation of the action is

$\begin{array}{l} δ_{η} S [q] (t_{1}, t_{2}) & = (\partial_{2} L \circ Γ [q]) η |_{t_{1}}^{t_{2}} \\ = p (t_{2}) η (t_{2}) - p (t_{1}) η (t_{1}) . & (6.56) \end{array}$

Alternatively, the variation of S[q] in equation (6.55) gives

$\begin{array}{l} δ_{η} S [q] (t_{1}, t_{2}) & = \partial_{1} \bar{F} (t_{1}, q (t_{1}), t_{2}, q (t_{2})) η (t_{1}) \\ + \partial_{3} \bar{F} (t_{1}, q (t_{1}), t_{2}, q (t_{2})) η (t_{2}) . & (6.57) \end{array}$

Comparing equations (6.56) and (6.57) and using the fact that the variation η is arbitrary, we find

$\begin{array}{l} \partial_{1} \bar{F} (t_{1}, q (t_{1}), t_{2}, q (t_{2})) = - p (t_{1}) \\ \partial_{3} \bar{F} (t_{1}, q (t_{1}), t_{2}, q (t_{2})) = p (t_{2}) . & (6.58) \end{array}$

The partial derivatives of $\bar{F}$ with respect to the coordinate arguments give the momenta. Abstracting off paths, we have

$\begin{array}{l} \partial_{1} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2}) = - p_{1} \\ \partial_{3} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2}) = p_{2.} & (6.59) \end{array}$

This looks a bit like the F₁-type generating function relations, but here there are two times. Solving equations (6.59) for q₂ and p₂ as functions of t₂ and the initial state t₁, q₁, p₁, we get the time evolution of the system in terms of $\bar{F}$ . The function $\bar{F}$ generates time evolution.

If we vary the lower limit of the action integral we get

$\begin{matrix} \partial_{0} (S [q]) (t_{1}, t_{2}) = - L (t_{1}, q (t_{1}), D_{q} (t_{1})) . & (6.60) \end{matrix}$

Using equation (6.55), and given a realizable path q such that q(t₁) = q₁ and q(t₂) = q₂, we get the partial derivatives with respect to the time slots:

$\begin{array}{l} \partial_{0} (S [q]) (t_{1}, t_{2}) & = \partial_{0} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2}) + \partial_{1} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2}) D q (t_{1}) \\ = \partial_{0} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2}) - p (t_{1}) D_{q} (t_{1}) . & (6.61) \end{array}$

Rearranging the terms of equation (6.61) and using equation (6.60) we get

$\begin{array}{l} \partial_{0} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2}) & = H (t_{1}, q_{1}, p_{1}) \\ = H (t_{1}, q_{1}, - \partial_{1} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2})), & (6.62) \end{array}$

and similarly

$\begin{array}{l} \partial_{2} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2}) & = - H (t_{2}, q_{2}, p_{2}) \\ = - H (t_{2}, q_{2}, \partial_{3} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2})) . & (6.63) \end{array}$

These are a pair of Hamilton–Jacobi equations, computed at the endpoints of the path.

The function $\bar{F}$ can be written in terms of an F₁ that satisfies a Hamilton–Jacobi equation for H. We can compute time evolution by using the F₁ solution of the Hamilton–Jacobi equation to express the state (t₁, q₁, p₁) in terms of the constants q′ and p′. Using the same solution we can then perform a subsequent transformation back from q′ p′ to the original state variables at a different time t₂, giving the state (t₂, q₂, p₂). The composition of these two canonical transformations is canonical (see exercise 5.12).

The generating function for the composition is the difference of the generating functions for each step:

$\begin{matrix} {\bar{F}}_{x} (t_{1}, q_{1}, q', t_{2}, q_{2}) = F_{1} (t_{2}, q_{2}, q') - F_{1} (t_{1}, q_{1}, q'), & (6.64) \end{matrix}$

with the condition

$\begin{matrix} \partial_{2} F_{1} (t_{2}, q_{2}, q') - \partial_{2} F_{1} (t_{1}, q_{1}, q') = 0, & (6.65) \end{matrix}$

which allows us to eliminate q′ in terms of t₁, q₁, t₂, and q₂. So we can write

$\begin{matrix} \bar{F} (t_{1}, q_{1}, t_{2}, q_{2}) = F_{1} (t_{2}, q_{2}, q') - F_{1} (t_{1}, q_{1}, q') . & (6.66) \end{matrix}$

Exercise 6.4: Uniform acceleration

a. Compute the Lagrangian action, as a function of the endpoints and times, for a uniformly accelerated particle. Use this to construct the canonical transformation for time evolution from a given initial state.

b. Solve the Hamilton–Jacobi equation for the uniformly accelerated particle, obtaining the F₁ that makes the transformed Hamiltonian zero. Show that the Lagrangian action can be expressed as a difference of two applications of this F₁.

6.2 Time Evolution is Canonical

We use time evolution to generate a transformation

$\begin{matrix} (t, q, p) = C_{Δ} (t', q', p') & (6.67) \end{matrix}$

that is obtained in the following way. Let $σ (t) = (t, \bar{q} (t), \bar{p} (t))$ be a solution of Hamilton's equations. The transformation $C_{Δ}$ satisfies

$\begin{matrix} C_{Δ} (σ (t)) = σ (t + Δ), & (6.68) \end{matrix}$

or, equivalently,

$\begin{matrix} C_{Δ} (t, \bar{q} (t), \bar{p} (t)) = (t + Δ, \bar{q} (t + Δ), \bar{p} (t + Δ)) . & (6.69) \end{matrix}$

Notice that $C_{Δ}$ changes the time component. This is the first transformation of this kind that we have considered.²

Given a state (t′, q′, p′), we find the phase-space path σ emanating from this state as an initial condition, satisfying

$\begin{array}{l} q' = \bar{q} (t') \\ p' = \bar{p} (t') . & (6.70) \end{array}$

The value (t, q, p) of $C_{Δ} (t', q', p')$ is then $(t' + Δ, \bar{q} (t' + Δ), \bar{p} (t' + Δ))$ .

Time evolution is canonical if the transformation $C_{Δ}$ is symplectic and if the Hamiltonian transforms in an appropriate manner. The transformation $C_{Δ}$ is symplectic if the bilinear antisymmetric form ω is invariant (see equation 5.73) for a general pair of linearized state variations with zero time component.

Let ζ′ be an increment with zero time component of the state (t′, q′, p′). The linearized increment in the value of $C_{Δ} (t', q', p')$ is $ζ = D C_{Δ} (t', q', p') ζ'$ : The image of the increment is obtained by multiplying the increment by the derivative of the transformation. On the other hand, the transformation is obtained by time evolution, so the image of the increment can also be found by the time evolution of the linearized variational system. Let

$\begin{array}{l} \bar{ζ} (t) = (0, {\bar{ζ}}_{q} (t), {\bar{ζ}}_{p} (t)) \\ \bar{ζ}' (t) = (0, {\bar{ζ}}_{p}^{'} (t), {\bar{ζ}}_{p}^{'} (t)) & (6.71) \end{array}$

be variations of the state path $σ (t) = (t, \bar{q} (t), \bar{p} (t))$ ; then

$\begin{array}{l} \bar{ζ} (t + Δ) = D C_{Δ} (t, q (t), p (t)) \bar{ζ} (t) \\ \bar{ζ}' (t + Δ) = D C_{Δ} (t, q (t), p (t)) \bar{ζ}' (t) . & (6.72) \end{array}$

The symplectic requirement is

$\begin{matrix} ω (\bar{ζ} (t), \bar{ζ}' (t)) = ω (\bar{ζ} (t + Δ), \bar{ζ}' (t + Δ)) . & (6.73) \end{matrix}$

This must be true for arbitrary Δ, so it is satisfied if the following quantity is constant:

$\begin{array}{l} A (t) & = ω (\bar{ζ} (t), \bar{ζ}' (t)) \\ = P (\bar{ζ}' (t)) Q (\bar{ζ} (t)) - P (\bar{ζ} (t)) Q (\bar{ζ}' (t)) \\ = {\bar{ζ}}_{p}^{'} (t) {\bar{ζ}}_{q} (t) - {\bar{ζ}}_{p} (t) {\bar{ζ}}_{q}^{'} (t) . & (6.74) \end{array}$

We compute the derivative:

$\begin{array}{l} D A (t) & = D {\bar{ζ}}_{p}^{'} (t) {\bar{ζ}}_{q} (t) + {\bar{ζ}}_{p}^{'} (t) D {\bar{ζ}}_{q} (t) \\ - D {\bar{ζ}}_{p} (t) {\bar{ζ}}_{q}^{'} (t) - {\bar{ζ}}_{p} (t) D {\bar{ζ}}_{q}^{'} (t) . & 6.75 \end{array}$

With Hamilton's equations, the variations satisfy

$\begin{array}{l} D {\bar{ζ}}_{q} (t) & = \partial_{1} \partial_{2} H (t, \bar{q} (t), \bar{p} (t)) {\bar{ζ}}_{q} (t) \\ + \partial_{2} \partial_{2} H (t, \bar{q} (t), \bar{p} (t)) {\bar{ζ}}_{p} (t), \\ D {\bar{ζ}}_{p} (t) & = - \partial_{1} \partial_{1} H (t, \bar{q} (t), \bar{p} (t)) {\bar{ζ}}_{q} (t) \\ - \partial_{2} \partial_{1} H (t, \bar{q} (t), \bar{p} (t)) {\bar{ζ}}_{p} (t) . & (6.76) \end{array}$

Substituting these into DA and collecting terms, we find³

$\begin{matrix} D A (t) = 0. & (6.77) \end{matrix}$

We conclude that time evolution generates a phase-space transformation with symplectic derivative.

To make a canonical transformation we must specify how the Hamiltonian transforms. The same Hamiltonian describes the evolution of a state and a time-advanced state because the latter is just another state. Thus the transformed Hamiltonian is the same as the original Hamiltonian.

Liouville's theorem, again

We deduced that volumes in phase space are preserved by time evolution by showing that the divergence of the phase flow is zero, using the equations of motion (see section 3.8). We can also show that volumes in phase space are preserved by the evolution using the fact that time evolution is a canonical transformation.

We have shown that phase-space volume is preserved for sym-plectic transformations. Now we have shown that the transformation generated by time evolution is a symplectic transformation. Therefore, the transformation generated by time evolution preserves phase-space volume. This is an alternate proof of Liouville's theorem.

Another time-evolution transformation

There is another canonical transformation that can be constructed from time evolution. We define the transformation $C_{Δ}^{'}$ such that

$\begin{matrix} C_{Δ}^{'} = C_{Δ} \circ S_{- Δ}, & (6.78) \end{matrix}$

where S_Δ(a, b, c) = (a + Δ, b, c) shifts the time of a phase-space state.⁴ More explicitly, given a state (t, q′, p′), we evolve the state that is obtained by subtracting Δ from t; that is, we take the state (t − Δ, q′, p′) as an initial state for evolution by Hamilton's equations. The state path σ satisfies

$\begin{array}{l} σ (t - Δ) & = (t - Δ, \bar{q} (t - Δ), \bar{p} (t - Δ)) \\ = (t - Δ, q', p') . & (6.79) \end{array}$

The output of the transformation is the state

$\begin{matrix} (t, q, p) = σ (t) = (t, \bar{q} (t), \bar{p} (t)) . & (6.80) \end{matrix}$

The transformation satisfies

$\begin{matrix} (t, \bar{q} (t), \bar{p} (t)) = C_{Δ}^{'} (t, \bar{q} (t - Δ), \bar{p} (t - Δ)) . & (6.81) \end{matrix}$

The arguments of $C_{Δ}^{'}$ are not a consistent phase-space state; the time argument must be decremented by Δ to obtain a consistent state. The transformation is completed by evolution of this consistent state.

Why is this a good idea? Our usual canonical transformations do not change the time component. The $C_{Δ}$ transformation changes the time component, but $C_{Δ}^{'}$ does not. It is canonical and in the usual form:

$\begin{matrix} (t, q, p) = C_{Δ}^{'} (t, q', p') . & (6.82) \end{matrix}$

The $C_{Δ}^{'}$ transformation requires an adjustment of the Hamiltonian. The Hamiltonian $H_{Δ}^{'}$ that gives the correct Hamilton's equations at the transformed phase-space point is the original Hamiltonian composed with a function that decrements the independent variable by Δ:

$\begin{matrix} H_{Δ}^{'} (t, q, p) = H (t - Δ, q, p) & (6.83) \end{matrix}$

$\begin{matrix} H_{Δ}^{'} = H \circ S_{- Δ} . & (6.84) \end{matrix}$

Notice that if H is time independent then $H_{Δ}^{'} = H$ .

Assume we have a procedure C such that ((C delta-t) state) implements a time-evolution transformation $C_{Δ}$ of the state state with time interval delta-t; then the procedure Cp such that ((Cp delta-t) state) implements $C_{Δ}^{'}$ of the same state and time interval can be derived from the procedure C by using the procedure

(define ((C->Cp C) delta-t)
  (compose (C delta-t) (shift-t (- delta-t))))

where shift-t implements S_Δ:

(define ((shift-t delta-t) state)
  (up
    (+ (time state) delta-t)
    (coordinate state)
    (momentum state)))

To complete the canonical transformation we have a procedure that transforms the Hamiltonian:

(define ((H->Hp delta-t) H)
  (compose H (shift-t (- delta-t))))

So both $C$ and $C'$ can be used to make canonical transformations by specifying how the old and new Hamiltonians are related. For $C_{Δ}$ the Hamiltonian is unchanged. For $C_{Δ}^{'}$ the Hamiltonian is time shifted.

Exercise 6.5: Verification

The condition (5.20) that Hamilton's equations are preserved for $C_{Δ}$ is

$D_{s} H \circ C_{Δ} = D C_{Δ} D_{s} H_{Δ}^{'},$

and the condition that Hamilton's equations are preserved for $C_{Δ}^{'}$ is

$D_{s} H \circ C_{Δ}^{'} = D C_{Δ}^{'} D_{s} H_{Δ}^{'} .$

Verify that these conditions are satisfied.

Exercise 6.6: Driven harmonic oscillator

We can use the simple driven harmonic oscillator to illustrate that time evolution yields a symplectic transformation that can be extended to be canonical in two ways. We use the driven harmonic oscillator because its solution can be compactly expressed in explicit form.

Suppose that we have a harmonic oscillator with natural frequency ω₀ driven by a periodic sinusoidal drive of frequency ω and amplitude α. The Hamiltonian we will consider is

$H (t, q, p) = \frac{1}{2} p^{2} + \frac{1}{2} ω_{0}^{2} q^{2} - α q \cos ω t .$

The general solution for a given initial state (t₀, q₀, p₀) evolved for a time Δ is

$\begin{array}{l} \begin{array}{l} (\begin{matrix} q (t_{0} + Δ) \\ p (t_{0} + Δ) / ω_{0} \end{matrix}) \\ = (\begin{matrix} \cos ω_{0} Δ & \sin ω_{0} Δ \\ - \sin ω_{0} Δ & \cos ω_{0} Δ \end{matrix}) (\begin{matrix} q_{0} - α' \cos ω t_{0} \\ (1 / ω_{0}) (p_{0} + α' ω \sin ω t_{0}) \end{matrix}) \\ + (\begin{matrix} α' \cos ω (t_{0} + Δ) \\ - α' (ω / ω_{0}) \sin ω (t_{0} + Δ) \end{matrix}) \end{array} \\ \begin{matrix} \end{matrix} \end{array}$

where $α' = α / (ω_{0}^{2} - ω^{2}) .$

a. Fill in the details of the procedure

(define (((C* alpha omega omega0) delta-t) state)
  ... )

that implements the time-evolution transformation of the driven harmonic oscillator. Let C be (C_* alpha omega omega0).

b. In terms of C_*, the general solution emanating from a given state is

(define (((solution alpha omega omega0) state0) t)
  (((C* alpha omega omega0) (- t (time state0))) state0))

Check that the implementation of C_* is correct by using it to construct the solution and verifying that the solution satisfies Hamilton's equations. Further check the solution by comparing to numerical integration.

c. We know that for any phase-space state function F the rate of change of that function along a solution path σ is

$D (F \circ σ) = \partial_{0} F \circ σ + {F, H} \circ σ .$

Show, by writing a short program to test it, that this is true of the function implemented by (C delta) for the driven oscillator. Why is this interesting?

d. Use the procedure symplectic-transform? to show that both C and Cp are symplectic.

e. Use the procedure canonical? to verify that both C and Cp are canonical with the appropriate transformed Hamiltonian.

6.2.1 Another View of Time Evolution

We can also show that time evolution generates canonical transformations using the Poincaré–Cartan integral invariant.

Consider a two-dimensional region of phase-space coordinates, R′, at some particular time t′ (see figure 6.1). Let R be the image of this region at time t under time evolution for a time interval of Δ. The time evolution is governed by a Hamiltonian H. Let $\sum_{i} A_{i}$ be the sum of the oriented areas of the projections of R onto the fundamental canonical planes.⁵ Similarly, let $\sum_{i} A_{i}^{'}$ be the sum of oriented projected areas for R′. We will show that $\sum_{i} A_{i} = \sum_{i} A_{i}^{'}$ , and thus the Poincaré integral invariant is preserved by time evolution. By showing that the Poincaré integral invariant is preserved, we will have shown that the qp part of the transformation generated by time evolution is symplectic. From this we can construct canonical transformations from time evolution as before.

In the extended phase space we see that the evolution sweeps out a cylindrical volume with endcaps R′ and R, each at a fixed time. Let R″ be the two-dimensional region swept out by the trajectories that map the boundary of region R′ to the boundary of region R. The regions R, R′, and R″ together form the boundary of a volume of phase-state space.

The Poincaré–Cartan integral invariant on the whole boundary is zero.⁶ Thus

$\begin{matrix} \sum_{i = 0}^{n} A_{i} - \sum_{i = 0}^{n} A_{i}^{'} + \sum_{i = 0}^{n} A_{i}^{″} = 0, & (6.85) \end{matrix}$

where the n index indicates the t, p_t canonical plane. The second term is negative, because in the extended phase space we take the area to be positive if the normal to the surface is outward pointing.

We will show that the Poincaré–Cartan integral invariant for a region of phase space that is generated by time evolution is zero:

$\begin{matrix} \sum_{i = 0}^{n} A_{i}^{″} = 0. & (6.86) \end{matrix}$

This will allow us to conclude

$\begin{matrix} \sum_{i = 0}^{n} A_{i} - \sum_{i = 0}^{n} A_{i}^{'} = 0. & (6.87) \end{matrix}$

The areas of the projection of R and R′ on the t, p_t plane are zero because R and R′ are at constant times, so for these regions the Poincaré–Cartan integral invariant is the same as the Poincaré integral invariant. Thus

$\begin{matrix} \sum_{i = 0}^{n - 1} A_{i} = \sum_{i = 0}^{n - 1} A_{i}^{'} . & (6.88) \end{matrix}$

We are left with showing that the Poincaré–Cartan integral invariant for the region R″ is zero. This will be zero if the contribution from any small piece of R″ is zero. We will show this by showing that the ω form (see equation 5.70) on a small parallelogram in this region is zero. Let (0; q, t; p, p_t) be a vertex of this parallelogram. The parallelogram is specified by two edges ζ₁ and ζ₂ emanating from this vertex. For edge ζ₁ of the parallelogram, we take a constant-time phase-space increment with length Δq and Δp in the q and p directions. The first-order change in the Hamiltonian that corresponds to these changes is

art — **Figure 6.1** All points in some two-dimensional region R′ in phase space at time t′ are evolved for some time interval Δ. At the time t the set of points define the two-dimensional region R. For example, the state labeled by the phase-space coordinates (t′, q′, p′) evolves to the state labeled by the coordinates (t, q, p).

$\begin{matrix} Δ H = \partial_{1} H (t, q, p) Δ q + \partial_{2} H (t, q, p) Δ p & (6.89) \end{matrix}$

for constant time Δt = 0. The increment Δp_t is the negative of ΔH. So the extended phase-space increment is

$\begin{matrix} ζ_{1} = (0; Δ q, 0; Δ p, - \partial_{1} H (t, q, p) Δ q - \partial_{2} H (t, q, p) Δ p) . & (6.90) \end{matrix}$

The edge ζ₂ is obtained by time evolution of the vertex for a time interval Δt. Using Hamilton's equations, we obtain

$\begin{array}{l} ζ_{2} & = (0; D q (t) Δ t, Δ t; D p (t) Δ t, D p_{t} (t) Δ t) \\ = (0; \partial_{2} H (t, q, p) Δ t, Δ t; - \partial_{1} H (t, q, p) Δ t, - \partial_{0} H (t, q, p) Δ t) . & (6.91) \end{array}$

The ω form applied to these incremental states that form the edges of this parallelogram gives the area of the parallelogram:

$\begin{array}{l} ω (ζ_{1}, ζ_{2}) \\ \begin{array}{l} = Q (ζ_{1}) P (ζ_{2}) - P (ζ_{1}) Q (ζ_{2}) \\ = (Δ q, 0) \\ . (- \partial_{1} H (t, q, p) Δ t, - \partial_{0} H (t, q, p) Δ t) \\ - (Δ p, - \partial_{1} H (t, q, p) Δ q - \partial_{2} H (t, q, p) Δ p) \\ . (\partial_{2} H (t, q, p) Δ t, Δ t) \\ = 0. & (6.92) \end{array} \end{array}$

So we may conclude that the integral of this expression over the entire surface of the tube of trajectories is also zero. Thus the Poincaré–Cartan integral invariant is zero for any region that is generated by time evolution.

Having proven that the trajectory tube provides no contribution, we have shown that the Poincaré integral invariant of the two endcaps is the same. This proves that time evolution generates a symplectic qp transformation.

Area preservation of surfaces of section

We can use the Poincaré–Cartan invariant to prove that for autonomous two-degree-of-freedom systems, surfaces of section (constructed appropriately) preserve area.

To show this we consider a surface of section for one coordinate (say q₂) equal to zero. We construct the section by accumulating the (q₁, p₁) pairs. We assume that all initial conditions have the same energy. We compute the sum of the areas of canonical projections in the extended phase space again. Because all initial conditions have the same q₂ = 0 there is no area on the q₂, p₂ plane, and because all the trajectories have the same value of the Hamiltonian the area of the projection on the t, p_t plane is also zero. So the sum of areas of the projections is just the area of the region on the surface of section. Now let each point on the surface of section evolve to the next section crossing. For each point on the section this may take a different amount of time. Compute the sum of the areas again for the mapped region. Again, all points of the mapped region have the same q₂, so the area on the q₂, p₂ plane is zero, and they continue to have the same energy, so the area on the t, p_t plane is zero. So the area of the mapped region is again just the area on the surface of section, the q₁, p₁ plane. Time evolution preserves the sum of areas, so the area on the surface of section is the same as the mapped area.

Thus surfaces of section preserve area provided that the section points are entirely on a canonical plane. For example, to make the Hénon–Heiles surfaces of section (see section 3.6.3) we plotted p_y versus y when x = 0 with p_x ≥ 0. So for all section points the x coordinate has the fixed value 0, the trajectories all have the same energy, and the points accumulated are entirely in the y, p_y canonical plane. So the Hénon–Heiles surfaces of section preserve area.

6.2.2 Yet Another View of Time Evolution

We can show directly from the action principle that time evolution generates a symplectic transformation.

Recall that the Lagrangian action S is

$\begin{matrix} S [q] (t_{1}, t_{2}) = \int_{t_{1}}^{t_{2}} L \circ Γ [q] . & (6.93) \end{matrix}$

We computed the variation of the action in deriving the Lagrange equations. The variation is (see equation 1.33)

$\begin{matrix} δ_{η} S [q] (t_{1}, t_{2}) = (\partial_{2} L \circ Γ [q]) η |_{t_{1}}^{t_{2}} - \int_{t_{1}}^{t_{2}} (E [L] \circ Γ [q]) η, & (6.94) \end{matrix}$

rewritten in terms of the Euler–Lagrange operator $E$ . In the derivation of the Lagrange equations we considered only variations that preserved the endpoints of the path being tested. However, equation (6.94) is true of arbitrary variations. Here we consider variations that are not zero at the endpoints around a realizable path q (one for which $E$ [L] ∘ Γ[q] = 0). For these variations the variation of the action is just the integrated term:

$\begin{matrix} δ_{η} S [q] (t_{1}, t_{2}) = (\partial_{2} L \circ Γ [q]) η |_{t_{1}}^{t_{2}} = p (t_{2}) η (t_{2}) - p (t_{1}) η (t_{1}) . & (6.95) \end{matrix}$

Recall that p and η are structures, and the product implies a sum of products of components.

Consider a continuous family of realizable paths; the path for parameter s is $\tilde{q} (s)$ and the coordinates of this path at time t are $\tilde{q} (s) (t)$ . We define $\tilde{η} (s) = D \tilde{q} (s)$ ; the variation of the path along the family is the derivative of the parametric path with respect to the parameter. Let

$\begin{matrix} \tilde{S} (s) = S [\tilde{q} (s)] (t_{1}, t_{2}) & (6.96) \end{matrix}$

be the value of the action from t₁ to t₂ for path $\tilde{q} (s)$ . The derivative of the action along this parametric family of paths is⁷

$\begin{array}{l} D \tilde{S} (s) & = δ_{\tilde{η} (s)} S [\tilde{q} (s)] \\ = (\partial_{2} L \circ Γ [\tilde{q} (s)]) \tilde{η} (s) |_{t_{1}}^{t_{2}} - \int_{t_{1}}^{t_{2}} (E [L] \circ Γ [\tilde{q} (s)]) \tilde{η} (s) . & (6.97) \end{array}$

Because $\tilde{q} (s)$ is a realizable path, $E [L] \circ Γ [\tilde{q} (s)] = 0$ . So

$\begin{array}{l} D \tilde{S} (s) & = (\partial_{2} L \circ Γ [\tilde{q} (s)]) \tilde{η} (s) |_{t_{1}}^{t_{2}} \\ = \tilde{p} (s) (t_{2}) \tilde{η} (s) (t_{2}) - \tilde{p} (s) (t_{1}) \tilde{η} (s) (t_{1}), & (6.98) \end{array}$

where $\tilde{p} (s)$ is the momentum conjugate to $\tilde{q} (s)$ . The integral of $D \tilde{S}$ is

$\begin{array}{l} S [\tilde{q} (s_{2})] (t_{1}, t_{2}) - S [\tilde{q} (s_{1})] (t_{1}, t_{2}) & = \int_{s_{1}}^{s_{2}} (D \tilde{S}) \\ = \int_{s_{1}}^{s_{2}} (h (t_{2}) - h (t_{1})), & (6.99) \end{array}$

where

$\begin{matrix} h (t) (s) = \tilde{p} (s) (t) \tilde{η} (s) (t) = \tilde{p} (s) (t) D \tilde{q} (s) (t) . & (6.100) \end{matrix}$

In conventional notation the latter line integral is written

$\begin{matrix} \int_{γ_{2}} \sum_{i} p_{i} d q^{i} - \int_{γ_{1}} \sum_{i} p_{i} d q^{i}, & (6.101) \end{matrix}$

where $γ_{1} (s) = \tilde{q} (s) (t_{1})$ and $γ_{2} (s) = \tilde{q} (s) (t_{2})$ .

For a loop family of paths (such that $\tilde{q} (s_{2}) = \tilde{q} (s_{1})$ ), the difference of actions at the endpoints vanishes, so we deduce

$\begin{matrix} \oint_{γ_{2}} \sum_{i} p_{i} d q^{i} = \oint_{γ_{1}} \sum_{i} p_{i} d q^{i}, & (6.102) \end{matrix}$

which is the line-integral version of the integral invariants.

In terms of area integrals, using Stokes's theorem, this is

$\begin{matrix} \begin{matrix} \sum_{i} \int_{R_{2}^{i}} d p_{i} d q^{i} = & \sum_{i} \int_{R_{1}^{i}} d p_{i} d q^{i}, \end{matrix} & (6.103) \end{matrix}$

where $R_{j}^{i}$ are the regions in the ith canonical plane. We have found that the time evolution preserves the integral invariants, and thus time evolution generates a symplectic transformation.

6.3 Lie Transforms

The evolution of a system under any Hamiltonian generates a continuous family of canonical transformations. To study the behavior of some system governed by a Hamiltonian H, it is sometimes appropriate to use a canonical transformation generated by evolution governed by another Hamiltonian-like function W on the same phase space. Such a canonical transformation is called a Lie transform.

The functions H and W are both Hamiltonian-shaped functions defined on the same phase space. Time evolution for an interval Δ governed by H is a canonical transformation $C_{Δ, H}$ . Evolution by W for an interval ϵ is a canonical transformation $C_{ϵ, W}^{'}$ :

$\begin{matrix} (t, q, p) = C_{ϵ, W}^{'} (t, q', p') . & (6.104) \end{matrix}$

The independent variable in the H evolution is time, and the independent variable in the W evolution is an arbitrary parameter of the canonical transformation. We chose $C'$ for the W evolution so that the canonical transformation induced by W does not change the time in the system governed by H.

**Figure 6.2** Time evolution of a trajectory started at the point (t₀, q₀, p₀), governed by the Hamiltonian H, is transformed by the Lie transform governed by the generator W. The time evolution of the transformed trajectory is governed by the Hamiltonian H′.

Figure 6.2 shows how a Lie transform is used to transform a trajectory. We can see from the diagram that the canonical transformations obey the relation

$\begin{matrix} C_{ϵ, W}^{'} \circ C_{Δ, H'} = C_{Δ, H} \circ C_{ϵ, W}^{'} . & (6.105) \end{matrix}$

For generators W that do not depend on the independent variable, the resulting canonical transformation $C_{ϵ, W}^{'}$ is time independent and symplectic. A time-independent symplectic transformation is canonical if the Hamiltonian transforms by composition:⁸

$\begin{matrix} H' = H \circ C_{ϵ, W}^{'} . & (6.106) \end{matrix}$

We will use only Lie transforms that have generators that do not depend on the independent variable.

Lie transforms of functions

The value of a phase-space function F changes if its arguments change. We define the function $E_{ϵ, W}^{'}$ of a function F of phase-space coordinates (t, q, p) by

$\begin{matrix} E_{ϵ, W}^{'} F = F \circ C_{ϵ, W}^{'} . & (6.107) \end{matrix}$

We say that $E_{ϵ, W}^{'} F$ is the Lie transform of the function F.

In particular, the Lie transform advances the coordinate and momentum selector functions Q = I₁ and P = I₂:

$\begin{array}{l} (E_{ϵ, W}^{'} Q) (t, q', p') = (Q \circ C_{ϵ, W}^{'}) (t, q', p') = Q (t, q, p) = q \\ (E_{ϵ, W}^{'} P) (t, q', p') = (P \circ C_{ϵ, W}^{'}) (t, q', p') = P (t, q, p) = p . & (6.108) \end{array}$

So we may restate equation (6.107) as

$\begin{array}{l} (E_{ϵ, W}^{'} F) (t, q', p') \\ = F (t, (E_{ϵ, W}^{'} Q) (t, q', p'), (E_{ϵ, W}^{'} P) (t, q', p')) . & (6.109) \end{array}$

More generally, Lie transforms descend into compositions:

$\begin{matrix} (E_{ϵ, W}^{'} (F \circ G)) = F \circ (E_{ϵ, W}^{'} G) & (6.110) \end{matrix}$

A corollary of the fact that Lie transforms descend into compositions is:

$\begin{array}{l} E_{ϵ_{1}, W_{1}}^{'} E_{ϵ_{2}, W_{2}}^{'} I & = (E_{ϵ_{1}, W_{1}}^{'} (E_{ϵ_{2}, W_{2}}^{'} I)) \circ I \\ = (E_{ϵ_{2}, W_{2}}^{'} I) \circ (E_{ϵ_{1}, W_{1}}^{'} I), & (6.111) \end{array}$

where I is the phase-space identity function: I(t, q, p) = (t, q, p). So the order of application of the operators is reversed from the order of composition of the functions that result from applying the operators.

In terms of $E_{ϵ, W}^{'}$ we have the canonical transformation

$\begin{array}{l} q & = (E_{ϵ, W}^{'} Q) (t, q', p') \\ p & = (E_{ϵ, W}^{'} P) (t, q', p') \\ H' & = E_{ϵ, W}^{'} H . & (6.112) \end{array}$

We can also say

$\begin{matrix} (t, q, p) = (E_{ϵ, W}^{'} I) (t, q', p') . & (6.113) \end{matrix}$

Note that $E_{ϵ, W}^{'}$ has the property:⁹

$\begin{matrix} E_{ϵ_{1} + ϵ_{2}, W}^{'} = E_{ϵ_{1}, W}^{'} \circ E_{ϵ_{2}, W}^{'} = E_{ϵ_{2}, W}^{'} \circ E_{ϵ_{1}, W}^{'} . & (6.114) \end{matrix}$

The identity I is

$\begin{matrix} I = E_{0, W}^{'} . & (6.115) \end{matrix}$

We can define the inverse function

$\begin{matrix} {(E_{ϵ, W}^{'})}^{- 1} = E_{- ϵ, W}^{'} & (6.116) \end{matrix}$

with the property

$\begin{matrix} I = E_{ϵ, W}^{'} \circ {(E_{ϵ, W}^{'})}^{- 1} = {(E_{ϵ, W}^{'})}^{- 1} \circ E_{ϵ, W}^{'} . & (6.117) \end{matrix}$

Simple Lie transforms

For example, suppose we are studying a system for which a rotation would be a helpful transformation. To concoct such a transformation we note that we intend a configuration coordinate to increase uniformly with a given rate. In this case we want an angle to be incremented. The Hamiltonian that consists solely of the momentum conjugate to that configuration coordinate always does the job. So the angular momentum is an appropriate generator for rotations.

The analysis is simple if we use polar coordinates r, θ with conjugate momenta p_r, p_θ. The generator W is just:

$\begin{matrix} W (τ; r, θ; p_{r}, p_{θ}) = p_{θ} & (6.118) \end{matrix}$

The family of transformations satisfies Hamilton's equations:

$\begin{array}{l} D r = 0 \\ D θ = 1 \\ D p_{r} = 0 \\ D p_{θ} = 0. & (6.119) \end{array}$

The only variable that appears in W is p_θ, so θ is the only variable that varies as ϵ is varied. In fact, the family of canonical transformations is

$\begin{array}{l} r = r' \\ θ = θ' + ϵ \\ p_{r} = p_{r}^{'} \\ p_{θ} = p_{θ}^{'} . & (6.120) \end{array}$

So angular momentum is the generator of a canonical rotation.

The example is simple, but it illustrates one important feature of Lie transformations—they give one set of variables entirely in terms of the other set of variables. This should be contrasted with the mixed-variable generating function transformations, which always give a mixture of old and new variables in terms of a mixture of new and old variables, and thus require an inversion to get one set of variables in terms of the other set of variables. This inverse can be written in closed form only for special cases. In general, there is considerable advantage in using a transformation rule that generates explicit transformations from the start. The Lie transformations are always explicit in the sense that they give one set of variables in terms of the other, but for there to be explicit expressions the evolution governed by the generator must be solvable.

Let's consider another example. This time consider a three-degree-of-freedom problem in rectangular coordinates, and take the generator of the transformation to be the z component of the angular momentum:

$\begin{matrix} W (τ; x, y, z; p_{x}, p_{y}, p_{z}) = x p_{y} - y p_{x} . & (6.121) \end{matrix}$

The evolution equations are

$\begin{array}{l} D_{x} = - y \\ D_{y} = x \\ D_{z} = 0 \\ D p_{x} = - p_{y} \\ D p_{y} = p_{x} \\ D p_{z} = 0. & (6.122) \end{array}$

We notice that z and p_z are unchanged, and that the equations governing the evolution of x and y decouple from those of p_x and p_y. Each of these pairs of equations represents simple harmonic motion, as can be seen by writing them as second-order systems. The solutions are

$\begin{array}{l} x = x' \cos ϵ - y' \sin ϵ \\ y = x' \sin ϵ + y' \cos ϵ \\ z = z', & (6.123) \end{array}$

$\begin{array}{l} p_{x} = p_{x}^{'} \cos ϵ - p_{y}^{'} \sin ϵ \\ p_{y} = p_{x}^{'} \sin ϵ + p_{y}^{'} \cos ϵ \\ p_{z} = p_{z}^{'} . & (6.124) \end{array}$

So we see that again a component of the angular momentum generates a canonical rotation. There was nothing special about our choice of axes, so we can deduce that the component of angular momentum about any axis generates rotations about that axis.

Example

Suppose we have a system governed by the Hamiltonian

$\begin{matrix} H (t; x, y; p_{x}, p_{y}) = \frac{1}{2} (p_{x}^{2} + p_{y}^{2}) + \frac{1}{2} a {(x - y)}^{2} + \frac{1}{2} b {(x + y)}^{2} . & (6.125) \end{matrix}$

Hamilton's equations couple the motion of x and y:

$\begin{array}{l} D x = p_{x} \\ D y = p_{y} \\ D p_{x} = - a (x - y) - b (x + y) \\ D p_{y} = a (x - y) - b (x + y) . & (6.126) \end{array}$

We can decouple the system by performing a coordinate rotation by π/4. This is generated by

$\begin{matrix} W (τ; x, y; p_{x}, p_{y}) = x p_{y} - y p_{x}, & (6.127) \end{matrix}$

which is similar to the generator for the coordinate rotation above but without the z degree of freedom. Evolving (τ; x, y; p_x, p_y) by W for an interval of π/4 gives a canonical rotation:

$\begin{array}{l} x = x' \cos π / 4 - y' \sin π / 4 \\ y = x' \sin π / 4 + y' \cos π / 4 \\ p_{x} = p_{x}^{'} \cos π / 4 - p_{y}^{'} \sin π / 4 \\ p_{y} = p_{x}^{'} \sin π / 4 + p_{y}^{'} \cos π / 4 . & (6.128) \end{array}$

Composing the Hamiltonian H with this time-independent transformation gives the new Hamiltonian

$\begin{matrix} H' (t; x', y'; p_{x}^{'}, p_{y}^{'}) = (\frac{1}{2} {(p_{x}^{'})}^{2} + b {(x')}^{2}) + (\frac{1}{2} {(p_{y}^{'})}^{2} + a {(y')}^{2}), & (6.129) \end{matrix}$

which is a Hamiltonian for two uncoupled harmonic oscillators. So the original coupled problem has been transformed by a Lie transform to a new form for which the solution is easy.

6.4 Lie Series

A convenient way to compute a Lie transform is to approximate it with a series. We develop this technique by extending the idea of a Taylor series.

Taylor's theorem gives us a way of approximating the value of a nice enough function at a point near to a point where the value is known. If we know f and all of its derivatives at t then we can get the value of f(t + ϵ), for small enough ϵ, as follows:

$\begin{matrix} f (t + ϵ) = f (t) + ϵ D f (t) + \frac{1}{2} ϵ^{2} D^{2} f (t) + \dots + \frac{1}{n!} ϵ^{n} D^{n} f (t) + \dots . & (6.130) \end{matrix}$

We recall that the power series for the exponential function is

$\begin{matrix} e^{x} = 1 + x + \frac{1}{2} x^{2} + \dots + \frac{1}{n!} x^{n} + \dots . & (6.131) \end{matrix}$

This suggests that we can formally construct a Taylor-series operator as the exponential of a differential operator¹⁰

$\begin{matrix} e^{ϵ D} = I + ϵ D + \frac{1}{2} {(ϵ D)}^{2} + \dots + \frac{1}{n!} {(ϵ D)}^{n} + \dots & (6.132) \end{matrix}$

and write

$\begin{matrix} f (t + ϵ) = (e^{ϵ D} f) (t) . & (6.133) \end{matrix}$

We have to be a bit careful here: (ϵD)² = ϵDϵD. We can turn it into ϵ²D² only because ϵ is a scalar constant, which commutes with every differential operator. But with this caveat in mind we can define the differential operator

$\begin{matrix} (e^{ϵ D} f) (t) = f (t) + ϵ D f (t) + \frac{1}{2} ϵ^{2} D^{2} f (t) + \dots + \frac{1}{n!} ϵ^{n} D^{n} f (t) + \dots & (6.134) \end{matrix}$

Before going on, it is interesting to compute with these a bit. In the code transcripts that follow we develop the series by exponentiation. We can examine the series incrementally by looking at successive elements of the (infinite) sequence of terms of the series. The procedure series:for-each is an incremental traverser that applies its first argument to successive elements of the series given as its second argument. The third argument (when given) specifies the number of terms to be traversed. In each of the following transcripts we print simplified expressions for the successive terms.

The first thing to look at is the general Taylor expansion for an unknown literal function, expanded around t, with increment ϵ. Understanding what we see in this simple problem will help us understand more complex problems later.

(series:for-each print-expression
                 (((exp (* 'epsilon D))
                   (literal-function 'f))
                  't)
                 6)

(f t)
(* ((D f) t) epsilon)
(* 1/2 (((expt D 2) f) t) (expt epsilon 2))
(* 1/6 (((expt D 3) f) t) (expt epsilon 3))
(* 1/24 (((expt D 4) f) t) (expt epsilon 4))
(* 1/120 (((expt D 5) f) t) (expt epsilon 5))
...

We can also look at the expansions of particular functions that we recognize, such as the expansion of sin around 0.

(series:for-each print-expression
                 (((exp (* 'epsilon D)) sin) 0)
                 6)

0
epsilon
0
(* -1/6 (expt epsilon 3))
0
(* 1/120 (expt epsilon 5))
...

It is often instructive to expand functions we usually don't remember, such as $f (x) = \sqrt{1 + x}$ .

(series:for-each print-expression
                 (((exp (* 'epsilon D))
                   (lambda (x) (sqrt (+ x 1))))
                  0)
                 6)

1
(* 1/2 epsilon)
(* -1/8 (expt epsilon 2))
(* 1/16 (expt epsilon 3))
(* -5/128 (expt epsilon 4))
(* 7/256 (expt epsilon 5))
...

Exercise 6.7: Binomial series

Develop the binomial expansion of (1 + x)ⁿ as a Taylor expansion. Of course, it must be the case that for n a positive integer all of the coefficients except for the first n + 1 are zero. However, in the general case, for symbolic n, the coefficients are rather complicated polynomials in n. For example, you will find that the eighth term is

(+ (* 1/5040 (expt n 7))
   (* -1/240 (expt n 6))
   (* 5/144 (expt n 5))
   (* -7/48 (expt n 4))
   (* 29/90 (expt n 3))
   (* -7/20 (expt n 2))
   (* 1/7 n))

These terms must evaluate to the entries in Pascal's triangle. In particular, this polynomial must be zero for n < 7. How is this arranged?

Dynamics

Now, to play this game with dynamical functions we want to provide a derivative-like operator that we can exponentiate, which will give us the time-advance operator. The key idea is to write the derivative of the function in terms of the Poisson bracket. Equation (3.80) shows how to do this in general:

$\begin{matrix} D (F \circ σ) = ({F, H} + \partial_{0} F) \circ σ . & (6.135) \end{matrix}$

We define the operator D_H by

$\begin{matrix} D_{H} F = \partial_{0} F + {F, H}, & (6.136) \end{matrix}$

$\begin{matrix} D_{H} F \circ σ = D (F \circ σ), & (6.137) \end{matrix}$

and iterates of this operator can be used to compute higher-order derivatives:

$\begin{matrix} D^{n} (F \circ σ) = D_{H}^{n} F \circ σ . & (6.138) \end{matrix}$

We can express the advance of the path function f = F ∘ σ for an interval ϵ with respect to H as a power series in the derivative operator D_H applied to the phase-space function F and then composed with the path:

$\begin{matrix} f (t + ϵ) = (e^{ϵ D} f) (t) = (e^{ϵ D_{H}} F) \circ σ (t) . & (6.139) \end{matrix}$

Indeed, we can implement the time-advance operator E_ϵ,H with this series, when it converges.

Exercise 6.8: Iterated derivatives

Show that equation (6.138) is correct.

Exercise 6.9: Lagrangian analog

Compare D_H with the total time derivative operator. Recall that

$D_{t} F \circ Γ [q] = D (F \circ Γ [q])$

abstracts the derivative of a function of a path through state space to a function of the derivatives of the path. Define another derivative operator D_L, analogous to D_H, that would give the time derivative of functions along Lagrangian state paths that are solutions of Lagrange's equations for a given Lagrangian. How might this be useful?

For time-independent Hamiltonian H and time-independent state function F, we can simplify the computation of the advance of F. In this case we define the Lie derivative operator L_H such that

$\begin{matrix} L_{H} F = {F, H}, & (6.140) \end{matrix}$

which reads “the Lie derivative of F with respect to H.”¹¹ So

$\begin{matrix} D_{H} = \partial_{0} + L_{H} & (6.141) \end{matrix}$

and for time-independent F

$\begin{matrix} D (F \circ σ) = L_{H} F \circ σ . & (6.142) \end{matrix}$

We can iterate this process to compute higher derivatives. So

$\begin{matrix} L_{H}^{2} F = {{F, H}, H}, & (6.143) \end{matrix}$

and successively higher-order Poisson brackets of F with H give successively higher-order derivatives when evaluated on the trajectory.

Let f = F ∘ σ. We have

$\begin{matrix} D f = (L_{H} F) \circ σ & (6.144) \end{matrix}$

$\begin{array}{l} D^{2} f = (L_{H}^{2} F) \circ σ \\ \dots . & (6.145) \end{array}$

Thus we can rewrite the advance of the path function f for an interval ϵ with respect to H as a power series in the Lie derivative operator applied to the phase-space function F and then composed with the path:

$\begin{matrix} f (t + ϵ) = (e^{ϵ D} f) (t) = (e^{ϵ L_{H}} F) \circ σ (t) . & (6.146) \end{matrix}$

We can implement the time-advance operator $E_{ϵ, H}^{'}$ with the Lie series $e^{ϵ L_{H}} F$ when this series converges:

$\begin{matrix} E_{ϵ, H}^{'} F = e^{ϵ L_{H}} F . & (6.147) \end{matrix}$

We have shown that time evolution is canonical, so the series above are formal representations of canonical transformations as power series in the time. These series may not converge, even if the evolution governed by the Hamiltonian H is well defined.

Computing Lie series

We can use the Lie transform as a computational tool to examine the local evolution of dynamical systems. We define the Lie derivative of F as a derivative-like operator relative to the given Hamiltonian function, H:¹²

(define ((Lie-derivative H) F)
  (Poisson-bracket F H))

We also define a procedure to implement the Lie transform:¹³

(define (Lie-transform H t)
  (exp (* t (Lie-derivative H))))

Let's start by examining the beginning of the Lie series for the position of a simple harmonic oscillator of mass m and spring constant k. We can implement the Hamiltonian as

(define ((H-harmonic m k) state)
  (+ (/ (square (momentum state)) (* 2 m))
     (* 1/2 k (square (coordinate state)))))

We make the Lie transform (series) by passing the Lie-transform operator an appropriate Hamiltonian function and an interval to evolve for. The resulting operator is then given the coordinate procedure, which selects the position coordinates from the phase-space state. The Lie transform operator returns a procedure that, when given a phase-space state composed of a dummy time, a position x0, and a momentum p0, returns the position resulting from advancing that state by the interval dt.

(series:for-each print-expression
                 (((Lie-transform (H-harmonic 'm 'k) 'dt)
                   coordinate)
                  (up 0 'x0 'p0))
                 6)

x0
(/ (* dt p0) m)
(/ (* -1/2 (expt dt 2) k x0) m)
(/ (* -1/6 (expt dt 3) k p0) (expt m 2))
(/ (* 1/24 (expt dt 4) (expt k 2) x0) (expt m 2))
(/ (* 1/120 (expt dt 5) (expt k 2) p0) (expt m 3))
...

We should recognize the terms of this series. We start with the initial position x₀. The first-order correction (p₀/m)dt is due to the initial velocity. Next we find an acceleration term (−kx₀/2m)dt² due to the restoring force of the spring at the initial position.

The Lie transform is just as appropriate for showing us how the momentum evolves over the interval:

(series:for-each print-expression
                 (((Lie-transform (H-harmonic 'm 'k) 'dt)
                   momentum)
                  (up 0 'x0 'p0))
                 6)

p0
(* -1 dt k x0)
(/ (* -1/2 (expt dt 2) k p0) m)
(/ (* 1/6 (expt dt 3) (expt k 2) x0) m)
(/ (* 1/24 (expt dt 4) (expt k 2) p0) (expt m 2))
(/ (* -1/120 (expt dt 5) (expt k 3) x0) (expt m 2))
...

In this series we see how the initial momentum p₀ is corrected by the effect of the restoring force −kx₀dt, etc.

What is a bit more fun is to see how a more complex phase-space function is treated by the Lie series expansion. In the experiment below we examine the Lie series developed by advancing the harmonic-oscillator Hamiltonian, by means of the transform generated by the same harmonic-oscillator Hamiltonian:

(series:for-each print-expression
                 (((Lie-transform (H-harmonic 'm 'k) 'dt)
                   (H-harmonic 'm 'k))
                  (up 0 'x0 'p0))
                 6)

(/ (+ (* 1/2 k m (expt x0 2)) (* 1/2 (expt p0 2))) m)
0
0
0
0
0
...

As we would hope, the series shows us the original energy expression $(k / 2) x_{0}^{2} + (1 / 2 m) p_{0}^{2}$ as the first term. Each subsequent correction term turns out to be zero—because the energy is conserved.

Of course, the Lie series can be used in situations where we want to see the expansion of the motion of a system characterized by a more complex Hamiltonian. The planar motion of a particle in a general central field (see equation 3.100) is a simple problem for which the Lie series is instructive. In the following transcript we can see how rapidly the series becomes complicated. It is worth one's while to try to interpret the additive parts of the third (acceleration) term shown below:

(series:for-each print-expression
                 (((Lie-transform
                     (H-central-polar 'm (literal-function 'U))
                     'dt)
                   coordinate)
                  (up 0
                      (up 'r_0 'phi_0)
                      (down 'p_r_0 'p_phi_0)))
                 4)

(up r_0 phi_0)
(up (/ (* dt p_r_0) m)
    (/ (* dt p_phi_0) (* m (expt r_0 2))))
(up
  (+ (/ (* -1/2 ((D U) r_0) (expt dt 2)) m)
     (/ (* 1/2 (expt dt 2) (expt p_phi_0 2))
        (* (expt m 2) (expt r_0 3))))
  (/ (* -1 (expt dt 2) p_phi_0 p_r_0)
     (* (expt m 2) (expt r_0 3))))
(up
  (+ (/ (* -1/6 (((expt D 2) U) r_0) (expt dt 3) p_r_0)
        (expt m 2))
     (/ (* -1/2 (expt dt 3) (expt p_phi_0 2) p_r_0)
        (* (expt m 3) (expt r_0 4)))))
(+ (/ (* 1/3 ((D U) r_0) (expt dt 3) p_phi_0)
      (* (expt m 2) (expt r_0 3))))
(/ (* -1/3 (expt dt 3) (expt p_phi_0 3))
   (* (expt m 3) (expt r_0 6)))
(/ (* (expt dt 3) p_phi_0 (expt p_r_0 2))
   (* (expt m 3) (expt_r_0 4)))
...

Of course, if we know the closed-form Lie transform it is probably a good idea to take advantage of it, but when we do not know the closed form the Lie series representation of it can come in handy.

6.5 Exponential Identities

The composition of Lie transforms can be written as products of exponentials of Lie derivative operators. In general, Lie derivative operators do not commute. If A and B are non-commuting operators, then the exponents do not combine in the usual way:

$\begin{matrix} e^{A} e^{B} \neq e^{A + B} . & (6.148) \end{matrix}$

So it will be helpful to recall some results about exponentials of non-commuting operators.

We introduce the commutator

$\begin{matrix} [A, B] = A B - B A . & (6.149) \end{matrix}$

The commutator is bilinear and satisfies the Jacobi identity

$\begin{matrix} [A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0, & (6.150) \end{matrix}$

which is true for all A, B, and C.

We introduce a notation Δ_A for the commutator with respect to the operator A:

$\begin{matrix} Δ_{A} B = [A, B] . & (6.151) \end{matrix}$

In terms of Δ the Jacobi identity is

$\begin{matrix} [Δ_{A}, Δ_{B}] = Δ_{[A, B]} . & (6.152) \end{matrix}$

An important identity is

$\begin{array}{l} e^{C} A e^{- C} & = e^{Δ C} A \\ = A + [C, A] + \frac{1}{2} [C, [C, A]] + \dots . & (6.153) \end{array}$

We can check this term by term.

We see that

$\begin{matrix} e^{C} A^{2} e^{- C} = e^{C} A e^{- C} e^{C} A e^{- C} = {(e^{C} A e^{- C})}^{2}, & (6.154) \end{matrix}$

using e^−Ce^C = I, the identity operator. Using the same trick, we find

$\begin{matrix} e^{C} A^{n} e^{- C} = {(e^{C} A e^{- C})}^{n} . & (6.155) \end{matrix}$

More generally, if f can be represented as a power series then

$\begin{matrix} e^{C} f (A, B, \dots) e^{- C} = f (e^{C} A e^{- C}, e^{C} B e^{- C}, \dots) . & (6.156) \end{matrix}$

For instance, applying this to the exponential function yields

$\begin{matrix} e^{C} e^{A} e^{- C} = e^{e^{C} A e^{- C}} . & (6.157) \end{matrix}$

Using equation (6.153), we can rewrite this as

$\begin{matrix} e^{Δ C} e^{A} = e^{e^{Δ C} A} . & (6.158) \end{matrix}$

Exercise 6.10: Commutators of Lie derivatives

a. Let W and W′ be two phase-space state functions. Use the Poisson-bracket Jacobi identity (3.93) to show

$\begin{matrix} [L_{W,} L_{W'}] = - L_{{W, W'}} . & (6.159) \end{matrix}$

b. Consider the phase-space state functions that give the components of the angular momentum in terms of rectangular canonical coordinates

$\begin{matrix} J_{x} (t; x, y, z; p_{x}, p_{y}, p_{z}) = y p_{z} - z p_{y} \\ J_{y} (t; x, y, z; p_{x}, p_{y}, p_{z}) = z p_{x} - x p_{z} \\ J_{z} (t; x, y, z; p_{x}, p_{y}, p_{z}) = x p_{y} - y p_{x .} \end{matrix}$

Show

$\begin{matrix} [L_{J_{x}}, L_{J_{y}}] + L_{J_{z}} = 0. & (6.160) \end{matrix}$

c. Relate the Jacobi identity for operators to the Poisson-bracket Jacobi identity.

Exercise 6.11: Baker–Campbell–Hausdorff

Derive the rule for combining exponentials of non-commuting operators:

$\begin{matrix} e^{A} e^{B} = e^{A + B + \frac{1}{2} [A, B] + \dots} . & (6.161) \end{matrix}$

6.6 Summary

The time evolution of any Hamiltonian system induces a canonical transformation: if we consider all possible initial states of a Hamiltonian system and follow all of the trajectories for the same time interval, then the map from the initial state to the final state of each trajectory is a canonical transformation. This is true for any interval we choose, so time evolution generates a continuous family of canonical transformations.

We generalized this idea to generate continuous canonical transformations other than those generated by time evolution. Such transformations will be especially useful in support of perturbation theory.

In rare cases a canonical transformation can be made to a representation in which the problem is easily solvable: when all coordinates are cyclic and all the momenta are conserved. Here we investigated the Hamilton–Jacobi method for finding such canonical transformations. For problems for which the Hamilton–Jacobi method works, we find that the time evolution of the system is given as a canonical transformation.

6.7 Projects

Exercise 6.12: Symplectic integration

Consider a system for which the Hamiltonian H can be split into two parts, H₀ and H₁, each of which describes a system that can be efficiently evolved:

$\begin{matrix} H = H_{0} + H_{1} . & (6.162) \end{matrix}$

Symplectic integrators construct approximate solutions for the Hamiltonian H from those of H₀ and H₁.

We construct a map of the phase space onto itself in the following way (see [47, 48, 49]). Define δ_2π(t) to be an infinite sum of Dirac delta functions, with interval 2π,

$\begin{matrix} δ_{2 π} (t) = \sum_{n = - \infty}^{\infty} δ (t - 2 π n), & (6.163) \end{matrix}$

with representation as a Fourier series

$\begin{matrix} 2 π δ_{2 π} (t) = \sum_{n = - \infty}^{\infty} \cos (n t) . & (6.164) \end{matrix}$

Recall that a δ function has the property that $\int_{- a}^{a} f δ = f (0)$ for any positive a and continuous real-valued function f. It is fruitful to think of the delta function as a limit of a function Δ_>h that has the value Δ_h(t) = 1/h in the interval −h/2 < t < h/2 and zero otherwise. Now consider the mapping Hamiltonian

$\begin{matrix} H_{m} (t, q, p) = H_{0} (t, q, p) + 2 π δ_{2 π} (Ω t) H_{1} (t, q, p) . & (6.165) \end{matrix}$

The evolution of the system between the delta functions is governed solely by H₀. To understand how the system evolves across the delta functions think of the delta functions in terms of Δ_h as h goes to zero. Hamilton's equations contain terms from H₁ with the factor 1/h, which is large, and terms from H₀ that are independent of h. So as h goes to zero, H₀ makes a negligible contribution to the evolution. The evolution across the delta functions is governed solely by H₁. The evolution of H_m is obtained by alternately evolving the system according to the Hamiltonian H₀ for an interval Δt = 2π/Ω and then evolving the system according to the Hamiltonian H₁ for the same time interval. The longer-term evolution of H_m is obtained by iterating this map of the phase space onto itself. Fill in the details to show this.

a. In terms of Lie series, the evolution of H_m for one delta function cycle Δt is generated by

$\begin{matrix} e^{Δ t L_{H_{m}}} I = (e^{Δ t L_{H_{1}}} I) \circ (e^{Δ t L_{H_{0}}} I) . & (6.166) \end{matrix}$

The evolution of H_m approximates the evolution of H. Identify the noncommuting operator A with $L_{H_{0}}$ and B with $L_{H_{1}}$ .

Use the Baker–Campbell–Hausdorff identity (equation 6.161) to deduce that the local truncation error (the error in the state after one step Δt) is proportional to (Δt)². This mapping is a first-order integrator.

b. By merely changing the phase of the delta functions, we can reduce the truncation error of the map, and the map becomes a second-order integrator. Instead of making a map by alternating a full step Δt governed by H₀ with a full step Δt governed by H₁, we can make a map by evolving the system for a half step Δt/2 governed by H₀, then for a full step Δt governed by H₁, and then for another half step Δt/2 governed by H₀. In terms of Lie series the second-order map is generated by

$\begin{matrix} e^{Δ t L_{H_{m}}} I = (e^{(Δ t / 2) L_{H_{0}}} I) \circ (e^{Δ t L_{H_{1}}} I) \circ (e^{(Δ t / 2) L_{H_{0}}} I) . & (6.167) \end{matrix}$

Confirm that the Hamiltonian governing the evolution of this map is the same as the one above but with the phase of the delta functions shifted. Show that the truncation error of one step of this second-order map is indeed proportional to (Δt)³.

c. Consider the Hénon–Heiles system. We can split the Hamiltonian (equation 3.135 on page 252) into two solvable Hamiltonians in the following way:

$\begin{array}{l} H_{0} (t; x, y; p_{x}, p_{y}) = (p_{x}^{2} + p_{y}^{2}) / 2 + (x^{2} + y^{2}) / 2 \\ H_{1} (t; x, y; p_{x}, p_{y}) = x^{2} y - y^{3} / 3. & (6.168) \end{array}$

Hamiltonian H₀ is the Hamiltonian of two uncoupled linear oscillators; Hamiltonian H₁ is a nonlinear coupling. The trajectories of the systems described by each of these Hamiltonians can be expressed in closed form, so we do not need the Lie series for actually integrating each part. The Lie series expansions are used only to determine the order of the integrator.

Write programs that implement first-order and second-order maps for the Hénon–Heiles problem. Note that these maps cannot be of the same form as the Poincaré maps that we used to make surfaces of section, because these maps must take and return the entire state. (Why?) An appropriate template for such a map is (1st-order-map state dt). This procedure must return a state.

d. Examine the evolution of the energy for both chaotic and quasiperiodic initial conditions. How does the magnitude of the energy error scale with the step size? Is this consistent with the order of the integrator deduced above? How does the energy error grow with time?

e. The generation of surfaces of section from these maps is complicated by the fact that these maps have to maintain their state even though a plotting point might be required between two samples. The maps you made in part c regularly sample the state with the integrator timestep. If we must plot a point between two steps we cannot restart the integrator at the state of the plotted point, because that would lose the phase of the integrator step. To make this work the map must plot points but keep its rhythm, so we have to work around the fact that explore-map restarts at each plotted point. Here is some code that can be used to construct a Poincaré-type map that can be used with the explorer:

(define ((HH-collector win advance E dt sec-eps n) x y done fail)
  (define (monitor last-crossing-state state)
    (plot-point win
                (ref (coordinate last-crossing-state) 1)
                (ref (momentum last-crossing-state) 1)))
  (define (pmap x y cont fail)
    (find-next-crossing y advance dt sec-eps cont))
  (define collector (default-collector monitor pmap n))
  (cond ((and (up? x) (up? y))      ;passed states
         (collector x y done fail))
        ((and (number? x) (number? y))   ;initialization
         (let ((initial-state (section->state E x y)))
           (if (not initial-state)
             (fail)
             (collector initial-state initial-state done fail))))
        (else (error "bad input to HH-collector" x y))))

You will notice that the iteration of the map and the plotting of the points is included in this collector, so the map that this produces must replace these parts of the explorer. The #f argument to explore-map allows us to replace the appropriate parts of the explorer with our combination map iterator and plotter HH-collector.

(explore-map win
             (HH-collector win 1st-order-map 0.125 0.1 1.e-10 1000)
             #f)

Generate surfaces of section using the second-order map. Does the map preserve the chaotic or quasiperiodic character of trajectories?

¹Remember that ∂_1,0 means the derivative with respect to the first coordinate position.

²Our theorems about which transformations are canonical are still valid, because a shift of time does not affect the symplectic condition. See footnote 14 in Chapter 5.

³Partial derivatives of structured arguments do not generally commute, so this deduction is not as simple as it may appear.

⁴The transformation S_Δ is an identity on the qp components, so it is symplectic. Although it adjusts the time, it is not a time-dependent transformation in that the qp components do not depend upon the time. Thus, if we adjust the Hamiltonian by composition with S_Δ we have a canonical transformation.

⁵By Stokes's theorem we may compute the area of a region by a line integral around the boundary of the region. We define the positive sense of the area to be the area enclosed by a curve that is traversed in a counterclockwise direction, when drawn on a plane with the coordinate on the abscissa and the momentum on the ordinate.

⁶We can see this as follows. Let γ be any closed curve in the boundary. This curve divides the boundary into two regions. By Stokes's theorem the integral invariant over both of these pieces can be written as a line integral along this boundary, but they have opposite signs, because γ is traversed in opposite directions to keep the surface on the left. So we conclude that the integral invariant over the entire surface is zero.

⁷Let f be a path-dependent function, $\tilde{η} (s) = D \tilde{q} (s)$ , and $g (s) = f [\tilde{q} (s)]$ . The variation of f at $\tilde{q} (s)$ in the direction $\tilde{η} (s)$ is $δ_{\tilde{η} (s)} f [\tilde{q} (s)] = D g (s)$ .

⁸In general, the generator W could depend on its independent variable. If so, it would be necessary to specify a rule that gives the initial value of the independent variable for the W evolution. This rule may or may not depend upon the time. If the specification of the independent variable for the W evolution does not depend on time, then the resulting canonical transformation $C_{ϵ, W}^{'}$ is time independent and the Hamiltonians transform by composition. If the generator W depends on its independent variable and the rule for specifying its initial value depends on time, then the transformation $C_{ϵ, W}^{'}$ is time dependent. In this case there may need to be an adjustment to the relation between the Hamiltonians H and H′. In the extended phase space all these complications disappear: There is only one case. We can assume all generators W are independent of the independent variable.

⁹The set of transformations $E_{ϵ, W}^{'}$ with the operation composition and with parameter ϵ is a one-parameter Lie group.

¹⁰We are playing fast and loose with differential operators here. In a formal treatment it is essential to prove that these games are mathematically well defined and have appropriate convergence properties.

¹¹Our L_H is a special case of what is referred to as a Lie derivative in differential geometry. The more general idea is that a vector field defines a flow. The Lie derivative of an object with respect to a vector field gives the rate of change of the object as it is dragged along with the flow. In our case the flow is the evolution generated by Hamilton's equations, with Hamiltonian H.

¹²Actually, we define the Lie derivative slightly differently, as follows:

(define ((Lie-derivative-procedure H) F)
  (Poisson-bracket F H))
(define Lie-derivative
  (make-operator Lie-derivative-procedure 'Lie-derivative))

The reason is that we want Lie-derivative to be an operator, which is just like a function except that the product of operators is interpreted as composition, whereas the product of functions is the function computing the product of their values.

¹³The Lie-transform procedure here is also defined to be an operator, just like Lie-derivative.