Home Blog index Previous Next About Privacy policy

banner

Dirac-Feynman-Berezin amplitudes 4 - Action for fields

By Chris Austin. 29 November 2020.

An earlier version of this post was published on another website on 17 November 2012.

This is the fourth part of a ten-part post on the foundation of our understanding of high energy physics, which is Richard Feynman's functional integral. The first three parts are Action, Multiple Molecules, and Electromagnetism, and the following parts, which will appear at intervals of about a month, are Radiation in an Oven, Matrix Multiplication, The Functional Integral, Gauge Invariance, Photons, and Interactions.

I'm hoping this blog will be fun and useful for everyone with an interest in science, so although I'll pop up a few formulae again, I'll try as usual to keep them friendly by explaining all the pieces. Please feel free to ask a question in the Comments, if you think anything in the post is unclear.

The clue that led to the discovery of quantum mechanics, whose principles are summarized in Feynman's functional integral, came from the attempted application to electromagnetic radiation of discoveries about heat and temperature. We looked at those discoveries about heat and temperature in the second part of the post, and in the third part of the post, we looked at how James Clerk Maxwell, just after the middle of the nineteenth century, was able to identify light as waves of oscillating electric and magnetic fields, and to calculate the speed of light from measurements of electrical and magnetic effects. Feynman's functional integral for a physical system depends on a property of the system called its action, and in the first part of the post, we derived Sir Isaac Newton's second law of motion from Pierre-Louis de Maupertuis's principle of stationary action.

From our study of the way heat and temperature arise from the random motions of large numbers of microscopic objects, in the second part of the post, the existence of a well-defined temperature, in a region that is in thermal equilibrium, is a consequence of the conservation of energy, which we derived in the first part of the post, here, for a collection of objects subject to Newton's laws of motion. The energy of a system is related to the formula for its action, and today we'll calculate the energy of electromagnetic fields and electrically charged particles moving in a vacuum, by deriving Maxwell's equations summarizing Coulomb's law and Ampère's law in terms of the voltage field $V$ and the vector potential field $A$, in the forms we derived them in the third part of the post, here and here, by de Maupertuis's principle from an action, and we'll also derive the electrostatic force on an electrically charged particle, as in the third part of the post, here, and the force on a moving charged particle from the magnetic induction field $B$, as in the third part of the post, here, by de Maupertuis's principle from that same action.

From the formula for the electric field strength $E$ in terms of the voltage field $V$ and the vector potential field $A$, in the third part of the post, here, the formula for the electrostatic force $F_{\mathrm{e.s.}}$ on a particle of electric charge $q$, here, becomes:

\begin{displaymath}\left( F_{\mathrm{e.s.}} \right)_a = - q \frac{\partial V}{\partial x_a} - 
q \frac{\partial A_a}{\partial t} . \end{displaymath}

And from the formula for the magnetic induction field $B$ in terms of the vector potential field $A$, in the third part of the post, here, the formula for the magnetic induction force $F_{\mathrm{m.i.}}$ on a particle of electric charge $q$ moving with velocity $v$, here, becomes:

\begin{displaymath}\left( F_{\mathrm{m.i.}} \right)_a = q \sum_{b, c} \epsilon_{... 
... b c} \epsilon_{c d e} v_b 
\frac{\partial}{\partial x_d} A_e, \end{displaymath}

where as in the third part of the post, here, each index $a, b, c, d, e, \ldots$ from the start of the lower-case English alphabet can take values 1, 2, 3. From the formula in the third part of the post, here, and a calculation similar to the one in the third part of the post, here, this simplifies to:

\begin{displaymath}\left( F_{\mathrm{m.i.}} \right)_a = q \sum_b v_b \left( 
\fr... 
...artial x_a} A_b - \frac{\partial}{\partial x_b} A_a 
\right) . \end{displaymath}

The action of a collection of electrically charged point particles moving slowly compared to the speed of light in a vacuum, and the voltage field $V$ and the vector potential field $A$, is:

\begin{displaymath}S = \int_{t_1}^{t_2} T \mathrm{d} t + S_{\mathrm{e.m.}} + 
S_{\mathrm{{{int}}.}}, \end{displaymath}

where $T$ is the kinetic energy of the particles, as in the first part of the post, here. The electromagnetic action $S_{\mathrm{e.m.}}$ is:

\begin{displaymath}S_{\mathrm{e.m.}} = \int_{t_1}^{t_2} \int \frac{1}{2} \sum_a ... 
...ght) \mathrm{d} x_1 \mathrm{d} x_2 \mathrm{d} x_3 \mathrm{d} t 
\end{displaymath}

\begin{displaymath}= \int_{t_1}^{t_2} \int \frac{1}{2} \sum_a \left( \epsilon_0 ... 
...t) \mathrm{d} x_1 \mathrm{d} x_2 \mathrm{d} x_3 
\mathrm{d} t. \end{displaymath}

The integral $\int \ldots \mathrm{d} x_1 \mathrm{d} x_2 \mathrm{d} x_3$ is over all space, and is often abbreviated to $\int \ldots \mathrm{d}^3 x$. From the formulae for the electric field strength $E$ and the magnetic induction field $B$ in terms of the voltage field $V$ and the vector potential field $A$, as in the third part of the post, here and here, we find:

\begin{displaymath}S_{\mathrm{e.m.}} = \int_{t_1}^{t_2} \int \frac{1}{2} \left( ... 
...artial A_e}{\partial x_d} \right) \mathrm{d}^3 x \mathrm{d} t. \end{displaymath}

From the formula in the third part of the post, here, and a calculation similar to the one in the third part of the post, here, we find:

\begin{displaymath}S_{\mathrm{e.m.}} = \int_{t_1}^{t_2} \int \frac{1}{2} \left( ... 
...b}{\partial x_c} 
\right) \right) \mathrm{d}^3 x \mathrm{d} t. \end{displaymath}

The interaction action $S_{\mathrm{{{int}}.}}$ is:

\begin{displaymath}S_{\mathrm{{{int}}.}} = - \int_{t_1}^{t_2} \sum_I 
q_I \left(... 
... \frac{\mathrm{d} x_{a I}}{\mathrm{d} t} \right) \mathrm{d} t, \end{displaymath}

where the notation is the same as we used in the first part of the post, here, and $q$ now represents the list of the electric charges of all the particles, so that $q_I$ is the electric charge of the $I$'th particle.

It is convenient to let a single symbol, say $Y$, represent the entire collection of data that includes the position data $x$ of all the particles at all times, the value of the voltage field $V$ at all positions and times, and the values of the three components of the vector potential field $A$ at all positions and times. We'll let indexes $i, j, k, \ldots$ distinguish the quantities in the collection $Y$ at each time $t$. The possible values of each index $i, j, k, \ldots$ are $x_{a I}$, the $a$'th position coordinate of the $I$'th particle; $V_x$, the value of the voltage field at position $x$; and $A_{a x}$, the value of the $a$'th component of the vector potential field at position $x$. $Y_{i t}$ or $Y_i \left( t \right)$ means the value of the quantity $Y_i$ at time $t$, so $Y_{x_{a I}} \left( t \right) = x_{a I t} = 
x_{a I} \left( t \right)$; $Y_{V_x} \left( t \right) = V_{x t} = V_x \left( t 
\right)$; and $Y_{A_{a x}} \left( t \right) = A_{a x t} = A_{a x} \left( t 
\right)$. This sort of notation is called DeWitt's compact index notation, after Bryce DeWitt.

A sum such as $\sum_i$ means a sum over all possible values of the index $i$, and where the possible values are continuous, as in the position index of $V_x$ and $A_{a x}$, it means an integral over the possible values. The possible values of the index $i$ correspond to the dynamical quantities that can vary independently, which are called "degrees of freedom". Thus $\sum_i$ means a sum over the degrees of freedom.

To extend the definition of $\frac{\partial X}{\partial y_q}$, as in the first part of the post, here, to a quantity $X$ that depends on a collection of data such as $Y$, where the range of possible values of an index such as $i$ that distinguishes the quantities in the collection includes continuous ranges of values, I'll first restate the definition of $\frac{\partial X}{\partial y_q}$ in terms of the Kronecker delta $\delta_{p q}$, which I defined in the first part of the post, here. The range of possible values of the indexes $p$ and $q$ here can be any discrete range of values: the definition of the Kronecker delta $\delta_{a b}$ that I used in the third part of the post, here, where the indexes $a$ and $b$ can take values 1, 2, or 3, is a special case.

In the context of the notation where $y$ represents a collection of quantities, and the index $q$ distinguishes the quantities in the collection, the expression $\delta_p$ represents the collection of quantities such that $y_q = \delta_{p q}$. Thus the expression $y + \varepsilon \delta_p$ represents the collection of quantities such that $\left( y + \varepsilon 
\delta_p \right)_q = y_q$ for $q \neq p$, and $\left( y + \varepsilon \delta_p 
\right)_p = y_p + \varepsilon$. Thus we can restate the definition of $\frac{\partial X}{\partial y_q}$, as in the first part of the post, here, as:

\begin{displaymath}\frac{\partial X}{\partial y_q} = \mathrm{\lim}_{\varepsilon ... 
...repsilon \delta_q \right) - X \left( y 
\right)}{\varepsilon}, \end{displaymath}

where $\mathrm{\lim}_{\varepsilon \rightarrow 0}$ means the limit of what follows it, as $\varepsilon$ tends to 0. In words, $\frac{\partial X}{\partial y_q}$ is the rate of change of a quantity $X$ that depends on a collection of quantities $y$, as the quantity $y_q$ changes, while all the other quantities in $y$ have fixed values.

A quantity $X$ that depends on a discrete collection of quantities $y$, some of which can take continuous values, is sometimes said to be a function of $y$. For example $\mathrm{\cos} \left( \theta \right)$, which we studied in the first part of the post, here, is said to be a function of the angle $\theta$. A quantity $X$ that depends on a collection of quantities such as $Y$, where the range of possible values of an index such as $i$ that distinguishes the quantities in the collection $Y$ includes continuous ranges of values, is sometimes said to be a "functional" of $Y$. For example the action $S$, above, is a functional of the position data $x$ of the particles, the voltage field $V$, and the vector potential field $A$.

To extend the definition of $\frac{\partial X}{\partial y_q}$ for a function $X$ of $y$ to $\frac{\partial X}{\partial Y_i}$ for a functional $X$ of $Y$, we'll use the analogue of the Kronecker delta $\delta_{p q}$ for continuous indexes, which is called the Dirac delta after Paul Dirac. If $s$ is a quantity that can take continuous values, for example time or a position coordinate, then the Dirac delta $\delta \left( s \right)$ is the limit as $\varepsilon$ tends to 0 of a family of smooth functions $f_{\varepsilon}$ of $s$ that have a high peak at $s = 0$, such that $\int_{- \infty}^{\infty} 
f_{\varepsilon} \mathrm{d} s = 1$ for all $\varepsilon$, and $f_{\varepsilon} 
\left( s \right)$ tends to 0 as $\varepsilon$ tends to 0 for all $s \neq 0$. For example $f_{\varepsilon} 
\left( s \right)$ could be $\frac{1}{\sqrt{\pi 
\varepsilon}} \mathrm{e}^{- \frac{s^2}{\varepsilon}}$, since by a calculation similar to the one in the second part of the post, here, we find that $\int_{- \infty}^{\infty} \mathrm{e}^{- 
\frac{s^2}{\varepsilon}} \mathrm{d} s = \sqrt{\pi \varepsilon}$. The Dirac delta has the property that for any function $F$ of $s$:

\begin{displaymath}\int_{- \infty}^{\infty} \delta \left( s - s_{\left( 0 \right... 
... \right) \mathrm{d} s = F \left( s_{\left( 0 \right)} \right), \end{displaymath}

which is analogous to the property of the Kronecker delta we observed in the third part of the post, here.

We'll now extend the definition of the Kronecker delta and the Dirac delta to the indexes $i, j, k, \ldots$ that distinguish the quantities in the collection $Y$ at each time $t$ in the appropriate way, using the Dirac delta where the ranges of possible values of the indexes are continuous. Thus we'll define $\delta_{x_{a I} x_{b J}} = \delta_{a b} \delta_{I J}$, $\delta_{x_{a I} V_x} = 0$, $\delta_{x_{a I} A_{b x}} = 0$, $\delta_{V_x V_y} 
= \delta \left( x_1 - y_1 \right) \delta \left( x_2 - y_2 \right) \delta 
\left( x_3 - y_3 \right)$, $\delta_{V_x A_{a y}} = 0$, and $\delta_{A_{a x} 
A_{b y}} = \delta_{a b} \delta \left( x_1 - y_1 \right) \delta \left( x_2 - 
y_2 \right) \delta \left( x_3 - y_3 \right)$. Here $y$ in the context $V_y$ or $A_{b y}$ is understood from its context to represent a position in space, like $x$ in the same context. With these definitions, $\delta_{i j}$ is 1 if $i$ and $j$ represent the same degree of freedom and 0 otherwise, and the Kronecker delta is used in contexts where the sum $\sum_i$ means a discrete sum, and the Dirac delta is used in contexts where it means an integral.

We can now define $\frac{\partial X}{\partial Y_i}$ for a functional $X$ of $Y$ as:

\begin{displaymath}\frac{\partial X}{\partial Y_i} = \mathrm{\lim}_{\varepsilon ... 
...repsilon \delta_i \right) - X \left( Y 
\right)}{\varepsilon}, \end{displaymath}

where in the context of the notation where $Y$ represents a collection of quantities, and the index $j$ distinguishes the quantities in the collection, the expression $\delta_i$ represents the collection of quantities such that $Y_j = \delta_{i j}$.

We now observe that the formula for the action $S$, above, can be written as:

\begin{displaymath}S = \int_{t_1}^{t_2} L \mathrm{d} t, \end{displaymath}

where

\begin{displaymath}L = T + L_{\mathrm{e.m.}} + L_{\mathrm{{{int}}.}}, 
\end{displaymath}

and $L_{\mathrm{e.m.}}$ and $L_{\mathrm{{{int}}.}}$ are the expressions that are integrated from $t_1$ to $t_2$ in the formulae for $S_{\mathrm{e.m.}}$, above, and $S_{\mathrm{{{int}}.}}$ above. The expression $L$ is called the Lagrangian, after Joseph-Louis Lagrange, who vigorously developed the applications of de Maupertuis's principle.

We observe, furthermore, that at each time $t$, the expressions $T$, $L_{\mathrm{e.m.}}$, and $L_{\mathrm{{{int}}.}}$, and consequently also $L$, depend on the dynamical quantities $Y_i$, as above, only through the values of $Y_i$ and $\frac{\partial Y_i}{\partial t}$ at that time $t$, or in other words, only through $Y_{i t}$ and $\left( \frac{\partial 
Y_i}{\partial t} \right)_t$. Here I am interpreting $\frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t}$ as $\frac{\partial x_{a I}}{\partial t}$, since the symbol $\partial$ is an alternative notation for Leibniz's $\mathrm{d}$, as I explained in the first part of the post, here. By calculations similar to those we did in the first part of the post, here, we find that when an action is the time integral of a Lagrangian $L$, as above, such that $L_t$, the Lagrangian $L$ at time $t$, depends on the dynamical quantities $Y_i$ only through $Y_{i t}$ and $\left( \frac{\partial 
Y_i}{\partial t} \right)_t$, the equations that result from requiring that the action should be relatively unaltered by small changes in the values of the dynamical quantities $Y_i$ and their time dependence, in accordance with de Maupertuis's principle of stationary action, as above, are:

\begin{displaymath}- \frac{\mathrm{d}}{\mathrm{d} t} \left( \frac{\partial L}{\p... 
...}{\partial t}} \right) + \frac{\partial L}{\partial Y_i} 
= 0. \end{displaymath}

This is called Lagrange's equation. In this formula, the expressions $\frac{\partial L}{\partial \frac{\partial Y_i}{\partial t}}$ and $\frac{\partial L}{\partial Y_i}$ are defined by treating $Y_{i t}$ and $\left( \frac{\partial 
Y_i}{\partial t} \right)_t$ as completely independent quantities. The change to the action that results from the replacement of $Y$ by $Y + \varepsilon$, where $\varepsilon_{i t}$ is small for all $i$ and all $t$, is the time integral of the sum $\sum_i$ of $\varepsilon_i$ times the left-hand side of the above equation, plus terms that tend to 0 more rapidly than in proportion to $\varepsilon$, as $\varepsilon$ tends to 0.

We'll now use Lagrange's equation, as above, to find the equations that result from the application of de Maupertuis's principle to the action $S$, above. For $Y_{x_{a I}} = x_{a I}$, we find from the formula for the kinetic energy $T$ of the particles, as in the first part of the post, here, that:

\begin{displaymath}\frac{\partial T}{\partial \frac{\mathrm{d} x_{a I}}{\mathrm{d} t}} = m_I 
\frac{\mathrm{d} x_{a I}}{\mathrm{d} t}, \end{displaymath}

where I used Leibniz's rule for the rate of change of a product, which we obtained in the first part of the post, here. Thus:

\begin{displaymath}- \frac{\mathrm{d}}{\mathrm{d} t} \left( \frac{\partial T}{\p... 
...hrm{d}}{\mathrm{d} t} \frac{\mathrm{d} x_{a I}}{\mathrm{d} t}, 
\end{displaymath}

which is in agreement with the result we found in the first part of the post, here. $T$ does not depend on $x_{a I}$ other than through $\frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t}$, and from the formula for $S_{\mathrm{e.m.}}$, above, $L_{\mathrm{e.m.}}$ does not depend on $x_{a I}$ or $\frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t}$. From the formula for $S_{\mathrm{{{int}}.}}$, above, we find:

\begin{displaymath}\frac{\partial L_{\mathrm{{{int}}.}}}{\partial x_{a 
I}} = - ... 
..._b}{\partial x_{a I}} \frac{\mathrm{d} x_{b I}}{\mathrm{d} t}, \end{displaymath}

\begin{displaymath}\frac{\partial L_{\mathrm{{{int}}.}}}{\partial 
\frac{\mathrm{d} x_{a I}}{\mathrm{d} t}} = q_I A_a \left( x_I \right) . \end{displaymath}

For the first of the above two formulae, I rewrote the dummy index $a$ that is summed over by $\sum_a$ in the formula for $S_{\mathrm{{{int}}.}}$ as $b$, to avoid confusing it with the index $a$ on the quantity $x_{a I}$ we are evaluating the rate of change of $L_{\mathrm{{{int}}.}}$ with respect to.

To calculate $\frac{\mathrm{d}}{\mathrm{d} t} \left( \frac{\partial 
L_{\mathrm{{{int}}.}}}{\p... 
...hrm{d} t}} \right) = q_I \frac{\mathrm{d}}{\mathrm{d} t} A_a \left( 
x_I \right)$, we observe that by the result we found in the first part of the post, here, with $X$ taken as $A_a$, and the $y_p$ taken as $x_{1 I}$, $x_{2 I}$, $x_{3 I}$, and $t$, we have:

\begin{displaymath}A_a \left( x_I + \frac{\mathrm{d} x_I}{\mathrm{d} t} \mathrm{... 
... \mathrm{d} 
t + \frac{\partial A_a}{\partial t} \mathrm{d} t, \end{displaymath}

where the error of this approximate representation tends to 0 more rapidly than in proportion to $\mathrm{d} t$, as $\mathrm{d} t$ tends to 0. For a formula such as this one, where we have not yet divided the change during time $\mathrm{d} t$ by $\mathrm{d} t$, we relax the rule that Leibniz's $\mathrm{d}$ means that the formula is to be taken in the limit where $\mathrm{d} t$ tends to 0, because the formula would otherwise give $0 = 0$. On dividing the above formula by $\mathrm{d} t$, and then taking the limit where $\mathrm{d} t$ tends to 0, we find:

\begin{displaymath}- \frac{\mathrm{d}}{\mathrm{d} t} \left( \frac{\partial 
L_{\... 
..._{b I}}{\mathrm{d} t} - q_I \frac{\partial A_a}{\partial 
t} . \end{displaymath}

Thus from Lagrange's equation, above, we have:

\begin{displaymath}- m_I \frac{\mathrm{d}}{\mathrm{d} t} \frac{\mathrm{d} x_{a I... 
... 
I}}{\mathrm{d} t} - q_I \frac{\partial A_a}{\partial t} = 0. \end{displaymath}

Rearranging this formula as:

\begin{displaymath}m_I \frac{\mathrm{d}}{\mathrm{d} t} \frac{\mathrm{d} x_{a I}}... 
...ial x_{b I}} \right) \frac{\mathrm{d} x_{b 
I}}{\mathrm{d} t}, \end{displaymath}

we recognize it as:

\begin{displaymath}m_I \frac{\mathrm{d}}{\mathrm{d} t} \frac{\mathrm{d} x_{a I}}... 
....s.}} \right)_{a I} + \left( F_{\mathrm{m.i.}} 
\right)_{a I}, \end{displaymath}

where $\left( F_{\mathrm{e.s.}} \right)_I$ is the electrostatic force, as above, on the $I$'th particle, and $\left( F_{\mathrm{m.i.}} \right)_I$ is the magnetic induction force, as above, on the $I$'th particle.

For $Y_{V_x} = V_x$, $T$ gives no contribution. From the formula for $S_{\mathrm{e.m.}}$, above, we find:

\begin{displaymath}\frac{\partial L_{\mathrm{e.m.}}}{\partial V_x} = 
\mathrm{\l... 
..._x \right) - L_{\mathrm{e.m.}} \left( V 
\right)}{\varepsilon} \end{displaymath}

\begin{displaymath}= \mathrm{\lim}_{\varepsilon \rightarrow 0} \frac{1}{\varepsi... 
...rac{\partial A_a}{\partial t} \right)^2 \right) \mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= \int \epsilon_0 \sum_a \left( \frac{\partial \delta_x}{\par... 
...y_a} + \frac{\partial 
A_a}{\partial t} \right) \mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= - \int \epsilon_0 \sum_a \delta_x \frac{\partial}{\partial ... 
...y_a} + \frac{\partial A_a}{\partial t} \right) 
\mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= - \epsilon_0 \sum_a \frac{\partial}{\partial x_a} \left( \f... 
... 
V}{\partial x_a} + \frac{\partial A_a}{\partial t} \right) . \end{displaymath}

In the second line here, I rewrote the dummy position index $x$ that is integrated over by $\int \ldots \mathrm{d}^3 x$ in the formula for $L_{\mathrm{e.m.}}$ as $y$, to avoid confusing it with the index $x$ on the quantity $V_x$ we are evaluating the rate of change of $L_{\mathrm{e.m.}}$ with respect to. $\delta_x$ means $\delta \left( x_1 - y_1 \right) \delta 
\left( x_2 - y_2 \right) \delta \left( x_3 - y_3 \right)$, in accordance with the definitions above and above. The fourth line is obtained from the third line in a similar manner to the calculation in the first part of the post, here: we use Leibniz's rule for the rate of change of a product to calculate the rate of change of the product $\delta_x \left( \frac{\partial V}{\partial y_a} + \frac{\partial 
A_a}{\partial t} \right)$ with respect to $y_a$, then use the result that the integral $\int_{- \infty}^{\infty} \frac{\partial}{\partial y_a} \left( 
\delta_x \left( \... 
...{\partial y_a} + \frac{\partial A_a}{\partial 
t} \right) \right) \mathrm{d} y_a$ is the difference between the value of $\delta_x \left( \frac{\partial V}{\partial y_a} + \frac{\partial 
A_a}{\partial t} \right)$ as $y_a$ tends to $+ \infty$ and its value as $y_a$ tends to $- \infty$, which is 0, since $\delta \left( x_a - y_a \right)$ is 0 in both these limits. And the fifth line is obtained from the fourth line by using the result above for the Dirac delta, with $s$ taken as $y_b$ and $s_{\left( 0 \right)}$ taken as $x_b$, for $b = 1, 2, 3$.

From the formula for $S_{\mathrm{e.m.}}$, above, we also find:

\begin{displaymath}\frac{\partial L_{\mathrm{e.m.}}}{\partial \frac{\partial V_x}{\partial t}} 
= 0. \end{displaymath}

From the formula for $S_{\mathrm{{{int}}.}}$, above, we find:

\begin{displaymath}\frac{\partial L_{\mathrm{{{int}}.}}}{\partial V_x} 
= - \sum_I q_I \delta^3 \left( x - x_I \right), \end{displaymath}

where $\delta^3 \left( x - x_I \right) = \delta \left( x_1 - x_{1 I} \right) 
\delta \left( x_2 - x_{2 I} \right) \delta \left( x_3 - x_{3 I} \right)$, and:

\begin{displaymath}\frac{\partial L_{\mathrm{{{int}}.}}}{\partial 
\frac{\partial V_x}{\partial t}} = 0. \end{displaymath}

Thus from Lagrange's equation, above, we find:

\begin{displaymath}- \epsilon_0 \sum_a \frac{\partial}{\partial x_a} \left( \fra... 
... t} \right) - \sum_I q_I 
\delta^3 \left( x - x_I \right) = 0. \end{displaymath}

This is in agreement with Maxwell's equation summarizing Coulomb's law, as in the third part of this post, here, since for a collection of point particles at positions $x_I$ with electric charges $q_I$, the electric charge density $\rho$ is:

\begin{displaymath}\rho = \sum_I q_I \delta^3 \left( x - x_I \right) . \end{displaymath}

For $Y_{A_{a x}} = A_{a x}$, $T$ again gives no contribution. From the formula for $S_{\mathrm{e.m.}}$, above, two terms in $L_{\mathrm{e.m.}}$ give contributions to $\frac{\partial L_{\mathrm{e.m.}}}{\partial A_{a x}}$, namely the term involving $\frac{\partial A_c}{\partial x_b} \frac{\partial 
A_c}{\partial x_b}$, which I shall call $L_{\mathrm{e.m. 1}}$, and the term involving $\frac{\partial A_c}{\partial x_b} \frac{\partial A_b}{\partial 
x_c}$, which I shall call $L_{\mathrm{e.m. 2}}$. We find:

\begin{displaymath}\frac{\partial L_{\mathrm{e.m. 1}}}{\partial A_{a x}} = 
\mat... 
... \right) - L_{\mathrm{e.m. 1}} \left( A 
\right)}{\varepsilon} \end{displaymath}

\begin{displaymath}= - \frac{1}{2 \mu_0} \mathrm{\lim}_{\varepsilon \rightarrow ... 
...y_b} \frac{\partial A_c}{\partial y_b} 
\right) \mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= - \frac{1}{\mu_0} \int \sum_{b, c} \delta_{a c} \frac{\part... 
...\partial y_b} \frac{\partial A_c}{\partial y_b} \mathrm{d}^3 y 
\end{displaymath}

\begin{displaymath}= \frac{1}{\mu_0} \int \delta_x \sum_b \frac{\partial}{\partial y_b} 
\frac{\partial A_a}{\partial y_b} \mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= \frac{1}{\mu_0} \sum_b \frac{\partial}{\partial x_b} \frac{\partial 
A_a}{\partial x_b}, \end{displaymath}

where the successive steps are as in the corresponding calculation for $V_x$, as above, and in the fourth line I also used the result we observed in the third part of the post, here, for the Kronecker delta. And similarly:

\begin{displaymath}\frac{\partial L_{\mathrm{e.m. 2}}}{\partial A_{a x}} = 
\mat... 
... \right) - L_{\mathrm{e.m. 2}} \left( A 
\right)}{\varepsilon} \end{displaymath}

\begin{displaymath}= - \frac{1}{2 \mu_0} \mathrm{\lim}_{\varepsilon \rightarrow ... 
...y_b} \frac{\partial A_b}{\partial y_c} 
\right) \mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= - \frac{1}{2 \mu_0} \int \sum_{b, c} \left( - \delta_{a c} ... 
...\frac{\partial \delta_x}{\partial y_c} 
\right) \mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= - \frac{1}{2 \mu_0} \int \left( \sum_b \delta_x \frac{\part... 
...y_c} \frac{\partial A_c}{\partial y_a} \right) 
\mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= - \frac{1}{\mu_0} \sum_b \frac{\partial}{\partial x_b} \frac{\partial 
A_b}{\partial x_a}, \end{displaymath}

where the successive steps are the same again, and in going from the fourth line to the fifth line, I rewrote the dummy index $c$ in the second term in the fourth line as $b$.

From the formula for $S_{\mathrm{e.m.}}$, above, we also find:

\begin{displaymath}\frac{\partial L_{\mathrm{e.m.}}}{\partial \frac{\partial A_{... 
...}} \left( \frac{\partial A}{\partial t} 
\right)}{\varepsilon} \end{displaymath}

\begin{displaymath}= \mathrm{\lim}_{\varepsilon \rightarrow 0} \frac{1}{\varepsi... 
...rac{\partial A_b}{\partial t} \right)^2 \right) \mathrm{d}^3 y \end{displaymath}

\begin{displaymath}= \int \epsilon_0 \sum_b \delta_{a b} \delta_x \left( \frac{\... 
... y_b} + \frac{\partial A_b}{\partial t} \right) \mathrm{d}^3 y 
\end{displaymath}

\begin{displaymath}= \epsilon_0 \left( \frac{\partial V}{\partial x_a} + \frac{\partial 
A_a}{\partial t} \right) . \end{displaymath}

Thus:

\begin{displaymath}- \frac{\mathrm{d}}{\mathrm{d} t} \frac{\partial 
L_{\mathrm{... 
... V}{\partial 
x_a} + \frac{\partial A_a}{\partial t} \right) . \end{displaymath}

From the formula for $S_{\mathrm{{{int}}.}}$, above, we find:

\begin{displaymath}\frac{\partial L_{\mathrm{{{int}}.}}}{\partial A_{a 
x}} = \s... 
...eft( x - x_I \right) \frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t}, \end{displaymath}

\begin{displaymath}\frac{\partial L_{\mathrm{{{int}}.}}}{\partial 
\frac{\partial A_{a x}}{\partial t}} = 0. \end{displaymath}

Thus from Lagrange's equation, above, we find:

\begin{displaymath}\frac{1}{\mu_0} \sum_b \frac{\partial}{\partial x_b} \left( \... 
... x - 
x_I \right) \frac{\mathrm{d} x_{a I}}{\mathrm{d} t} = 0. \end{displaymath}

This is in agreement with Maxwell's equation summarizing Ampère's law, as in the third part of this post, here, since for a collection of point particles at positions $x_I$ with electric charges $q_I$, the electric current density $J_a$ is:

\begin{displaymath}J_a = \sum_I q_I \delta^3 \left( x - x_I \right) \frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t} . \end{displaymath}

As a check of this formula for $J_a$, we observe that for the electric charge density $\rho$ of a collection of point particles at positions $x_I$ with electric charges $q_I$, as above, the result we found in the first part of this post, here, with $X$ taken as $\delta^3 \left( x - x_I \right)$ and the $y_p$ taken as $x_a$, gives:

\begin{displaymath}\rho \left( x, t + \mathrm{d} t \right) - \rho \left( t \righ... 
...} t \right) \right) - \delta^3 \left( x - x_I 
\right) \right) \end{displaymath}

\begin{displaymath}\simeq \sum_I \sum_a q_I \left( \frac{\partial}{\partial x_a}... 
...\frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t} \mathrm{d} t \right), \end{displaymath}

where the error of this approximate representation tends to 0 more rapidly than in proportion to $\mathrm{d} t$, as $\mathrm{d} t$ tends to 0. Thus:

\begin{displaymath}\frac{\partial \rho}{\partial t} = - \sum_{a, I} q_I \frac{\m... 
...x - x_I 
\right) = - \sum_a \frac{\partial J_a}{\partial x_a}, \end{displaymath}

where the electric current density $J_a$ is as above. Thus the electric charge density $\rho$ of a collection of point particles at positions $x_I$ with electric charges $q_I$, as above, and the electric current density $J_a$ of those particles, as above, satisfy the equation expressing the conservation of electric charge, as in the third part of this post, here.

Thus for a collection of electrically charged point particles moving slowly compared to the speed of light in a vacuum, the electrostatic force $F_{\mathrm{e.s.}}$ on those particles, as above, the magnetic induction force $F_{\mathrm{m.i.}}$ on those particles, as above, Maxwell's equation summarizing Coulomb's law, as in the third part of this post, here, and Maxwell's equation summarizing Ampère's law, as in the third part of this post, here, all follow from the application of de Maupertuis's principle of stationary action, as in the first part of the post, here, to the action $S$, above. And Maxwell's equation summarizing Faraday's measurements involving time-dependent magnetic fields, as in the third part of the post, here, and Maxwell's equation above summarizing the non-observation of magnetic monopoles, as in the third part of the post, here, are automatically satisfied, due to the formulae expressing the electric field strength $E$ and the magnetic induction field $B$ in terms of the voltage field $V$ and the vector potential field $A$, as in the third part of the post, here.

We observe that the action $S$, above, is unaltered when the voltage field $V$ and the vector potential field $A$ are changed by a gauge transformation, as in the third part of this post, here, if the scalar field $f$ that defines the gauge transformation is 0 at the times $t_1$ and $t_2$ between which the action is calculated. For the first form of the formula for $S_{\mathrm{e.m.}}$, as above, shows that $S_{\mathrm{e.m.}}$ only depends on $V$ and $A$ through the electric field strength $E$ and the magnetic induction field $B$, which are unchanged by the gauge transformation, as we found in the third part of the post, here. And the change to $S_{\mathrm{{{int}}.}}$, above, that results from the gauge transformation, is:

\begin{displaymath}- \int_{t_1}^{t_2} \sum_I q_I \left( - \frac{\partial}{\parti... 
... \frac{\mathrm{d} x_{a I}}{\mathrm{d} t} \right) \mathrm{d} t. \end{displaymath}

The result we found in the first part of the post, here, with $X$ taken as $f$, and the $y_p$ taken as the $x_{a I}$ and $t$, gives:

\begin{displaymath}f \left( x_I + \frac{\mathrm{d} x_I}{\mathrm{d} t} \mathrm{d}... 
...\right) \frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t} \mathrm{d} t, \end{displaymath}

where the error of this approximate representation tends to 0 more rapidly than in proportion to $\mathrm{d} t$, as $\mathrm{d} t$ tends to 0. So:

\begin{displaymath}\frac{\mathrm{d}}{\mathrm{d} t} f \left( x_I, t \right) = 
\f... 
...eft( x_I, t \right) \frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t} . \end{displaymath}

Thus the formula above for the change to $S_{\mathrm{{{int}}.}}$, that results from the gauge transformation, is:

\begin{displaymath}\int_{t_1}^{t_2} \sum_I q_I \frac{\mathrm{d}}{\mathrm{d} t} f \left( x_I, t 
\right) \mathrm{d} t. \end{displaymath}

So from the result we found in the first part of the post, here, that the integral of the rate of change of a quantity is equal to the net change of that quantity, the change to $S_{\mathrm{{{int}}.}}$, that results from the gauge transformation, is 0 if $f$ is 0 at $t_1$ and $t_2$. Thus since $T$, the kinetic energy of the particles, does not depend on $V$ or $A$, the action $S$ is unaltered by the gauge transformation, if $f$ is 0 at $t_1$ and $t_2$. This property of $S$ is called gauge invariance.

For any system whose action, $S$, can be expressed in terms of a Lagrangian, $L$, as above, such that $L_t$, the Lagrangian at time $t$, depends on the dynamical quantities $Y_i$ only through $Y_{i t}$ and $\left( \frac{\partial 
Y_i}{\partial t} \right)_t$, the values of $Y_i$ and $\frac{\partial Y_i}{\partial t}$ at the time $t$, so that de Maupertuis's principle of stationary action leads to Lagrange's equation, as above, the following expression:

\begin{displaymath}H = \left( \sum_i \frac{\partial Y_i}{\partial t} \frac{\partial 
L}{\partial \frac{\partial Y_i}{\partial t}} \right) - L \end{displaymath}

is automatically independent of time. For by Leibniz's rule for the rate of change of a product, which we obtained in the first part of the post, here:

\begin{displaymath}\frac{\mathrm{d} H}{\mathrm{d} t} = \left( \sum_i \left( \lef... 
...} \right) \right) 
\right) - \frac{\mathrm{d} L}{\mathrm{d} t} \end{displaymath}

\begin{displaymath}= \left( \sum_i \left( \left( \frac{\mathrm{d}}{\mathrm{d} t}... 
...ial Y_i} \right) \right) - \frac{\mathrm{d} 
L}{\mathrm{d} t}, \end{displaymath}

where the second line follows from Lagrange's equation, above. And from the result we found in the first part of the post, here, with $X$ taken as $L$, and the quantities $y_p$ taken as the quantities $Y_i$ and $\frac{\partial Y_i}{\partial t}$, we have:

\begin{displaymath}L \left( Y + \frac{\partial Y}{\partial t} \mathrm{d} t, \fra... 
...t) - L \left( Y, \frac{\partial 
Y}{\partial t} \right) \simeq \end{displaymath}

\begin{displaymath}\simeq \sum_i \left( \frac{\partial L}{\partial Y_i} \frac{\p... 
...\frac{\partial 
Y_i}{\partial t} \right) \mathrm{d} t \right), \end{displaymath}

where the error of this approximation tends to 0 more rapidly than in proportion to $\mathrm{d} t$, as $\mathrm{d} t$ tends to 0. Thus:

\begin{displaymath}\frac{\mathrm{d} L}{\mathrm{d} t} = \sum_i \left( \frac{\part... 
...athrm{d} t} \frac{\partial Y_i}{\partial t} \right) 
\right) . \end{displaymath}

So from the formula above:

\begin{displaymath}\frac{\mathrm{d} H}{\mathrm{d} t} = 0. \end{displaymath}

For a system that includes a collection of point particles, so that $L$ includes a term that is the kinetic energy $T$ of those particles, as above, the formula for $\frac{\partial T}{\partial \frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t}}$, as above, shows that $\sum_{a, I} \frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t} \frac{\partial T}{\partial \frac{\mathrm{d} x_{a 
I}}{\mathrm{d} t}} = 2 T$, so that $H$ also includes a term $2 T - T = T$. Thus $H$ must be the total energy of the system, and the result that $\frac{\mathrm{d} H}{\mathrm{d} t} = 0$, so that the value of $H$ is independent of time, expresses the conservation of energy. $H$ is called the Hamiltonian of the system, after Sir William Rowan Hamilton.

From the formulae for the action, $S$, as above, $S_{\mathrm{e.m.}}$, as above, $S_{\mathrm{{{int}}.}}$, as above, $\frac{\partial L_{\mathrm{{{int}}.}}}{\partial 
\frac{\mathrm{d} x_{a I}}{\mathrm{d} t}}$, as above, and $\frac{\partial 
L_{\mathrm{e.m.}}}{\partial \frac{\partial A_{a x}}{\partial t}}$, as above, we find that the Hamiltonian for a collection of electrically charged point particles moving slowly compared to the speed of light in a vacuum, and the voltage field $V$ and the vector potential field $A$, is:

\begin{displaymath}H = T + H_{\mathrm{e.m.}} + H_{\mathrm{{{int}}.}}, 
\end{displaymath}

where $T$ is the kinetic energy of the particles, as in the first part of the post, here,

\begin{displaymath}H_{\mathrm{e.m.}} = \int \! \frac{1}{2} \left( \epsilon_0 \su... 
...ac{\partial A_b}{\partial x_c} \right) \right) \mathrm{d}^3 x, \end{displaymath}

and

\begin{displaymath}H_{\mathrm{{{int}}.}} = \sum_I q_I V \left( x_I 
\right) . \end{displaymath}

The above formula for $H$ is not manifestly left unchanged by the gauge transformations that modify the voltage field $V$ and the vector potential field $A$, as in the third part of the post, here, but leave the electric field strength $E$ and the magnetic induction field $B$ unaltered, and thus have no experimentally observable consequences. However by the property of the Dirac delta we observed above, we can write $H_{\mathrm{{{int}}.}}$, above, as:

\begin{displaymath}H_{\mathrm{{{int}}.}} = \int V \left( x \right) 
\sum_I q_I \delta^3 \left( x - x_I \right) \mathrm{d}^3 x. 
\end{displaymath}

So when Maxwell's equation summarizing Coulomb's law, as above, is satisfied, we have:

\begin{displaymath}H_{\mathrm{{{int}}.}} = - \int V \left( x \right) 
\epsilon_0... 
... 
V}{\partial x_a} + \frac{\partial A_a}{\partial t} \right) . \end{displaymath}

When the number of electrically charged point particles is finite, $V$ and $A$ tend to 0 when any of the position coordinates $x_a$ tend to $\pm \infty$, so by a calculation similar to the way we obtained the fourth line from the third line in the calculation above, we find:

\begin{displaymath}H_{\mathrm{{{int}}.}} = \int \epsilon_0 \sum_a 
\left( \frac{... 
...al A_a}{\partial t} 
\right) \frac{\partial V}{\partial x_a} . \end{displaymath}

Thus:

\begin{displaymath}H_{\mathrm{e.m.}} + H_{\mathrm{{{int}}.}} = \int \! 
\frac{1}... 
...ac{\partial A_b}{\partial x_c} 
\right) \right) \mathrm{d}^3 x \end{displaymath}

\begin{displaymath}= \int \! \frac{1}{2} \left( \epsilon_0 \sum_a E_a^2 + \frac{... 
...1}{2} \sum_a \left( E_a 
D_a - B_a H_a \right) \mathrm{d}^3 x, \end{displaymath}

where the second line here follows from reversing the steps that led from the first version of the formula for $S_{\mathrm{e.m.}}$, as above, to the final version, as above.

Thus when the Lagrange equation that follows from the application of de Maupertuis's principle to the action $S$, above, is satisfied, the Hamiltonian $H$, above, is equal to the gauge invariant Hamiltonian:

\begin{displaymath}H_{\mathrm{g.i.}} = T + \int \! \frac{1}{2} \sum_a \left( \epsilon_0 E_a^2 
+ \frac{1}{\mu_0} B^2_a \right) \mathrm{d}^3 x. \end{displaymath}

Thus since $\frac{\mathrm{d} H}{\mathrm{d} t} = 0$ when the Lagrange equation is satisfied, we also have $\frac{\mathrm{d} H_{\mathrm{g.i.}}}{\mathrm{d} t} 
= 0$ when the Lagrange equation is satisfied. $H_{\mathrm{g.i.}}$ is the manifestly gauge invariant formula for the total energy of a collection of electrically charged point particles moving slowly compared to the speed of light in a vacuum, and the electric field strength $E$ and the magnetic induction field $B$.

In the next part of this post, Radiation in an Oven, we'll use the above formula for $H_{\mathrm{g.i.}}$ to look at how the discoveries about heat and temperature that we looked at in the second part of this post, combined with the discoveries about electromagnetic radiation that we looked at in the third part of the post, lead to a seriously wrong conclusion about the properties of electromagnetic radiation in a hot oven. In the subsequent parts of the post, we'll look at how that problem has been resolved by the discovery of quantum mechanics and Feynman's functional integral, which started with the identification of a new fundamental constant of nature by Max Planck, in 1899.

The software on this website is licensed for use under the Free Software Foundation General Public License.

Page last updated 29 November 2020. Copyright (c) Chris Austin 2012 - 2020. Privacy policy