Home | Blog index | Previous | Next | About | Privacy policy |

By Chris Austin. 30 November 2020.

*An earlier version of this post was published on another website on 1 January 2013.*

This is the fifth part of a ten-part post on the foundation of our understanding of high energy physics, which is Richard Feynman's functional integral. The first four parts are Action, Multiple Molecules, Electromagnetism, and Action For Fields, and the following parts, which will appear at intervals of about a month, are Matrix Multiplication, The Functional Integral, Gauge Invariance, Photons, and Interactions.

I'm hoping this blog will be fun and useful for everyone with an interest in science, so although I'll pop up a few formulae again, I'll try as usual to keep them friendly by explaining all the pieces. Please feel free to ask a question in the Comments, if you think anything in the post is unclear.

The clue that led to the discovery of quantum mechanics, whose principles are summarized in Feynman's functional integral, came from the attempted application to electromagnetic radiation of discoveries about heat and temperature. We looked at those discoveries about heat and temperature in the second part of the post, and in the third part of the post, we looked at how James Clerk Maxwell, just after the middle of the nineteenth century, was able to identify light as waves of oscillating electric and magnetic fields, and to calculate the speed of light from measurements of electrical and magnetic effects. Feynman's functional integral for a physical system depends on a property of the system called its action, and in the first part of the post, we derived Sir Isaac Newton's second law of motion from Pierre-Louis de Maupertuis's principle of stationary action. In the fourth part of the post, we calculated the energy of a system of electrically charged particles and electromagnetic fields, by deriving Maxwell's equations for the electromagnetic fields, and the forces exerted on the charged particles by the fields, by de Maupertuis's principle from an action.

Today I would like to put all these pieces together, and show you how they lead to a seriously wrong conclusion about the properties of electromagnetic radiation in a hot oven. In the subsequent parts of the post, we'll look at how that problem has been resolved by the discovery of quantum mechanics and Feynman's functional integral, which started with the identification of a new fundamental constant of nature by Max Planck, in 1899.

Let's now consider the electromagnetic fields in a box-shaped oven whose sides are aligned with the Cartesian coordinate directions, and whose internal dimensions in the Cartesian coordinate directions are , , and . We'll assume that the internal faces of the oven walls are perfectly reflecting, and that the oven is empty apart from the electromagnetic fields. We found in the third part of the post, here, that for any vector , any angle , and any vector perpendicular to , which means that , a solution of the equations for the electromagnetic field in a vacuum, with no electric charges or electric currents present, is given by:

where is the voltage field and is the vector potential field, and the electric field strength and the magnetic induction field are expressed in terms of and by the formulae we found in the third part of the post, here. We found in the third part of the post, here, that is equal to the speed of light metres per second in a vacuum, so I'll now write instead of .

From calculations similar to the ones in the third part of the post, here, which confirmed that a wave of the above form satisfies the gauge condition on and in the third part of the post, here, and Maxwell's equations in a vacuum in terms of and assuming that gauge condition, as in the third part of the post, here, with no electric charges or electric currents present, we find that an arbitrary sum of waves of the above form also satisfies those equations. We'll assume that the electromagnetic fields inside the oven consist of a sum of waves of the above form, so that inside the oven.

Maxwell's equation summarizing Faraday's measurements involving time-dependent magnetic fields, as in the third part of the post, here, shows that the components of tangential to an oven wall must be continuous at the oven wall, for if a component of tangential to the oven wall changed discontinously at the oven wall, the component of in the perpendicular direction along the oven wall would have to be infinite at the oven wall. The assumption that the oven walls are perfectly reflecting means that and are 0 inside the material of the oven walls, and we'll assume that and are also 0 inside the material of the oven walls. So from the formula for in terms of and , as in the third part of the post, here, the tangential components of must be continuous at the oven wall, and are thus 0 at the oven wall.

The requirement that the tangential components of must be 0 at each wall of the oven is called a boundary condition, and restricts the possible wave vectors of the electromagnetic waves inside the oven. We'll assume that the interior faces of the oven walls perpendicular to the Cartesian coordinate direction are at and , for . A single wave of the above form with a nonzero polarization vector and a nonzero wave vector cannot satisfy the boundary condition on any wall of the oven by itself, so if there is a wave present with particular values of , , and , there must also be other waves present with different values of , , or , such that the sum of these waves satisfies the boundary conditions.

Considering first the boundary at , if there is a wave present with

with nonzero and , then there must also be other waves present with different values of , , or , such that the sum of over these waves is 0 for all values of , and all values of in the range , and all values of in the range . Any waves present relevant to satisfying the boundary condition at will have the same values of , , and , and we can also require them to have the same value of , since from the first part of the post, here, the value of is unaltered by adding a whole number times to . Thus since , the only other relevant value of is . Thus to satisfy the boundary condition at , any wave present as above must be paired with another wave with the opposite value of , such that the sum of the two waves has the form:

since the 2 and 3 components of this are 0 at , and the polarization vector of the second wave is perpendicular to the wave vector of the second wave.

Considering, next, the boundary condition at , this must also be satisfied by the above sum of two waves, since any relevant waves have the same values of , , and , and can be chosen to have the same value of . Thus we require , , and to be such that the 2 and 3 components of the above sum of two waves are also 0 at , for all values of , and all values of in the range , and all values of in the range . To determine the values of , , and that satisfy this requirement, it is helpful to know about a formula that expresses in terms of , , , and , for arbitrary angles and .

From the definition of and , as in the first part of the post, here, are the Cartesian coordinates, in the 2-dimensional plane of Euclidean geometry, of a point that is moving along a circle of radius 1 centred at the point with Cartesian coordinates , such that the angle between the straight line from to and the straight line from to is , and is at . We'll now use the straight line from to as the first coordinate direction of a second set of Cartesian coordinates also centred at , such that the coordinate directions of the second set of Cartesian coordinates rotate into the original set as tends to 0. From the definition of Cartesian coordinates, as in the first part of the post, here, the second coordinate direction of the second set of Cartesian coordinates must be perpendicular to the first coordinate direction of the second set, so by the discussion in the third part of the post, here, it satisfies , whose solution of length 1 is . This is required to become as tends to 0, so the required solution is . A point whose coordinates are with respect to the first set of coordinates has coordinates with respect to the second set of coordinates, so we have:

Thus we have:

and also:

for all angles and all angles .

The above sum of two waves is therefore equal to:

where I have defined the Greek letter to be . We observe from the definition of and , as in the first part of the post, here, that:

and:

for all , so the above sum of two waves is:

Thus the requirement that the 2 and 3 components of the sum of the two waves are 0 at , for all values of , and all values of in the range , and all values of in the range , or equivalently, at , for all values of , means that either , or . If , then and means that , while if , then from the definition of , as in the first part of the post, here, we have , for some whole number .

Considering the boundary conditions at the oven walls perpendicular to the 2 and 3 coordinate directions in the same way, we therefore find that if there is a wave present as above with nonzero and such that , then it must be part of a sum of waves of the form:

and the components of must be of the form , for some whole numbers , . Doing the sums over the signs , , and using the formulae above, we find:

From the definition of and , as in the first part of the post, here, we observe that , for all . Thus from the formula above, we have:

for all and all . Thus if we display the -dependence of the above sum of waves by representing it as , we have:

Thus for arbitrary , the above sum of waves is equal to a sum of waves of the above form with , plus a sum of waves of the above form with .

From the formulae in the third part of the post, here, the electric field strength and the magnetic induction field for the above sum of waves are:

When the oven is hot, we can expect that it will contain electromagnetic radiation in these possible modes of oscillation. Let's now consider a sum of the above possible modes of electromagnetic radiation in the oven for all the possible values of , as above, and the two independent values 0 and of . There is an independent polarization vector for each pair of a possible value of and one of the two independent values 0 and of , and I'll write this as , where are whole numbers such that , for . The symbol means, "greater than or equal to." The polarization vector satisfies . The 1 component of the electric field strength is now:

where means , and the other components of and the components of are now analogous sums of the components above.

From the gauge-invariant formula for the Hamiltonian for a collection of electrically charged point particles moving slowly compared to the speed of light in a vacuum, plus electric and magnetic fields, which we found in the fourth part of the post, here, the energy of the electromagnetic fields in the oven involves the integrals over the volume of the oven of the squares of the components of and . To calculate the contribution of , where is equal to a sum over the independent modes as above, it is convenient to use independent dummy indexes and for each of the two factors of , so that can be written as:

where stand for the remaining factors, and , for .

Considering a term in the above sum with specific values of , , , and , the integral over the volume of the oven factorizes into a product of integrals of the form or , there being one such factor for each of the three possible values of . From the result we found above, and the observations above, we have:

for all and all . Thus we have:

We now observe that if is a fixed number, and is a quantity that depends smoothly on a quantity , so in the terminology of the fourth part of the post, here, is a smooth function of , then:

So from the result we found in the first part of the post, here, we have:

Thus from the result we found in the first part of the post, here, that the integral of the rate of change of a quantity is equal to the net change of that quantity, we have:

if is a whole number , while for we have . Thus from the result above, we find that for whole numbers and :

And similarly:

Thus after doing the integral over the volume of the oven, the contribution from to the gauge-invariant formula for the total energy, , as in the fourth part of the post, here, is:

where now represents the volume of the oven, and . The expression is 1 if and 0 otherwise, in accordance with the definition of the Kronecker delta, in the first part of the post, here. And in the same way, we find that when is expressed as a sum over , and , analogous to the expression for above, the contributions from to with give 0 after doing the integral over the volume of the oven.

We'll focus now on the contributions of modes with for all . From the formula above and the analogous formulae for the contributions from and , the sum of the contributions from , , and , for the modes with for all , is:

And from the formulae for the components of , above, the sum of the contributions from , , and , to , as in the fourth part of the post, here, for the modes with for all , is:

since, for example, , from the definition of the antisymmetric tensor , in the third part of the post, here. From the formula we found in the third part of the post, here, with the indexes and rewritten as and , and the dummy index rewritten as , and the property of the Kronecker delta we observed in the third part of the post, here, the above expression is equal to:

since .

We found in the third part of the post, here, that , the speed of light in a vacuum, is equal to , so the factor in the contribution from is equal to the factor in the contribution from . Thus for each , our observation above implies that the total contribution from terms where and , or vice versa, is proportional to:

where at the second step I used the observation that for all , which follows from the definition of , as in the first part of the post, here.

Thus since for all , the total contribution of the modes with for all to the energy of the electromagnetic radiation in the oven is:

where I used the formula for above.

We now observe that the arguments that led to the Boltzmann distribution, as in the second part of the post, here, for the most likely number of objects of a given type in a given position and momentum bin, when the range of possible positions and momenta of the microscopic objects in a system in thermal equilibrium at absolute temperature is divided up into tiny bins of equal size, can be adapted to the electromagnetic radiation subject to Maxwell's equations in a hot oven, for radiation of wavelengths very small compared to the dimensions , , and of the oven, in the following way.

We'll consider radiation modes whose wavelengths are sufficiently small compared to the dimensions of the oven that we can group the modes, described by their wave-number vector and or , into "types" of similar and equal , such that the number of modes of each type is large compared to 1, and the relative differences , , and , between the wave-number vectors and of any two modes of the same type are small compared to 1. For example for infra-red radiation of wavelength about metres in an oven of size about a metre cubed, we can assume that , , and are all at least about , and divide up the range of each into ranges to ; to ; and so on, and say that two wave-number vectors and are of the same type if is in the same range as , for , and 3. Then the number of wave-number vectors of each type is , and for and of the same type, is , for , and 3. This is consistent with the definition of the type of an object that I gave in the second part of the post, here, in the course of the derivation of the Boltzmann distribution.

Looking back at the derivation of the Boltzmann distribution, in the second part of the post, starting here, we observe that it depended on the assumption that the number of objects of each type is very large compared to 1, and that requirement will be satisfied for the radiation modes in the oven if we regard the modes of wave-number vector such that for , and 3 as "objects", and classify them into types as I just described.

The derivation of the Boltzmann distribution also depended on the conserved total energy being equal to the sum of the energies of the individual objects, so that the range of possible positions and momenta of an object could be divided into very small bins, such that the conserved total energy is to a very good approximation equal to a sum , as in the second part of the post, here, where is the number of objects of type in bin , and is the energy of an object of type at the centre of bin . However the derivation did not depend on any particular details of how depends on the type of object or the position and momentum of an object at the centre of bin , and it did not depend on the numbers associated with an object, that determine the contribution of that object to the total energy , being the position and momentum coordinates of the object, as opposed to some other quantities associated to the object that determine its contribution to .

Thus from the formula above for the contribution of the modes with for all to the energy of the electromagnetic radiation in the oven, which shows that the energy is the sum of a contribution from each mode, the argument works equally well if we regard the modes of wave-number vector such that for , and 3 as objects, and classify them into types as I described above, where the variable quantities associated with each object, that determine the contribution of that object to , are the components of the polarization vector of that mode. Two of the three components of can vary independently for each mode, due to the restriction that , as above.

The derivation of the Boltzmann distribution depended on the range of possible values of the position and momentum coordinates of an object being divided into bins of equal size, but although I used the same bins, for convenience, for each different type of object, the derivation did not depend on that, and neither the derivation nor the result, as in the second part of the post, here, depended on the actual sizes of the bins, other than through the requirements that they be small enough that the energies of all the objects of type in bin are equal to to a good accuracy, and that they be large enough, or that the total number of objects of type be large enough, that the total number of objects in each bin, or at least, in the bins with the largest numbers of objects of type , should be large compared to 1. The derivation of the formula , in the second part of the post, starting here, was carried out separately for each different type of object , and the value of for each different type of object, as in the formula in the second part of the post, here, or in the example of the ideal gas, which we studied in the second part of the post, starting here, automatically compensates for any change of the bin size for each different type of object.

From the formula above, the contribution of a mode the energy of the electromagnetic radiation in the oven is proportional to the sum of the squares of the components of the polarization vector of the mode, so from the analogy to the kinetic energy of a particle, which is proportional to the sum of the squares of the components of the momentum vector of the particle, I shall assume that it is the range of possible polarization vectors of a mode that should be divided into equal size bins. The modes are grouped into types of approximately equal wave-number vector and equal as above, and for a type whose wave-number is , we'll choose two vectors and of length 1 that are perpendicular to and perpendicular to each other. We'll represent the components of the polarization vector in the directions and by and , so from the formula in the third part of the post, here:

for . The mutually perpendicular vectors , , and are the vectors of length 1 in the coordinate directions of an alternative system of Cartesian coordinates, so by Pythagoras:

since .

Thus from the formula above, the energy of a mode of type with independent polarization vector components and is:

So in a similar manner to the calculation for an ideal gas, in the second part of the post, starting here, we find from the result in the second part of the post, here, that the most likely number of modes of type in a polarization bin with edge sizes and is:

where is the total number of modes of type . From the result above, and the results we obtained in the second part of the post, here, and here, we find:

so:

In the same way as for the ideal gas example, in the second part of the post, here, we'll assume now that this most likely number of modes of type in each polarization bin is the actual number of modes in each polarization bin. So the total energy of the modes of type is:

From the result in the second part of the post, here, with , , this is equal to:

And from the calculation in the second part of the post, starting here, this is equal to:

Thus in thermal equilibrium at absolute temperature , each of the modes of type has energy . Since this is independent of and , we thus find that for every mode for which all three components of are sufficiently large, the energy of that mode, in thermal equilibrium at absolute temperature , is . From the discussion above, this applies, in particular, for electromagnetic radiation of infra-red and all shorter wavelengths in an oven of size about a metre cubed or larger, for almost all directions of the wave-number vector .

Thus since there is no upper limit to the size of the components of the wave-number vector , we have arrived at a result in contradiction with everyday experience: if the absolute temperature of the oven is greater than 0, then the energy of the electromagnetic radiation in the oven is infinite, because there are an infinite number of modes, each of which has energy .

At the end of the nineteenth century, the actual energy of the electromagnetic radiation in different frequency ranges in hot ovens was measured by Otto Lummer, Ferdinand Kurlbaum, and Heinrich Rubens. From the formulae for the electric and magnetic fields in the waves, above, the angular frequency , where is the Greek letter omega, which is times the number of cycles per unit time, is , where the wavevector is , from above, and the speed of light is , from the third part of the post, here. Each wave-number vector corresponds to 2 modes, one for each of the two values 0 and of , so the number of modes per unit volume in the space of wavevectors is . Each component of is , so from the value we found in the third part of the post, here, for the area of a sphere of radius 1, the total number of modes whose wavevector magnitude lies in the range from to is:

Thus the total number of modes whose angular frequency lies in the range from to is , so since each of these modes has energy according to the result above, the total energy of electromagnetic radiation in the oven in modes whose angular frequency lies in the range from to , according to the result above, is:

This is called the Rayleigh-Jeans law, after Lord Rayleigh and Sir James Jeans.

Max Planck discovered in 1900 that the measurements of Lummer, Kurlbaum, and Rubens are instead represented accurately by a formula:

where , which is pronounced "h bar", is a fundamental constant of nature that was previously unknown, whose value is:

and is Napier's number, as in the second part of the post, here.

From the formula in the second part of the post, here, the factor tends to as tends to 0, and is approximately equal to for all such that , so the Rayleigh-Jeans law, as above, is approximately valid for all such that , and becomes accurately valid when is substantially smaller than . However at such that , the energy in modes whose angular frequency lies in the range from to stops growing with as predicted by the Rayleigh-Jeans law, and at larger it decreases very rapidly with increasing , in complete contradiction with the Rayleigh-Jeans law.

If we define , then the Planck law, above, becomes , and the Rayleigh-Jeans law, above, becomes . This graph shows the Planck factor, , which is in agreement with observation, plotted in blue, and the Rayleigh-Jeans factor, , plotted in red. We see that the Rayleigh-Jeans factor is already too large by about a factor of 2 at , and only agrees well with the correct Planck factor for less than about 0.4.

The quantity is known as Planck's constant. The dimensions of are energy times time, which are the dimensions of de Maupertuis's action. Planck suggested that energy could be transferred between the oven walls and the electromagnetic radiation of frequency in the oven only in whole number multiples of a basic amount . This was the start of the discovery of quantum mechanics and Feynman's functional integral, which among other things, has made possible the design and construction of the computer on which you are reading this blog post.

In the next part of this post, Matrix Multiplication, I would like to provide you with a little bit of background knowledge that is helpful for understanding and using Feynman's functional integral, and in the part after that, The Functional Integral, we'll take a first look at the functional integral. We'll see how the functional integral leads to de Maupertuis's principle, Newton's laws, and Maxwell's equations, in the circumstances where those results are valid, and in the final parts of the post, we'll see how the functional integral leads to Planck's law, as above, and we'll take a first look at how, with a suitable identification of the relevant fields, and their action, the functional integral is used to predict the results of experiments such as those being carried out using the Large Hadron Collider at CERN, the European center for high energy physics near Geneva.

The software on this website is licensed for use under the Free Software Foundation General Public License.

Page last updated 30 November 2020. Copyright (c) Chris Austin 2012 - 2020. Privacy policy