Home | Blog index | Previous | Next | About | Privacy policy |
By Chris Austin. 30 November 2020.
An earlier version of this post was published on another website on 1 January 2013.
This is the fifth part of a ten-part post on the foundation of our understanding of high energy physics, which is Richard Feynman's functional integral. The first four parts are Action, Multiple Molecules, Electromagnetism, and Action For Fields, and the following parts, which will appear at intervals of about a month, are Matrix Multiplication, The Functional Integral, Gauge Invariance, Photons, and Interactions.
I'm hoping this blog will be fun and useful for everyone with an interest in science, so although I'll pop up a few formulae again, I'll try as usual to keep them friendly by explaining all the pieces. Please feel free to ask a question in the Comments, if you think anything in the post is unclear.
The clue that led to the discovery of quantum mechanics, whose principles are summarized in Feynman's functional integral, came from the attempted application to electromagnetic radiation of discoveries about heat and temperature. We looked at those discoveries about heat and temperature in the second part of the post, and in the third part of the post, we looked at how James Clerk Maxwell, just after the middle of the nineteenth century, was able to identify light as waves of oscillating electric and magnetic fields, and to calculate the speed of light from measurements of electrical and magnetic effects. Feynman's functional integral for a physical system depends on a property of the system called its action, and in the first part of the post, we derived Sir Isaac Newton's second law of motion from Pierre-Louis de Maupertuis's principle of stationary action. In the fourth part of the post, we calculated the energy of a system of electrically charged particles and electromagnetic fields, by deriving Maxwell's equations for the electromagnetic fields, and the forces exerted on the charged particles by the fields, by de Maupertuis's principle from an action.
Today I would like to put all these pieces together, and show you how they lead to a seriously wrong conclusion about the properties of electromagnetic radiation in a hot oven. In the subsequent parts of the post, we'll look at how that problem has been resolved by the discovery of quantum mechanics and Feynman's functional integral, which started with the identification of a new fundamental constant of nature by Max Planck, in 1899.
Let's now consider the electromagnetic fields in a box-shaped oven whose sides
are aligned with the
Cartesian coordinate
directions, and whose internal
dimensions in the Cartesian coordinate directions are ,
, and
.
We'll assume that the internal faces of the oven walls are perfectly
reflecting, and that the oven is empty apart from the electromagnetic fields.
We found in the third part of the post,
here,
that for any vector
, any angle
, and any vector
perpendicular to
, which means that
, a solution of
the equations for the electromagnetic field in a vacuum, with no electric
charges or electric currents present, is given by:
where is the voltage field and
is the vector potential field, and the
electric field strength
and the magnetic induction field
are expressed
in terms of
and
by the formulae we found in the third part
of the post,
here.
We found in the third part of the post,
here, that
is equal to the speed of light
metres per second in a vacuum, so I'll now write
instead of
.
From calculations similar to the ones in the third part of the post,
here, which confirmed that a wave of
the above form satisfies the gauge condition on and
in the third part of the post,
here, and
Maxwell's equations in a vacuum in terms of
and
assuming that gauge
condition, as in the third part of the post,
here,
with no electric charges or electric currents present, we
find that an arbitrary sum of waves of the above form also satisfies those
equations. We'll assume that the electromagnetic fields inside the oven
consist of a sum of waves of the above form, so that
inside the oven.
Maxwell's equation summarizing Faraday's measurements involving time-dependent
magnetic fields, as in the third part of the post,
here,
shows that the components of tangential to an
oven wall must be continuous at the oven wall, for if a component of
tangential to the oven wall changed discontinously at the oven wall, the
component of
in the perpendicular direction
along the oven wall would have to be infinite at the oven wall. The
assumption that the oven walls are perfectly reflecting means that
and
are 0 inside the material of the oven walls, and we'll assume that
and
are also 0 inside the material of the oven walls. So from the formula for
in terms of
and
, as in the third part of the post,
here,
the tangential components of
must be
continuous at the oven wall, and are thus 0 at the oven wall.
The requirement that the tangential components of must be 0 at each wall
of the oven is called a boundary condition, and restricts the possible wave
vectors
of the electromagnetic waves inside the oven. We'll assume that
the interior faces of the oven walls perpendicular to the
Cartesian
coordinate direction are at
and
, for
. A
single wave of the
above
form with a nonzero polarization vector
and a
nonzero wave vector
cannot satisfy the boundary condition on any wall of
the oven by itself, so if there is a wave present with particular values of
,
, and
, there must also be other waves present with different
values of
,
, or
, such that the sum of these waves satisfies
the boundary conditions.
Considering first the boundary at , if there is a wave present with
with nonzero and
, then there must also be other waves present with
different values of
,
, or
, such that the sum of
over these waves is 0 for all values of
, and all values of
in the range
, and all values of
in the range
. Any waves present relevant to satisfying the boundary
condition at
will have the same values of
,
, and
, and we can also require them to have the same value of
,
since from the first part of the post,
here, the value of
is unaltered by adding a
whole number times
to
. Thus since
, the only other relevant value of
is
. Thus to satisfy the boundary condition at
, any wave present
as above must be paired with another wave with the opposite value of
,
such that the sum of the two waves has the form:
since the 2 and 3 components of this are 0 at , and the polarization
vector
of the second wave is perpendicular
to the wave vector
of the second wave.
Considering, next, the boundary condition at , this must also be
satisfied by the above sum of two waves, since any relevant waves have the
same values of
,
, and
, and can be chosen to have
the same value of
. Thus we require
,
, and
to be
such that the 2 and 3 components of the above sum of two waves are also 0 at
, for all values of
, and all values of
in the range
, and all values of
in the range
. To determine the values of
,
, and
that satisfy this
requirement, it is helpful to know about a formula that expresses
in terms of
,
,
, and
, for arbitrary angles
and
.
From the definition of
and
, as
in the first part of the post,
here,
are the
Cartesian coordinates, in the 2-dimensional plane of Euclidean geometry, of a
point that is moving along a circle of radius 1 centred at the point with
Cartesian coordinates
, such that the angle between the
straight line from
to
and the
straight line from
to
is
, and
is
at
. We'll now use the straight
line from
to
as the first
coordinate direction of a second set of Cartesian coordinates also centred at
, such that the coordinate directions of the second set
of Cartesian coordinates rotate into the original set as
tends to
0. From the definition of Cartesian coordinates, as in the first part of the post,
here, the second
coordinate direction
of the second set of Cartesian
coordinates must be perpendicular to the first coordinate direction of the
second set, so by the discussion in the third part of the post,
here, it satisfies
, whose
solution of length 1 is
. This is required to become
as
tends to 0, so
the required solution is
. A
point whose coordinates are
with
respect to the first set of coordinates has coordinates
with
respect to the second set of coordinates, so we have:
Thus we have:
and also:
for all angles and all angles
.
The above sum of two waves is therefore equal to:
where I have defined the Greek letter to be
. We observe from the definition of
and
, as
in the first part of the post,
here, that:
and:
for all , so the above sum of two waves is:
Thus the requirement that the 2 and 3 components of the sum of the two waves
are 0 at , for all values of
, and all values of
in the
range
, and all values of
in the range
, or equivalently, at
, for all values of
, means
that either
, or
. If
, then
and
means that
, while if
, then from the
definition of
, as
in the first part of the post,
here, we have
, for some whole number
.
Considering the boundary conditions at the oven walls perpendicular to the 2
and 3 coordinate directions in the same way, we therefore find that if there
is a wave present as above with nonzero and
such that
, then it must be part of a sum of waves of the form:
and the components of must be of the form
, for
some whole numbers
,
. Doing the sums over the signs
,
, and
using the formulae
above, we find:
From the definition of
and
, as
in the first part of the post,
here, we observe that
, for all
. Thus from the formula
above, we have:
for all and all
. Thus if we display the
-dependence of the above sum of waves by representing it as
, we have:
Thus for arbitrary , the above sum of waves is equal to a sum of waves
of the above form with
, plus a sum of waves of the above form
with
.
From the formulae in the third part of the post,
here, the electric field strength and the magnetic
induction field
for the above sum of waves are:
When the oven is hot, we can expect that it will contain electromagnetic
radiation in these possible modes of oscillation. Let's now consider a sum
of the above possible modes of electromagnetic radiation in the oven for all
the possible values of , as
above,
and the two independent values 0 and
of
. There is an independent polarization vector
for each pair
of a possible value of
and one of
the two independent values 0 and
of
, and I'll write
this as
, where
are whole numbers
such that
, for
. The symbol
means, "greater than or equal to." The polarization vector
satisfies
. The
1 component of the electric field strength
is now:
where
means
, and the other components of
and the components of
are now analogous sums of the components
above.
From the gauge-invariant formula for the
Hamiltonian
for a collection of
electrically charged point particles moving slowly compared to the speed of
light in a vacuum, plus electric and magnetic fields, which we found in the fourth
part of the post,
here, the energy of the
electromagnetic fields in the oven involves the
integrals
over the volume of
the oven of the squares of the components of and
. To calculate the
contribution of
, where
is equal to a sum over the independent
modes as above, it is convenient to use independent dummy indexes
and
for each of the two factors of
, so that
can be
written as:
where stand for the remaining factors, and
, for
.
Considering a term in the above sum with specific values of ,
,
, and
, the integral over the volume of the oven factorizes
into a product of integrals of the form
or
, there being one such factor for each of the three possible
values of
. From the result we found
above,
and the observations
above,
we have:
for all and all
.
Thus we have:
We now observe that if is a fixed number, and
is a quantity that
depends smoothly on a quantity
, so in the terminology of the fourth part
of the post,
here,
is a
smooth function of
, then:
So from the result we found in the first part of the post, here, we have:
Thus from the result we found in the first part of the post, here, that the integral of the rate of change of a quantity is equal to the net change of that quantity, we have:
if is a whole number
, while for
we have
. Thus from the result
above,
we find that for whole numbers
and
:
And similarly:
Thus after doing the integral over the volume of the oven, the contribution
from to the gauge-invariant formula for the total energy,
,
as in the fourth part of the post,
here, is:
where now represents the volume
of the oven, and
.
The expression
is 1 if
and 0 otherwise, in accordance with
the definition of the Kronecker delta, in the first part of the post,
here.
And in the same way, we find that when
is
expressed as a sum over
, and
, analogous to the
expression for
above,
the contributions from
to
with
give 0 after doing the integral over the
volume of the oven.
We'll focus now on the contributions of modes with for all
. From the formula
above
and the analogous formulae for the
contributions from
and
, the sum of the contributions from
,
, and
, for the modes with
for all
, is:
And from the formulae for the components of ,
above,
the sum of the
contributions from
,
, and
, to
,
as in the fourth part of the post,
here, for the modes with
for all
, is:
since, for example,
, from the definition of the antisymmetric tensor
,
in the third part of the post,
here.
From the formula we found in the third part of the post,
here,
with the indexes
and
rewritten as
and
, and the dummy index
rewritten as
, and the
property of the
Kronecker delta
we observed in the third part of the post,
here,
the above expression is
equal to:
since
.
We found in the third part of the post,
here, that , the speed of light in a vacuum, is equal to
, so the factor
in the
contribution from
is equal to the factor
in the
contribution from
. Thus for each
, our observation
above implies
that the total contribution from terms where
and
, or vice versa, is proportional to:
where at the second step I used the observation that
for all
, which follows
from the definition of
, as
in the first part of the post,
here.
Thus since
for all
, the total
contribution of the modes with
for all
to the energy
of the electromagnetic radiation in the oven is:
where I used the formula for
above.
We now observe that the arguments that led to the Boltzmann distribution,
as in the second part of the post,
here,
for the most likely number of objects of a given type in a given
position and momentum bin, when the range of possible positions and momenta of
the microscopic objects in a system in thermal equilibrium at absolute
temperature is divided up into tiny bins of equal size, can be adapted to
the electromagnetic radiation subject to Maxwell's equations in a hot oven,
for radiation of wavelengths very small compared to the dimensions
,
, and
of the oven, in the following way.
We'll consider radiation modes whose wavelengths are sufficiently small
compared to the dimensions of the oven that we can group the modes, described
by their wave-number vector
and
or
, into "types" of similar
and equal
,
such that the number of modes of each type is large compared to 1, and the
relative differences
,
, and
, between the wave-number vectors
and
of any
two modes of the same type are small compared to 1. For example for
infra-red radiation of wavelength about
metres in an oven of size
about a metre cubed, we can assume that
,
, and
are all at
least about
, and divide up the range of each
into
ranges
to
;
to
; and so on, and say that two wave-number vectors
and
are of the
same type if
is in the same range as
, for
, and 3. Then the number of wave-number vectors of each type is
, and for
and
of the same type,
is
, for
, and 3. This is consistent with the definition of the type of an
object that I gave in the second part of the post,
here,
in the course of the derivation of the Boltzmann
distribution.
Looking back at the derivation of the Boltzmann distribution, in the second
part of the post, starting
here, we
observe that it depended on the assumption that the number of objects of each
type is very large compared to 1, and that requirement will be satisfied for
the radiation modes in the oven if we regard the modes of wave-number vector
such that
for
, and 3 as "objects", and
classify them into types as I just described.
The derivation of the Boltzmann distribution also depended on the conserved
total energy being equal to the sum of the energies of the individual objects,
so that the range of possible positions and momenta of an object could be
divided into very small bins, such that the conserved total energy is to a
very good approximation equal to a sum
, as
in the second part of the post,
here,
where
is the number of objects of type
in bin
, and
is the energy of an object of type
at the centre of bin
. However the derivation did not depend on any particular details of how
depends on the type of object
or the position and momentum of an
object at the centre of bin
, and it did not depend on the numbers
associated with an object, that determine the contribution of that object to
the total energy
, being the position and momentum coordinates of the
object, as opposed to some other quantities associated to the object that
determine its contribution to
.
Thus from the formula
above
for the contribution of the modes with
for all
to the energy of the electromagnetic radiation in the
oven, which shows that the energy is the sum of a contribution from each mode,
the argument works equally well if we regard the modes of wave-number vector
such that
for
, and 3 as objects, and classify
them into types as I described
above,
where the variable quantities associated
with each object, that determine the contribution of that object to
, are
the components of the polarization vector
of
that mode. Two of the three components of
can
vary independently for each mode, due to the restriction that
, as
above.
The derivation of the Boltzmann distribution depended on the range of possible
values of the position and momentum coordinates of an object being divided
into bins of equal size, but although I used the same bins, for convenience,
for each different type of object, the derivation did not depend on that, and
neither the derivation nor the result, as in the second part of the post,
here,
depended on the actual sizes
of the bins, other than through the requirements that they be small enough
that the energies of all the objects of type in bin
are equal to
to a good accuracy, and that they be large enough, or that the total
number
of objects of type
be large enough, that the total number of
objects in each bin, or at least, in the bins with the largest numbers of
objects of type
, should be large compared to 1. The derivation of the
formula
, in the second part of the post, starting
here,
was carried out separately for each different type of
object
, and the value of
for each different type
of object, as in the formula in the second part of the post,
here,
or in the example of the ideal gas, which we studied in the second part of the
post, starting
here,
automatically compensates for any change of the bin size for each different
type of object.
From the formula
above,
the contribution of a mode the energy of the
electromagnetic radiation in the oven is proportional to the sum of the
squares of the components of the polarization vector of the mode, so from the
analogy to the kinetic energy of a particle, which is proportional to the sum
of the squares of the components of the momentum vector of the particle, I
shall assume that it is the range of possible polarization vectors of a mode
that should be divided into equal size bins. The modes are grouped into
types of approximately equal wave-number vector and equal
as
above,
and for a type whose wave-number is
, we'll choose two vectors
and
of length 1 that are perpendicular to
and perpendicular to each
other. We'll represent the components of the polarization vector
in the directions
and
by
and
, so from the formula
in the third part of the post,
here:
for . The mutually perpendicular vectors
,
, and
are the vectors of length 1 in the coordinate
directions of an alternative system of Cartesian coordinates, so by
Pythagoras:
since
.
Thus from the formula
above,
the energy of a mode of type
with independent polarization vector components
and
is:
So in a similar manner to the calculation for an ideal gas,
in the second part of the post, starting
here,
we find
from the result in the second part of the post,
here,
that the most likely number of modes of type
in a polarization bin with edge sizes
and
is:
where is the total number of modes of type
. From the result
above,
and the results we obtained in the second part of the post,
here, and
here,
we find:
so:
In the same way as for the ideal gas example, in the second part of the post,
here,
we'll assume now that
this most likely number of modes of type
in each
polarization bin is the actual number of modes in each polarization bin. So
the total energy of the modes of type
is:
From the result in the second part of the post,
here, with
,
, this is equal to:
And from the calculation in the second part of the post, starting here, this is equal to:
Thus in thermal equilibrium at absolute temperature , each of the
modes of type
has energy
. Since this is independent of
and
, we thus find that for every
mode
for which all three components of
are
sufficiently large, the energy of that mode, in thermal equilibrium at
absolute temperature
, is
. From the discussion
above, this
applies, in particular, for electromagnetic radiation of infra-red and all
shorter wavelengths in an oven of size about a metre cubed or larger, for
almost all directions of the wave-number vector
.
Thus since there is no upper limit to the size of the components of the
wave-number vector , we have arrived at a result in contradiction with
everyday experience: if the absolute temperature
of the oven is greater
than 0, then the energy of the electromagnetic radiation in the oven is
infinite, because there are an infinite number of modes, each of which has
energy
.
At the end of the nineteenth century, the actual energy of the electromagnetic
radiation in different frequency ranges in hot ovens was measured by
Otto Lummer,
Ferdinand Kurlbaum, and
Heinrich Rubens.
From the formulae for the electric
and magnetic fields in the waves,
above, the angular
frequency , where
is the Greek letter omega, which is
times the number of cycles per unit time, is
,
where the wavevector
is
, from
above,
and the
speed of light is
, from
the third part of the post,
here. Each
wave-number vector
corresponds to 2 modes, one for each of the two values
0 and
of
, so the number of modes per unit volume in
the space of wavevectors is
.
Each component of
is
, so from the value we found in the
third part of the post,
here, for the
area of a sphere of radius 1, the total number of modes whose wavevector
magnitude lies in the range from
to
is:
Thus the total number of modes whose angular frequency lies in the range from
to
is
, so since each of these modes has energy
according to the result
above,
the total energy of electromagnetic radiation
in the oven in modes whose angular frequency lies in the range from
to
, according to the result
above, is:
This is called the Rayleigh-Jeans law, after Lord Rayleigh and Sir James Jeans.
Max Planck discovered in 1900 that the measurements of Lummer, Kurlbaum, and Rubens are instead represented accurately by a formula:
where , which is pronounced "h bar",
is a fundamental constant of nature
that was previously unknown, whose value is:
and
is Napier's number,
as in the second part of the post,
here.
From the formula in the second part of the post,
here,
the factor
tends to
as
tends to 0, and is approximately equal to
for all
such that
, so the
Rayleigh-Jeans law, as
above,
is approximately valid for all
such
that
, and becomes accurately valid when
is substantially smaller than
. However at
such that
, the energy in modes whose
angular frequency lies in the range from
to
stops growing with
as predicted by the Rayleigh-Jeans law,
and at larger
it decreases very rapidly with increasing
, in
complete contradiction with the Rayleigh-Jeans law.
If we define
, then the Planck law,
above,
becomes
, and the Rayleigh-Jeans law,
above,
becomes
. This graph shows the Planck factor,
, which is in agreement with observation, plotted in blue, and the
Rayleigh-Jeans factor,
, plotted in red. We see that the Rayleigh-Jeans
factor is already too large by about a factor of 2 at
, and only agrees
well with the correct Planck factor for
less than about 0.4.
The quantity is known as Planck's constant. The
dimensions of
are energy times time, which are the dimensions of de
Maupertuis's action. Planck suggested that energy could be transferred
between the oven walls and the electromagnetic radiation of frequency
in the oven only in whole number multiples of a basic amount
. This was the start of the discovery of quantum mechanics and Feynman's
functional integral, which among other things, has made possible the design
and construction of the computer on which you are reading this blog post.
In the next part of this post, Matrix Multiplication, I would like to provide you with a little bit of background knowledge that is helpful for understanding and using Feynman's functional integral, and in the part after that, The Functional Integral, we'll take a first look at the functional integral. We'll see how the functional integral leads to de Maupertuis's principle, Newton's laws, and Maxwell's equations, in the circumstances where those results are valid, and in the final parts of the post, we'll see how the functional integral leads to Planck's law, as above, and we'll take a first look at how, with a suitable identification of the relevant fields, and their action, the functional integral is used to predict the results of experiments such as those being carried out using the Large Hadron Collider at CERN, the European center for high energy physics near Geneva.
The software on this website is licensed for use under the Free Software Foundation General Public License.
Page last updated 30 November 2020. Copyright (c) Chris Austin 2012 - 2020. Privacy policy