A S KOMPAN EYETS 


THEORETICAL 

PHYSICS 


TRANSLATED FROM THE RUSSIAN 


EDITED BY OEORGE YANKOVSKY 


This translation has been read and approved 
by the author, Professor A. S. Kompaneyets 


Printed in the Union of Soviet Socialist Republics 


CONTENTS 

Page 

From the Preface to the First Edition. 7 

Preface to the Second Edition. 9 

Part I. Mechanics . 11 

Sec. 1. Generalized Coordinates. 11 

Sec. 2. Lagrange’s Equation. 13 

Sec. 3. Examples of Lagrange’s Equations .24 

Sec. 4. Conservation Laws .30 

Sec. 6. Motion in a Central Field.41 

Sec. 6. Collision of Particles.48 

Sec. 7. Small Oscillations.57 

Sec. 8. Rotating Coordinate Systems. Inertial Forces.66 

Sec. 9. The Dynamics of a Rigid Body.73 

Sec. 10. General Principles of Mechanics.81 

Part 11. Electrodynamics .92 

Sec. 11. Vector Analysie.92 

Sec. 12. The Eleotromagnetic Field. Maxwell’s Equations.104 

Sec. 13. The Action Principle for the Electromagnetic Field.117 

Sec. 14. The Electrostatics of Point Charges. Slowly Varying Fields . . 124 

Sec. 15. The Magnetostatics of Point Charges.135 

Sec. 18. Electrodynamics of Material Media.144 

Sec. 17i Plane Eleotromagnetic Waves.162 

Sec. 18. Transmission of Signals. Almost Plane Waves.173 

Sec. 19. The Emission of Electromagnetic Waves .181 

Sec. 20. The Theory of Relativity.190 

Sec. 21. Relativistic Dynamics.211 

Fart III. Quantum Mechanics.229 

Sec. 22. The Inadequacy of Classical Mechanics. 

The Analogy Between Mechanics and Geometrical Optics.229 

Sec. 23. Electron Diffraction .238 

Sec. 24. The Wave Equation.244 


6 


CONTENTS 


Page 

Sec. 25. Certain Problems of Quantum Mechanics.252 

Sec. 26. Harmonic Oscillatory Motion in Quantum Mechanics 

(Linear Harmonic Oscillator).265 

Sec. 27. Quantization of the Electromagnetic Field.271 

Sec. 28. Quasi-Classical Approximation.280 

Sec. 29. Operators in Quantiun Mechanics.291 

Sec. 30. Expansions into Wave Functions.301 

Sec. 31. Motion in a Central Field.312 

Sec. 32. Electron Spin.323 

Sec. 33. Many-Eleotron Systems.334 

Sec. 34. The Quantum Theory of Radiation.353 

Sec. 35. The Atom in a Constant External Field.368 

Sec. 36. Quantum Theory of Dispersion.379 

Sec. 37. Quantum Theory of Scattering.385 

Sec. 38. The Relativistic Wave Equation for an Electron.394 

Part IV. Statistical Physics.413 

Sec. 39. The Equilibrium Distribution of Molecules in an Ideal Cas . . 413 
Sec. 40. Boltzmann Statistics (Translational Motion of a Molecule. Gas 

in an External Field).430 

Sec. 41. Boltzmann Statistics (Vibrational and Rotational Molecular 

Motion) .447 

Sec. 42. The Application of Statistics to the Electromagnetic Field and 

to Crystalline Bodies.457 

Sec. 43. Bose Distribution .474 

Sec. 44. Fermi Distribution.477 

Sec. 45. Gibbs Statistics.498 

Sec. 46. Thermodynamic Quantities.612 

Sec. 47. The Thermodynamic Properties of Ideal Gases in Boltzmann 

Statistics.535 

Sec. 48. Fluctuations.646 

Sec. 49. Phase Equilibrium.557 

Sec. 60. Weak Solutions.568 

Sec. 61. Chemical Equilibria.576 

Sec. 62. Surface Phenomena.582 

Appendix.' . . 586 

Bibliography.588 

Subject Index.689 


FROM THE PREFACE TO THE FIRST EDITION 

This book is intended for readers who are acquainted with the 
course of general physics and analysis of nonspecializing institutions 
of higher education. It is meant chiefly for engineer-physicists, though 
it may also be useful to specialists working in fields associated with 
physics—chemists, physical chemists, biophysicists, geophysicists, 
and astronomers. 

Like the natural sciences in general, physics is based primarily 
on experiment, and, what is more, on quantitative experiment. 
However, no series of experiments can constitute a theory until a 
rigorous logical relationship is established between them. Theory not 
only allows us to systematize the available experimental material, but 
also makes it possible to predict new facts which can be experimentally 
verified. 

All physical laws are expressed in the form of quantitative relation¬ 
ships. In order to interrelate quantitative laws, theoretical physics 
appeals to mathematics. The methods of theoretical physics, which 
are based on mathematics, can be fully mastered only by those who 
have acquired a very considerable volume of mathematical knowledge. 
Nevertheless, the basic ideas and results of theoretical physics are 
readily comprehensible to any reader who has an understanding of 
differential and integral calculus, and is acquainted with vector 
algebra. This is the minimum of mathematical knowledge required 
for an understanding of the text that follows. 

At the same time, the aim of this book is not only to give the reader 
an idea about what theoretical physics is, but also to furnish him 
with a working knowledge of the basic methods of theoretical physics. 
For this reason it has been necessary to adhere, as far as possible, 
to a rigorous exposition. The reader will more readily agree with 
the conclusions reached if their inevitability has been made obvious 
to him. In order to activize the work of the student, some of the 
applications of the theory have been shifted into the exercises, in 
which the line of reasoning is not so detailed as in the basic text. 

In compiling such a relatively small book as this one it has been 
necessary to cut down on the space devoted to certain important 


8 


FROM THE PREFACE TO THE FIRST EDITION 


sections of theoretical physics, and omit other branches entirely. 
For instance, the mechanics of solid media is not included at all 
since to set out this branch, even in the same detail as the rest of 
the text, would mean doubling the size of the book, A few results 
from the mechanics of continuous media are included in the exercises 
as illustrations in thermodynamics. At the same time, the mechanics 
and electrodynamics of solid media are less related to the fundamental, 
gnosiological problems of physics than microscopic electrodynamics, 
quantum theory, and statistical physics. For this reason, very little 
space is devoted to macroscopic electrodynamics: the material has 
been selected in such a way as to show the reader how the transition 
is made from microscopic electrod3mamics to the theory of quasi¬ 
stationary fields and the laws of the propagation of light in media. 
It is assumed that the reader is familiar with these problems from 
courses of physics and electricity. 

On the whole, the book is mainly intended for the reader who is 
interested in the physics of elementary processes. These considerations 
have also dictated the choice of material; as in all nonencyclopaedic 
manuals, this choice is inevitably somewhat subjective. 

In compiling this book, I have made considerable use of the excellent 
course of theoretical physics of L. D. Landau and E. M. Lifshits. This 
comprehensive course can be recommended to all those who wish 
to obtain a profound understanding of theoretical physics. 

I should like to express my deep gratitude to my friends who have 
made important observations: Ya. B. Zeldovich, V. G. Levich, 
E. L. Feinberg, V. I. Kogan and V. I. Goldansky. 


A. Kompaneyets 


PREFACE TO THE SECOND EDITION 

In this second edition I have attempted to make the presentation 
more systematic and rigorous without adding any difficulties. In 
order to do this it has been especially necessary to revise Part III, 
to which I have added a special section (Sec. 30) setting out the general 
principles of quantum mechanics; radiation is now considered only 
with the aid of the quantum theory of the electromagnetic field, 
since the results obtained from the correspondence principle do not 
appear sufficiently justified. 

Gibbs’ statistics are included in this edition, which has made it 
necessary to divide Part IV into something in the nature of two 
cycles: Sec. 39-44, where only the results of combmatorial analysis 
are set out, and Sec. 45-52, an introduction to the Gibbs’ method, 
which is used as background material for a discussion of thermo¬ 
dynamics. A phenomenological approach to thermodynamics would 
nowadays appear an anachronism in a course of theoretical physics. 

In order not to increase the size of the book overmuch, it has been 
necessary to omit the theory of beta decay, the variational properties 
of eigenvalues, and certain other problems included in the first 
edition. 

I am greatly indebted to A. F. Nikiforov and V. B. Uvarov for 
pointing out several inaccuracies in the first edition of the book. 


A. Kompaneyds 


PART I 

MECHANICS 

Soc. 1. Generalized Coordinates 

Frames ol reference. In order to describe the motion of a mechanical 
system, it is necessary to specify its position in space as a function 
of time. Obviously, it is only meaningful to speak of the relative 
position of any point. For instance, the position of a flying aircraft 
is given relative to some coordinate system fixed with respect to the 
earth; the motion of a charged particle in an accelerator is given 
relative to the accelerator, etc. The system, relative to which the 
motion is described, is called a frame of reference. 

Specification of time. As will be shown later (Sec. 20), specification 
of time in the general case is also coimected with defining the frame 
of reference in which it is given. The intuitive conception of a uni¬ 
versal, unique time, to which we are accustomed in everyday life, 
is, to a certain extent, an approximation that is only true when the 
relative speeds of all material particles are small in comparison 
with the velocity of light. The mechanics of such slow movements 
is termed Newtonian, since Isaac Newton was the first to formulate 
its laws. 

Newton’s laws permit a determination of the position of a mechanical 
system at an arbitrary instant of time, if the positions and velocities 
of all points of the system are known at some initial instant, and also 
if the forces acting in the system are known. 

Degrees of freedom of a mechanical system. The number of inde-i 
pendent parameters defining the position of a mechanical system in 
space is termed the number of its degrees of freedom. 

The position of a particle in space relative to other bodies is defined 
with the aid of three independent parameters, for example, its 
Cartesian coordinates. The position of a system consisting of N 
particles is determined, in general, by ZN independent parameters. 

However, if the distribution of points is fixed in any way, then 
the number of degrees of freedom may be less than 3iV. For example. 


12 


MECHANICS 


[Part I 


if two points are constrained by some form of rigid nondeformable 
coupling, then, upon the six Cartesian coordinates of these points, 
yi> * 2 > ^ 2 > imposed the condition 

(x^ — xi)’‘-h {y^ — + (za — 2i)® = , (1.1) 

where is the given distance between the points. It follows that 
the Cartesian coordinates are no longer independent parameters: 
a relationship exists between them. Only five of the six values 
Xi, ..., Za are now independent. In other words, a system of two 
particles, separated by a fixed distance, has five degrees of freedom. If 
we consider three particles which are rigidly fixed in a triangle, then 
the coordinates of the third particle must satisfy the two equations; 

+ (2/3 + (Z3 - = Rl, , (1.2) 

(Xi — XzY + {y^ — y^Y + {z^ — z^)^^R\^. (1.3) 

Thus, the nine coordinates of the vertices of the rigid triangle are 
defined by the three equations (1.1), (1.2) and (1.3), and hence only 
six of the nine quantities are independent. The triangle has six 
degrees of freedom. 

The position of a rigid body in space is defined by three points 
which do not lie on the same straight line. These three points, as we 
have just seen, have six degrees of freedom. It follows that any rigid 
body has six degrees of freedom. It should be noted that only such 
motions of the rigid body are considered as, for example, the rotation 
of a top, where no noticeable deformation occurs that can affect 
its motion. 

Generalized coordinates. It is not always convenient to describe 
the position of a system in Cartesian coordmates. As we have already 
seen, when rigid constraints exist, Cartesian coordinates must satisfy 
supplementary equations. In addition, the choice of coordinate system 
is arbitrary and should be determined primarily on the basis of 
expediency. For instance, if the forces depend only on the distances 
between particles, it is reasonable to introduce these distances into 
dynamical equations explicitly and not by means of Cartesian 
coordinates. 

In other words, a mechanical system can be described by coordinates 
whose number is equal to the number of degrees of freedom of the 
system. These coordinates may sometimes coincide with the Cartesian 
coordinates of some of the particles. For example, in a system of two 
rigidly connected points, these coordinates can be chosen in the 
following way; the position of one of the points is given in Cartesian 
coordinates, after which the other point will always be situated on 
a sphere whose centre is the first point. The position of the second 
point on the sphere may be given by its longitude and latitude. 


Sec. 2] 


liAGBANGE’S EQUATION 


13 


Together with the three Cartesian coordinates of the first point, 
the latitude and longitude of the second point completely define 
the position of such a system in space. 

For three rigidly bound points, it is necessary, in accordance with 
the method just described, to specify the position of one side of the 
triangle and the angle of rotation of the third vertex about that 
side. 

The independent parameters which define the position of a 
mechanical system in space are called its generalized coordinates. 
We will represent them by the symbols qa., where the subscript a 
signifies the number of the degree of freedom. 

As in the case of Cartesian coordinates, the choice of generalized 
coordinates is to a considerable extent arbitrary. It must be chosen 
so that the dynamical laws of motion of the system can be formulated 
as conveniently as possible. 


Sec. 2. Lagrange’s Equation 

In this section, equations of motion will be obtained in terms of 
arbitrary generalized coordinates. In such form they are especially 
convenient in theoretical physics. 

Newton’s Second Law. Motion in mechanics consists in changes in 
the mutual configuration of bodies in time. In other words, it is 
described in terms of the mutual distances, or lengths, and intervals 
of time. As was shown in the preceding section, all motion is relative; 
it can be specified only in relation to some definite frame of refer¬ 
ence. 

In accordance with the level of knowledge of his time, Newton 
regarded the concepts of length and time interval as absolute, which 
is to say that these quantities are the same in all frames of reference. 
As will be shown later, Newton’s assumption was an approximation 
(see Sec. 20). It holds when the relative speeds of aU the particles 
are small compared with the velocity of light; here Newtonian 
mechanics is based on a vast quantity of experimental facts. 

In formulating the laws of motion a very convenient concept is 
the material particle, that is, a body whose position is completely 
defined by three Cartesian coordinates. Strictly speaking, this 
idealization is not applicable to any body. Nevertheless, it is in 
every way reasonable when the motion of a body is sufficiently well 
defined by the displacement in space of any of its particles (for 
example, the centre of gravity of the body) and is independent of 
rotations or deformations of the body. 

If we start with the concept of a particle as the fundamental 
entity of mechanics, then the law of motion (Newton’s Second Law) 
is formulated thus: 


14 


MECHANICS 


[Part I 


m 


d*r 

d«» 


F. 


( 2 . 1 ) 


Here, F is the resultant of all the forces applied to the particle 

fl% p 

(the vector sum of the forces) is the vector acceleration, the 
Cartesian components of which are 


d}x d^y d^z 

Tt^’ TF’ ~d^' 


The quantity m involved in equation (2.1) characterizes the particle 
and is called its mass. 

Force and mass. Equality (2.1) is the definition of force. However, 
it should not be regarded as a simple identity or designation, be¬ 
cause (2.1) establishes the form of the mteraction between bodies 
in mechanics and thereby actually describes a certain law of nature. 
The interaction is expressed in the form of a differential equation 
that includes only the second derivatives of the coordinates with 
respect to time (and not derivatives, say, of the fourth order). 

In addition, certain limiting assumptions are usually made in 
relation to the force. In Newtonian mechanics it is assumed that 
forces depend only on the mutual arrangement of the bodies at the 
instant to which the equality refers and do not depend on the con¬ 
figuration of the bodies at previous times. As we shall see later (see 
Part II), this supposition about the character of interaction forces 
is valid only when the speeds of, the bodies are small compared with 
the velocity of light. 

The quantity m in equality (2.1) is a characteristic of the body, 
its mass. Mass may be determuied by comparing the accelerations 
which the same force imparts to different bodies; the greater the 
acceleration, the less the mass. In order to measure mass, some body 
must be regarded as a standard. The choice of a standard body is 
completely independent of the choice of standards of length and 
time. This is what makes the dimension (or unit of measurement) 
of mass a special dimension, not related to the dimensions of length 
and time. 

The properties of mass arc established experimentally. Firstly, it 
can be shown that the mass of two equal quantities of the same 
substance is equal to twice the mass of each quantity. For example, 
one can take two identical scale weights and note that a stretched 
spring gives them equal accelerations. If we join two such weights 
and subject them to the action of the same spring, which has been 
stretched by the same amount as for each weight separately, the 
acceleration will be found to be one half what it was. It follows that 
the overall mass of the weights is twice as great, since the force 
depends only on the tension of the spring and could not have changed. 


Sec. 2] 


Lagrange’s equation 


15 


Thus, mass is an additive quantity, that is, one in which the whole 
is equal to the sum of the quantities of each part taken separately. 
Experiment shows that the principle of additivity of mass also 
applies to bodies consisting of different substances. 

In addition, in Newtonian mechanics, the mass of a body is a 
constant quantity which does not change with motion. 

It must not be forgotten that the additivity and constancy of 
masses are properties that follow only from experimental facts which 
relate to very specific forms of motion. For example, a very important 
law, that of the conservation of mass in chemical transformations 
involving rearrangement of the molecules and atoms of a body, 
was established by M. V. Lomonosov experimentally. 

Like all laws deduced from experiment, the principle of additivity 
of mass has a definite degree of precision. For such strong interactions 
as take place in the atomic nucleus, the breakdown of the additivity 
of mass is apparent (for more detail see Sec. 21). 

We may note that if instead of subjecting a body to the force of 
a stretched spring it were subjected to the action of gravity, then 
the acceleration of a body of double mass would be equal to the 
acceleration of each body separately. From this we conclude that 
the force of gravity is itself proportional to the mass of a body. 
Hence, in a vacuum, in the absence of air resistance, all bodies fall 
with the same acceleration. 

Inertial frames of reference. In equation (2.1) we have to do with 
the acceleration of a particle. There is no sense in talking about 
acceleration without stating to which frame of reference it is referred. 
For this reason there arises a difficulty in stating the cause of the 
acceleration. This cause may be either interaction between bodies 
or it may be due to some distinctive properties of the reference frame 
itself. For example, the jolt which a passenger experiences when a 
carriage suddenly stops is evidence that the carri^e is in nonuniform 
motion relative to the earth. 

Let us consider a set of bodies not affected by any other bodies, 
that is, one that is sufficiently far away from them. We can suppose 
that a frame of reference exists such that all accelerations of the 
set of bodies considered arise only as a result of the interaction between 
the bodies. This can be verified if the forces satisfy Newton’s Third 
Law, i.e., if they are equal and opposite in sign for any pair of particles 
(it is assumed that the forces occur instantaneously, and this is true 
only when the speeds of the particles are small compared with the 
speed of transmission of the interaction). 

A frame of reference for which the acceleration of a certain set of 
particles depends only on the interaction between these particles 
is called an inertial frame (or inertial coordinate system). A free 
particle, not subject to the action of any other body, moves, relative 
to such a reference frame, uniformly in a straight line or, in everyday 


16 


MECHANICS 


[Part I 


language, by its own momentum. If in a given frame of reference 
Newton’s Third Law is not satisfied we can conclude that this is 
not an inertial system. 

Thus, a stone thrown directly downwards from a tall tower is 
deflected towards the east from the direction of the force of gravity. 
This direction can be independently established with the aid of a 
suspended weight. It follows that the stone has a component of 
acceleration which is not caused by the force of the earth’s attraction. 
From this we conclude that the frame of reference fixed in the earth 
is noninertial. The noninertiality is, in this case, due to the diurnal 
rotation of the earth. 

On the lorccs ol friction. In everyday life we constantly observe 
the action of forces that arise from direct contacts between bodies. 
The sliding and rolling of rigid bodies give rise to forces of friction. 
The action of these forces causes a transition of the macroscopic 
motion of the body as a whole into the microscopic motion of the 
constituent atoms and molecules. This is perceived as the generation 
of heat. Actually, when a body slides an extraordinarily complex 
])rocess of interaction occurs between the atoms in the surface layer. 
A description of this interaction in the simple terms of frictional 
forces is a very convenient idealization for the mechanics of macro- 
sco])io motion, but, naturally, does not give us a full picture of the 
])roce8s. The concept of frictional force arises as a result of a certain 
averaging of all the elementary interactions which occur between 
bodies in contact. 

In this part, which is concerned only with elementary law's, we shall 
not consider averaged hiteractions where motion is transferred to 
the internal, microscopic, degrees of freedom of atoms and molecules. 
Here, wo will study only those interactions which can be completely 
expressed with the aid of elementary laws of mechanics and which 
do not require an appeal to any statistical concepts connected with 
internal, thermal, motion. 

Ideal rigid constraints. Bodies in contact also give rise to forces 
of interaction which can bo reduced to the kinematic properties of 
rigid constraints. If rigid constraints act in a system they force 
the particles to move on definite surfaces. Thus, in Sec. 1 we con¬ 
sidered the motion of a single particle on a sphere, at the centre of 
which was another particle. 

This kind of mteraction between particles does not cause a transition 
of the motion to the internal, microscopic, degrees of freedom of 
bodies. In other words, motion which is limited by rigid constraints 
is completely described by its own macroscopic generalized co¬ 
ordinates 5 *. 

If the limitations imposed by the constraints distort the motion, 
they thereby cause accelerations (cimvilinear motion is alw'ays 
accelerated motion since velocity is a vector quantity). This ac- 


Sec. 21 


Lagrange’s equation 


17 


celeration can be formally attributed to forces which are called 
reaction forces of rigid constraints. 

Reaction forces change only the direction of velocity of a particle 
but not its magnitude. If they were to alter the magnitude of the 
velocity, this would produce a change also in the kinetic energy of 
the particle. According to the law of conservation of energy, heat 
would then be generated. But this was excluded from consideration 
from the very start. 

To summarize, the reaction forces of ideally rigid constraints do 
not change the kinetic energy of a system. In other words, they do 
not perform any work on it, shice work performed on a system is 
equivalent to changing its kinetic energy (if heat is not gener¬ 
ated). 

In order that a force should not perform work, it must be jierpen- 
dicular to the displacement. For this reason the reaction forces of 
constraints are perpendicular to the direction of particle velocity 
at each given instant of time. 

However, in problems of mechanics, the reaction forces are not 
initially given, as are the functions of particle ijosition. They are 
determined by integrating equations (2.1), with account taken of 
coiLstraint conditions. Therefore, it is best to formulate the equations 
of mechanics so as to exclude constraint reactions entirely. It turns 
out that if we go over to generalized coord mates, the number of 
which is equal to the number of degrees of freedom of the system, 
then the constraint reactions disappear from the equations. In this 
section wo shall make such a transition and will obtain the equations 
of mechanics in terms of the generalized coordinates of the system. 

The transformation from rectangular to generalized coordinates. 
We take a system with a total of 3iV=w Cartesian coordinates of 
which V are independent. We will always denote Cartesian coordinates 
by the same letter Xi, understandmg by this symbol all the co¬ 
ordinates x,y,z; this means that i varies from 1 to 3A, that is, from 1 
to n. The generalized coordinates we denote by q^, (l<a<v). Since 
the generahzed coordinates completely specify the position of their 
sjstem, Xi are their unique functions: 

x, = Xi (qi, g’g, gv) . (2.2) 

From this it is easy to obtain an expression for the Cartesian com¬ 
ponents of velocity. Differentiating the function of many variables 
X, {. . . g'a) with respect to time, we have 

V 

dxi _ dxi dqqt 

dt ^ dq% dt 

a — 1 

In the subsequent derivation we shall often have to perform 
summations with respect to aU the generalized coordinates q^, 


2 - 0040 


18 


MECHANICS 


[Part I 


and double and triple sums will be encountered. In order to save 
space we introduce the following summation convention. 

If a Greek symbol is met twice on one side of an equation, it will he 
understood as denotinq a summation from unity to v, that is, over all 
the generalized coordinates. (It is not convenient to use this convention 
for the Latin characters which denote the Cartesian coordinates.) 
d X' 

Then the velocity can be rewritten thus: 

dx, __ djTi dqx „ 

(it VqoL (it ' 

Here the summation sign is omitted. 

The total derivative with respect to time is usually denoted by a 
dot over the corresponding variable; 


dx, 

(it 


dqa 

dt 


(2.4) 


In this notation, (2.3) is written in an even more abbreviated form; 


. dxi . 


(2.5) 


Differentiating (2.5) with respect to time again, we obtain an 
expression for the Cartesian components of acceleration: 


5-i= 


d 

(it 


I i>Xi . 


+ 


dxi 

(1(1 Oi 


The total derivative in the first term is written as usual; 


d I dxi \ ()^ X, 

dt \8q<tl^ ('q^hqa ' 

The Greek symbol over which the summation is performed is denoted 
by the letter p to avoid confusion with the symbol a, which denotes 
the summation in the expression for velocity (2.5). Thus, we obtain 
the desired expression for ii,: 


Xi 


g* Xi 

dq^ dqo, 


+ 


( 2 . 6 ) 


The first term on the right-hand side contains a double summation 
with respect to a and p. 

Potential ot a force. We now consider components of force. In 
many cases, the three components of the vector of a force acting 
on a particle can be expressed in terms of one scalar function U 
according to the formula: 


Fr 


dU 

8.Ci 


(2.7) 


Sec. 2J 


Lagrange's equation 


19 


Such a function can always be chosen for the force of Newtonian 
attraction, and for electrostatic and elastic forces. The function U 
is called the potential of the force. 

It is clear that by far not every system of forces can, in the general 
case, be represented by a set of partial derivatives (2.7), since, if 


Fi 


eu 


Fu-- — 


a.ffc ’ 


then we must have the equality 
eF, ^ £?n- 

£'.(■(, fj-,- 


£•2 17 
a.I', c'j'k 


for all i, l\ which is not, in advance, obvious for the arbitrary functions 
F,, Fk. The definite form of tlie potential, in tho.so cases where it 
exists, will be given below for various forces. 

Expression (2.7) defines the potential function U to tlie accuracy 
of an arbitrary constant term. IJ is also c,ailed the potential energy 
of the system. For exami)le, the gravitational force F ~ — mg, 
while the potential energy of an elevated body is equal to nigz, 
where ;/~980 cm/sec^ is the acceleration of a freely falling body 
and z is the height to which it has been raised. It can be calculated 
from any level, which in the given case corresponds to a determi¬ 
nation of U to the accuracy of a constant term. A more precise ex- 
])rpssion for the force of gravity than F= — mg (with allowance 
nmde for its dependence on height also admits of a potential, which 
we shall derive a little later [see (3.4)]. 

We denote the component reaction forces of rigid constraints by 
Fi. We now note that 

n 

2^F’,dxi=0, ( 2 . 8 ) 

1-1 


if the displacements are compatible with the constraints. Indeed, 
(2.8) expresses precisely the work performed by the reaction forces 
for a certain possible displacement of the system; but this work 
has been shown to be equal to zero. 

Lagrange’s equations.* We will now write down the equations of 
motion Avith the aid of (2.7) and (2.8) as 


Wi ( 


fi® Xi 
fig’p fiya 


. . dxi .. \ dU 


(2.9) 


Here, of course, — is equal to the mass of the first 

particle, m^ — m^=m^ equals the mass of the second particle, etc. 


* In the first reading, the subsequent derivation up to equation (2.18) 
need not be studied in detail. 


2" 


20 


MECHANICS 


[Part I 


^X' 

Let U8 multiply both sides of this equation by and sum from 1 
to n over i. 

Let us first consider the right-hand side. We obviously have 


817 

dXi 


8xi 

8(?y 


8U 

8qy 


( 2 . 10 ) 


in accordance with the law for differentiating composite functions. 
For the forces of reaction we obtain 


8xi 

Sq-t 


0 , 


( 2 . 11 ) 


since this equality is a special case of (2.8), in \rhich the displace¬ 
ments dxi are taken for all constants q except for that is why we 

dxi 

retain the designation of the partial derivative . It is clear that 

in such a special displacement the work done by the reaction forces 
of the constraints is equal to zero, as in the case of a general displace¬ 
ment. 

After multiplication by and summation, the left-hand side 

of equality (2.9) can be written in a more compact form, without 
resorting to explicit Cartesian coordinates. It is precisely the purpose 
of this section to give such an improved notation. To do this we ex¬ 
press the kinetic energy in terms of generalized coordinates: 


i-1 


( 2 . 12 ) 


Substituting the generalized velocities by using (2.5), we obtain 


T = 


n 


The summation indices for q must, of course, be denoted by different 
letters, since they independently take aU values from 1 to v inclusive. 
Changing the order of summation for Cartesian and generalized co¬ 
ordinates we have 


n 

(-I 


dxi dxi 
Sqa dq^ 


(2.13) 


Henceforth, T will have to be differentiated both with respect to 
generalized coordinates and generalized velocities q^. The co¬ 
ordinate g* and its corresponding generalized velocity q^ are in- 


See. 2] 


lagkange's equation 


21 


dependent of each other since, in the given position in which the 
coordmate has a given value g*, it is possible to impart to the S 5 ’stem 
an arbitrary velocity permitted by the constraints. Naturally, 

dT 

and are also independent. It follows that in calculating 

all the remaining velocities and all the coordinates, including 

Qa, should be regarded as constant. 

d T 

Let us calculate the derivative In the double summation 

dqr 

(2.13), the quantity y can be taken as tlie index a and also the index p, 
so that wo obtain 


dT 

dqy 


i- 1 


c‘.r^ 

£qy 


dxi 


mi-. 


dxi 

dqa 


dqy • 


Both these sums are the same except that in the first the index 
is denoted by p and in the second by a. They can be combined, 
replacing p by a in the first summation; naturally the value of the 
sum does not cliange due to renaming of the summation sign. Then 
we obtain 


n 


I - 1 


dxi 
hqy ' 


(2.14) 


Let us calculate the total derivative of this quantity with respect 
to time: 


d ST 
(it dqy 


rt 

1-1 


Sxi dxi 
8qa dqy 


n 

+ ga 27 

i-1 


d^Xi 
8ga 8gg 


+ 


n 

+ 27 

1=1 


dxi 

8qa 


d^Xi 
8qy dq^ 


^3- 


(2.16) 


Here we have had to write down the derivatives of each of the three 
factors of all the terms in the summation (2.14) separately. 

T dT 

Now let us calculate the partial derivative . As has been 


dT 

shown, (ja, gp are regarded as constants. Like -wr-, the derivative 

oqy 

dT ^ 

8^ consists of two terms which may be amalgamated into one. 
Differentiating (2.13), we obtain 


dT 

dqy 


• ■ d^Xi dxi 


(2.16) 


22 


MECHANICS 


[Part 1 


Subtracting (2.16) from (2.16), we see that (2.16) and the last term 
of (2.15) cancel. As a result we obtain 


d ST 
dt Sqy 


i-1 


n 

+ Z! ”*■' 

» = 1 


a?o< 8gg 


Sxi 
'Sqy • 


(2.17) 


However, the expression on the right-hand side of (2.17) can 

Q oc' 

also be obtained from (2.9) if we multiply its left-hand side by ~~ 

and sum over i. For this reason, (2.17), in accordance with (2.10), 
5 XJ 

is equal to — ^. Thus we find 


d ST 8r 8_U^ 

dt Sqy Sqy Sqy 


(2.18) 


In mechanics it is usual to consider interaction forces that are 
independent of particle velocities. In this case U does not involve q^, 
so that (2.18) may be rewritten in the following form: 


The difference between the kuietic and potential energy is called 
the Lagrangian function (or, simply, Lagrangian) and is denoted by 
the letter L : 

L^T—U. ( 2 . 20 ) 


Thus we have arrived at a system of v equations with v independent 
quantities q^,, the number of which is equal to the number of degrees 
of freedom of the system; 


d SL SL 

dt Sqa Sqa 


< V . 


( 2 . 21 ) 


These equations are called Lagrange’s equations. Naturally, in 
(2.21) L is considered to be expressed solely in terms of q^ and q^, 
the Cartesian coordinates being excluded. It turns out that this 
type of equation holds also in cases when the forces depend on the 
velocities (see Sec. 21).* 

The rules for forming Lagrange’s equations. Since the derivation 
of equations (2.21) from Newton’s Second Law is not readily evident 
we ’^l give the order of joperations which, for this given system, 
lead to the Lagrange equations. 


♦ In this case, the Lagrangian function does not have the form of (2.20), 
where t/ is a function of generalized coordinates only. However, the form of 
equations (2.21) is still valid. 


Sec. 2] 


lagbangk’s equation 


23 


1) The Cartesian coordinates are expressed in terms of generalized 
coordinates: 

Xi~Xi . . .) ^v)- 

2) The Cartesian velocity components are expressed in terms of 
generalized velocities ^ 

. dxi . 

— a— • 

8q(x ^ 

3) The coordinates are substituted in the expression for potential 
energy so that it is defined in relation to generalized coordinates: 

U =U (9'i> • • •> q[«> • ■ ^v)' 

4) The velocities are substituted in the expression for kinetic 
energy 


T= l-2Jmixf, 


i = l 

which is now a function of g* and q^. It is essential that in generalized 
coordinates, T is a function both of and q^. 

5) The partial derivatives and are found. * 

' ^ oqa oqa. 

6) Lagrange’s equations (2.21) are formed according to the number 
of degrees of freedom. 

In the next section we will consider some examples in forming 
Lagrange’s equations. 

Exercises 

1) Write down Lagrange’s equation, where the Lagrangian function has 
the form: 

L = —Vi — g* + gg • 

2) A point moves in a vortical plane along a given curve in a gravitational 

field. The equation of the curve in parametric form is x — x (a), z=z (a). Write 
down Lagrange’s equations. v 

The velocities are 


. dx , 

X = —f— a — X a, 
da 


. dz , 
z — a ^ z a . 
da 


The Lagrangian has the form: 


L = 


Lagrange’s equation is 


^ (x'2 + 2 ' 2 ) gi —'rngz (a) . 


m [{x'^ + z'*) s] — jn a* {x'x" + z'z") + mgz' = 0 . 


24 


MECHANICS 


[Part I 


Sec. 3. Examples ol Lagrange’s Equations 

Central forces. Central forces is the name given to those whose 
directions are along the lines joining the particles and which depend 
only on the distances between tlicm. Corresponding to such forces, 
there is always a potential energy, U, dependent on these distances. 
As an example, wo consider the motion of a particle relative to a 
fixed centre and attracting it according to Newton’s law. We shall 
show how to find the potential energy in this case by proceeding 
from the expression for a gravitational force. 

Gravitational force is known to be inversely proportional to the 
square of the distance between the jiarticles and is directed along 
the line joining them; 

( 3 . 1 ) 

Here a is the factor of proportionality which we will not define more 
precisely at this point, r is the distance between the particles, and 

^ is a unit vector. The minus sign signifies that the particles attract 

each other, so that the force is in the opposite direction to the radius 
vector r. According to (3.1), the attractive-force component along x 
is equal to 

( 3 . 2 ) 

since x is a component of r. But r='Vx'^+y'^-\-z^ » so that 

and similarly for the tivo other component forces. Comparing (3.3) 
and (2.7), we see that in the given case 

= ( 3 - 4 ) 

We note that the potential energy U is chosen here in such a way 
that U ( oo) = 0 when the particles are separated by an infinite distance. 
The choice of the arbitrary constant in the potential energy is called 
its gauge. In this case it is convenient to choose this constant so that 
the potential energy tends to zero at infinity. 

It is obvious that an expression similar to (3.4) is obtained for 
two electrically charged particles interacting in accordance with 
Coulomb’s law. 

Spherical coordinates. Formula (3.4) suggests that in this instance 
it is best to choose precisely r as a generalized coordinate. In other 
words, we must transform from Cartesian to spherical coordinates. 
The relationship between Cartesian and spherical coordinates is 


Soc. 3] 


EXAMPLES OF LAGRAXGE’S EQUATIONS 


shown in Fig. 1. The z-axis is called the polar axis of the spherical 
coordinate system. The angle S- between the radius vector and the 
polar axis is called the polar angle; it is complementary (to 90°) 
to the “latitude.” Finally, the angle 9 is analogous to the “longitude” 
and is called the azimuth. It measures the diliedral angle between 
the plane zOx and the plane passing through the polar axis and the 
given point. 

Let us find the formulae for the transformation from Cartesian 
to spherical coordinates. From Fig. 1 it is clear that 

z^rcosO-. (3.5) 

The projection p of the radius vector onto the plane xOy is 

p = rsin.9'. (3.6) 

Whence, 

2 :=p C 0 S 9 = r sin 3- cos 9 , (3.7) 

7 / = p sin 9 =r sin 3- sin 9 . (3.8) 

We wiU now find an expression for the kinetic energy hi spherical 
coordinates. This can be done either by a simple geometrical con¬ 
struction or by calculation according to the method of Sec. 2 . 


Fig. 2 


Although the construction is simpler, let us first follow the comjni- 
tation procedure in order to illustrate the general method. We have: 

z =f cos 3- —'r sin 3- 3, 

x—r sin 3 cos 9 -fr cos 3 cos 93 — r sin 3 sin 99 , 
y—ir sin 3 sin 9 -fr cos 3 sin 93 -fr sin 3 cos 99 . 

Squaring these equations and adding, we obtain, after very simple 
manipulations, the following: 


26 


MECHANICS 


[Part I 


T= - -m(x^ + y* + 2^) = + r^^^ + r^sin^&ip*). (3.9) 


The same is clear from the construction shown in Fig. 2. An arbitrary 
displacement of the point can be resolved into three mutually per¬ 
pendicular displacements: dr, rd^ and pd^=r sin 9d<p. Whence 

dl^ = dr^+r^dd^^-\-r^ sin^ ^ d(f^. (3.10) 

Since the square of the velocity 1 )^ = , (3.9) is obtained from 

(3.10) simply by dividing by (dt)^ and multiplying by —. 

Hence, in spherical coordinates, the Lagrangian function is expressed 


as 


L = -z (r ^ + r** sin* O tp* + r* O-*) - 
>2 


U(r). 


(3.11) 


Now in order to write down Lagrange’s equations it is sufficient to 
calcidate the partial derivatives. We have: 

dL ■ 8Ij « a 8L 2 • 2 a • 

—- = mr , —7- = mr* 0, = mr* sin* 0 6; 

'<dT () 


^^- = /nrsin*0<f>*+-mr0*- 


dr 


dr 


4^ = wr*sinOcos9'<p*,-" = 0. 

C7i> ^ c ?9 


These derivatives must be substituted into (2.21), which, however, 
we will not now do since the motion we are considering actually reduces 
to the jilane case (see beginning of Sec. 5). 

Two-particle system. So far we have considered the centre of at¬ 
traction as stationary, which corresponds to the assumption of an 
infinitely large mass. In the motion of the earth around the sun, or 
of an electron in a nuclear field, the mass of the centre of attraction 
is indeed large compared with the mass of the attracted particle. But 
it may happen that both masses are similar or equal to each other (a 
binary star, a neutron-proton system, and the like). We shall show 
that the problem of the motion of two masses interacting only with 
one another can always be easily reduced to a problem of the motion 
of a single mass. 

Let the mass of the first particle be and of the second mg. We 
call the radius vectors of these particles, drawn from an arbitrary 
origin, Tj and Tg, respectively. The components of are x^, y-^, Zy, 
the components of Tg are Xg, j/g, Zg. We now define the radius vector of 
the centre of mass of these particles R by the following formula: 


R = 


Wi r, -f wtg Tg 
TOj -f 


(3.12) 


Synonymous terms for the “centre of mass” are the “centre of gravity” 
and the “centre of inertia.” 


Sec. 3] 


EXAMPLES OF LAGBANGE’S EQUATIONS 


27 


In addition, let us introduce the radius vector of the relative position 
of the particles 

r=rj,—rg (3.13) 


Let us now express the kinetic energy in terms of R and r. Prom 
(3.12) and (3.13) we have 


r - R 1 

^ mi+rrii ’ 

(3.14) 

r3 = R - 

(3.15) 

The kinetic energy is equal to 


y ^ r* 4- F!2. 

2 2 2' 

(3.16) 


Differentiating (3.14) and (3.15) with respect to time and substi¬ 
tuting in (3.16), we obtain, after a simple rearrangement. 


mi 


2 (mi -f m^) 


(3.17) 


Tf we introduce the Cartesian components of the vectors R (X, Y, Z) 
and r (x, y, z), then we obtain an expression for the kinetic energy 
in terms of Cartesian components of velocity. 

Since no external forces act on the particles, the potential energy 
can be a function only of their relative positions: U =U {x, y, z). 
Thus, the Lagrangian is 


L ^ ( 1 . + 72 + 22 ) + {i- + y^ + z^) - U {x,y,z) . 

I'ransition to the centrc-ol-inass system. Let us write down La¬ 
grange’s equations for the coordinates of the centre of mass. We have 


<7 A 


= (mi-f mj) F, = + 


ex ’ BY ’ 


BL 

BZ 


0 . 


Hence, in accordance with (2.21) 

X=F=Z = 0. 


These equations can be easily integrated: 

Z=Xot-t-Zo, F=Fot+Fo, Z=Z^+Z^, (3.18) 

where the letters with the index 0 signify the corresponding values 
at the initial time. 


28 


MECHANICS 


[Part I 


Combining the coordinate equations into one vector equation, we 
obtain 

It ~ Rq • 

Thus, the centre of mass moves uniformly in a straight line quite 
independently of the relative motion of the particles. 

Reduced mass. If wo now write down Lagrange’s equations for 
relative motion in accordance wdth (2.21) the coordinates of the 
centre of mass do not appear. It follows that the relative motion 
occurs as if it were in accordance with the Lagrangian 


Z/rcl ■ 


irii 


2 (mi + m 2 ) 


U(r) 


(3.19) 


(where = + z*), formed in exactly the same way as the 

Lagrangian for a single mass m equal to 


m — - 


mjmj 
mi -f- m2 


(3.20) 


This mass is called the redticed mass. 

The motion of the centre of mass does not affect the relative motion 
of the masses. In particular, we can consider, simply, th.at the centre 
of mass lies at the coordinate origin R = 0. 

In the case of central forces (for example, Newtonian forces of attrac¬ 
tion) acting between the particles, the potential energy is simply 
equal to U {r) [this is taken into consideration in (3.10)], where 
r —Vx^-[-y^-{-z^. Then, if we describe the relative motion in spherical 
coordinates, the equations of motion will have the same form as for 
a single particle moving relative to a fixed centre of attraction. 

The centre of mass can now be considered as fixed, assuming R = 0. 
From this, in accordance with (3.14) and (3.15), wo obtain the distance 
of both masses from the centre of mass: 


wi'o r m, r 

^ . y ■ • Y" _ * 

^ Wij H-W2 ^ ^ 

We see that if one mass is much smaller than the other, m., m-i, 
then Ti i.e., the centre of mass is close to the larger mass. This 
is the case for a sun-planet system. At the same time the reduced 
mass can also be written thus: 


ni = 


1+^ 

mi 


(3.21) 


From here it can be seen that it is close to the smaller mass. That is 
why the motion of the earth around the sun can be approximately 
described as if the sun were stationary and the earth revolved about it 
with its own value of mass, independent of the mass of the sun. 


Sec. 3J EXAMPLES OP laoe.ange’s equations 29 

Simple and compound pendulums. In concluding this section we 
shall derive the Lagrangian for simple and compound pendulums. 

The simple plane pendulum is a mass suspended on a flat hinge at 
a certain point of a weightless rod of length 1. The hinge restricts the 
swing of the pendulum. Let us assume that swinging occurs in the 
plane of the paper (Fig. 3). It is clear that such a pend¬ 
ulum has only one degree of freedom. We can take 
the angle of deflection of the pendulum from the ver¬ 
tical 9 as a generalized coordinate. Obviously the veloc¬ 
ity of particle m is equal to so that the kinetic 
energy is 

The potential energy is determined by the height of 
the mass above the mean position z-~l (1 — cos <p). 

Whence, the Lagrangian for the pendulum is 

L = — mgl (1 — cos<p). 

A double pendulum can be described in the following way: in mass 
there is another hinge from which another pendulum, which is 
forced to oscillate in the same plane (Fig. 4), is suspended. Let the 
mass and length of the second pendulum be and l^, 
respectively, and its angle of deflection from the verti¬ 
cal, i. The coordinates of the second particle are 

Xi = l sin 9 sin t{;, 
z^ = l ( 1 —cos 9 )-|-i!i (1 — cos ij/). 

Whence we obtain its velocity components; 

Xi=lco& 994-^1 cos (L(L , 

Zi=l sin 99 -f-li sinikp . 

Squaring and adding them we express the kinetic energy 
of the second particle in terms of the generalized coordinates 9 , and 
the generalized velocities 9 , tL: 

2ZZiCOs( 9 — 'i')?'}'] • 

The potential energy of the second particle is determined in terms of 
Zj. Finally, we get an expression of the Lagrangian for a double pen¬ 
dulum in the following form: 

L = - ~ U cos (9 — ' 1 ') 9 — 

— (m + rtiy) gl (1 — cos 9 ) —(1 — cost);). 


m, I 

‘ I 


Fig. 4 


(3.23) 


30 


MECHANIC'S 


[Part I 


All the formulae for the Lagrangian functions (3.11), (3.22), and 
(3.23) will be required in the sections that follow. 


Kxorcises 

1) Write down Lagrange’s equation for an elastically suspended pendulum. 

1c 

For such a pendulum, the potential energy of an elastic force U — {I —2„)^, 

where is the equivalent length of the unstretched rod and 1; is a constant, 
characteristic of its elasticity. 

2) Write down the kinotii; energy for a system of three particles with masses 
nil, TOj, and m, in the form of a sum of the kinetic energy of the centre of mass 
and the kinetic energy of relative motion, using the following relative coordi¬ 
nates : 

P: “ fj, Pa “■ fj Tj. 


Sec. 4. Conservation Laws 


The problem ol mechanics. If a mechanical system has v degrees 
of freedom, then its motion is described by v Lagrangian equations. 
Each of these equations is of the second order with respect to the time 
derivatives q [see (2.17)]. From general theorems of analysis we 
conclude that after integration of this system we obtain 2 v arbitrary 
constants. The solution can be represented in the following form: 


•Zi— Qi (L Czv)) 

*72 “ 72 (L ('ly • • • y (^2v)y 


qfoL — 7a (7 j • j C*2v), 


(4.1) 


Differentiating these equations with respect to time, we obtain 
expressions for the velocities: 


Then 


C. 


2V- 


7i = 7i 

(<; 

Cl, . 


wa- 

II 

(<; 

C,,. 

• • y 


7a = 7a 

{t-y 


..yC,.)y 


quations (4.1) and (4.2) are 

solved 

* * > 

SO 

that 

these values are 

7 v, 7i. 


• ,7v. 


> 7i) 72> 


•>7v; 

Qly 72- • • •> 

7v), 

; 7i. 72> 


• > > 

7i> 72> • • • > 

7v). 

; 7i’ 72> 


• , 7v; 

7i> 72. • • •. 

7v)- 


(4.2) 


(4.3) 


Sec. 4] 


CONSERVATION LAWS 


31 


From the equations (4.3) we see that in any mechanical system 
described by 2v second-order equations there must be 2 v functions of 
generalized coordinates, velocities, and time, which remain constant 
in the motion. These functions are called integrals of motion. 

It is the main aim of mechanics to determine the integrals of 
motion. 

If the form of the function (4.3) is known for a given mechanical 
.system, then its numerical value can be determined from the initial 
conditions, that is, according to tlie given values of generalized 
coordinates and velocities at the initial instant. 

In the preceding section we obtained the so-called centre-of-mass 
integrals and Rq (3.18). 

Naturally, Lagi’ange’s equations cannot be integrated in general 
form for an arbitrary mechanical system. Therefore the problem of 
determining the integrals of motion is usually very complicated. But 
there are certain important integrals of motion which are given directly 
by the form of the Lagrangian. We shall consider these integrals in 
the present section. 

Energy. The quantity 

is called the total energy of a system. Let us calculate its total deriv¬ 
ative with respect to time. 

We have 


_.. ±L_,- d _ 8L .. _ dJ^ 

dt dqa. dt 8 qa 8qa 8 ij'a 8t 


The last three terms on the right-hand side are the derivatives of the 
Lagrangian L, which, in the general form, depend on q, q and t. 
In determining S’ and its derivative we have made use of the sum¬ 
mation convention. The quantity ^ in Lagrange’s equations 

can be replaced by . The result is, therefore, 


^ 8 i 

di dt 


(4.5) 


Consequently, if the Lagrangian does not depend explicitly on time 

( B L \ 

= 01, the energy is an integral of motion. Let us find the con¬ 
ditions for which time does not appear explicitly in the Lagrangian. 

If the formulae expressing the generalized coordinates q in terms of 
Cartesian coordinates x do not contain time explicitly (which corre- 


a is summed from 1 to v (see See. 2). 


32 


MECHANICS 


[Part i 


spends to constant, time-independent constraints) then the transfor¬ 
mation from X to q cannot introduce time into the Lagrangian. 

3 L 

Besides, in order that = 0, the external forces must also be 

independent of time. When these two conditions—constant constraints 
and constant external forces^—are fulfilled, the energy is an integral 
of motion. To take a particular case, when no external forces act 
on the system its energy is conserved. Such a .system is called closed. 

When frictional forces act inside a closed system, the energy of 
macroscopic motion is transformed into the energy of molecular 
microscopic motion. The total energy is conserved in this case, too, 
tliough the Lagrangian, which involves only the generalized coordi¬ 
nates of macroscopic motion of the system, no longer gives a complete 
description of the motion of the system. The mechanical energy of 
only macroscopic motion, determined by means of such a Lagrangian, 
is not an integral. We will not consider such a system in this section. 

Let us now consider the case when our definition of energy (4.4) 
coincides with another definition, S=T+U. Let the kinetic energy 
be a homogeneous quadratic function of generalized velocities, as 
expressed in equation (2.13). For this it is necessary that the con¬ 
straints should not involve time explicitly, otherwise equation (2.6) 
would have the form 


. dxi . 


dxi 

'W ’ 


where the partial derivative of the function (2.2) with respect to time 
is taken for all constants q^. But then terms containing q^ in the first 
degree would appear in the expression for T. 

Since we assume that the potential energy U does not depend on 
velocity [see (2.18) and (2.19)], then 

dL ^ 8T 
dqa ~ dqa ’ 

and the energy is 

^ = L. (4.6) 


But according to Euler’s theorem, the sum of partial derivatives of a 
homogeneous quadratic function, multiplied by the corresponding 
variables, is equal to twice the value of the function (this can easily 
be verified from the function of two variables ax^ + 2 bxy+cy^). 
Thus, 

^=:2T-~T+U=T+U, (4.7) 

that is, the total energy is equal to the sum of the potential energy 
and the kinetic energy, in agreement with the elementary definition. 

We note that the definition (4.4) is more useful and general also 
in the case when the Lagrangian is not represented as the difference 


Sec. 4] 


CONSERVATION LAWS 


33 


L—T — U. Thus, in electrodynamics (Sec. 16) L contains a linear 
term in velocity. For the energy integi'al to exist, only one condition 

Q Ij 

is necessary and sufficient: — 0 (if, of course, there are no friction¬ 

al forces). 

The application of the energy integral to systems with one degree 
of freedom. The energy integral allows us, straightway, to reduce 
problems of the motion of systems with one degree of freedom to those 
of quadrature. Thus, in the pendulum problem considered in the 
previous section we can, with the aid of (4.7), uTite down the energy 
integral directly: 

<p2-1-w!(/l (1—COS9). (4.S) 

The value ^ is determinable from the initial conditions. For examide, 
let the pendulum initially bo deflected at an angle 9 ^ and released 
without any initial speed. It follows that 90 = 0 . Whence 

S=mgl ( 1 —-cos 90 ). (“I-fl) 

Substituting this in (4.8), we have 

mgl (cos rf — cos 9 o) =-^ 9 ^. (4-10) 

From this, the relationship between the deflection angle and time 
is determined by the quadrature 


It is essential that the law governing the oscillation of a pendidum 
depend only on the value of the ratio l/g and is independent of the 
mass. The integral in (4.11) cannot be evaluated in terms of elementary 
functions. 

A system in which mechanical energy is conserved is sometimes 
called a conservative system. Thus, the energy integral permits reducing 
to quadrature the problem of the motion of a conservative system 
with one degree of freedom. 

In a system with several degrees of freedom the energy integral 
allows us to reduee the order of the system of differential equations 
and, in this way, to simplify the problem of integration. 

Generalized momentum. Wo shall now consider other integrals of 
motion which can be found directly with the aid of the Lagrangian. 
To do this we shall take advantage of the following, quite obvious, 
consequence of Lagrange’s equations. If some coordinate does not 

appear explicitly in the Lagrangian (^- . then in accordance 

with Lagrange’s equations 


34 


MECHANICS 


[Part I 


But then 


d SL 
dt dqa 


= 0. 


Va — 


SL 

dqa 


— const, 


(4.12) 

(4.13) 


i.e., it is an integral of motion. The quantity is called the generalized 
momerUwm corresponding to a generalized coordinate with index a. 
This definition includes the momentum in Cartesian coordinates: 

px==mvx = . Summarizing, if a certain generalized coordinate does 

not appear explicitly in the Lagrangian, the generalized momentum 
corresponding to it is an integral of motion, i.e., it remains constant 
for the motion. 

In the preceding section wo saw that the coordinates X, Y, Z of 
the centre of gravity of a system of two particles, not subject to the 
action of external forces, do not appear in the Lagrangian. From this 
it is evidentthat 

(niy+m^) X = Px, (mj+ma) F = Py, (miH-TOa)^ = Pz (4.14) 


are constants of motion. 

The momentum of a system of particles. The same thing is readily 
shown also for a system of N particles. Indeed, for N particles we can 
introduce the concept of the centre of mass and the velocity of the 
centre of mass by means of the equations: 


R = —^-, (4.16a) 

2Jm,i 


b 

-. (4.15b) 

2^ mi 
i 

The velocity of the ith mass relative to the centre of mass is 


fi' = r,—R (4.16) 

(by the theorem of the addition of velocities). The kinetic energy of 
the system of particles is 


T = ^2;mi r,2 = = 

N N N 


t-1 


t-1 


(4.17) 


Sec. 4] 


CONSERVATION LAWS 


35 


However, from (4.16b) and (4.16) it can immediately bo seen that 
N 

^ mi f= 0, by the definition of r,' and R. Therefore, the kinetic 

i = l 

energy of a system of particles can be divided into a sum of two terms: 
the kinetic energy of motion of the centre of mass 


and the kinetic energy of motion of the mass relative to the centre 
of mass 

N 

Trel = 

i-1 

The vectors f,' are not independent; as has been shown, they are 

N 

governed by one vector equation ^ m, r,' = 0. Consequently, they can 

i„i 

be expressed in terms of an iV-1 independent quantity by determining 
the relative positions of the ith and first masses. For this reason the 
kinetic energy of N particles relative to the centre of mass is, in general, 
the kinetic energy of their relative motion, and is expressed only in 
terms of the relative velocities — f,-. By definition, no external forces 
can act on the masses in a closed system, and the interaction forces 
inside the system can be determined only by the relative positions 

Ti—Ti. 

Thus, only R appears in the Lagrangian, and R does not. Therefore, 
the overall momentum is conserved: 


const, 


(4.18) 


Equality (4.18), which contains a derivative with respect to a vector, 
should be understood as an abbreviated form of three equations: 


Px 


dL 


dX 


Py- 


8L 

dY 


Pz- 


8L 


8Z 


For more detail about vector derivatives see Sec. 11. 

We have seen that the overall momentum of a mechanical system 
not subject to any external force is an integral of motion. It is impor¬ 
tant that it is what is known as an additive integral of motion, i.e., 
it is obtained by adding the momenta of separate particles. 


36 


MECHANICS 


[Part I 


It may be noted that the momentum integral exists for any system 
in which only internal forces are operative, even though they may be 
frictional forces causing a conversion of mechanical energy into 
heat. 

If we integrate (4.18) with respect to time once again, the result will 
be the centre-of-mass integral similar to (3.18). This will be the so- 
called second integral (for it contains two constants); it contains only 
coordinates but not velocities. (3.18) and (4.11) are also second 
integrals. 

Properties ol the vector product. The angular momentum of a particle 
is defined as 

M--frp]. (4.19) 

Here the brackets denote the vector product of the radius vector of 
the particle and its momentum. We know that (4.19) takes the place 
of three equations, 

Mx^ypz—z'Py, My-^zpx -xpz. Mz^xpy — ypx, 

for the Cartesian comjioncnts of the vector M. 

Recall the geometrical definition for a vector product. We construct 
a parallelogram on the vectors r and p. Then [rpj denotes a vector 
numerically equal to the ai’ca of the parallelogram with direction 
perpendicular to its plane. In order to siiecify the direction of [rp] 
uniquely, we must agree on the way of tracing the parallelogram con¬ 
tour. We shall agree always to traverse the contour beginning with the 
first factor (in this case begimiing with r). Then that side of the plane 
will be considered positive for which the direction is anticlockwise. 
The vector [rp] is along the normal to the positive side of the plane. 
In still another way, if we rotate a corkscrew from r to p, then it will 
be displaced hi the direction of [rj)]. The direction of traverse changes 
if we interchange the positions of r and p. Therefore, unlike a conven¬ 
tional product, the sign of the vector product changes if we inter¬ 
change the factors. This can also be seen from the definition of Carte¬ 
sian components of angular momentum. 

The area of the parallelogram is rp sin a, where a is the angle between 
r and p. The product r sin a is the length of a xierpendicular drawn 
from the origin of the coordinate system to the tangent to the trajec¬ 
tory whose direction is the same as p. This length is sometimes called 
the “arm” of the moment. 

The vector product possesses a distributive property, i.e., 

[a, b-]-c] = [ab]-f-[ac]. 

Hence, a binomial product is calculated in the usual way, but the 
order of the factors is taken uito aceount. 

[a -f d, b -1- e] = [ab] -[- [ac] -f- fdb] -|- [dc]. 


Sec. 4] 


CONSBKVATION LAWS 


37 


The angular momentum ot a particle system. The angular momentum 
of a system of particles is defined as the sum of tlie angular momenta 
of all the particles taken separately. In doing so we must, of course, 
take the radius vectors related to a coordinate origin common to all 
tlie particles; 

N 

(4-20) 

i = i 

We sliall show that the angular momentum of a system can be 
separated into the angular momentum relative to the motion of the 
particles and the angular momentum of tlie system as a whole, similar 
to the way’ that it was done for tlie kinetic energy. To do this we must 
represent the radius vector of each particle as the sum of the radius 
A’ector of its position relative to the centre of mass and the radius 
vector of the centre of mass; we must expand the expression for the 
particle velocities in the same way. Thou, the angular momentum can 
be written in the form 

N 

I-l 

N N N N 

-2'.a,[RR] +2’Kr/,l{,] +2'r.'P.'J- 

1-1 i-l i=l 1-1 

In the second and third sums, wc can make use of the distriliutive 
property of a vector product and introduce the summation sign 
inside the product sign. However, both these sums are equal to zero, 
by definition of the centre of ma.ss. This was used in (4.17) for veloc¬ 
ities. Thus, the angular momentum is indeed equal to the sum of the 
angular momenta of the centre of mass (Mq) and the relative motion 

(M'): 

M = [RP] + [r.' Pi'J - Mo 4- M'. (4.21) 

i 

Let us perform these transformations for the special case of a 
sy'stem of two particles. We substitute and Tg expressed [from 
(3.14) and (3.15)] in terms of r and R. This gives 

M = [ripi] -t- [r^pj = [R,pi + p^] -p (^2 [rpi] — [rpj). 

Further, we replace p^ by r^, Pa by fj and PiH-Pa by P, after 

which the angular momentum reduces to the required form: 


38 


MECHANICS 


[Part I 


M = [RP] -i- [rr] [RPJ + [rp]. (4.22) 

r = mr = p is the momentum of relative motion. 

Jfhi Wfrj * 

We shall now show that the determination of angular momentum 
of relative motion does not depend on the choice of the origin. Indeed, 
if we displace the origin, then all the quantities r/ change by the same 
amount r,'=^" + 8. 

Accordingly, angular momentum for relative motion will be 


N 


N 


M 


' = p.-] =2;[r/'P.'] = 


I «1 

N 


f«l 


because 


N 


a^P.' 


i-l 


•M' 


N N 

2Jpi' 0 . 

i-i I 


Thus, the determination of angular momentum for relative motion 
does not depend on the choice of the origin of the coordinate 
system. 

Conservation ol angular momentum. We shall now show that the 
angular momentum of a system of ]iarticles not acted upon by any 
external forces is an integral of motion. 

Let us begin with the angular momentum of the system as a whole. 
Its time derivative is equal to zero: 

-]^- = [RP] + [RP] = 0, 


because P = 0 for any system not acted upon by external forces, and 
R is in the direction of P, so that the vector product [RP] = 0. 

We shall now prove that angular momentum is conserved for 
relative motion. The total derivative with respect to time is 

[^-'P-'] + Z fr' Pf'] • (4-23) 

i - 1 i - 1 

Here, the first term on the right-hand side is equal to zero since r,' 
is in the direction of p,'. We consider the second term. Let us choose 
the origin to be coincident with any particle, for example, the first. 
As a result, M', as we have already seen, does not change. The potential 
energy ean only depend on the differences r^—^Tg, ..., —r*,, ... 


Sec. 4] OONSBaVATION LAWS 39 

The other diEFerences are expressed in terms of these, for example, 
rji—ri = (ri—r/)—(r^—r*). We introduce the abbreviations. 

Pi *'2> 

92=^1 r3> 


pfc - i = ri—rfe, 


Then the derivatives ... will be expressed in terms 


of the variables p^, ..., p), _ i, ... as follows: 


N-1 


k-1 


8U_ 

8t, 


8t, 


Fi —Fg... 


■’ 8r7 

1 in (4.23) 

we obtain 


N 

N 


II 


k^l 


N-1 1 

N-1 

N-1 

? ki 

1 _ 

fc=i 

fe-i 


8U 

9 ?S.'-I 


In this expression, only the relative coordinates p^, 
remain. We shall now show that, for a closed system, the right-hand 
side is identically equal to zero. The potential energy is a scalar function 
of coordinates. Hence, it can depend only on the scalar arguments 
pfe^i pi^. (p/ pt). totally irrespective of whether the initial expression 
was a function only of the absolute values | r; — r*, |, or whether it 
also involved scalar products of the form (rj— Tk, ii —Fn). An essential 
point is that the system is closed (in accordance with the definition, 
see page 32), and the forces in it are completely defined by the relative 
positions of the points and by nothing else. Therefore, the potential 
energy depends only on the quantities r,-—r*, and only in scalar 
combinations (r,-— tk, r/ — r„) (in particular, the subscripts i, 1; k, n 
can even be the same; then the scalar product becomes the square 
of the distance between the particles i, k). 

To summarize, the potential energy U depends on the following 
arguments: 


u=u [p”, p“,..., pL ..., pN-i; (pi P2).(pfcp/)]. 


40 


MKC'HANIt'S 


[Part I 


In order to save sjiace we will, in future, perform the operation for 
two vectors, though this 0 ])eration can be directly generalized to any 
arbitrary number. We obtain 


'oU 

c)pi " 

dU 

~ ^(Pi) 

^pi 

+ 

eu 

^(Pl P2) 

f'(Pi P 2 ) 
c>Pj 

dV _ 

dU 


1 

eu 

^ (Pi P 2 ) 

«’P2 

“ e(pi) 

^Pi 


^(pi P 2 ) 

8p„ 


The partial derivatives of the scalar quantities p*, (pj pg) witli respect 
to the vector arguments are in the giveji case easily (waluated. Tims. 


Spi 


f!iPjP=l 

cipi 


- P 2 - 


hlach of these equations is a shortened form of three equations referring 
to the components (the components of p, are y),-, 0): 


Hence, 


;,5-(PiP 2)= (?l?2 + -')l>l2 + ^lQ=?2; 


(Pi P 2 ) “ ’^ 2 ' 


(P 1 P 2 ) — ^2 • 


< Pi ^{PiPi) ’ '-'Pi Pi S(PiPa) 


(4.25) 


Substituting (4.25) into (4.24) for the case of the two variables, we 
obtain 


<nv 

(it 


2 [PiPa]^^j-- 2 [p,p,l. 


_dU^ 

^(p|) 


([Pi P 2 ] + [P 2 P 1 ]) 


su 

S(piPa) 


But the sigir of a vector jiroduct depends on the order of the factors 
TPi P 2 ] = — fP 2 Pil- Hence it can also be seen that [p^ Pi] =— [pi pj 0 
and rp 2 P 2 ]^ 0 . Therefore, — 0, as stated. 

The integral, like the angular momentum, can also be formed when 
the forces are determined not only by the relative position of the par¬ 
ticles but also by their relative velocities. This is the case, for example, 
in a system of elementary currents interacting in accordance with the 
Biot-Savart law. 

Addilive integrals of motion for closed systems. We have thus shown 
that a closed system has the following first integrals of motion: 
energy, three components of the momentum vector and three compo¬ 
nents of the angular-momentum vector. Momentum and angular 
momentum are always additive, while energy is additive only for the 
noninteracting parts of a system. 

All the other integrals of motion are found in a much more compli¬ 
cated fashion and depend on the specific form of the system (in the 
sense that one cannot give a general rule for their definition). 


Sec. 5] 


MOTION IN A OENTUAL I'lELD 


41 


Kxercises 

Describe the motion of a point moving along a cycloid in a gravitational 
field. 

The equation of the cycloid in parametric form is 
2 = — Kcoss, x=^Rs+l{ams. 

I’ho kinetic energy of the point is 

T = ^ {x^ + z^) = 2 ?nR^ cos''® . 

The potential energy is U = — nujR cos s. 

The total-energy integral is 

<? = 2 m 7?- cos“ — mg R cos s = const. 


The value S can be determined on the condition that the velocity a is equal 
to zero when the deflection is maximum s — s^; the particle moves along the 
cycloid fi'om that position. Hence, 

<f = —■ iiuj R cos 

After separating the variables and integrating, wo obtain 


Calling sin -- = u, wo rewrite the integral in the form: 


' J VmS —' 0 


— arc sm - 


In order to find the period of the motion, we must take the integral between 
the lunits —u„ and-|-w„ and double the result. This corresponds to the oscdla- 
tion of the particle within the limits s= —s„ and s — 

Thus the total period of oscillation is equal to 4n 

Hence, as long as the particle moves on the cycloid, the period of its oscillation 
does not depend on the oscillation amplitude (Huygens’ cycloidal pendulum). 
The period of oscillation of an ordinary pendulufn, describing an arc of a circle, 
is known, in the general case, to depend on the amplitude [see (4.11)]. 


Sec. 6. Motion in a Central Field 

The angular-momentum integral. We shall now consider the motion 
of two bodies in a frame of reference fixed in the centre of mass. If 
the origin coincides with the centre of mass, then R = 0. As was shown 
in the preceding section, the angular momentum of relative motion 
is conserved in any closed system; specifically, it is also conserved in 


42 


MECHANICS 


[Part I 


a system of two particles. If the radius vector of the relative position 
of the particles r=ri— r^, and the momentum of relative motion is 


V = mv, 


(5.1) 


mi + Wa 

then the angular-momentum integral is reduced to the simple form: 

M = [rp] = const. (5.2) 

It follows that the velocity vector and the relative position vector 
all the time remain perpendicular to the constant vector M; in other 
words, the motion takes place in a plane perpendicular to M (Fig. 5). 

When transforming to a spherical coordinate 
system, it is advisable to choose the polar axis 
along M. Then the motion will take place in the 

plane xy or sinS^=l, & = 0 . 

The potential energy can depend only on the 
absolute value r, because this is the only scalar 
quantity which can be derived solely from the 
vector r. In accordance with (3.9), the Lagran- 
gian for plane motion, with 9-=0, sin {> = 1 , is 


i= —(r‘‘+rV)-(^(»-). 


(5.3) 


where m is the reduced mass. 

Angular momentum as a generalized momentum. We shall now show 

0 Ij 

that = Af is nothing other than , i.e., the component of angular 

momentum along the polar axis is a generalized momentum, provided 
the angle of rotation 9 around that axis is a generalized coordinate. 
Indeed, in accordance with (5.2), the angular momentum M is 

M = Mz=x'py — ypx=r'mr cos 9 (f sm 9 +r cos 99 )— 

—mr sin 9 (fcos 9 —r sin 99 )=jKr* 9 (cos^ 94 -sin^ 9 )=mr 2 ^ 

On the other hand, differentiating L with respect to 9 we see that 

dL 

99 


= mr®9 = Mz 


(5.4) 


The expression for angular momentum in polar coordinates can also 
be derived geometrically (Fig. 6 ). In unit time, the radius vector r 
moves to the position shown in Fig. 6 by the dashed line. Twice the 
area of the sector OAB, multiplied by the mass m, is by definition 
equal to the angular momentum [cf. (5.2)]. But, to a first approxi¬ 
mation the area of the sector is equal to the product of the modulus 

r and . The height h is proportional to the angle of rotation in unit 


Sec. 5] 


MOTION m A CENTRAL FIELD 


43 


time and to the radius itself so that the area of the sector is 1/2 r* <p. 
Thus, a doubled area multiplied by the mass m is indeed equal to the 
angular momentum. 

The quantity l/ 2 r*cp is the so-called areal velocity, or the area 
described by the radius vector in unit time. The law of conservation 
of angular momentum, if interpreted geometrically, 
expresses constancy of areal velocity (Kepler’s Second 
Law). 

The central field. If one of two masses is very much 
greater than the other, the eentre of mass coincides with 
the larger mass (see Sec. 3). In this case, the particle 
with the smaDer mass moves in the given central field of 
the heavy particle. The potential energy depends only on 
the distance between the particles and does not depend 
on the angle 9 . Then, in accordance with (4.12), p^ = Mz 
is the integral of motion. However, since one particle is 
considered at rest, the origin should be chosen coincident 
with that particle and not with some arbitrary point, as 
in the case of the relative motion of two particles. In 
the case of motion in a central field, angular momentum is conserved 
only relative to the centre. 

Elimination of the azimuthal velocity component. The angular- 
momentum integral permits us to reduce the problem of two-particle 
motion, or the problem of motion of a single particle in a central field, 
to quadrature. To do this we must express 9 in terms of angular mo¬ 
mentum and thus get rid of the superfluous variable, in as much the 
angle 9 itself does not appear in the Lagrangian. Such variables, 
which do not appear in L, are termed cyclic. 

In accordance with (4.7), we first of aU have the energy integral 

+ (5-5) 


B 


Fig. 6 


Eliminating 9 with the aid of (6.4) we obtain 


i 


m r* 
___ 


\-U{r). 


(5.6) 


This first-order differential equation (in r) is later on redueed to 
quadrature. Before writing down the quadrature, let us examine it 
graphically. 

The dependence of the form of the path on the sign of the energy. 
For such an examination, we must make certain assumptions about 
the variation of potential energy. 

From (2.7), force is connected with potential energy by the relation 


44 


MECHANICS 


[Part I 


The upper limit in the integral can be chosen arbitrarily. If F {>■) 

CO 

tends to zero at infinity faster than, then the integral j F dr is 

CO r 

convergent. Then we ean put U (^)=j F dr, or U (oo)—0. In other 

r 

words, the potential energy is considered zero at infinity. The elufice 
of an arbitrary constant in the c.xprcssion for potential energy is called 
its ijduge. 

In addition we shall consider that at r —0, U (r) does not tend to 
infinity more rapidly than , as, for c.xainple, for Newtonian attrac¬ 


tion U -- -• j | j dr - - - 


II 

r 


Lot us now write (fi.fi) as 


til. /■“ 
2 


J\I^ 

2mr'^ 


-U(r). 


The left-hand side of this equation is essentially positive. For r=oo 
the last two terms in (.').?) tend to zero. Thus, for the particles to be 
able to recede from each other an infinite distance, the total energy 
must be positive when the gauge of the potential energy satisfies 
f7(oo)=0. 

(Jiven a dcliiiite form of U, we ean now plot the curve of the function 


UM(r)^ 


+ U (r) . 


(.5.8) 


The index M in U dejiotes that the potential energy includes the 

IP 

“centrifugal” eiiergy The derivative of this quantity- with 

respect to r, taken with the opposite sign, is equal to . If we put 

il/ = w?r® 9 , the result will be the usual ex])ression for “centrifugal 
force.” However, henceforth, we will call a mechanical quantity of 
different origin the “centrifugal force” (see Sec. 8). Let (7<0 and 
monotonie. Since U (cx)) = 0, we see that U (r) is an increasing function 

of r. It follows that the force has a negative sign |sinceF=— , 

i.e., it is an attractive force. Let us assume, in addition, that at infinity 
\12 

I ('■) I Let us summarize the assumptions that we have 

made concerning Untir): 

1 ) f/iU (r) is positively infinite at zero, where the centrifugal term 
is predominant. 

2 ) at infinity, where U (r) predominates, U.u (r) tends to zero from 
a negative direction. 


Sec. 5] 


MOTION IN A CENTRAL FIELD 


45 


Consequently, the curve Uu^r) has the form shown in Fig. 7, 
since we must go through a minimum in order to pass from a decrease 
for small values of r to an increase at large values of r.* 

In this figure we can also plot the total energy <5". But since the total 
energy is conserved, the curve of S must have the form of a horizontal 
straight line lying above or below the abscissa, depending on the sign 
of 6. 

For positive values of energy, the line <? = const lies above the curve 
VM (r) everywhere to the right of point A. Hence the diflFcrence 
S — TJu (r) to the right of A is positive. 

The particles can approach each other 
from infinity and recede from each other 
to infinity. Such motion is termed infinite. 

As we will see later in this section, in the 
case of Newtonian attraction, we obtain 
hyperbolic orbits. 

For <f<0, but higher than tlic mini¬ 
mum of the curve f/jM(r), the difference 

S’ — Um {r), , remains positive 

only lietween the points B and B' {finite 
motion). Thus, between these values of 
the radius there is included a phj^si- 
cally possible region of motion, to which 
there correspond elliptical orbits in the case of Newtonian 
attraction. In the case of planetary motion around the .sun, point B 
is called the qycrihelion and point B' the aphelion. 

For = 0 the motion is infinite (parabolic motion). 

If U (r)>0, w'hich corresponds to repulsion, the curve Um (r) docs 
not possess a minimum. Then finite motion is clearly im]iossil)le. 

Falling towards the centre. For Newtonian attraction, U (0) tends 
to infinity like —1/r. If we su])pose that U (0) tends to —oo more 
rapidly than — 1/r^, then the curve UM{r) is negative for all r close 
to zero. Then, from (5.7), is positive for infinitesimal values of r 
and tends to infinity when r tends to zero. If f<() initially, then f 
does not change sign and the particles now begin to move towards 
collision. In Newtonian attraction this is possible only when the 
particles are directed towards each other; then “the arm” of the 
angular momentum is equal to zero and, hence, the angular momentum 
itself is obviously equal to zero, too, so that Um (r) = U (r). If an 
initial “arm” exists within the distance of minimum approach, then 


♦ rf I {/ {r) |< 2 fnri infinity, then the curve approaches zero on the 

positive side, and there can bo a further small maximum after the minimum. 
This form of C/m (r) applies to the atoms of elements with medium and largo 
atomic weights. 


MECHANICS 


46 


[Part 1 


the angular momentum M =mvp 0 (p is the “arm”) and the motion 
can in no way become radial. 

In the case of Newtonian or Coulomb attraction for a particle with 
angular momentum not equal to zero, there always exists a distance 

Tq 'for which v—r becomes greater than S’ — U (ro). This distance 

A ffh T ^ 

determines the perihelion for the approaching particles. 

However, if U (r) tends to infinity more rapidly than — then, 

as r->0, there will be no 2 >oint at which Um (r) becomes zero. In place 
of a hyperbolic orbit, as in the case of Newtonian attraction, a siiiral 
curve leading to one particle falling on the other results. The turns 
of the spiral diminish, but the speed of rotation increases so that the 
angular momentum is conserved, as it should be in any central field. 
But the “centrifugal” reinilsive force turns out to be less than the 
forces of attraction, and the particles approach each other indef¬ 
initely. 

Of course, the result is the same if the energy is negative (for example, 
part of the energy is transferred to some third particle, which then 
recedes). In the case of attractive forces increasing more rapidly than 
1 /r®, no counteqiart to elli])tical orbits exists. 

If three bodies in motion are subject to Ncivtonian attraction, two 
of them may coUide even if, initially, the motion of these particles 
was not jiurely radial. Indeed, in the case of three bodies, only the 
total angular momentum is conserved, and this does not exclude the 
collision of two particles. 

Ucducing to quadrature. Lot us now find the equation of the tra¬ 
jectory in general form. To do this we must, in (5.6), change from 
differentiation with respect to time to differentiation with respect to cp. 
Using (5.4) we have 

= (5.9) 

Se]mrating the variables and passing to 9 in (5.6) gives 


1 M 

dr 

(5.10) 

1 mr^ 1 / 


^0 ’ 


Here, the lower limit of the integral corresponds to 9 = 0 . If we cal¬ 
culate 9 with respect to the perihelion, then the corresponding value 
r=rQ can be easily found by noting that the radial component of 
velocity r changes sign at perihelion (r has a minimum, and so dr = 0 ). 
From this we find the equation for the particle distance at perihelion: 


Sec. 5] 


MOTION IN A CENTBAL FIELD 


47 


Kepler’s problem. Thus, the problem of motion in a central field 
is reduced to quadrature. The fact that the integral sometimes cannot 
be solved in terms of elementary functions is no longer so essential. 
Indeed, the solution of the problem in terms of definite integrals 
contains all the initial data explicitly; if these data are knoivn, the 
integration can be performed in some way or other. 

But, naturally, if the integral is expressed in the form of a well-known 
function, the solution can be more easily examined in the general 
form. In this sense an explicit solution is of particular interest. 

Such a solution can be found in only a few cases. One of these is 
the case of a central force diminishing inversely as the square of 
distance. The forces of Newtonian attraction between poiat masses 
(or bodies possessing spherical symmetry) are subject to this 
law. 

It will be recalled that the laws of motion in this case were found 
empirically by Kepler before Newton deduced them from the equations 
of mechanics and the law of gravitation. It was the agreement of 
Newton’s results mth Kepler’s laws that was the first verification of 
the truth of Newtonian mechanics. The problem of the motion of a 
particle in a field of force diminishing inversely with the square of the 
distance from some fixed point, is called Kepler’s problem. The prob¬ 
lem of the motion of two bodies with arbitrary masses always reduces 
to the problem of a single body when passing to a frame of reference 
fixed in the centre of mass. 

The expression “Kepler’s problem” can also be applied to Coulomb 
forces acting between point charges. These can either be forces of 

attraction or repulsion. In all cases we shall write U = ^, where a < 0 
for attractive forces and a>0 for repulsive forces. 

H we replace — - in (5.10) by a new variable x, the integral in the 
Kepler problem is reduced to the form 


dx 


X -f 


2a 

M m 


arc cos 


M 


]T~ 

\ M* 


-f 


2 ^ 

m 


M 


M 

X — - 


At the lower limit, the expression inside the arc cos sign is equal 
to unity [as will be seen from (6.11)], since the lower limit was chosen 
on the condition that dr — 0. But arc cos 1=0. Rearranging the result 
of integration and reverting to r, we obtain, after simple manipu¬ 
lations. 


am 


., 

2S' 

-' + -^1 


- COS <p 

m 


(5.12) 


48 


MECHANICS 


[Part I 


(5.12) represents the standard equation for a conic section, the 

eccentricity being equal to ^ H . . As long as this expression 

is less than unity, the denominator in (5.12) cannot become zero, 
because cos 1. But this is true for —• <#<0. Thus, when 

<0, the result is elliptical orbits. For this it is necessary that a <0, 
i.e., that there be an attraction, otherwise (5.12) would lead to r<0, 
which is senseless. 

For > 0 the eccentricity is greater than unity and the denominator 
in (6.12) becomes zero for a certain cp = (poo. Thus, the orbit goes to 
infinity (a hyperbola). The direction of the asymptote is obtained by 

putting r =oo in (5.12). This requires that cos tpoo = _ ■ 

The angle between the asymptotes is equal to 2 900, when the particles 
repel each other, and to 2 (tt— 900), when the particles attract. An 
example of a trajectory, when the forces are repulsive, is shown in 
Fig. 8, Sec. 6. 

Exercise 

. HT^ 

Obtain the ocination of the trajectory when U — ^ > 0. 

See. 6. Collision of Particles 

The significance of collision problems. In order to determine the 
forces acting between particles, it is necessary to study the motion of 
particles caused by these forces. Thus, Newton’s gravitational law 
was established with the aid of Kepler’s laws. Here, the forces were 
determined from finite motion. However, infinite motion can also be 
used if one particle can, in some way, be accelerated to a definite 
velocity and then made to pass close to another particle. Such a process 
is termed “collision” of particles. It is not at aU assumed, however, 
that the particles actually come into contact in the sense of “collision” 
in everyday life. 

And neither is it necessary that the incident particle should be 
artificially accelerated in a machine: it may be obtained in ejection 
from a radioactive nucleus, or as the result of a nuclear reaction, or 
it may be a fast particle in cosmic radiation. 

Two approaches are possible to problems on particle collisions. 
Firstly, it may be only the velocities of the particles long before the 
collision (before they begin to interact) that are given, and the problem 
is to determine only their velocities (magnitude and direction) after 
they have ceased to interact. In other words, only the result of the 
collision is obtained without a detailed examination of the process. 
In this case, some knowledge of the final state must be available (or 


.Sec. 0] 


COLIJ.SIOX OF PAKTICLES 


49 


specified) beforehand: it is not possible to determine, from the initial 
velocities alone, aU the integrals of motion which characterize the 
collision, and, hence, it is likewise impossible to predict the final state. 
With this approach to collision problems, only the momentum and 
energy integrals are known. 

However, another aijproach is possible: it is required to precalculate 
the final state where the precise initial state is given. 

Let us first consider collisions by the first method. It is clear that 
if only the initial velocities of the particles are known, the collision is 
not completely determined: it is not known at what distance the par¬ 
ticles were when they passed each other. This is why some quantity 
relating to the final state of the system must be giveir. Usually the 
problem is stated as follows: the initial velocities of the colliding par¬ 
ticles and also the direction of velocity of one of them after the colli¬ 
sion are specified. It is required to determine all the remaining quanti¬ 
ties after the collision. In such a form the problem is solved uniquely. 
Six quantities are unkno\vn, namely the six momentum components 
of both particles after the collision. The conservation laws ])rovide 
four equalities: conservation of the scalar quantity (energy) and 
conservation of the three components of the vector quantity (momen¬ 
tum). Therefore, with six unknowns, it is necessary to specify two 
quantities which refer to the final state. They are contained in the 
determination of the unit vector which specifies the direction of the 
velocity of one of the particles; an arbitrary vector is defined by three 
quantities, but a unit vector, obviously, only by two. Actually, 
only the angle of deflection of the particles after the collision need be 
given, i.e., the angle which the velocity of the particle makes with 
the initial direction of the incident particle. The orientation (in 
space) of the plane passing through both velocity vectors is im¬ 
material. 

Elastic and inelastic collisions. A collision is termed elastic, if the 
initial kinetic energy is conserved when the particles separate after 
the collision at infinity, and inelastic, if, as a result of the collision, 
the kinetic energy changes at infinity. In nuclear physics, studies are 
very often made of collisions of a more general character, in which the 
nature of the colliding particles changes. These collisions are also 
inelastic. They are called nuclear reactions. 

The laboratory and centre-o!-mass frames of reference. When colli¬ 
sions are studied in the laboratory, one of the particles is usually at 
rest prior to collision. The frame of reference fixed in this particle (and 
in the laboratory) is termed the laboratory frame. However, it is more 
convenient to perform calculations in a frame of reference, relative 
to which the centre of mass of both particles is at rest. In accordance 
with the law of conservation of the centre of mass (3.18), it will also 
be at rest in its own frame after the collision. The velocity of the 
centre of mass, relative to the laboratory frame of reference, is 


60 


MECHANICS 


[Part I 


Y — _ 

TO 1 + W 2 


( 6 . 1 ) 


Here Vq is the velocity of the first particle (of mass trij) relative to the 
second (with mass m 2 ). In so far as the second particle is at rest in 
the laboratory system, Vq is also the velocity of the first particle relative 
to this system. 

The general case of an inelastic collision. The velocity of the first 
particle relative to the centre of mass is, according to the law of 
addition of velocities, equal to 


Vio = Vo 


in^ y„ __ 

4- Wj ’ 


( 6 . 2 ) 


and in the same system, the velocity of the second particle is 


^ 20 “■ 


OTi + «ij ■ 


(6.3) 


Thus, mi Vio+mg V 2 o = 0, as it should be in the centre-of-mass 
system. 

In accordance with (3.17), the energy in the centre-of-mass system is 


Vl 


(6.4) 


Here, the reduced mass is indicated by a zero subscript, since in nuclear 
reactions it may change. 

Let the masses of the particles obtained as a result of the reaction 
be m 3 and and the energy absorbed or emitted Q (the so-called 
“heat” of the reaction). If Q is the energy released in radiation, then, 
strictly speaking, one should take into account the radiated momen¬ 
tum (see Sec. 13). But it is negligibly small in comparison with the 
momenta of nuclear particles. 

Thus, the law of conservation of energy must be written in the follow¬ 
ing form: 


movj mv* 

2 + V 2 • 


(6.5) 


Here, m = is the reduced mass of the particles produced in 

the nuclear reaction, and v is their relative velocity. 

In order to specify the collision completely, w'e will consider that 
the direction of v is known, since the value of v is determined from 

( 6 . 6 ). Then the velocity of each particle separately will be 


''30 


V 40 


WaV 

ma+m* ■ 


( 6 . 6 ) 


They satisfy the requirement Wj V 3 o-f-m 4 V 4 o= 0 , i. e., the law of 
conservation of momentum in the centre-of-mass system, and give 
the necessary value for the kinetic energy 


Sec. 6] 


COLUSION OP PABTICI.ES 


61 


‘^h'^30 , «» 4«40 

2 ” 2 


Now, it is not difficult to revert to the laboratory frame of reference. 
The velocities of the particles in this system will be 


V3=-V3o+V^ 


OT 4 V 


^V40 + V = ~- 


»h + nif 


+ 


»n,v„ 


«ij + Wa ' 

w,T„ 


OT 3 + W 4 


+ 


nil + »”a 


(6.7) 


Equations (6.7) give a complete solution to the problem provided 
the direction of v is given. 

Elastic collisions. The computations are simplified if the collision is 
elastic, for then Q~0. It follows from (6.5) that 

the relative velocity changes only m direction and not in magnitude. 
Let us suppose that its angle of deflection x is given. We take the axis 
Ox along Vq, and let the axis Oy lie in the plane of the vectors v and v^ 
(Avhich are equal in magnitude in the case of elastic collisions). Then 

Vx =I'o cos X, Wy=«o sin X • 

From (6.6), the components of particle velocity in the centre-of-mass 
system will, after collision, be correspondingly equal to 


Viox = ^ 


OTgt>oC08X 
nil + nil 


*> 20 * — 


Wj Vo cos X 

mi + nil ’ 


*>20V — 


Vq sin X 

mi + Wj ’ 

miVosinx 

mi + mi 


Since the velocity of the centre of mass is in the direction of the axis 
Ox, we obtain, from (6.7), the equations for the velocities in the 
laboratory frame of reference: 


*'i* 


(mi + miC 08 x) Vq 
mi + mi 

Wi (1 — cosx) Vq 
mi 4- mi 


■ Vinv -;■ 


mjVoSmx 


Wi + m, 


*> 2 y — *> 2 oy — — 


m,VoSinx 

mj + mj 


By means of these equations, the deflection angle 0 of the first 
particle in the case of collision in the laboratory system can be related 
to the angle x, (i- e., its deflection angle in a centre-of-mass system): 


^ '*'iy misinx 
® t’lx mi + miooax 

The “recoil” angle of the second particle 6' is defined as 


tan0' 


Viy _ sinx 
Vix 1 — cos X 


( 6 . 8 ) 


(6.9) 


62 


MECHANICS 


[Part I 


Tlie minus sign in the dciinition of tg 0' is chosen because the signs 
of v-^y and are opposite. 

The case of equal masses. Equation (0.8) becomes still simpler if the 
masses of the collidmg jiarticles are equal. This is approximately true 
in the case of a collision between a neutron and proton. Then, from 
( 6 . 8 ). 

tan 0 " tan , Q — ~ ^ 


6 + 6 ' 


TT 

2 ^ 


i.e., the particles fly off at right angles and the deflection angle of the 
neutron in the laboratory system is equal to half the deflection angle 
in the centre-of-mass system. Since the latter varies from 0 to 180°, 
6 cannot exceed 90°. And, in addition, the velocity of the incident 
particle is plotted as the “resultant” velocity of the diverging particles. 
The collision of billiard balls resembles the collision considered here 
of particles of equal mass, provided that the rotation of the balls 
about their axes is neglected. 

The energy transferred in an elastic collision. The energy received 
by the second particle in a collision is 

CTaWf (l -- -co8x)i>g 

(TOi + wq)* 


Its portion, relative to the initial energy of the first particle, is 


(fj _—oosz) 

^0 ^ (mi + Wa)® 


( 6 . 10 ) 


From this we obtain ^ = .sin^ ■— =sin2 6 for particles of equal 

mass. Accordingly, the jiortion of the energy retained by the fir.st 

particle is-C-=cos®0. In a “head-on” collision x —180°, 6 = 90°. 
00 

The first particle comes to rest and the second continues to move 
forward with the same velocity. This can easily be seen when billiard 
balls collide. 


The problem of scalicring. Let us now examine the problem of colli¬ 
sion in more detail. We shall confine ourselves to the case of elastic 
collision and perform the calculations in the centre-of-mass system. 
The transformation to the laboratory system by equations (6.7) is 
elementary. 

It is obvious that for a complete solution of the collision problem, 
one must know the potential energy of interaction between the par¬ 
ticles U (r) and specify the initial conditions, so that aU the integrals 
of motion may be determined. The angular-momentum integral is 
found in the foUowuig way. Fig. 8, which refers to repulsive forces. 


Sec. 6] 


COLLISION OF PABTICLES 


63 


represents the motion of the first particle relative to the second. 
The path at an infinite distance is linear, because no forces act between 
the particles at infinity. Since the path is linear at infinity, it possesses 
asymptotes. The asymptote for the part of the trajectory over which 
the j)articles are approaching is represented 
by the straight line AF, and FB is the 
asymptote for the part whore they recede. 

The collision parameter. The distance of 
an asymptote from the straight line OC, 
drawn through the second point and parallel 
to the relative velocity of the particles at 
infinity, is called the collision parmneter 
(“aiming distance”). It has been denoted ^ 
by p, since, as can be seen from Fig. 8 , p is 
also the “arm” of the angular momentum. 

If there were no interaction between the particles, tiiey would pass 
each other at a distance p; this is why p is called the collision ])aramoter. 
lint we know that the angular momentum is very sim])ly expressed 
in terms of p. In the preceding section it ivas shown that it is equal 
to mvp. Let us draw the radius vector OA to some very distant jioint A. 
Then the angular momentum is 

M = mvr sin a 

(the angle a is shown on the diagram). But rsina = p, so that 

M =mDp . (fi ll) 

Recall that here m is the reduced mass of the particles and v is their 
relative velocity at infinity. 

The energy integral is expressed in terms of the velocity at infinity 
thus: 

( 6 . 12 ) 

since U (oo) — 0. 

The deflection angle. The deflection angle y is equal to | tt —29 „ |, 
where 9 ^. is half the angle between the asymptotes. The angle 9 „ 
corresponds to a rotation of the radius vector from the position OA, 
where it is infinite, to a position OF, where it is a minimum. Hence, 
from equation (5.10) the angle 900 is expressed as 


(6.13) 


rg is determined from (5.11). In place of M and S' we must substitute 
into (6.13) the expressions (fi.ll) and (6.12). 


54 


MKCHANICS 


[Part I 


The differential effective scattering cross-section. Let us suppose 
that the integral (6.13) lias been calculated. Tlien and therefore Xj 
are known as functions of the collision parameter p. Let this relation¬ 
ship bo inverted, i.e., p is determined as a function of the deflection 
angle: 

p = p(x). (6.14) 

In collision experiments, the collision parameter is never defined 
in practice; a parallel beam of scattering particles is directed with 
identical velocity at some kind of substance, the atoms or nuclei 
of which are scatterers. The distribution of particles as to deflection 
angles x (or> more exactly, as to angles 0 in the laboratory system) 
is observed. Thus a scattering experiment is, as it were, performed 
very many times one after the other with the widest range of aiming 
distances. 

Tjct one particle pass through a square centimetre of surface 
of the scattering substance. Then, in an amiulus contained between 
p and p-[-dp, there pass ‘Inp dp particles. We classify the collisions 
according to the aiming distances, similar to the way that it is done 
on a shooting target with the aid of a concentric system of rings. 
If p is known in relation to x, then it may be stated that da=2np dp = 

= 2Tzp-~dx particles will be deflected at the angle between x 

and x + dx- 

Let us suppose that the scattered particles are in some way detected 
at a large distance from the scattermg medium. Then the whole 
scatterer can be considered as a point and we can say that after 
scattering the particles move in straight lines 
from a common centre. Let us consider those 
particles which occupy the space between 
two cones that have the same apex and a 
common axis ; the half-angle of the imier cone 
is equal to x> the external cone x+<^X- 
The space between the two cones is called a 
solid angle, similar to the way that the plane 
contained between two straight lines is called 
a plane angle. The measure of a plane angle is 
the arc of a circle of unit radius drawn about 
the vertex of the angle, while the measure 
of a solid angle is the area of a sphere of unit radius dra^vn about the 
centre of the cone. An elementary solid angle is shown in Fig. 9 as 
that part of the surface of a sphere covered by an element of arc dy 
when it is rotated about the radius OC. Since 00—1, the radius 
of rotation of the element dx is equal to sin x- Therefore, the surface 
of the sphere which it covers is equal to 2n sin xdx- Thus, the elemen¬ 
tary solid angle is 


Sec. 6] 


COLiaSIOK OF PARTICLES 


65 


dOi = 27 t sill y^dx- 


(6.15) 


The number of particles scattered in the element of solid angle is, 
thus, 


da = p 


dp do 
dx sinx 


(6.16) 


The quantity da has the dimensions of area. It is the area in which 
a particle must fall in order to be scattered within the solid-angle 
element dQ. It is called the effective differential scattering cross-section 
in the element of solid angle dQ. 

Experimentally we determine just this value, because it is the 
angular distribution of the scattered particles that is dealt with 
[in (6.16) we consider that p is given in relation to /]. If there are n 
scatterers in unit volume of the scattering substance, then the attenu¬ 
ation of the primary beam J in passing through unit thickness of 
the substance, due to scattermg in an elementary solid angle d£i, is 

dJa — — Jnda= — Jnp particles/cm. 


If we examine da as a function of y, we find a relationship between 
the collision parameter and the deflection angle. And this allows 
us to draw certain conclusions about the nature of the forces acting 
between the particle and the scattering centre. 

Rutherlord’s formula. A marvellous example of the determination 
of forces from the scattering law is given by the classical experiments 
of Rutherford with alpha particles. As was pointed out in Sec. 3, 
the Coulomb potential acting on particles decreases with distance 

according to a y law, in the same way as the Newtonian potential. 

Consequently, the deflection angle can be calculated from the equations 
of Sec. 5. Let us first of aU find the angle <pco. It can be determined 
from equation (5.12) by putting r = oo,a>0 (the charges on the nucleus 
and alpha particle are like charges). Hence, 


COS(poo = 


1 


2M^ ¥ ’ 
ma^ 


tan 9oo = 


M 1/2^ 

aim' 


(6.17) 


The integrals of motion S and M are determined with the aid of 
(6.11) and (6.12). We therefore have 


p =tg <p =cot(6.18) 

|since <p = y-• We now form the ec 

ential scattering cross-section in the 
the aid of (6.16); 


lation for the effective di|fi^ 
centre-of-mass svstear with 


66 


MECHANICS 


[Part I 


da = 


a 


2 


Am^ V* 


dQ 

8in«|- 


(6.19) 


If the scattering nucleus is not too light, this equation, to a good 
a]jproximation, also holds in the laboratory system. 

Thus, the number of particles scattered in the elementary solid 
angle d£l = 2t: sin x dX’ inversely proportional to the fourth power 
of the sine of the deflection half-angle. This law is uniquely related 
to the Coulomb nature of the forces between the particles. 

Studying the scattering of alpha particles by atoms, Rutherford 
showed that the law (6.19) is true for angles up to coixesponding 
to collision parameters less than 10“^* cm. It was thus experimentally 
proved that the whole mass of the atom is concentrated in an ex¬ 
ceedingly small region (recall that the size of an atom is ~10-® cm.). 
Thus, experiments on the scattering of alpha jjarticles led to the 
discovery of the atomic nucleus and to an estimation of the order 
of maguit.ude of its dimensions. 

Isotrojiic scattering. As may be scon from equation (6.19), the 
scattering has a 7 )ronounccd maximum for small deflection angles. 
This maximum relates to large aiming distances since particles passing 
each other at these distances are weakly deflected, while large distances 
predominate since they deJine a larger area. Thus, if the interaction 
force between the particles does not idevlically convert to zero at 
a finite distance, then, for small deflection angles, the expression 
for da will always have a maximum. This maximum is the more 
pronounced, the more rapidly the i)iteraction force decreases with 
distance, for in the case of a rapidly diminishing force, large aimmg 
distances correspond to very small dc^flection angles. 

However, particles that are very little deflected can in no way 
be detected experimentally as deflected xiarticles. Indeed, the initial 
beam cannot be made ideally parallel. For this reason, when investi¬ 
gating a scattered beam, one must always neglect those angles which 
are comparable with the angular deviation of the })articles in the 
initial beam from ideal ])arallelism. 

For a sufficiently ra])id attenuation of force with distance, the region 
of the maximum of da in relation to the angle x can refer to such 
small angles that the particles travelling within these angles will 
not be distinguished as being scattered because of their small deflection 
angles. On the other hand, the remaining particles will be the more 
uniformly distributed as to scattering angle, the more rapidly the 
forces fall off with distance. 

This can be seen in the example of particles scattered by an im¬ 
permeable sphere (Exercise 1). Such a sphere may be regarded as 
the limiting case of a force centre rejiulsing particles according to 

the law C7,(r) — l—Y ; when n tends to infinity: if r < r^, then U (r)->oo. 


Sec. 7] 


SMAI.L OSCILLATIONS 


57 


and if r>rQ, then U (»•)->- 0 . When n = oo the scattering is completely 
isotropic. If n is large, the angular distribution of the particle is 
almost isotropic, and only for very small deflection angles has the 
distribution a sharp maximum. Hence, a scattering law that is almost 
isotropic indicates a rapid diminution of force with distance. 

The scattering of neutrons by protons in the centre-of-mass system 
is isotropic up to energy values greater than 10 Mev (1 Mev equals 
1.6 X 10“® erg). An analysis of tlie effective cross-section sliows that 
nuclear forces are short-range forces; they are very great at close 
distances and rapidly diminish to zero at distances larger than 
2 X 10 - 1 ® cm must be mentioned, however, that a correct investi¬ 
gation of this case is only possible on the basis of the quantum theory 
of scattering (Sec. 37). 

Exercises 

1) Find the differential effective scattering cross-section for particles by 
an impermeable sphere of radius r„. 

The impermeable splioro can be represented by giving the potential energy 
in the form U (r) = 0 for r > (outside the sphere) and V (r) -- oo for r < 
(inside the siihcrc). Then, whatever the kinetic energy of tho particle, penetra¬ 
tion into the region r <r„ is impossible. 

fn reflection from tho sphere, tho tangential component of inoinontum is 
conserved and tho normal component chango.s sign. Tho absolute value of the 
momentum is conserved since tho scattering is elastic. A simple construction 
shows ns that the collision parameter is related to the deflection angle by 

p = ro cos -A , 

if p < r„. Hence tho general equation gives 

do = -^d£2, 

4 

so that the scattering occurs isotropically for all angles. Tho total effective 
scattering cross-section a is equal, in this case, to as expected. Note that 
if the interaction converted to zero not at a finite distance but at infinity, tho 
total scattering cross-section would tend to infinity since to any arbitrarily 
largo approach distance p there would correspond a certain deflection angle, 

and the integral J 2 np dp diverges. In tho quantum theory of scattering, 

o is also finite when the forces diminish fast enough with distance. 

2) Tho collision of particles with masses Wj and 7n.^ is considered (the mass 
of the incident particle is w,). As a result of the collision, particles with the 
same masses are obtained whose paths make certain angles tf ami '>1 with tho 
initial flight direction of the particle of mass mi- Determine the energy Q 
which is absorbed or emitted in the collision. 


Sec. 7. Small Oscillations 

In applications of mechanics, we very often meet a special form 
of motion known as small oscillations. We devote a separate section 
to the theory of small oscillations. 


68 


MECHANICS 


[Part I 


The definition ot small oscillations of a pendulum. In the problem 
of pendulum oscillations in Sec. 4 it was shown that the equation 
relating the deflection angle 9 to time led, in the general case, to 
a nonelementary (elliptical) integral (4.11). A simple graphical in¬ 
vestigation shows that the function 9 (t) is 
periodic. Fig. 10 shows the curve U (<f)=nigl 
(1 — cos 9 ), which gives the relationship be¬ 
tween potential energy and deflection angle. 
The horizontal straight line corresponds to 
a certain constant value oi t^.li S <2 mgl, 
the motion occurs periodically with time 
between the points ■—90 and 9 ^. 

The problem is greatly simplified if 
9 (,<^ 1 , i.e., the angle cp^ is small in com¬ 
parison with a radian. Then cos 90 can 

be replaced by the expansion 1 -. Since 1 9 1 < 90 , cos 9 can also 


be replaced by 1 - 
evaluated; 

t = 


-21 

2 


After this the integral (4.11) can be easily 


9 


'Po 


dtf 

V90 - 


arc cos 


9o 


(7.1) 


Inverting relation (7.1), we get the angle as a function of time 


9 = 9 q cos 


(7.2) 


The result is a periodic function. As can be seen from (7.2), the 
period 9 is equal to . The quantity |/^ is called the frequency 

of oscillation 


(7.3) 


This quantity gives the number of radians by which the argument 
of the cosine in (7.2) changes in one second. Sometimes the term 
frequency denotes a quantity that is 2 71: times less and equal to the 
number of oscillations performed by the pendulum in one second. 

The inverse value is the period of small oscillations of the pen¬ 
dulum. An important point is that the period and the frequency 
of small oscillations do not depend on the amplitude of oscillation 9 ^. 

The general problem of small oscillations with one degree of freedom. 
In order to solve the small-oscillation problem, we need not, initially, 
reduce to quadrature the problem of arbitrary oscillations; we can 
first perform an appropriate simplification of the Lagrangian. 


Sec. 7] 


SMALL OSCILLATIONS 


69 


First of all, we note that any oscillations, both large and small, always 
occur about a position of equilibrium. Thus, a pendulum oscillates 
about a vertical. On deflection from the position of stable equilibrium, 
a restoring force acts on the system in the opposite direction to the 
deflection. In the equilibrium position, this force obviously becomes 
zero (by definition of the “equilibrium” concept). 

Force is equal to the derivative of potential energy with respect 
to the coordinate taken with opposite sign. The equilibrium condition 
written in terms of this derivative is 


dq 


0 . 


(7.4) 


Let us denote the solution of this equation by = We assume, 
uiitially, that the system has only one degree of freedom and expand U 
in a Taylor series in the vicinity of the point qQ. 


U(q}=U {q„) + (q-qo) + (q-qoV + • • • (7-5) 

The linear term relative to q — q^ vanishes in accordance with (7.4). 

( d^U\ 

by tbe letter p. Then, conlining ourselves to these 
terms of the series, we obtain 


U(q)^U(qo) + -^{q-qoV. (7.6) 


The force near the equilibrium position is 

= - 0f-=-P('Z-?o). (7.7) 

For this force to be a restoring force (i.e., for it to act in the airection 
opposite to the deflection), the following inequality must hold; 


This is the stability condition for the equilibrium; the function U (q) 
must increase on both sides of the point q=qo. It follows that the 
potential energy at that point must be a minimum. This is shown 
in Fig. 10 at cp = 0. 

Let us now examine the expression for kinetic energy. If, in the 
general formula for the kinetic energy of a particle, 

T=~{x^ + y^ + z*) 


we put x = x(q), y = y{q), z=z{q), then T reduces to the form 


60 


MECHANICS 


[Part I 


The quantity in the brackets depends only on g; and so the kinetic 
energy of a particle can be represented in the form 

T=\«.(a)q^. ( 7 . 9 ) 

Let us now expand the coefficient a {q) in a series, in terms of 
q — q^, in the vicinity of the equilibrium position; 

T = j « (?o) + I ((? - ?o) + • • • 

In order that the particle should not move far from the equilibrium 
jjosition, its velocity must be small. In other words, the zero member 

of the kinetic energy expansion 4 «(7n) is already of the same order 
of smallness for small oscillations as the second term in the expansion 

Q 

of JJ (q), i.e., 2 " (9 — qo)^ ■ When q — q,, all the energy of oscillation 
is kinetic, while for maximum deflection all the energy is potential. 
Therefore a (qo) and ^ (q — g^l'^s.re of the same order of magnitude, 

and the remaining terms in the series [including those contaiuuig 
{q —f/o)*/^] ‘‘^’1 i*® neglected. Wo shall show that the mean values 

of both the quantities .,-a (</„) q^ and (q — q^Y S'*”® fi^® same after 

Avc determine q as an explicit function of time. 

In future, the coordinate q will be measured from the equilibrium 
position, i.e., we shall put q^—O- Then [omitting U (0), which does 
not affect the equations of motion] the Lagrangian can be written 
in the following form; 


(7.10) 

Thus, Lagrange’s 

equation will be written as 


a (0) ^ + Pg = 0. 

(7.11) 

Denoting 

ld^U\ 


-.2 _ P \ lo 

“ - a(0) «(0) ’ 

(7.12) 


we reduce (7.12) to the general form for the oscillation equation; 

^ + = (7.13) 


Various forms for the solutions of small-oscillation problems. The 
general solution of this equation, which contains two arbitrary 
constants, may be ivritten in one of three forms; 


Sec. 7] 


SMALL OSCILLATIONS 


61 


g = (7x COS <01 + Cj sin <o<, (7.14a) 

g' = C'cos (oj<+ y), (7.14b) 

7 = Re {C" e'“‘} . (7.14c) 


The symbol Re{} signifies the real jiart of the expression inside the 
braces. TJie constant G' inside the braces is complex: C' = C^ — iC^- 
The constants are chosen in accordance with the initial conditions. 
The constant y is called the initial phase, and C is the amplitude. 

If we are only interested in the frequency of small oscillations, 
and not the phase or amplitude, it is sufficient to use equation (7.12), 


verifying that the second derivative 


( 5 ^)„ 


A system which is described by equation (7.13) is called a linear 
harmonic oscillator. 

It can be seen from equations (7.10), (7.12), and (7.14b) that the 
averages of the potential energy and kinetic energy of the oscillator 
during one period are the same because the averages of the squares 
of a sine or cosine are equal to one half: 


sin^ («< + y) = I sin* (<o< + y) = -5- ; cos* (oi + y) = 


T = tt a (0) to* 6 '* = pc*. 


2 ’ 


Small oscillations with two degrees of freedom. We shall now con¬ 
sider oscillations with several degrees of freedom. As an example, 
let us first take the double pendulum of Sec. 3. If we confine ourselves 
to small oscillations, we must consider that the deflections and 
are close to zero, (i.e., the pendulum is elose to a vertical position). 
Then, by substituting the equilibrium values of the coordinates 
9 andcos ( 9 —ij;) in the kinetic energy must be replaced by cos 0 = 1 
as in the problem of oscillations with one degree of freedom where 
cos 9 and cos i];, in the expression for potential energy, must be re¬ 
placed by 1-^ and 1 - 'G-. Then the Lagrangian will have the form 


j- + m, , m, ,2 ; „ , 1 , ■ 1 m + m, j , w, „ 

L =-1*9* 4- + mi«i9<p-lfif9*-tj;*. 


Let us examine this in a somewhat more general form: 
^ = y (®ii7i + 2ai2^xg2-f ajg^l)— U (0) — 

-9'(Pll9'l+2Pl2(J'l5'2+P2232) • 


(7.15) 


(7.16) 


62 


MECHANICS 


[Part I 


Here, the coefficients and 01.22 assumed to be constant 

numbers expressed in terms of the equilibrium values and ^ 2 - 
Comparing (7.15) and (7.16), wo find that in the problem of the double 
pendulum 

«ii = (wt + wtj) . «i2 = nielli, of.22 = nil ; 


= (m + mJZg, Pi2 = 0, p22 = «2^i^!7* 

In the general case, the coefficients and P 22 are expressed 

by the equations 


where the derivatives are also taken in the equilibrium positions. 
For the equilibrium to be stable, we must demand that the following 
inequality be satisfied: 

U{q)—U (0) = -i (Pijg? + 2pi2 gig* + p 22 g^) > 0 . (7.17) 


Under this condition, U has a minimum at the point gi = 0, g 2 = 0. 
Let us rewrite the left-hand side of (7.17) in identical form 

+ 2Pi2?i? 2 + P2a?l) = -^(?x + + -^^f^^-g|. 


This expression remains positive for all values of g^ and g 2 , provided 
the coefficients of both quadratics in g are greater than zero; 

Pu>0. (7-18) 


Pup22-P?2>0. (7.19) 

In future, we shall consider that the conditions (7.18) and (7.19), 
together with analogous conditions for a^, and a**, are satisfied. 
We shall now write down Lagrange’s equations. We have 

*ii9i+*i 2 ^/a> *ia9 + “aa^a> 

= Pii?i + Pi*?* > 8^~ Paaffa • 

Whence 

aii?i + «i2?2+Pu'i'i + Piag'a = 0. 1 ^,^20) 

*I2?l+a22?2+Pia3'l+p22 32 = 9. j 

In order to satisfy these equations, we shall look for a solution 
in the form 

gi=^ie*“', g2=J2e‘'“'. (7.21) 

As in (7.14c), the real part of the solution (7.21) must be taken. 


Sec. 7] 


SMALL OSCILLATIONS 


63 


The equation for frequency. Substituting (7.21) in (7.20), we 
obtain equations relating and A 2 ' 

(Pll —«llW^)^I + (Pl 2 —«I 2«^)^2 = 0 , I 

(P 12 *12 -^I d" (P 22 *22 “‘^ 2 “® • i 

Transferring terms in .dg to the right-hand side of the equation 
and dividing one equation by the other, ive eliminate Aj^ and .dg: 


Pil —«ii _ P 12 —g iao ’* 
Pl 2 -ai 2 ^22 - «22 


(7.23) 


Reducing (7.23) to a common denominator, we arrive at the bi¬ 
quadratic equation 

(*n*22 *12) (Pll*22+ ^22*11 ^a^gpig) 

+ PuP 22 -Pf 2 = 0. (7.24) 

Substituting here the expressions for m, from (7.15), wo obtain 
an equation for the frequencies of a double pendulum 

— (m-f TOj) nijllx (^i-t-^) m^) mJlxg^ — O. 

If we introduce stiU another contraction in notation (for the given 
problem) -^ = 7^, = P, the expression for frequencies will be of 

the following form: 


= W + P) (1 + X) ± V (1 -f (x)2 (1 -f X)2 — 4X (1+ p) ]. 

It is easy to see that this expression yields only the real values 
of the frequencies. However, we shall show this in more general 
form for equation (7.24). Let us assume that the following function 
is given: 

F (t>)^)= (a^^agg afg) w* (Pii*22 d" ^22*11 ^p^ga^g) w®-!- P11P22 

which passes through zero for all values of <0 that satisfy equation 

(7.24) . jP (w®) is positive for w*=0 and for w® = oo, since Pupgg — 
— P 12 > 0, aiiagg — afg > 0. Let us now substitute into this function 

the positive number {>>*==-|^. After a simple rearrangement we 

P 22 

obtain 

*11 ^ l"^) ~ ^12 *22)* ® • 

Thus, as CO® varies from 0 to oo, (co®) is first positive, then negative, 
and then again positive. Hence, it changes sign twice, so that equation 

(7.24) has two positive roots cof, (o“ and, as was asserted, all the 
values for frequency are real. 


64 


MECHANICS 


[Part I 


The quantity co has four values, both pairs of which are equal 
in absolute value. If we represent the solution in the form (7.21), 
it is sufficient to take only positive w. 


Normal coordinates. Let us put these roots in (7.22). To each of 
them there will correspond a definite ratio of tlie coefficients For 
i = l, 2 we have 


1 


Pll — «u“i 

Pl 2 “12“," 


(i=l, 2 ). 


(7.25) 


According to (7.23), the same ratio is also obtained from the second 
equation of (7.22). For example, for the double pendulum 
<!)? 

^ equal to 1 or 2 depending on the sign in front 


of the root in the solution for to®. 

Each frequency w/ defines one partial solution of the system 
(7.20). Since the system is linear, the general solution is the sum 
of these particular solutions. Let us write this as 


t/2 -- A(pe'“>'. (7.26) 


We must, of course, take oidy the real parts of the expressions on 
the right. 

We now introduce the following notation: 

s= = (? 2 . (7.27) 

According to (7.27), the quantities and satisfy the differential 
equations 

4 + ‘^iei = 0: 4 + wi^)2 = 0. (7.28) 

Each of these equations can be obtained from the Lagrangian 

4 —[wfQ?, (7.29) 


which describes oscillation with one degree of freedom. 

Thus, in terms of the variables Qi, the problem of two related 
oscillations with two degrees of freedom q^, has been reduced 
to the problem of two independent harmonic oscillations with one 
degree of freedom and Q^. The coordinates and Q.^ are termed 
normal. 

In equations (7.20), we cannot arbitrarily put <j'i = 0 or ^'2 = 0: 
if the quantity q^ oscillates, then it must cause q^ to oscillate. In 
contrast, the oscillations of the quantities and are in no way 
related [as long as we limit ourselves to the expansion (7.16) for L]. 


Sec. 7] 


SMALL OSCILLATIONS 


65 


From equations (7.26), we can express Qi and in terms of qi and : 
Qi = 

If, for example, we choose the initial values of q and q so that at 
this instant = 0 and = 0, then the oscillation with frequency co^ 
will not occur at all. For this it is sufficient, at i = 0, to take the co¬ 
ordinates and velocities in a relationship such that !^ 2 ?i — = 0 

and ^2 ki — other words, only the frequency coa will occur, 

and the oscillations vdll be strictly periodic. When both frequencies 
Wj and toa are excited they are generally speakmg incommensurable, 
i.e., their ratio cannot be expressed as a rational fraction), the oscil¬ 
lation q is no longer periodic, since the sum of two periodic functions 
with incommensurable periods is not periodic. 

Expressing energy in normal eoordinates. From the form of the 
Lagrangian (7.29) it can be immediately concluded that the expression 
for energy in normal coordinates reduces to the form 

<^ = 4-2'(<3?+0>fg?), (7.31) 

1 

since L = T — U and S' —T+U. This result is true for small oscil¬ 
lations with any number of degrees of freedom. 

We must note that if the normal coordinates are expressed directly 

by equations (7.30), then the separate energy terms — (Q?co?Qf), 
will also be multiplied by certain numbers a,-. However, if we replace 
Qi by Qi V^i, then these numbers are eliminated from the expression 
for energy, which is then reduced to the form (7.31). An example 
of this procedure is given in the exercises. 

Thus, the energy of any system performing small oscillations is 
reduced to the sum of the energies of separate, independent linear 
harmonic oscillators. As a result of this, consideration of oscillation 
problems is greatly simplified since the linear harmonic oscillator is, 
in many respects, one of the most simple mechanical systems. 

The reduction to normal coordinates turns out to be a very fruitful 
method in studies of the oscillations of polyatomic molecules, in 
the theory of crystals, and in electrodynamics. In addition, normal 
coordinates are useful in technical applications of oscillation theory. 

The case ol equal frequencies. If the roots of equation (7.24) coincide, 
the general solution must not be written in the form (7.26), but 
somewhat differently, namely, 

qi=A cos (at -\-B sin <at , 
q 2 = A' cos (at+B' sin (at. 


(7.32) 


s - 00«0 


66 


MECHANICS 


[Part I 


Four arbitrary constants appear in this solution, and this is as 
it should be in a system with two degrees of freedom. 

An example of such a system is a pendulum suspended by a string 
instead of a hinge. In the approximation (7.32), it turns out that 
the pendulum describes an ellipse centred about the equilibrium 
position. Account taken of the subsequent terms in the expansion 
of the potential energy in powers of deflection shows that the axes 
of the ellipse do not remain stationary, but rotate. 


Exercise 


Find the natiiriil frequencies and normal oscillations of a double pendulum, 
taking the ratios of load masses ix= 3/4 and the rod lengths X = 6/7. 

From the equation for the oscillation frequencies of a double pendulum, 

wo obtain <of = , coj = —~ . Further, = — 7/3, = 7/6. 

Jjet us now write down the expression for kinetic energy. For simplicity, 
wo write l — g = m,= l so that only ratios of X and g will appear in all the equa¬ 
tions. Tliis gives an « 1 -f g —7/4, a,j = gX = 16/28, a^j = gX® = 76/196; p,, = 1 -|-g = 
= 7/4, Pi 2 —0, P 22 = gX-- 16/28. Let tis determine the coefficients a;. To do this 
we must calculate the kinetic energy 


7 . . 1 

27’ =-I- (Q. + Qd” +(Q 


i + ^a) I-^ "6 


Vii 

Consequently, wo must put Oj = —,y— , Uj = 1/2. 

Qi j Qi 


Denoting 
expression for potential energy 

_ .1 (Vs 


and again by the letters Qi and Q^, we have the 

A 


as it should be according to (7.10). The generalized coordinates are related to 
the normal coordinates by 

Tims, if 7 <p = — 3 ij;. and 7 9 = — 3 initially, then we have = 0 for all 
time, so that both pendulums oscillate with one frequency u,, with the constant 
relationship between the deflection angles 7 9 = — 3 i|< holding all the time. 
Both penduliuns are deflected to opposite sides of the vertical. The other normal 
oscillation, with frequency occurs for a constant angular relationship 
7 9 = 6 i{<. 

Sec. 8. Rotating Coordinate Systems. Inertial Forces 

The equivalence of inertial coordinate systems. The particular 
significance of inertial coordinate systems in mechanics was pointed 
out in Sec. 2. In such systems, all accelerations are produced by 


Sec. 8] 


BOTATINQ COORDINATE SYSTEMS INERTIAL FORCES 


67 


interaction between bodies. It is impossible to find a strictly inertial 
system in nature (any system is noninertial if the motions of bodies 
in it are observed over a sufficiently long period of time). 

In the exercise at the end of this section we shall consider the 
Foucault pendulum, whose plane of oscillation rotates with a speed 
depending only on the geographical latitude of its location. This 
rotation cannot be explained by an interaction with the earth, because 
the gravitational force cannot make the pendulum rotate from oast 
to west instead of from west to east. * However, if we consider several 
oscillations, then the rotation of the plane is still insignificant and 
can be ignored. Then it is sufficient to consider that gravity alone 
is acting on the pendulum and that the coordinate system fixed 
in the earth is approximately inertial over a period of several oscil¬ 
lations. 

The concept of an inertial system is meaningful as an approximation 
and is a veiy convenient idealization in mechanics. In such a co¬ 
ordinate system, the interaction forces are measured by the acceler¬ 
ations of the bodies. 

Let a coordinate system be defined for which it is known that, 
to the required degree of accuracy, it can be regarded as inertial. 
Then another coordinate system, moving uniformly relative to it, 
is also inertial within the same degree of accuracy. Indeed, if all 
the accelerations in the first system are due to interaction forces 
between bodies, then no additional accelerations can appear in the 
second system either. Therefore, both systems are inertial. Either 
of them may be considered at rest and the other moving, since motion 
is always relative. 

The principle of relativity. One of the basic principles of mechanics 
is that all laws of motion have an identical form in aU inertial co¬ 
ordinate systems, since these systems are, physically, completely 
equivalent. This principle of the equivalence of aU inertial systems 
is known as the relativity principle, for it is connected with the 
relativity of motion. 

It should be noted that this in no way signifies that inertial and 
noninertial coordinate systems are equivalent: in the latter, not 
all the accelerations can be reduced to interaction forces, so that 
there is no physical equivalence between two such systems. 

Mathematically, the principle of relativity is expressed by the fact 
that equations of motion for one inertial system preserve their form 
after the variables have been transformed to another inertial system. 

The equations for the transformation from one inertial system 
to another can be obtained only on the basis of certain physical 

* In this case the plane of oscillation must pass through the vertical, since, 
otherwise, the pendulum would have an initial angidar momentum relative 
to the vertical and would describe an ellipse whose semiaxes rotate (see end 
of Sec. 7). 


68 


MECHANICS 


[Part I 


assumptions. In Newtonian mechanics it is always taken that the 
interaction forces between bodies, in particular, gravitational forces, 
are transmitted instantaneously over any distance. Thus, the dis¬ 
placement of any body immediately transmits a certain momentum 
to any other body, no matter where it is located. As a result, a clock 
located in a certain inertial system can be instantaneously synchronized 
with a clock moving in another inertial system. Thus, in Newtonian 
mechanics, time is considered universal. In transforming from one 
inertial system to another (the latter with a velocity V relative to 
the former) it is taken that the time t is the same in both systems. 

Later on we shall see that this assump¬ 


tion is approximate and holds only when 
the relative velocity of the systems is 
considerably less than the velocity of 
light. 

The Galilean transformation. Let us 
construct coordmate systems in two 
inertial frames of reference such that 
their abscissae are in the direction of 


Fig-11 the relative velocity V and the other 

coordinate axes are also mutually parallel. 
Then from Fig. 11 it will be immediately seen that the abscissa of 
point X in the system which wo shall call stationary is related to the 
abscissa in the moving system by the simple relation 


x=x' -t- Vt, 


( 8 . 1 ) 


provided that the origins coincided at the instant < = 0. The co¬ 
ordinate construction does not impose any limitations on the generality 
of the transformation equations. The remaining transformations lead 
simply to the identities 

y=y', 2 =z'. (8.2) 


The relationship t = t' is a h 5 rpothesis which is correct only 
for values of V considerably less than the velocity of light 
(Sec. 20). 

Condition (8.1) is absolutely symmetrical with respect to both 
inertial systems: if we consider that the one in which the variables 
are primed is stationary and the other in motion, (8.1) retains the 
same form; one should, of course, replace F by — F. In the given 
case, symmetry exists because t'=t. If the transformation 

equations x=x' + Vt and x' = x — Vt' would contradict each other. 
But it would seem that equation (8.1) is obtained, quite obviously, 
from Fig. 11. Thus, if we do not consider time as identical for all 
inertial systems, the mathematical formulation of the relativity 
principle should be more complicated than that obtained on the 
basis of equation (8.1); and, we must definitely give up this “obvious- 


Sec. 8] 


KOTATING COORDINATE SYSTEMS INERTIAL FORCES 


69 


ness,” which is so rooted in our everyday experience with velocities 
that are small compared with the velocity of light. 

The equations of Newtonian mechanics involve, on the right-hand 
side, the forces of interaction between particles. These forces depend 
on the relative coordinates of the particles and, for this reason, 
they do not change with transformation (8.1), since Vt is cancelled 
in the formation of differences between the coordinates of any pair 
of particles. The left-hand sides of the equations contain accelerations, 
i.e., the second derivatives of the coordinates with respect to time. 
But since time enters linearly in (8.1) and is the same in both systems, 
x=x'. Thus, the equations of mechanics are of iden¬ 
tical form in any inertial frame of reference. 

To summarize, the equations of mechanics do not 
change their form when the variables undergo trans¬ 
formations (8.1). In other words, it is common to say 
that the equations of mechanics are invariant to these 
transformations, which are usually called Galilean 
transformations. 

The constancy (invariance) of mechanical laws 
under Galilean transformations is the essence of the 
relativity prmciple of Newtonian mechanics. 

Here we must bear in mind that the relativity 
principle, which expresses the-equivalence of all iner¬ 
tial coordinate systems, expresses a far more general 
law of nature than the approximate equations of transformation 
(8.1), (8.2). The extension of the relativity principle to electromag¬ 
netic phenomena involves the replacement of these equations by 
more general ones, which reduce to the former equations only when 
all velocities are much smaller than that of light. 

Rotating coordinate systems. Several new ter m s appear in the 
equations of mechanics when transforming to rotating coordinate 
.systems. Let us first obtain the equations for this transformation. 

In Fig. 12 the axis of rotation is represented by a vertical line. 
The origin 0 is on the axis of rotation. Let r be the radius vector 
of a point A rotating around the axis. Then, for a rotation angular 
velocity w (radians per second) the linear speed of the point will be 

v — a-r sin a , (8.3) 

since the radius of rotation is p = r sin a (see Fig. 12). Let the rotation 
be anticlockwise. If point A lies in the plane of the paper, then the 
velocity v is perpendicular to the plane of the paper and directed 
towards the back of the paper. This permits us to obtain a relationship 
between the linear and angular velocities in vector form. We represent 
the angular velocity by a vector directed along the axis of rotation 
and associated with the direction of rotation by the corkscrew rule. 


Fig. 12 


70 


MECHANICS 


[Part I 


Then, if the rotation occurs in an anticlockwise direction, the vector io 
is directed upwards from the paper. From this it follows that 

v=[wr]. (8.4) 

This expression ensures a correct magnitude and direction for the 
linear velocity of the point. 

Let us assume that point A, in addition to rotation, is somehow 
displaced relative to the origin 0 with velocity v' = f. The resultant 
velocity of the point relative to a nonrotating system will be 
represented as the sum v' +v. The kinetic energy of the point relative 

to the nonrotating system is -^(v + v')®, and the Lagrangian is 

L = f(Y + vr~U(r) = f(v' + [u> r])“ - (r). (8.5) 

Let us now write down Lagrange’s equations for motion relative 
to a rotating system, i.e., considering r a generalized coordinate. 

In order to do this we must calculate the derivatives and-^ ; 

or or 

let it be noted that differentiation with respect to a vector denotes 
a shortened way of writing down the differentiation with respect 
to all of its three components. The general rules for such differentiations 
will be given in Sec. 11; here we shall calculate the derivatives for 
each component separately. 

Let w be along the direction of the z-axis. Then, in vector com¬ 
ponents, L will be of the form 

L=^[(x—<x>yy+(y + toa:)* + z^'\ — U{x,y,z ). (8.6) 

Whence we obtain 

-^ = m(x —coy), = m (y-f coa:), -^ = mz\ 

-^ = m<o(y+coa:)--^, = _mco (x-coy). 

BL _ BU 

Bz Bz ' 

Lagrange’s equations in component form appear thus: 

3 XJ 

m (x — coy) —mco (y -f- coa;) — nKSiy = 0 , 

d XJ 

m (y -f coi) -f mco (i — coy) + max + = 0 , 


Sec. 8] 


BOTATINO COOBDINATE SYSTEMS INEBTIAI, EOBCES 


71 


Let US leave on the left only the second derivatives and rewrite 
the last three equations as a single vector equation: 

mf=m [r <ji>] + 2m [rw] + m [to [rco]] -(8.7) 

Expanding the double vector product on the right by means of 
the equation [A [BC]]=B (AC) — 0 (AB), and transforming to com¬ 
ponents, we can see that (8.7) is equivalent to the preceduig system 
of three equations. A direct differentiation with respect to the vectors r 
and f would have led to (8.7), without the expression in terms of 
components. 

Inertial forces. The first three terms on the right in (8.7) essentially 
distinguish the equations of motion, written relative to a rotating 
coordinate system, from the equations written relative to a non¬ 
rotating system. 

The use of a noninertial system is determined by the nature of 
the problem. For example, if the motion of terrestrial bodies is being 
studied, it is natural to choose the earth as the coordinate system, 
and not some other system related to the Galaxy (the aggregate 
of stars in the Milky Way). If we consider the reaction of a passenger 
to a train that suddenly stops, we must take the train as frame of 
reference and not the station platform. When the train is braked 
sharply, the passenger continues to move forwards “inertially” or, 
as we have agreed to say, he continues to move uniformly relative 
to an inertial system attached to the earth. Thus, relative to the 
carriage, it is the familiar jerk forward. At the same time it is obvious 
that the noninertial system is the train and not the earth, since 
no one experiences any jerk on the platform. 

The additional terms on the right of equation (8.7) have the same 
origin as the jerk when the train stopped; they are produced by 
noninertiality (in the given case, rotation) of the coordinate system. 
Naturally, the acceleration of a point caused by noninertiality of 
the system is absolutely real, relative to that system, in spite of the 
fact that there are other, inertial, systems relative to which this 
acceleration does not exist. In equation (8.7) this acceleration is 
written as if it were due to some additional forces. These forces 
are usually called inertial forces. In so far as the acceleration associated 
with them is in every way real, the discussion (which sometimes 
arises) about the reality of inertial forces themselves must be con¬ 
sidered as aimless. It is only possible to talk about the difference 
between the forces of inertia and the forces of interaction between 
bodies. 

But if we consider the force of Newtonian attraction, we cannot 
ignore the striking fact that, like the forces of inertia, it is proportional 
to the mass of the body. As a result of this, the equations of mechanics 
can be formulated in such a way that the difference between gravi- 


72 


MECHANICS 


[Part I 


tational forces and inertial forces does not at all appear in the equations; 
all these forces turn out to be physically equivalent. However, this 
formulation is, of course, connected with a re-evaluation and a 
substantial revision of the basis of mechanics. It is the subject of 
Einstein’s general theory of relativity, which is discussed in somewhat 
more detail at the end of Sec. 20. 

Coriolis force. Let us now consider in more detail the inertial forces 
appearing in (8.7), which are due to a rotating coordinate system. 

The first term in (8.7) occurs as a result of nonconstancy of angular 
velocity. It will not interest us. The second term is called the Coriolis 
force. For a Coriolis force to appear, the velocity of a point relative 
to a rotating coordinate system must have a projection, other than 
zero, on a plane perpendicular to the axis of rotation. This velocity 
projection can, in turn, be separated into two components: one, 
perpendicular to the radius drawn from the axis of rotation to the 
moving point, and the other, directed along the radius. The most 
interesting, as to its action, is the component of the Coriolis force 
due to the radial component of velocity. It is perpendicular both 
to the radius and to the axis of rotation. If a body moves perpen¬ 
dicularly to a radius, then its Coriolis acceleration is radial, and 
therefore analogous in its action to the centripetal acceleration which 
will bo considered a little further on. 

We note that the Coriolis force cannot be related, even formally, 
to tho gradient of a potential function U. 

There are many examples of the deflecting action of the Coriolis 
force in nature. The water of rivers in the Northern Hemisphere 
which flow in the direction of the meridian, i.e., from north to south, 
or from south to north, experience a deflection towards the right- 
hand bank (if we are looking in the direction of flow). This is why 
the right-hand bank of such rivers is steeper than the left. It is easy 
to form the corresponding component of the Coriolis force. The 
angular-velocity vector of the earth’s rotation is directed along the 
earth’s axis, “upwards” from the north pole. The waters of a river, 
which flows southwards at tlie mean latitudes of the Northern Hemi¬ 
sphere, have a velocity component perpendicular to the earth’s 
axis and directed away from the axis. This means that the Coriolis 
acceleration of the water, relative to the earth, is in a westerly direction 
or, relative to a river flowing southwards, to the right. If the river 
flows in a northerly direction, the deflection will be towards the 
east, i.e., again to the right. In the southern hemisphere the deflection 
occurs leftwards. 

The warm Gulf Stream which flows northwards is deflected towards 
the east, which is of tremendous importance for the elimate of Europe. 
In general, the Coriolis force considerably afiects the motion of air 
and water masses on the earth, though when compared in magnitude 
with the gravitational force it is very insignificant. Indeed, the angular 


Sec. 9] 


THE DYNAMICS OF A BIGID BODY 


73 


velocity of the earth, as it completes one rotation about its axis 
in 24 hours, is a little less than 10-* rad/sec, while the velocity of 
a particle of water or air can be taken as having an order of magnitude 
of 10® cm/sec. From this the Coriolis acceleration has an order of 
magnitude of 10“® cm/sec®, which is one hundred thousand times 
less than the acceleration caused by the force of gravity. 

The Coriolis force also causes the rotation of the plane of oscillation 
of a Foucault pendulum. With the aid of the Foucault pendulum, 
we can prove the rotation of the earth about its axis without astro¬ 
nomical observations. In a nonrotating system, the plane of oscillation 
must be invariable in accordance with the law of conservation of 
angular momentum. 

Centrifugal force. The third vector term in equation (8.7) is the 
usual centrifugal force. Indeed, it is perpendicular to the axis of 
rotation and, in absolute value, is equal to 

I m [w'[«ijr]] I = mta j [tor] | = (tor sin a) = mto®r sin a. (8.8) 

Here, the first equality takes account of the fact that the vectors to 
and [tor] are perpendicular to each other, so that the absolute value 
of the vector product is equal to the product of their absolute values. 

But r sin a is equal to the distance from the axis of rotation, so 
that this force satisfies the usual definition of a centrifugal force. 

Exercise 

Let cs consider the rotation of the plane of oscillation of a Foucault pendulum 
under the action of the earth’s rotation about its axis. 

The axis Ox at a given point on the earth is drawn in a northerly direction 
and the axis Oy in an easterly direction. Then, if <03 = « sin 0 , where 0 is the 
latitude of tho locality, we have the equation of motion 

— wja;—2yco3 , gr = — 1/-f 2a:cOg , 

Multiplying the first equation by y and the second by x and then sub¬ 
tracting, we got 

^(ySo-xy)^-^{,f+x^)^^. 

Integrating and transforming to polar coordinates (x — rcosi y — r sin 9): 

Whence, after cancelling the r®’s, wo have 

9 = ojg = 01 sin 0 , 

which gives the angular velocity of rotation of tho plane of oscillation. 

Sec. 9. The Bjmamics of a Rigid Body 

Tho d 3 mamics of a rigid body is a large independent chapter of 
mechanics and is very rich in technical applications. Our aim is 
to give only a brief account of the basic concepts of this branch 


74 


MBCBANICS 


[Part I 


of mechanics inasmuch as it contains instructive examples of general 
laws. In addition, certain mechanical quantities that characterize 
a rigid body are necessary for an understanding of molecular spectra. 

The kinetic energy of a rigid body. As was shown in Sec. 1, a rigid 
body has six degrees of freedom. Three of them relate to the trans¬ 
lational motion of the centre of mass of a body in space. The re¬ 
maining three degrees of freedom correspond to rotation (relative 
to this centre of mass). 

In Sec. 4, it was shown that the kinetic energy of a system consists 
of the kinetic energy of the motion of the whole mass of the body 
concentrated at the centre of mass, and the kinetic energy of the 
relative motion of the separate particles of the system. In the case 
of a rigid body, relative motion reduces to rotation with the value 
of angular velocity (a the same for all particles. Naturally, both the 
magnitude and the direction of to may vary with time. 

Let us calculate the kinetic energy of rotation of a rigid body. 
In the general case, the density p of the body may not be uniform 
over tlie whole volume of the body, and may depend on the co¬ 
ordinates : p = p (a;, y, z) = p (r). The mass of an element of volume dV 
is equal to dm — p dV. The velocity of rotation v is, from (8.4), [oir]. 
Therefore, the kinetic energy of the volume element is equal to 

~ p [o>r]**dF. The kinetic energy of the whole body is represented by 
the integral of this quantity with respect to the volume 

T=lJp[«rfdF. (9.1) 

Expressing the square of the vector product in terms of the compo¬ 
nents ti>, we have 

[tor]® = £ 0 ® r® sin® a = to® r® — w® r® cos® a = to® r® — (tor)®. 

Here a is the angle between to and r. But 

to® = toj -f toj -1- to|, 

(tor)® = -f Wyy -f- to^z)® = 

=toJa:® -f tojy® -f to’z® -+- 2coxtOya;y -f 2t0xt0ia:z -p 2tOyto*yz. 

Since the body is rigid, the components to*, toy, to* can be taken 
out of the volume integral. Combining terms which are similar in the 
components to, we obtain for T: 

T = -i tol J p (y® -f- z®) dF -f -i to® J p (a;® -b z®) dF+ 

+ 4"“*/P — to*tOy J pxydV — to*to* J pxzdF— 

— toyto* J pyzdF . (9.2) 


Sec. 9] 


THE DYNAMICS OF A RIOID BODY 


76 


Moments of inertia. All the integrals appearing in (9.2) depend only 
on the shape of the body and its density distribution, and do not de¬ 
pend on the motion of the body (in a coordinate system fixed in the 
body). We denote them as follows: 


Jyy=j + Z’^)dV, 

Jzz=^jp{x’^ + y^)dV, 


Jxy — J* pxy dV) 
Jxz— — J pxzdV, 
Jyz= — J pyzdv 


(9.3) 


The quantities with the same indexes are called moments of inertia, 
while those with different indexes are called products of inertia. 

In the notation of (9.3), the kinetic energy has the form 

T = xxiot -f" J vyCOy -|- J liiol -1- 2J xy<0x(0y "h 2JxzO^xOz “t* yzCOyOIz) . 

(9.4) 

With the aid of the summation convention used in Sec. 2, when eval¬ 
uating Lagrange’s equations the kinetic energy can be written in the 
following concise form: 

T — OatOp . 

Principal axes of inertia. Let us suppose that Oxyz is a coordinate 
system fixed in a body. In this system all the quantities J**, ..., Jyz 
are constant. Let us take another coordinate system Ox'y'z' which is 
also fixed in the body. The old coordinates of any point are expressed 
in terms of its new coordinates by the well-known formulae of analyti¬ 
cal geometry: 

x = x' cos Z. («', x) -f y' cos /_ (y', x) -f- z' cos Z (z'. x ), 

y = x' cos Z (*', y) + y' cos Z W, y) -f- 2' cos Z (z', y), 

z = x' cos z {x', z) -b y' cos Z («/', z) + z' cos Z (z', z), 

or, if we denote cos < (xa'x^) by the symbol then, with the aid of 
the summation convention 


a:p=a:,'Aap. 

The same formulae are used to express also the components of any 
vector, and in particular o>p, relative to the old axes, in terms of the 
components to*' relative to the new axes. 

Let us substitute these expressions into the kinetic energy (9.4) 
and collect the terms containing the products oix'oy', oy'o)*' and 


* Xi^y, a!,=*. 


76 


MECHANICS 


[Part I 


the squares w*'®, to/®, to^'®. We shall now show that wo can always 
rotate the coordinate axes so that the coefficients of the new products 
<o*'tOy', to*' 102 ', toy'to*' become zero. Indeed, any rotation of the coordinate 
system can be described with the aid of three independent parameters, 
for a coordinate system is like an imaginary rigid body and its posi¬ 
tion in space is defined by the three angles of rotation (see Sec. 1 ). 
These three angles can be chosen so that the sums of the products of 
the cosines of the angles between the axes, for to*'tOy', co*'(o/ and coy'co*' 
become zero. The remaining expressions for < 0 /®, coy'®, and to*'® will be 
called Jj, J^, Jj, so that 

Jx — Ji = «^3 =-dsp ■fop- 

The kinetic energy is written in the following form in the new coordi¬ 
nate axes: 

T = -^ (Jitof-b J 2 <<>i-1-J 3 C 0 I). (9.5) 

These axes are called tlie ‘principal axes of inertia of the body ; they 
can be defined relative to any point connected with the body. By defi¬ 
nition, the products of inertia convert to zero in the principal axes of 
inertia. The moments of inertia in the principal axes are called princi¬ 
pal moments of inertia. They are denoted by Jy, J^, J3. 

The angular momentum of a rigid body. Let us now calculate a pro¬ 
jection of the angular momentum of a rigid body. From the definition 
of angular momentum we obtain 

Mx — J p[rv]*dF = J p [r[<or]jxdF = J P («a:r® — a: (tor)) dV = 

= <o* Jp(j/® -f 2®)dF—(Oy J pxydV —<o* J pa:zdF = 

— JxxCix “b -}■ Oy Jxz o* (9.6) 

or, in shortened form, 

31a = J <«>3. 

Comparing (9.6) and (9.4), we see that 

Mx = ^. (9.7) 

3f y and 31x appear analogous. In vector form, we may write 

M = (9.8) 

Equations (9.7) and (9.8) again express the fact that the angular 

momentum is a generalized momentum related to rotation. In this 
sense, (9.7) corresponds to (6.4). The only difference is that the com¬ 
ponents u> are not total time derivatives of some quantities. This will 


Sec. 9] 


THE DYNAMICS OF A RIGID BODY 


77 


be shown a little later in the present section. In that sense, tax, in 
(9.7), is not altogether similar to 9 in (5.4). 

If the coordinate axes coincide with the principal axes of inertia, 
then the expression for angular momentum is even simpler than 
(9.B): 

<«) 

and similarly for the other components. 

Moment of forces. Let us now find equations which describe the 
variation of angular momentum with time. The derivative of angular 
momentum of a particle is 

“=4-M=w + ['pi=m' 

where the first term becomes zero since r and p are parallel. Integrating 
this equation over the volume of the rigid body and taking advantage 
of the additive property of angular momentum, we have 

M = J[rF]dF = K. (9.10) 

The right-hand side of (9.10), which we denote by K, is called the 
resultant moment of the forces applied to the body. If F is the gravi¬ 
tational force (which occurs in the majority of cases) then K can also 
be written as 

K = — J psr[rZo]dF, 

where Zq is the unit vector in a vertical direction. But since the vector 
Zf, is a constant, it should be put outside the integration sign: 

K = [zo, Jp^rdF]. 

If the body is supported at its centre of mass, then, by the definition 
of centre of mass, the integral for all three projections pr wifi be zero. 
Then K = 0 and the total angular momentum will be conserved. This 
occurs in the case of a gyroscope. 

For the conservation of angular momentum of a rigid body it is 
sufficient that K = 0; but for any arbitrary mechanical system, 
angular momentum is conserved only when there are no external 
forces. 

Euler’s equations. Equation (9.6) gives a relationship between M 
and CO. The quantities J XX i • • • y J yz are eonstant only in a coordinate 
system fixed in the rigid body itself. If we write equation (9.10) for 
a stationary coordinate system, then, differentiating M with respect 
to time, we must also find the derivatives of Jxx, ..., Jyz with respect 
to time, which is very inconvenient. Therefore, it is preferable to 


78 


MECHANICS 


[Part I 


transform the equation to a coordinate system fixed in the body, 
taking into account the accelerated motion of that system. The varia¬ 
tion of the vector M relative to the moving axes consists of two com¬ 
ponents ; one is due to the variation of the vector itself, while the other 
is due to the motion of the axes onto which it is projected. For the vector 
M this variation is equal to [coM], similar to the way that it was equal 
to [wr] for the radius vector r in Sec. 8 . When the coordinate system 
is rotated, any vector varies like a radius vector. 

Let the coordinate axes be taken in the direction of the principal 
axes of inertia. Obviously, the moments of inertia relative to these 
coordinates are constant. For this reason, the time derivative of 
is 

Ml = Ji Wj q- [<«> M]j — J-i till -f- M^ —11)3 M^ — Ji -f- (<I 3 J2) ^3 • 

Equating this expression to the magnitude of the projection of the 
moment of force on the first axis of inertia, and doing the same for 
the other axes, we obtain the required system of equations 

J 1 ^1 "b (*^3 *^ 2 ) *'^2 W 3 = Ki , 

*^ 3 ) ^3 ~-^2 > 

*^3 “3 "b ('^2 ^>>1 f <>2 — -^3 • 

These equations were obtained by L. Euler and are named after 
him. They can be reduced to quadrature for any arbitrary values of 
integrals of motion in the following cases: 

1 ) Ki = K 2 =K^ — 0 (point of support at the centre of mass) for 
arbitrary values of the moments of inertia; 

2 ) Jjj = J 3 and the point of support lies on the axis of symmetry, 

relative to which two moments of inertia are equal. This is the so- 
called symmetrical top. 

For more than a hundred years, no other case of a solution of system 
(9.11) by quadratures was known. Only in 1887 did S. V. Kovalevskaya 
find another example (see G. K. Suslov, Theoretical Mechanics, 
Gostekhizdat, 1944). Kovalevskaya showed that the three listed cases 
exhaust all the possibilities of integrating the system (9.11) by quadra¬ 
tures for arbitrary constants (integrals) of motion. 

A free symmetrical top. All three cases, and in particular the Kova¬ 
levskaya case, are very complicated to integrate. Therefore, we shall 
only consider the simplified first case, when (a free symmetrical 

top). 

From the first equation of (9.11), it immediately follows that 
o>i=const. For brevity, we write the value 

—l) = Q. 

The second two equations of (9.11) are written thus: 


(9.12) 


Sec. 9] 


THE DYNAMICS OF A BIQID BODY 


79 


d )2 + O<d 3 = 0, 6)3 —Qa >2 = 0. (9.13) 

Equations (9.13) are easily integrated if we represent the components 
u >2 and 0)3 in the following form; 

6)2 = Wx cos Qi, 0)3 = tox sin f2<. (9.14) 

Here, o)| + w§ = tox is a constant quantity. Thus, the angular- 
momentum projection on the axis of symmetry and the sum of the 
squares of the angular-momentum projections on the other two axes 
are conserved. This means that the angular-momentum vector rotates 
about the axis of symmetry, i. e., the first axis of inertia, with angular 
velocity £ 2 ; the vector makes with it a constant angle, the tangent 

of which is situation m a system of moving axes. 

Of course, in a system of stationary axes, the total angular momen¬ 
tum is conserved in magnitude and direction, since the resultant 
moment of force is equal to zero. In this 
system, the axis of symmetry of the top 
rotates about the angular-momentum 
direction making a constant angle with 
it. Such motion is called precession. Pre- 
cessional motion is only stable for rela¬ 
tively small external perturbations. The 
stabilizing action of gyroscopes is based 
on this principle. 

Eulerian angles. We shall now show 
how to describe the rotation of a rigid 
body with the aid of parameters which 
specify its position. Such parameters are 
the Eulerian angles shown in Fig. 13. The 
figure depicts two coordinate systems: 
a fixed system Oan/z and a system 
Ox'y'z' fixed in the rigid body. It is most convenient to take x', y', z' 
along the principal axes of inertia through the point of support. Then 
the Eulerian angles are: 

9^ is the angle between the axes z and z', 

9 is the angle between the line OK of intersection between the planes 
xOy and x'Oy' and the a;'-axis, 

ij; is the angle between the line OK and the a;-axis. 

If the angle varies, then the angular-velocity vector ^ is directed 
along the axis Oz since that vector is perpendicular to the plane of 
angle of rotation <{'• Thus 9 must be taken along the axis Oz' and 9 
along the line OK. 

Let us now express the angular-velocity projections (i. e., Wj, < 02 , 
6 > 3 ,) onto the principal axes of inertia in terms of the generalized ve¬ 
locities <];, 9 ,9. 


P 


80 


MECHANICS 


[Part I 


< 1)3 is the projection of the angular velocity on the axis Oz' {z' is 
the third axis). As was shown, 9 is projected exclusively on this axis 
and the projection of ij; is equal to cos %■, since & is the angle between 
the axes Oz and Oz'. Hence, 

(,> 3 = 9 4 -tj^cosS-. (9.15) 

In order to find the projections of the angular velocity on the other 
two axes, we draw a lino OL which lies in the plane x'Oy' and is per¬ 
pendicular to OK. 

From Fig. 13 it can be seen that 

Z_L0x' = - 2 -— 9 and /_ zOL = ^ , 

since the straight line OL lies in the plane zz', as do all lines perpendic¬ 
ular to OK. The projection of ip on OL is equal to —A sin 9-, and the 
projection on Ox' is equal to —<p sin 9- cos — 9 ^= — ip sin 9 

sin 9 . The projection of ip on Oy' is ^ sin 9 cos 9 . The projection of 9 
on Ox' and Oy' can be directly found by means of the diagram; they 
are 9 cos 9 and 9 sin 9 . The result is therefore 

cDj = 9cos 9 —tpsin 9 sin 9 , (9.16) 

tOg = 9 sin 9 -|-<p sin 9 cos 9 . (9.17) 

From equations (9.15), (9.16), and (9.17) it will be seen that toj, 
cOg and 0)3 are not total time derivatives of any quantities and, in 
that sense, do not exactly agree with the usual notion of generalized 
velocities (as do 9 , p, 9). 

If we substitute into (9.5) the expressions for to^, <03 in terms of 
the Eulerian angles, we obtain the kinetic energy of a rigid body as a 
function of the generalized coordinates 9 , p, 9. 

The symmetrical top in a gravitational field. We shall find the Lagran- 
gian for a symmetrical top whose point of support lies on the axis of 
symmetry at a distance I below the centre of mass. Then the height 
of the centre of mass above the point of support is 2 = 1 cos 9. Hence, 
the potential energy of the top is 

U—mgz = mgl cos &. (9.18) 

The kinetic energy of the top, expressed in terms of the Eulerian 
angles, is 

T = -^JI (wf -f 6)1) Y *^3 “I = 

= + P®sin®9) - 1 - -^^^ 3(9 -f- pco 8 9)^. (9.19) 

The difference between the quantities (9.19) and (9.18) gives the 
Lagrangian for a symmetrical top. The sum gives the total energy tf. 


See. 10] 


GENBRAl PBINCIPLES OF MECHANICS 


81 


Since L does not contain time explicit^, the energy is an integral 
of motion: 

S = T + U = const. (9.20) 


We can find two more integr als of motion, noting that the angles 9 
and do not appear explicitly in L (9 is eliminated only in the case 
of a symmetrical top). These integrals of motion are 

p, = ^ = Jg (9 + cos O) = const, {fi- 21 ) 

3 L 

•p^ = sin® S-ij; -|- Jg cos 0 (9 + cos 0^) = const. (9.22) 

If we eliminate 9 and ij; from equations (9.21) and (9.22) and substi¬ 
tute them into the energy integral, the latter will contain only the 
variable 9-, which allows us to reduce the problem to quadrature. 
Substituting (9.21) in (9.22), wo obtain 

V<> = sin®9 4 ' + Pv ^ > 

whence 


The energy integral, after substituting p, and p^ is 


(P'j'—Pep cos a)* 


2 sin® a 


++ mgricos^-. (9.23) 


Thus, the iiroblem is reduced to motion with one degree of freedom 
9, as it were. The corresponding “kinetic energy” is Ji9®, and the 

“potential energy” is represented by those energy terms which depend 
on 9. This potential energy becomes infinite for 9 = 0, and 9 =h. 
Hence, for 0 <9 <Tt it has at least one minimum. If this minimum 

corresponds to 9 > then the rotation of the top, whoso centre of 

mass is above the point of support, is stable. Small oscillations are 
possible near the potential energy minimum. These oscillations are 
superimposed on the processional motion of the top which we have 
already noted. They are called nutations. 


Sec. 10. General Principles ot Mechanics 

In this part of the book, mechanics is explained mainly through the 
use of Newton’s equations ( 2 . 1 ). Going over to generalized coordinates, 
we obtain from them Lagrange’s equations and a series of further 
deductions. In this section it will be shown that the system of Lagran¬ 
ge’s equations can be obtained not only from Newton’s Second Law, 
but also from a very simple assertion about the value of the integral 


6 - 0060 


82 


MECHANICS 


[Part I 


of tlie Lagrangian taken with respect to time. The basic laws of mechan¬ 
ics thus formulated are usually called integral 'principles. 

The particular importance of these principles is that they allow 
us to understand, in a unified manner, the laws relating to various 
areas of theoretical physics (mechanics and electrod 3 aiamics), thus 
opening up a field for broad generalizations. 

Action. For a certain mechanical system, let it be possible to define 
the Lagrangian 

L=L{q,q,t), (10.1) 

as dependent on the generalized coordinates q, velocities q, and the 
time t. We shall consider that all the coordinates and all the velocities 
are independent. Let us choose some continuous, but otherwise arbi¬ 
trary, dependence of the coordinates upon the time q (t). The functions 
q (t) can be in complete disagreement with the actual law of motion. 
The only requirement imposed on q (<) is that the functions q (t) should 
be smooth, i. e., that they should provide for differentiation and should 
correspond to the rigid constraints present in the system. 

The time integral of the Lagrangian is called the action of the sys¬ 
tem: 

S=jL(q,q,t)dt. (10.2) 

*0 


The magnitude of this integral depends upon the law chosen for 
q (<), and is, in that sense, arbitrary. In order to examine the relation¬ 
ship between the action and the function q (t), it is convenient to cal¬ 
culate the change of S for a transition from some arbitrary law q (t) 
to another, infinitely close but also arbitrary, law q' (t). 

Yariation. Fig. 14 shows two such conceivable 


paths. Time is taken along the abscissa, and 
one of the generalized coordinates q, represent¬ 
ing the totality of generalized coordinates, is 
plotted on the ordinate axis. 

For the specification of future operations, we 
shall consider that both paths pass through the 
same points, q^ and at the initial and final 
instants of time. 


Fig. 14 


The vertical arrow shows the difference 
between two conceivable, infinitely close paths at 


some instant of time other than initial or final. 


This difference is usually called the variation of q and is denoted by 
Sg. The symbol S should emphasize the difference between variation 
and the differential d; the differential is taken for the same path at 
various instants of time, while the variation is taken for the same 
instant of time between different paths. 


Sec. 10] 


GEKBRAIi PBINCITLES OF MECHANICS 


83 


Since the neighbouring paths in Fig. 14 have different forms, the 
speed of motion along them %vill also difer. Together with the variation 
of the coordinate 8q between paths, we can also find the variation 

in velocity 8g. We shall show that 8q =~8q. Indeed, 8q=q' {t)—q (t), 

where q' and q are values of the coordinates for neighbouring paths. 

d 

But the derivative of the difference ^ 8q is equal to the difference of 
the derivatives q' (f)— q(t)=8q. 

Let us now find the variation of the Lagrangian, i. e., the difference 
of the function for two adjacent paths. Since L = L {q, q, t) and the 
variation is taken at the same instant of time, i. e., S< = 0, we obtain 


8L = 


dL 

aq 


(10.3) 


Let us rearrange the second term. Taking advantage of the fact that 
S q= 8q, we can write it thus: 


8L _aL d 
aq aq at 


d ah 
dt aq 


The last equation simply expresses a transformation by parts. 
Substituting it into (10.3), we find 


The integral of the variation of L is equal to the variation of action 
8S, since the difference between integrals taken between the same 
limits is equal to the difference between the integrands. 

The first term in (10.4) can be integrated with respect to time, 
because it is a total derivative. The variation of action is then reduced 
to the form 

^0 ^0 

We have agreed to consider only those paths which pass through the 
same points, q^ and q^, at the initial and final instants of time. Hence, 
at these instants the variation 8q becomes zero by convention, and 
the integrated term disappears. The expression SiS is reduced to the 
following integral: 

^0 

The extremal property of aetion. If the chosen path coincides with 
the actual path of motion, the coordinates satisfy Lagrange’s equation: 


6* 


84 


MECHANICS 


[Part I 


d dL dL 
dt dq 8q 


(10.7) 


Substituting this in ( 10 . 6 ), we see that the variation of action tends 
to zero close to the actual path. The change in magnitude is equal to 
zero either close to its extreme, or close to the “stationary point” 
(for example, the function y = 3? has such a point at x = 0, where 
y’ = 0, y” =0). Three cases can, in general, be realized: a minimum, 
a maximum and a stationary point. 

For example, let a point, not subject to the action of any forces 
other than constraint reactions, move freely on a sphere. Then its 
path will be an arc of a great circle. But through any two points on 
the sphere there pass two arcs of a great circle representing the largest 
and smallest sections of the circumforence. One corresponds to a maxi¬ 
mum, and the other, to a minimum, 8. If the beginning and end of the 
path are diametrically opposite, the result is a stationary point. 

The principle of least action. We have proven, on the basis of Lag- 
range’s equations, that SS = 0. We can proceed in a different way: 
by asserting that close to the actual path passing between the given 
initial and final positions of the system the increment of action is 
equal to zero, we can derive Lagrange’s equations. Ordinarily, the 
action on an actual path is minimal, and therefore the assertion we 
have made is called the principle of least action. Action was wiltten 
in the form ( 10 . 2 ) by Hamilton. Much earlier, the prineiple of least 
action was mathematically formulated by Euler for the siiecial case 
of paths corresponding to constant energy. 

For us, it is not essential that the action should be a minimum, but 
that it should be steady, SjS = 0 . 

Lagrange’s equations are derived from the principle of least action 
by means of proving the opposite. We assume the right-hand side of 
equation ( 10 . 6 ) to be zero, S<S = 0 , and the variation 87 to be arbitrary. 
Then, if the e.xpression inside the parentheses is not equal to zero, 
the sign of the variation 87 can always be chosen to be the same as 


c dL d dL 

for the quantity 


because the variation is arbitrary. If, 


3 Jj d 3 Jj 

for example, the sign of the quantity -changes several 

times along the path of integration, then the sign of 87 must also be 
changed accordingly at those points so that the integrand of ( 10 . 6 ) 
should everywhere be non-negative. But the integral of a non-nega¬ 
tive function cannot equal zero unless the function is equal to zero 


everywhere. Therefore, 8 (Sf = 0 only when^^ — becomes zero 

along the whole path of integration, for otherwise the variation 8 g 
can be so chosen that 8 /S >0. We have shown that if we proceed 
from the principle of least action as a requirement for the motion along 
an actual path, then that path must satisfy Lagrange’s equations. 


Sec. 10] 


OENERAI. PRINCIPLES OP MECHANICS 


86 


The advantages of using action. The principle of least action may at 
first sight appear artificial or, in any case, less obvious than Newton’s 
laws, to whose form we are accustomed. For this reason we shall try 
to explain where its advantages lie. 

First of all, let it be noted that Lagrange’s or Newton’s equations 
are always associated ■ with some coordinates whose choisc is, to a 
significant extent, arbitrary. In addition, the choice of coordinate 
system, relative to which the motion is described, is also arbitrary. 
Yet the motion of particles along actual paths in a mechanical sys¬ 
tem expresses a certain set of facts which cannot depend on the 
arbitrary manner of their description. For example, if the motion 
leads to a collision of particles, that fact must always be represented 
in any description of the system. 

But it is precisely the integral principle that is especially useful in a 
formulation of laws of motion not related to any definite choice of 
coordinates, the value of the integral between the given limits being 
independent of the choice of integration variables. The extremal 
property of an integral cannot be changed by the way in which it is 
calculated. 

The integral principle S;S' = 0 is equivalent, purely mathematically, 
to Lagrange’s equations (2.21). But in order to apply it to any actual 
system, we must have an explicitly expressed Lagrangian. It may be 
found from those physical requirements which should be imposed on 
an invariant law of motion that is independent of the choice of coordi¬ 
nate axes and the frame of reference. 

As a result of the invariance of the principle of least action, we can 
consider the laws of mechanics in a very general form, and this, 
therefore, opens the way for further generalizations. 

The determinacy of the Lagrangian. Before finding an explicit 
form for the Lagrangian, we must put the question; Is the determined 
function we are looking for single-valued ? We shall show that if we 
add the total time derivative of any function of coordinates and time, 

/ {q, t), then Lagrange’s equations remain unchanged. This can be 

verified either by simple substitution into (10.7), or directly from the 
integral principle. Writing 


L = L'-b^/(gr,0. (10.8) 

we see that 


(10.9) 


4 ^0 ^0 ^0 

The variations of / appear in the variation of 8 only at the limits of 
integration. But since we have arranged that / depends on the coordi- 


86 


MECHANICS 


[Part I 


nates and time, but not on the velocities, the variation of / is expressed 
linearly in terms of the variations of the coordinates, and is zero at 
the limits of integration. Therefore, 


ti <i 

BjLdt = 8jL'dl. (10.10) 

Iq to 

Hence, the Lagrangian is determined only to the accuracy of the 
total time derivative of the function of coordinates and time. 

Dcflning forms of the Lagrangian. We shall now formulate in more 
detail those requirements which the integral principle exiJressing laws 
of mechanics must satisfy. 

First of all we note that the form of this principle must be the same 
for different inertial systems, since all such systems are equivalent. 
This statement follows from the relatively principle (see Sec. 8). 
The essence of the relativity principle consists in the fact that the 
choice of an inertial coordinate system is arbitrary, while the physical 
consequences of the equations of motion camiot be arbitrary. 

Similarly arbitrary is the choice of the origin and the initial instant 
of time and also orientation of the coordinate axes in space. 

It must, of course, be borne in mind that the form of action is by no 
means determined by specidation; this form represents no less a gener¬ 
alization of physical experience than the laws of Newton. However, 
the principle of least action, best expresses the invariance of physical 
laws to the method of their formulation. Quite naturally, the form of 
the invariance (in relation to rotations, translations, reflections, etc.) 
is itself a certain, very broad, generalization of experience, and must 
by no means be considered as a priori. 

Considering now the problem of finding the form of the Lagrangian, 
let us first of all determine the action of a free particle in an inertial 
coordinate system.* In such a system, the particle moves uniformly 
in a straight line, i. e., with constant velocity. (This statement is based 
on the experimental fact that inertial systems exist in nature). Thus, 
the Lagrangian for a free particle in an inertial system cannot con¬ 
tain any coordinate derivatives other than velocity. 

By definition, a free particle is very far away from any other bodies 
with which it could interact. Therefore, its Lagrangian must not change 
its form upon displacement of the origin to any arbitrary point fixed 
in the given inertial system. In other words, the Lagrangian of such a 
particle does not depend explicitly on the coordinates. 

In this way, one can conclude that the Lagrangian does not depend 
explicitly on time. 


* See L. D. Landau and £. M. Lifshits, Mechanics, Fizmatgiz, 1958 . 


Sec. 10] 


GENEBAL PRINCIPLES OF MECHANICS 


87 


The orientation of the coordinate axes is arbitrary as well as the 
choice of the origin. For the Lagrangian to be independent of the orien¬ 
tation of coordinate axes, it must be scalar quantity. 

To summarize, then, the Lagrangian is a scalar that depends only 
on the velocity of the free particle relative to the given inertial system. 
Tlie only scalar quantity which can be formed from a vector is the 
absolute value of the vector. Therefore, 

L—L {v^). 

The form of this function can be found from the relativity principle, 
in accordance with which the Lagrangian must not change with the 
transformation from one inertial system to another. In Newtonian 
mechanics, this transformation is effected with the aid of equations 
(8.1), (8.2), i. e., Galilean transformations. The Galilean transforma¬ 
tions led to the law of addition of velocities: 

v = v' + V, 

where V is the relative velocity of the inertial systems. Therefore, 
the Lagrangian must remain invariant with respect to Galilean trans¬ 
formations. 

Since the Lagrangian is determined to a total derivative, it is suffi¬ 
cient (for its invariance) for the following equality to be satisfied: 

L^L{v^) = L [(V' + V)^*] = L(i;'^) -f , (J0.11) 

where the functions L (v^) and L (v'^) have the same form in accordance 
with the principle of relativity. 

Any transformation (8.1), (8.2), in which the relative velocity V 
is finite, can be obtained by a set of infinitely small transformations 
applied successively. It is, therefore, sufficient to consider a transforma¬ 
tion in which the relative velocity of the inertial systems V is very 
much smaller than the particle velocity v. Then, to a very good approx¬ 
imation, the quantity (v'-pV^) is equal to 

(v'-fV)2 = i;'2-t-2v'V, 

where the term of the second order of smallness is discarded. 

Expanding L [(v'-pV)*®)] in a series, we obtain, to the same approx¬ 
imation, 

L[(v + V)2] = L(0 + ^2v'V. 

Comparing this with (10.11), we find: 

2 ,-rr _ dL 2 xr (*^0 

a(w'>) ^ ^ dt ~ dt • 


88 


MECHANICS 


[Part I 


However, the expression on the left-hand side of the equation can 

d L 

be a total derivative of the function of coordinates only if is 

independent of velocity. Introducing the notation 


wo obtain 


8L m , 

"Siyij “ T “ > 

dt dt ’ 


8L 


could not be put inside the derivative sign. 


for, otherwise, ,,, 

’ ’ £)(«*) 

In this way we have shown that the Lagrangian for a free particle 

is equal to 

“ ' ”2 ( 10 . 12 ) 


The Lagrangian for a system of noninteracting particles is equal 
to the sum of the Lagrangians of these particles taken independently, 
since it is the only sum of quadratic expressions of the type (10.12) 
that changes by a total derivative when Vi = V/'-l-V (where i is the 
particle number) is substituted. 

In order to write down L for a system of interacting particles, we 
must, of course, make certain physical assumptions about the nature 
of the interaction. 

1 ) The interaction does not dejiend on the particle velocities. This 
assumption is justified for gravitational and electrostatic forces, and 
is not justified for electromagnetic forces. It should, however, be noted 
that electromagnetic interactions involve ratios of particle velocities 
and the velocity of light c, and therefore, to the approximation of 
Newtonian mechanics, they must be considered as negligibly small. 
The Lagrangian of Neivtonian mechanics is not universal and is appli¬ 
cable only to a limited gi’oup of phenomena, when all Vi c. 

2) The interaction docs not change the masses of the particles. 

3) The interaction is invariant with respect to Galilean transforma¬ 
tions. 

From these conditions it can be seen that the interaction appears 
in the Lagrangian in the form of a scalar function determined only 
by the relative distribution of the particles: 

L= — (10.13) 

i 

From this expression, we can find the conservation laws for energy, 
linear momentum, and angular momentum (see Sec. 4). 

The Hamiltonian function. We shall now use the principle of least 
action in order to transform a system of equations of motion to other 
variables. Namely, in place of coordinates and velocities we shall 


Sec. 10] 


GENERAL PRINCIPLES OF MECHANICS 


89 


employ coordinates and momenta. Let us assume that velocities are 
eliminated from the relations 

(10.14) 

Since the Lagrangian depends quadratically on the velocities, equa¬ 
tions (10.14) are linear in the velocities and can always be solved. We 
shall obtain for coordinates and momenta a more symmetrical system 
of equations than Lagrange’s equations. 

The passing from velocities to momenta was performed to some ex¬ 
tent when we substituted the integrals of motion in the expression for 
energy, for example, in (5.4), (9.21), (9.22). 

Now, in place of the velocities we shall introduce into the energy 
the momenta for all the degrees of freedom, (and not only for the cyclic 
ones, i. e., those, whose coordinates do not appear explicitly in L). 
Energy expressed in terms of coordinates and momenta only is called 
the Hamiltonian function of the system or, for short, the Hamiltonian-. 

^ [?>? iP)] = i<h P) -=qp-L. (10.15) 


Thus, for example, if we replace ^ by in (9.23), we obtain the 

^ 1 

Hamiltonian for a symmetrical top: 


(P'l' — Pf cos a)* 


2 Jj sin® a 


2J, 


mgl COS (10.16) 


Hamilton’s equations. In order to derive the required system of 
equations, we write the expression for the principle of least action, 
expressing L in terms of : 

h 

SS = 8j{p'q-J^)dt = 0. (10.17) 

*0 

Here it is assumed that q is expressed in terms of p and q. 

Let us calculate the variation BS: 


88=j\8pq+p8q-~8p-^8q^dt==0. 

The second term inside the parentheses can be integrated by parts, 
similar to the way that it was done in (10.5). This gives 

SS_ pSg [+1 [8p(j - 2^) - + ?J)] dl. 

to to 

The integrated part becomes zero when limits of integration have been 
substituted. The independent variables are now p and q. The variation 


90 


MECHANICS 


[Part I 


of p, as well as the variation of q, is completely arbitrary in sign. For 
^8 to bo equal to zero, the following equations must be satisfied: 

• , A • A /lA 1Q\ 


This system of equations is more symmetrical than Lagrange’s 
equations. Instead of v second-order Lagrangian equations, we have 
2 V first-order equations (10.18). They are called Hamilton’s equa¬ 
tions. 

Kodueing the order with the aid o! the energy integral. If 3^ does not 
depend on time, we can exclude time completely from the equations 
by dividing all the equations (10.18), except one, by the said equation. 
Then we have 


qp ^_ 

dq d3^ 

8p 


(10.19) 


Here, for simjilicity, this operation has been performed for a system 
with one degree of freedom. The integration of (10.19) yields one con¬ 
stant. The second constant will bo determined by quadrature from the 
equation 


dt _ 1 

~dq"~ aIdp ’ 


( 10 . 20 ) 


where is a certain fimction q which can be obtained by integrating 

(10.19). The constant of integration in (10.20) is the initial instant 


The connection between momentum and action. We shall now show 
that if action is calculated for the actual paths of a system, then mo¬ 
mentum can be very simply expressed in terms of this action. For this 
wo shall consider the change in action when the ends of the integra¬ 
tion interval are displaced along the actual paths. From (10.7), the 
expression under the integral sign in (10.6) is equal to zero on such 
paths. But the integi-ated part does not become zero; only the varia¬ 
tions in it must be replaced by differentials, since we are considering 
the displacement of the ends of the integration interval along given 
])aths. Therefore, 


^^ = ^^9--^dqo=pdq-padqa 


( 10 . 21 ) 


in agreement with the definition of momentum (4.13). 

But action calculated along an actual path is uniquely determined 
by its initial and final points 8 = 8 {q^, q). So 


d8 


as 

39o 


dqo+ '^dq. 


( 10 . 22 ) 


Sec. 10] 


GKKEBAIi PllKs’CIPLES OF MECHANICS 


91 


Comparing (10.21) and (10.22), we obtain the verj^ important rela¬ 
tionship between momentum and action 

which is very essential for the formulation of quantum mechanics. 


Exercise 

Write down the Hamiltonian and Hamilton’s equations for a particle 
in a central field. 


PART II 
ELECTRODYNAMICS 

Sec. 11. Vector Analysis 

The equations of electrodynamics gain considerably in conciseness 
and vividness if they arc writteii in vector notation. In vector notation, 
the arbitrariness associated with the choice of one or another coordi¬ 
nate system disappears, and the physical content of the equations 
becomes more apparent. 

Wo have assumed tliat the reader is acquainted with the elements of 
vector algebra, such as the definition of a vector and the various forms 
of vector products. However, in electrodynamics, vector differential 
operations are also used. This section is devoted to a definition of vec¬ 
tor differential o])crations and to proofs of their fundamental proper¬ 
ties, which will be needed later. 

The vector ol an area. We first of all give a definition of the vector 
of an elementary area ds. This is a vector in the direction of the normal 
to the area, numerically equal to its surface and related to the 
direction of traverse of the 


Fig. 15 Fig. 16 


We shall make use of a right-handed coordinate system x, y, z, in 
which, if we look from the direction of the z-axis, the a:-axis is rotated 
towards the y-axis in an anticlockwise sense (Fig. 16). In this system. 


Sec. 11] 


VECTOR ANALYSIS 


93 


the vector area can be resolved into components which are expressed 
thus: 

dsx=dy dz, dsy=dz dx, dsx=dx dy. 


Vector flux. Now suppose that a liquid of density 1 (“water”) 
flows tlu'ough the area, the flow velocity being represented by the 
vector V. We shall call the angle between v and ds, a. Fig. 17 shows 
the flow lines of the liquid passing through ds. 


They are parallel to the velocity v. Let us calcu¬ 
late the amount of liquid that passes tlmough the 
area ds every second. Obviously, it is equal to the 
amount that passes through the area ds', placed 
lierpendicular to the flux and intersected by the 
same flow lines as pass through ds. This quantity is 
simply equal to v ds', because every second a 
liquid cylinder of base ds' and height v passes 


through the area ds'. But ds' = ds cos a, whence 
the quantity of liquid we are concerned with is 


dJ=v ds' — V ds cos «=v ds. 


( 11 . 1 ) 


By analogy, the scalar product of any vector A (taken at the point 
of infinitesimal area) on ds is called the flux of the vector A across 
the area ds. Similar to the way that the flow of liquid across a finite 
area s is equal to the integral of dJ with ref?pcct to the surface. 


J=jvds, (11.2) 

the integral 

J=jAd8 (11.3) 


is called the flow (flux) of the vector A across any area. 

The area vector is introduced so that we can make use of the 
noncoordinate and convenient notation of (11.3). The integrals 
appearing in (11.3) are double. In terms of the projections of (11.3) 
we can Avrite 


dzdx -{-^^Azdydx, 

where the limits of the double integrals are determined from the cor¬ 
responding projections, onto the coordinate planes, of the contour 
bounding the surface. 

The Gauss-Ostrogradsky theorem. Let us now calculate the vector 
flux through a closed surface. For this we shall consider, first of all, 
the infinitesimal closed surface of a parallelepiped (Fig. 18). We 
shall make the convention that the normal to the closed surface 
will always be taken outwards from the volume. 


J=jAds=JjA*dydz -f-JjAy 


94 


ELECTROD YNAMICS 


[Part II 


Let US calculate the flux of the 
vector A across the area A BCD 
(the direction of traverse beuig in 
agreement with the direction of the 
normal). Since the flux is equal to 
the scalar product of A by the vec¬ 
tor area A BCD, in the negative 
a:-direction (and hence equal to 
dydz), we obtain for this infinitely 
small area 

dJABCD = — Ax (x) dy dz. 


We get a similar expression for the area A'B'C'D' , only in this case 
the projection dsx is equal to dy dz, and Ax is taken at the point 
x-\-dx instead of x. And so 

dj A'B’C'D' — Ax (x+dx) dy dz. 


Thus the resultant flux through both areas, perpendicular to the 
x-axis, is 

dA 

dJ.rB'c'D’ + dJABCD = [A*(.C -t- dx) — Ax{x)]dydz = ^^dxdydz. 

(11.4) 

We have utilized the fact that dx is an infinitely small quantity, 
and we have expanded Ax(x-\-dx) in a series. The resultant fluxes 
across the boundaries perpendicular to the y and z axes are formed 
similarly. The resultant flux across the whole parallelepiped is 

A finite closed volume can be divided into small parallelepipeds, 
and the relationship (11.6) applied to each one of them separately. 
If we sum aU the fluxes, the adjacent boundaries do not give any 
contribution, since the flux emerging from one parallelepiped enters 
the neighbouring one. Only the fluxes through the outer surface 
of the selected volume remain, since they are not cancelled by others. 
But the right-hand sides of (11.6) will be additive for all the elementary 
volumes dV—dx dy dz, yielding the very important integral theorem; 


It is called the Gauss-Ostrogradsky theorem. 

The divergence of a vector. The expression appearing on the right- 
hand side under the integral sign can be written down in a much 
shorter form. We first of aU notice that it is a scalar expression, 
since there is a scalar on the left-hand side in (11.6) and dF is also 


Sec. 11] 


VECTOR ANALYSIS 


96 


a scalar. This expression is called the divergence of the vector A 
and is written thus: 


OX oy oz 


(11.7) 


The divergence can' be defined independently of any coordinate 
system, if (11.6) is used. Indeed, from (11.6) the definition for diver¬ 
gence follows as 

TacIs 

divA = lim—=—. (11-8) 

K-vO ^ 


The divergence of a vector at a given point is equal to the limit 
of the ratio of the vector fiux through the surface surrounding the 
point to the volume enveloped by the surface, when the surface is 
contracted into the point. 

Let us suppose that the vector A denotes the velocity field of 
some fluid. Then, from the definition (11.8), it can be seen that the 
divergence of the vector A is a measure of the density of the sources 
of the fluid, for it is obvious that the more sources there are in unit 
volume, the more fluid will flow out of the closed volume. If div A 
is negative, we can speak of the density of vents. But it is more 
convenient to define the source density with arbitrary sign. We 
note that from (11.7) there follows the quantity 

since r has components x, y, z. 

Contour integrals. We shall now consider the vector integral of 
a closed contour having the following form: 

G= ^A.d\—^{Axdx-\-Aidy-\-Azdz). (11.10) 

This single integral is called the circulation of the vector over the 
given contour. For example, if A is the force acting on any particle, 
then A dl = A dl cos a is the work done by the force on the contour 
element dl and C is the work performed in covering the whole contour. 

Stokes’ theorem. We shall now prove that the circulation of the 
vector A aroimd the contour can be replaced 
by the surface integral “pulled over” the y 
contour. 

Let us consider the projection of an in¬ 
finitely small rectangular contour onto the 
plane yz. Let this projection also have the 
form of a rectangle shown in Fig. 19. We 
shall calculate the circulation of A around Fig. 19 


96 


BLECTBOD YNAMIOS 


[Part II 


this rectangle. The side A B contributes a component Ay ( 2 ) dy 
and side CD the component — Ay(z-[-dz)dy, where the minus 
sign must bo written because the direction of the vector CD is 
opijosite to that of the vector AB. We obtain, for the sum due to 
the sides AB and CD, 

h A 

~Ay(z_-\-dz)dy-^Ay(z)dy^ - -^-^dydz 

(we have expanded Ay {z-\-dz) in a series for dz), while for the sides 
B(] and DA, 

ri A 

A;: {y + dy) dz -A^ {y) dz - - g,ydydz. 

Tlie resultant value for circulation in the 2 / 2 -plane is 


=Bxdsx. 


( 11 . 11 ) 


The notation Bx is clear from the equation. Let us now find out 
Avhat meaning this expression has. From the definition of (11.10), 
circulation is a scalar quantity and, hence, on the right side of equation 
(11.11) there must also be a scalar quantity. If the contour lies in 
the plane yz, this quantity is of the form dC — Bxdsx', consequently, 
for an arbitrary orientation of the contour, the relationship (11.11) 
must have the form of the scalar product 


dG = Bxdsx-\-Bydsy-\-Bz6kz = "Bds, ( 1112 ) 


where Bx, By, Bx must necessarily be the components of a vector, 
since, otherwise, dC could not be a scalar. From (11.11), 


Bx 


SAx 

dy 


dAy 

dz • 


(11.13) 


In order to findthe circulation for infinitely small contours in 
the xz, yz planes, it is sufficient to perform a cyclic permutation 
of the indices x, y, 2 . This permutation yields the components By, Bx: 


By^ 


8Ax 

dz 


a Ax 

dx ’ 


dAy dAx 
dx dy 


(11.14) 

(11.15) 


The vector B has a special name: it is called the rotation or curl 
of the vector A and is denoted thus: 


B=rot A. 

rot A is expressed in terms of unit vectors i, j, k, directed along 
the coordinate axes: 


See. 11] 


VECTOR ANALYSIS 


97 


B = rot A = i 


jbA^, _ 

dAy\ 

1 

dz 1 


) + i( 


'dAx 

N 

1 

1 8z 

dx 1 


Changing to the notation (11.16), we see tliat the component of 
rot A normal to the area appears in equation (11.11): 

JA(/I = rot„Ad.'<, (11.17) 

where the suljscript n of rot A indicates that we must take the pro¬ 
jection of rot A noiunal to the area, i.e., coinciding with the vector ds. 
(11.17) permits us to define rot A in a noncoordinate manner, similar 
to the Avay that we defined div A in (11.8), namely: 


A til 

rotn A = lim -, (11.18) 

s -> 0 

or the jirojection of rot A, normal to the area at the given point, 
is the limit of the ratio of the circulation of A, over the contour 
of the area, to its value when the contour is contracted into the point. 

So that the integral JA d\ should not become zero, wo must have 
closed vector lines, to some extent following the integration contour, 
which lines are similar to the closed lines of flow in a liquid during 
vortex motion. Hence the term curl, or rotation. 

If the circulation is calculated from a finite contour then the contour 
can be broken up into infinitely small cells to form a grid. For the 
sides of adjacent cells, the circulations mutually cancel since each 
side is traversed twice in opposite directions; oidy the circulation 
along the external contour itself remains. The integral on the right- 
hand side of equation (11.17) gives the flux of rot A across the surface 
“pulled over” the contour. Thus, we obtain the desired integral 
theorem 

|'Adl=JrotAds, (11.19) 


which is called Stokes’ theorem. 

Differentiation along a radius vector. The divergence and rotation 
of a vector are its derivatives with respect to the vector argument. 
They can be reduced to a unified notation by means of the following. 
We introduce the vector symbol V (nabla*) with components 


V.v = 


a 

bx ’ 


V. 


a 

bz 


Then, from (11.7), we obtain for the divergence of A : 


( 11 . 20 ) 


* Nabla is an ancient musical instrument of triangular shape. This symbol 
is also called del. 


7 - 00«0 


08 


ELECTKOD YNAMICS 


[Part II 


div Vx^*+Vy^v+V^^^ = {V^), ( 11 . 21 ) 

i.e., a scalar product of nabla and A. 

From (11.16), we have for the rotation 

rot A r-i i (Vy Ai .— Vi A y) -j- j (Vi A* — V* Ai) -|- k (V* Ay — Vy A*) ■— 
sa[VA]. (11.22) 

We use the identity symbol -. here in order to emphasize the fact 
that we are simply dealing with a new system of notation. We shall 
see, however, that this system is very convenient in vector analysis. 
We note, with reference to algebraic operations, that nabla is in 
all cases similar to a conventional vector. We shall use the expressio)\ 
‘‘multiplication by nabla” if, when nabla operates on any exjiression, 
that expression is differentiated. Sometimes, nabla is multi})lied by 
a vector without operating on it as a derivative. In that case it is 
ai^idied to another vector [see (11.30), (11.32)]. 

Gradient. If we operate with V on a scalar cp, we obtain a vector 
which is called the gradieivt of the scalar cp: 


grad9^V9 = i||+j + 

(11.23) 

Its components are: 


(11.24) 


From ecjuations (11.24), it can be seen that the vector Vcp is per¬ 
pendicular to the surface <p = const. Indeed, if we take a vector dl 
lying on this surface, then, in a displacement dl, 9 does not change. 
This is written as 

Ucp = + JLUly + dh = (V9d() = 0, (11.25) 

i.e., V 9 is peri)endicxilar to any vector which lies in the plane tangential 
to the surface 9 = const, at the given pouit, which accords nith our 
assertion. 

Differentiation of products. We now give the rules governing 
differential operations with V. 

First f)f all, the gradient of the product of two scalars is calculated 
as the derivative of a product; 

V9t}'=9V4'(11.26) 

The divergence of a product of a scalar with a vector is calculated 
thus: 

div 9 A = (V„ 9 A) + (V. 4 , 9 A) = (AV 9 ) -b 9 (VA) = 

— A grad 9 -I -9 div A. (11.27) 

Here the indices 9 and A attached to V show what V is applied to. 


Sec. 11] 


VECTOR ANALYSIS 


99 


We find the rotation of 9 A in a similar manner: 

rot 9 A= [V,, 9 A] + [V. 4 , 9 A] = [grad 9 , A] + 9 rot A . (11.28) 

Now we shall oiJerate with V on the product of two vectors: 

div [AB] = (V [AB]) = (V .4 [AB]) + (Vb [AB]). 

We perform a cyclic permutation in both terms, since V can be 
treated in the same way as an ordinary vector. In addition, we put B 
after Vb in the second term, and here, as usual, we must change 
the sign of the vector product. The result is 

div [AB] = (B [Va A]) — (A [Vb B])=B rot A — A rot B. (11.29) 

Let us find the rotation of a vector product. Here we must use the 
relationship [A[BC]] =B (AC) — C (AB): 

rot [AB] = [Va [AB]] + [Vb [AB]] = (Va B) A — (Va A) B (Vb B) A — 
— (AVb) B = (BV) a — B div a + a div B — (AV) B. (11.30) 

Here we note the new symbols (BV) and (AV) operating on the 
vectors A and B. Obviously, (AV) and (BV) are symbolic scalars, 
equal, by definition of V. to 

(AV) = .d. V* + AyVy + yl. V. = + Ay-^ + A, , (11.31) 

and similarly for (BV). Then, (AV) B is a vector which is obtained 
by application of the operation (11.31) to all the components of B. 
Of the operations of this kind, we have yet to calculate grad AB: 

gi-ad (AB) =Va (AB) +Vb (AB). 

We use the same transformation as in the preceding case: 

grad (AB) = (BVa) A + [B [VaA]] + (AVb) B + [A [VbB]] = 

= (BV) A + (AV) B + [B rot A] + [A rot B], (11.32) 


Certain special formulae. We note certain essential cases of 
operations involving V. 

From the definition of divergence (11.7), we obtain from (11.27) 
and (11.9) 

divi div r + r grad= .J. _ = 0. (11.33) 


Further, 


and in general 


rot* r = 


dz 

% 


rot r = 0. 


(11.34) 


7* 


100 


KLKCTBODYNAMICS 


[Part rr 


We now take 


Bx 


8x 

Bz 


A., 


and for all ciomponents of r at once 

(AV)r = A. (11.35) 

In addition, we apply V to a vector depending only on tlio absolute 
value of the radius vector. We note first of all that 


Br 

Bx 


B 

Bx 


V *2 + ?/2 + ; 


y/ 


X 

-1 


X 

r 


[Cf. (3.3.), where 1 /r is differentiated], so that 


(11.36) 

Using the rule for differentiating a function of a function, we liave 

divA(r).^{4f Vr) = 4^. (11.37) 

Hero A is a total (huivative (jf A (r) with respect to the argument r, 
i.c., a vector whose components are the derivatives of the three 
comjjoneuts of A (r) with rcs])ect to r : Ax, Ay, At. 

Further, 

rotA(r) ==|Ar (11.38) 

Ucpeated diflercniiation. Let us investigate certain results con¬ 
cerning I’cpeated operations with V. 

The rotation of the gradient of any scalar is equal to zero: 

rot gradcp = | V.Vq)) = [VVJ <p = 0 , (11.3!J) 

since the vector product of any vector (including V) by itself is equal 
to zero. 'Fliis can also be seen by expanding rot grad cp in terms of its 
components. The divergence of a rotation is also equal to zero: 

div rot A = (V [VA]) = ([VV] A) = 0 . (11.40) 

I^et us wite down the divergence of the gradient of a scalar cp in 
component form. From equations (11.7) and (11.24) we have 


div grad cp ----- (V V) cp = -f = A 9 


(11.41) 

Here A (delta) is the so-called Laplacian operator, or Laplacian: 
A^ 


5® 


Bx^ 


B* 

82* • 


Sec. 11] 


VECTOB ANALYSIS 


101 


Filially, the rotation of a rotation can be expanded as a double 
vector product: 

rot rot A= [V [VA]]=V (VA)—(VV) A=gi'ad div A—A A . (11.42) 

The last equation can be regarded as a definition of AA. In curvi¬ 
linear coordinates, Acp and AA are expressed differently. 

Curvilinear coordinates. We shall further show how the gradient, 
divergence, and rotation, as weU as A of a scalar appear in curvilinear 
coordinates. 

Curvilinear coordinates q^, q^, q^ are termed orthogonal if only the 
quadratic terms dql, dq\, dql appear in the expression for the 
element of length dP, and not the products dq^ dq^, dq^ dq^, dq^ dq^, 
similar to the way that dP = dx^ -f dy^ + dz^ in rectangular coordinates. 
In orthogonal coordinates 

dP = hldql + hl dql + hi dql . (11.43) 

For example, in spherical coordinates q^ — r, q^ == S', <73 = 9 . The element 
of length is 

dP — dr^ + r^ sin® 9- dq)® , 

so tliat 

Ai = l, h2=r, A3 = rsin9'. 


Let us construct an elementary qiarallelepiped (Fig. 20). Then the 
components of the gradient will be 


for Fig. 20 . The area ADCB is equal to dq^ dq^. The flux of 
vector A through it is 

■d-i (<li) ^2 ^3 dq2 dq^ • 

Here, ^2 ^.nd A 3 are also taken for a definite value of q^. The sum of the 
fluxes through the areas ADCB and A'B’G'D' is 

(Aj A3 j) dqi (^2^3’ 

where we have used the expansion of the quantity A 2 A 3.41 at the 
point qi+dqi in terms of dq^, jn a way similar to ( 1 1.4). The total flux 
across all the boundaries is 


102 


ELECTRODYNAMICS 


[Part II 


dJ — (^2 -^i) + (^3 ^ 2 ) + (^1 ^2 -^s) j ^Qz ^Qa • 

Let us now take advantage of the definition of divergence (11.8): 
dJ = div A- Aj dq^ dq^ dq^^div Ad V . 

Hence, 

^ = 7h b^T {h\Az) + 4 - (^'1 *3 ^ 3 )] • 

(11.45) 

If, iiLStcad of Aj, A 2 , Ag, we substitute the expressions (11.44), the 
result will be the Laplacian of a scalar in orthogonal curvilinear coor¬ 
dinates. Thus, in spherical coordinates it is 


T 


r 2 it 

dr 


+ 


H sin 0 


e . „ a.]/ 


+ 


a 9 * 


(11.46) 


With the aid of Stokes’ theorem, we can also calculate the rotation 
in curvilinear coordinates. We shall give it for reference without 
proof: 


^ = A-k bk “ k k • 

rota A = Ah — A /(g), 

^°^3 A = (g— Ag /12 — A 1 . 


(11.47) 


Exercises 

Whore (from the roquiromonta of the problem) expressing in terms of coordi¬ 
nates is not demanded, it is recoinmondod that only the vector equations of the 
present section (11.26)-(11.42) be used. 


1) Calculate the expressions: 

a) A -i (r^O). 

b) div 9 (r) r, rot 9 (r) r. 

c) V (Ar), rge A = const. 

d) V (A (r) r). 

0 ) div 9 (r) A (r), rot 9 (r) A (r). 

f) div [r [Ar]], A = const. 

g) rot [r [Ar]], A = const. 

h) AA(r) [cm.(11,42)J. 

i) V(A(r)B(r)). 

j) rot [Ar], A = const. 

k) div [Ar], A = const. 


Answers: 

1 1 r 

A -^ = div grad ~ = — div = 0 . 

89 -f r 9 ; 0 . 

A. 

A-fl(rA). 

^(rA) +l(rA); [rA]-f | [rA]. 

~2(Ar). 

3[rA]. 

X + -A. 

r 

i(AS)-(-I(AB). 

2 A. 

0 . 

-_lL 

r> • 


Sec. 11] 


VBCTOa ANALYSIS 


103 


2) Write down A']; in cylindrical coordinates. 

3) AVrite down the three components of AA in spherical coordinates. 

4) Two closed contours are given. The radius vector of points of the first 
contoiu" is Ti, of the second contour, r^. The elements of length along each 
contour are cilj and d\^, respectively. Provo that the integral 


is equal to 'zero, 4 n, 8 w, and 9 i -4 it, depending upon how many times the first 
contoiu’ is wound round the second, linking up with the latter. Vj denotes diffe¬ 
rentiation with respect to Tj (Ampere’s theorem). 

Changing the order of integration and performing a cyclic permutation of tho 
factors, we have 

1 


dl 


■])• 


We apply Stokes’ theorem (11.19) to the integral in dli: 

1 


roti Vi 


d\ 


s] ci'Si) • 


•l— *2 I 

Wo use equation (11.30); rotj denotes dilTerentiation with rospoct to tho compo¬ 
nents Fi; and dl^ in such a differentiation may be regarded as a constant vec¬ 
tor: 


dlj = (dljVdVi- 


-dlj divtVi- 


roti Vi 


In accordance with exercise la, the last term containing Aj 

to zero. There remains, therefore, 

1 

ivi-iv— a: 


1 


Ti—'•2r 


is equal 


roti 


dl, 


i] = (dljVi) 


Vi- 


- — (<^4 V 2 ) Vi ■ 


since a function of the difference Fj—Fj is differentiated. 

For short, we write f^Fj—Fj. Then the required integral will be 

« = — J (dU V2) J dSi Vi I = — / (<^*2 V 2 ) J V ^ . 

We shall now explain tho geometrical sense of the second integrand, i.e., 

, 1 (dStl) 1 rm 1 J . (f^slr) . „ 

dSi V — =-^ • Tlie scalar product —-—- is the projection of 

an element of the surface dSj pulled over the first contour, on the radius vector 

r drawn from a point on the second contour. In other words, is equal 

to the projection of tho area dSj on a plane perpendicular to r. This projection, 
divided by r“, is equal to the solid angle dSi at a point Fj on the second contoiu- 

C {rf 8 1 f*) 

subtended by the area ds^. The integral - ' 


is, therefore, that solid angle 

a which is obteuned if a cone is drawn with vertex at the point r^, so that the 
generating line of the cone formed the contour Ij. 

The differential (dlj Vj) Cl is the increment of solid angle Cl obtained in 
shifting along the contour Zj a distance dij. Thus, 


a = J (dljV 2 ) ^ = f^i- 


-n. 


The integral of this quantity around a closed contour is equal to the total 
change in solid angle in traversing the contour Zj. Let the initial point of cir¬ 
cumvention lie on the surface Sj. Then the solid angle subtended by the surface 


104 


KLKCTUODYKAMICS 


[Part 11 


at the origin is — 2 7t. If the contours arc linked, then the solitl angle will bo 
2 -K after the circumvention, since the area i.s observed from a terminal point 
on the other side. If the contours are not linked, then the solid angle is once 
again its initial value, — 2 re, and the integral is equal to zero. Thus, when the 
contours are linked n times, the integral in lil.^ is equal to 4 re «. 


See. 12. The Eleetromagnelic Field. Maxwell’s Equations 

Interaction in mechanics and in electrodynamics. The interaction 
of charged bodies in electrodynamics is principally an interaction of 
charges with an elcctronittgiietic licld. However, the jthysical concept 
of the field in electrodynamics differs es.sentially from the field concept 
in Newtonian meehtinics. 

We know that the s]tace in which gravitational forces act is called a 
gravitational licld. The values of these foi ces at any point of tho field 
is determined, in Newtonian mechanics, hy the instantaneons positions 
(ff the, gravitating bodies, no matter how far they are from the given 
])oint. In electrodynamics, such a field representation is not satis¬ 
factory : during the time that it takes an electromagnetic disturbam-e 
to move from one charge to another, the latter can move a very great 
distance. IClcmentary ch.arges (electrons, protons, mesons) veiy often 
have velocities close to the velocity of pro])agation of electromagnetic 
disturbances. 

Modem gravitational theory (the general theory of relativity', sec 
Kec. 20) shows that gravitational interaction, too, prop.agates with a 
finite velocity. But since macroscopic bodies move considei'ahly 
slower, within the scale of the solar sy^stem, the finite v'eloeity of 
pro])agation of gravitational forces introduces only' an in.significant 
correction to the laAvs of motion of Newtonian mechanics. 

In the elcctrodymamics of elementary' charges, the finite velocity of 
])ro])agation of electromagnetic disturbances is of fundamental signi- 
licanco. When si)caking of point charges, the action of a field on the 
charge is always determined only by' the field at the point where the 
charge is located, and only' at the instant when the charge is at this 
point. As opposed to the “action at a distance” of Newtonian mecha¬ 
nics. such interactions arc termed “short-range.” 

If the energy or momentum of a charged particle is changed under 
the action of a field, they' can he imparted directly only to the electro¬ 
magnetic field, since a finite interval of time is necessary for the energy 
and momentum of other particles to be changed. But this means that 
the electromagnetic field itself possesses energy' and momentum, 
whereas in Newtonian mechanics it was sufficient to assume that only 
the interacting particles possessed energy and momentum. It follows 
from this that the electromagnetic field is itself a real physical entity 
to exactly the same extent as tlie charged particles. The equations of 
electrodynamics must describe directly the propagation of electro- 


See. 12] 


THE ELECTRt)MAGNETIC FIELD. MAXWELL’S EQUATIONS 


105 


magnetic disturbances in space and the interaction of charges with the 
field. 

Interaction between charges is effected through the electromagnetic 
field. Such laws as the Coulomb or Biot-Savart laws (in which only 
the instantaneous positions and the instantaneous velocities of the 
charges appear) are of an approximate nature and are valid only when 
the relative velocities of the charges are small compared with the 
propagation velocity of electromagnetic disturbances. 

It will be shown later that this velocity is a fundamental constant 
which appears in the equations of electrodynamics. It is equal to the 
velocity of light in vacuo and, to a high degree of precision, is 
3 X 10*® cm/sec. 

A field in the absence of charges. The independent reality of the 
electromagnetic field is particularly evident from the fact that electro¬ 
dynamic equations admit of a solution in the absence of charges. 
These solutions describe electromagnetic waves, in particular light 
waves, in free space. Thus, electrodynamics has shown that light is 
electromagnetic in nature. 

In the course of two centuries, the protagonists of the wave theory 
of light considered that light waves were propagated by a s^iecial 
elastic medium permeating all space, the so-called “ether.” In order 
to represent the .spread of oscillations it was, naturally, necessary to 
have something oscillating. This “something” was called the ether. 
Proceeding from an analogy with the propagation of sound waves 
in a continuous medium, the ether was endowed with the ]noperties 
of a fluid, physical phenomena being explained simply by reducing 
them to definite mechanical displacements of bodies. In particular, 
light ])hcnomena were regarded as displacements of particles of the 
special medium, the ether. 

In this, a peculiar “abhorrence of a vacuum” was apparent or, more 
exactly, a purely speculative representation of empty space where 
“nothing exists” and, hence, where nothing can occur. Physicists did 
not at once come to realize that the electromagnetic field itself was 
just as real as the more tangible “ponderable matter.” Electrodynamic 
laws are those elementary concepts, from which the interaction of 
atoms should be deduced, which interaction accounts for the proper¬ 
ties of real fluids that are incomparably more complicated than the 
properties of a field in “empty space,” i.e., in the absence of charges. 
There is no sense in reducing a field to an imaginary fluid merely in 
order to avoid the idea of “empty space.” Physical space is the carrier 
of the electromagnetic field and is, therefore, inse]iarable from the 
state and motion of real objects. As regards the term “ether,” which 
still persists in the field of radio, it expresses nothing other than the 
electromagnetic field. 

The electromagnetic field. liCt us now establish the basic equations 
of electrodjmamics. We shall proceed from certain elementary laws. 


106 


ELECTRODYNAMICS 


[Part II 


which Ave assume tlie reader knows from a general course of physics 
or electricity. These laws will first be used in the ab.sence of matter 
consisting of atoms or, as is usually said in elcctrodjmamies, in the 
absence of a “material medium.” By this term wo must not under¬ 
stand any encroachment on the material nature of the electromagnetic 
field itself. From the elcctrodynamic equations for free space we shall, 
later on, derive the c(iuations for an electromagnetic field in a medium 
(a comlnctor or dielectric). 

As is known, the electromagnetic field in a medium is described by 
four vector <piantities: the electric field, the electric induction, the 
magnetic field, and the magnetic induction. The force acting on unit 
electric charge at a given point in s]jace is called the electric field 
intensity. In future, instead of the field intensity, Ave shall simply 
speak of the field at a given j)oint in space. The magnetic field intensity 
or, for short, the magnetic field is defined analogously. Separate magne¬ 
tic charges, unlike electric charges, do not exist in nature; however, 
if we make a long permanent magnet in the form of a needle, then the 
magnetic force acting at its ends Avill be the same as if there existed 
point charges at the ends. 

A rigorous definition of the electric and magnetic induction vectors 
will be given in Sec. Ki, Avhore the fiekl equations in a medium Avill be 
derived from the equations for point charges in free space. It need 
only be recalled that in free space there is no need to use four vectors 
for a description of the electromagnetic field, only two vectors being 
sufficient: the electric and magnetic fields. 

System of units. We shall consider that all electromagnetic quanti¬ 
ties are expressed in the CGSE system, i.e., in the absolute electro¬ 
static system of units. In this system the dimensions of electric charge 
are gm'/« • cm’/«/sec and the dimensions of field intensity, both electric 
and magnetic, are gm'/>/cm'/» • sec. If we substitute charge, expressed 
in this system of units, into the equation for Coulomb’s laAV, then the 
interaction force between charges is expressed in dynes (gm • cm/sec ^). 

Electromotive force. Let us recall the definition for electromotive 
force in a circuit: this is the work performed by the forces of the 
electric field Avhen unit charge is taken along the gteen closed circuit. 
And it is absolutely immaterial what the given circuit represents: 
whether it is filled Avith a conductor or Avhether it is merely a clo.sed 
line drawn in space. Let us Avrite doAAui the expression for electromotive 
force (abbreviated as e. m. f.) in the notation of Sec. 11. The force 
acting on unit charge at a given pouit is the electric field E. The work 
done by this force on an element of path dl is the scalar product Edl. 
Then, the Avork done on the Avhole closed circuit, or the e. m. f., is 
equal to the integral 

e.m.f. = j Edl. 


(12.1) 


Sec. 12] THE ELECTROMAGNETIC FIELD. MAXWELL’S EQUATIONS 


107 


Magnetic-field fiux across a surface. Let us suppose that some 
surface is bounded by the given circuit. We shall denote the magnetic 
field by the letter H. The magnetic-field flux through an element of 
the chosen surface is, by the definition given in Sec. 11, d 4> = II(i!8. 
The magnetic-field flux through the whole surface, bounded by the 
circuit, is 

0=)'Hds. (12.2) 

It can be conveniently represented thus. Let us consider a section 
of the surface through which unit flux A O = 1 (in the CGSE system) 
jiasses. We draw through this section of the surface a line tangential 
to the direction of the field at some point on the surface. A line which 
is tangential to the direction of the field at its points is called a magnetic 
line of force. For this reason, the tot.al fiux is equal, by definition, 
to the number of magnetic lines of force crossing the surface. 

Magnetic lines of force are either closed or extended to infinity. 
Indeed, a magnetic line of force may begin or end only at a single charge, 
but separate magnetic charges do not exist in nature. In a permanent 
magnet the lines of force are completed inside the magnet. 

From this it follows that a magnetic fiux through any surface, 
bounded by a circuit, is the same at a given uistant. Otherwise, a 
number of the magnetic Imes of force would have to begin or end in 
the space between the surfaces through which different fluxes pass. 
Consequently, at a given instant, a eonstant number of magnetic 
lines of force, i.e., a constant magnetic field flux passes across any 
surface bounded by the circuit. Therefore, the flux can be ascribed 
to the circuit itself, irrespective of the surface for which it is calculated. 

Faraday’s induction law. Faraday’s induction law is written in the 
form of the following equation: 

—( 12 . 3 ) 

If all the quantities are expressed in the CGSE system, then the 
constant of proportionality c is a universal constant with the di¬ 
mensions of velocity equal to 3 X 10^° cm/sec. 

Usually, Faraday’s law is applied to circuits of conductors; however, 
e. m. f. is simply the quantity of work performed by unit charge in 
movmg along the circuit, and, for a given field value through the 
circuit, cannot depend upon the form of the circuit. The e. m. f. is 

simply equal to the integral J Edl. In a conducting circuit, this work 

can be dissipated in the generation of Joule heat (“an ohmic load”). 
However, it is completely justifiable to consider the circuit in a vacuum 
also. In this case, the work performed on the charge is spent in increas¬ 
ing the kinetic energy of the charged particle, as, for instance in the 
case in an induction accelerator, the betatron. 


K-K 


pLEOTKODYNAMIOS 


[Part II 


Maxwell’s equation' Joy rot E. Thus, equation (12.3) refers to any 
arbitrary closed circuit. We suljstitute the definitions (12.1) and (12.2) 
into this equation: 

(' 2 ‘) 

The left-hand side of the equation can he transformed by the 
Stokes theorem (11.19) and, on the right-liand side, the order of the 
time differentiation and surface integration can be interchanged, since 
they are performed for independent variables. In addition, taking 
this integral over to the left-hand side, we obtain 

J(rotE4-l^l)ds = 0. (12.5) 

But, the initial circuit is completely arbitrary, i.c., it can liave 
arbitrary magnitude and sliape. Let us assume that the integrand, in 
])arentheses, of (12.5) is not equal to zero. Then we can choose the sur¬ 
face and the circuit that bounds it so that the integral (12.5) does not 
become zero. Thus, in all cases, tlie following equation must be satis¬ 
fied : 

rot E-f- 0. *'’ (12.5) 


Tn comparison with (12.3), this equation does not contain anything 
new ])hysically; it is the same induction laAv, but rewritten in differen¬ 
tia) form for an infinitely small circuit (contour). In many applications 
the differential form is more convenient than the integral form. 

We shall see later that the constant c is equal to the velocity of 
light in free space. ■'' 

The equation for 4iv II. As wo have already said, magnetic lines of 
force are either closed or go off to infinity. Hence, in any closed surface, 
the same number of magnetic-field lines enter as leave. The magnetic- 
field flux in free space, across any closed surface, is equal to zero: 


j'llds=0. (12.7) 

Transforming this integral to a volume integral according to the 
(lauss-Ostrogradsky theorem (11.6), wo obtain 

('divIIdF=0. (12.8) 

Due to the fact that the surface bounding the volume is completely 
arbitrary, we can always choose this volume to be so small that the 
integral is taken over the region in which div H has constant sign if 

it is not equal to zei*o. But then, in spite of (12.7) and (12.8), j div H d F 

will not be equal to zero. For this reason, the divergence of H must 
become zero: 


Sec. 12J THE ELECTROMAGNETIC FIELD. MAXWELL'S EQD.ATI0N8 


109 


divH = 0. (12.9) 

(12.9) is the differential form of (12.7) for an infinitely small volume. 
In Sec. 11 it was shown that the divergence of a vector is the density 
of sources of a vector field. The sources of the field are free charges 
from which the vector (force) magnetic-field lines originate. Thus, 
(12.9) indicates the absence of free magnetic charges. 

Equations (12.6) and (12.9) are together called the first pair of 
Maxwell’s equations. . o 

Let us now mtrqduce the^ second pair. 

The equation for div E. The electric-field flu.x through a closed surface 
is not equal to zero, but to the total electric cliarge e inside the surface 
multiplied by 4 tc (Gauss’ theorem); 

' Ji .1 r jEds — 4n:e. (12.10) 

This theorem is derived from Coulomb’s law for point charges. 
The field due to a iioint charge e is expressed by tlie following equation: 


Here, r is a radius vector drawn from the point situated at the charge 
to the point where the field is defined. The field is inversely projior- 
tional to r® and is directed along the radius vector. 

Let us surround the charge by a spherical surface centred on tlie 

charge. The element of surface for the sphere ds is r^dfl where 

r ^ 

(l Q is an elementary solid angle and -j- indicates the direction of the 
normal to the surface. The flux of the field across the surface element is 

Eds-4'--»'*(^ii- = edQ. 

r r 

The flux across the whole surface of the sphere is J edO = eJ dQ — 

= 4 TV e. But since lines of force begin only at a charge, the flux will 
be the same through the sphere as through any closed surface around 
the charge. Therefore, if there is an arbitrary charge distribution e 
inside a closed surface, then equation (12.10) holds. 

In order to rewrite this equation in differential form, we introduce 
the concept of charge density. The charge density p is the charge con- 
tained in unit volume, so that tke total diarge in tne volume is related 
to the density by the following equation: 

e = JpdF. (12.11) 

A 6 

Hence, p = lim . Introducing the charge density in (12.10), wo 
obtain 


110 


ELECTUOD y NAIIICS 


[Part II 


|'(<livE-47ip)rfF-=0. (12.12) 

liepeating the same argument for this integral as we used for (12.8), 
we have 

divE = 4 7rp. (12.13) 

According to (11.8) w(“. can say that the density of sources of an 
electric field is equal to the electric charge density multiplied by 4 tz. 

The density function for point charges. The density function for 
point charges is obtained by a limiting process. Let us initially assume 
that a finite quantity of charge is distributed in a small, but finite, 

volume A V. Then p must be regarded as the ratio . 

If wo let the volume A V tend to zero, then the density will have a 
very peculiar form: it will turn out to be equal to zero everywhere 
except at the place where the charge is situated, and at that point it 

will convert to infinity, since the numerator of the fraction is 
finite and the denominator is infinitely small. However, the integral 

remains equal to the charge e itself. Thus, the concept of charge density 
can also bo used in the case of a point charge. In this case, p is under¬ 
stood to bo a function which is equal to zero everywhere except at the 
point of the charge. The volume integral of this function is either equal 
to the charge e itself, if the charge is situated inside the integration 
region, or zero, if the charge is outside the region of integration. 

The charge conservation law. One of the most important laws of 
electrodynamics is the law of conservation of charge. The total charge 
of any system remains constant if no external charges are brought into 
it. In all charge transformations occurring in nature, the law of con¬ 
servation of charge is satisfied with extreme precision (while the law 
of conservation of mass is approximate!). 

In order to formulate the charge-conservation law in differential 
form, we must introduce the concept of current density. This vector 
quantity is defined as 

j = pv, (12.14) 

where v is the charge velocity at the point where the density p is 
defined. The dimensions of charge densit 5 ' are charge/cm®, and of 
current density, charge/cin® • sec (i.c., the dimensions of charge pass¬ 
ing in unit time through unit area). For a point charge, v denotes 
its velocity and p the density function defined above. 

The total current emerging from an area is 

7 == [ jds= [p (vds). 


(12.15) 


Sec. 12] THE ELECTROMAGNETIC EIELI). MAXWBIX’S EQUATIONS 


111 


It must be equal to the reduction of charge inside the surface in 
unit time, i.e., 


/=- 


Tf 


(12.16) 


(the charge-conservation law in integral form). Substituting e from 
(12.11) and transforming 1 by the Gauss-Ostrogradsky theorem, 
we obtain 

|(-|7 + divj)dF=0. (12.17) 

Since the volume, over which the integration is performed is arbitrary, 
the conservation law for charge in diiferential form follows from 
(12.17): 

div j = divpv = 0. (12.18) 


Displacement current. From direct-current theory, it is known 
that current lines are always closed. Indeed, open circuited lines 
indicate that there is either an accumulation or deficiency of charge 
at their ends. But we can also define vector lines such that they 
wiU always be closed (or will have to go off to infinity) in the case 

of altei’nating currents. For this we substitute the derivative 1^- , 

according to (12.1.3), into the equation of the charge-conservation 

1 

law (12.18). This derivative is equal to —-^div-^. Hence, we 
always have the relation 


1 

aE\ 

47C 

dtj' 

1 , we see 

1 

dE 

4n 

dt 


(12.19) 

Comparing (12.19) and (12.9), we see that the vector lines 

j + 


1 

are always closed. The vector ^ - -- is called the displacement cur¬ 
rent density. Together with the charge-tran.sport 
current j, the displacement current forms a closed 
system of current lines. 

The displacement current can bo more vividly 
demonstrated in the following way. Fig. 21 shows 
a capacitor whose plates are joined by a conductor. 

The current I flowing in the conductor is equal 
to the change of charge on the plates in unit 
time: t — — 

et • 

But the charge on the plates is related to the field in the capacitor 
by the relationship 


Pig. 21 


112 


ELEri'ROD YNAMICS 


[Part II 


where / is the area of the plates. Whence 

I - JL 

^ 471 dt ■ 

Consciqucntly, tlie quantity can be interpreted as the density 

of some current wliich completes the conduction cuiTcnt, of density 
j= I . This corrc.sponds to the more general equation (12.19). 

The conception that the magnetic action of a displacement current 
tlocs nt)t differ from the magnetic action of ordinary current is basic 
to Maxwellian electrodynamics. 

Magnetomotive force. By analogy with electromotive force jEdl, 
we delinc the magnetomotive force J II dl, where the integration 

is performed over a closed circuit. Usmg the Biot-Savart law for 
direct currents, it may be shown that the magnetomotive force in 
a closed circuit is equal to the electric current, I, crossing a surface 

bounded by the circuit, multiplied by -J-• In other words, 

j’Hdl=‘^-^. (12.20) 

This relationship can be shown most simply by assuming that 
a direct current 1 is llowing through an infinitely long straight-line 
(urcuit. We shall calculate the magnetomotive force in a circuit of 
circidar form, the current line ]jassing through the centre of the 
circle perpendicular to its jilanc. The magnetic field is tangential 

' 2,1 

to the circle and equal to H “ ~ in accordance with the Biot-Savart 

law, where r is the radius of the circle. Thus, for the circuit we have 
chosen, the absolute value of II is constant and the magnetomotive 

4 -/ 

force is equal to 2 tc rH — —^ -. 

For a circuit of arbitrary form, we should use the Biot-Savart 
law in differential form; 

cn 

'I’hen the magnetomotive force is represented by the integral 

However, in accordance with Ampere’s theorem (exercise 4, Sec. 11), 
this integi’al is equal to —if the circuit in which the magneto¬ 
motive force is calculated is linked with the current-carrying circuit. 


113 


Sec. 12] THE ELECTBp*(fA(JNKTIC FIET.D. MAXWELL'S EQUATIONS 

The equation''CfiT.lOdLH. Following Maxwell, we shall assume that 
equation (12.20) is also true for displacemoiit current if tlie iield 
i.s variable. The current lines Avill then ahvax -s be closed, and in cal¬ 
culating magnetomotivo force aic can use all tlie arguinents that 
we have used for continuoiLs current. In the case of varying fields 
and currents, /, in the formula for magnetomotive force, denotes 

the total current passing through the circuit, i.e., the sum j j ds -b 
j Naturally, this assumption is not obvious beforehand 

and is justified by the fact that Maxiveli’s equations provide for the 
explanation or prediction of the entire assemblage of phenomena 
relating to rapidly changing electromagnetic fields (the displacement 
current does not usually exist for slowly varying fields). 

Let us assume that the total current is formed by the cui’rent 
jiroduced by charge transport (of density eipial to j) combined with 

a displacement current with density ^ L, jn accordance with 
our assumption, 

( 12 . 21 ) 

Transforming the left-hand side of (12.20) in accordance with 
Stockes’ theorem (11.19) and combining with /, we obtain 

Applj'ing the same argument to equation (12.22) as applied to 
(12.5), we arrive at the differential equation 

rotll - —- ^^=0. (12.23) 

c dt c ' ' 

It is easy to see that this equation agrees with the law of charge 
conservation. Indeed, we operate on it by div. According to (11.40), 
div rot H = 0, so that we are left with 

-i- div E -t- — div J = 0. 
c dt c •’ 

Suhstituting div E from (12.13), we arrive once again at (12.18), i.e., 
the law of conservation of charge. 

Equation (12.23) is not merely an expression for the Biot-Savart 
law in differential form. In (12.23), we have introduced the disiilace- 
ment current, which is not involved in the theory of continuous 
currents. 

The Maxwell system of equations. Let us once again write down 
the system of Maxwell’s equations for free .space. 


8 - 0060 


114 

ELECTRODYNAMICS 

[Part 11 

The first pair; 

, r 1 9H 

rot L = -^ , 

c Vt 

(12.24) 

The second i)air: 

divH = 0. 

(12.25) 

1 d¥j 4r:j 

rotH = --^-f 

(12.26) 

t 

div E = 47cp . 

(12.27) 


In these equations we consider p and j, i.e., the charge and current 
distributions in space, to be known. The unknowns, to be determined, 
are the fields E and II. Each of them has three components. 

In spite of the fact that both pairs form, together, eight equations, 
only six of them are independent, according to the number of field 
c()m’f)onent8. Indeed the three com])on(mts of each rotation are 
constrained by div rot — 0 and, hence, are not independent of one 
another. 

Electromagnetic potentials. We can introduce new unknown quan¬ 
tities such that each equation will contain only one unknown. In 
this way the overall number of equations is reduced. These new 
(juantities are called electnimagnetic potentials. 

We choose the potentials so that the first pair of Maxwell’s equations 
are identically satislied. In order to satisfy equation (12.25), it is 
sufficient to put 

11= rot A, (12.28) 

where A is a vector called the vector 'potential. Then, according to 
(11.40), the divergence of II will be equal to zero identically. We 
shall look for the electric field in the form: 

(‘- 20 ) 

where cp is a quantity called the scalar 'potential. 

From (11.39) rot Vcp =0. Substituting (12.28) and (12.29) in 
(12.24), we obtain the identity. 

The detcrminacy ol potentials. The electromagnetic fields E and H 
are physically determinate quantities since, through them, the forces 
acting on charges and currents can be expressed. The fields are ex¬ 
pressed in terms of potential derivatives. Therefore, potentials are 
determined only to the accuracy of the expressions that cancel in 
differentiation. These expressions shoxild be chosen so that the po¬ 
tentials satisfy equations of the simplest form. We shall now find 
the most general potential transformation which does not change the 
fields. 

From equation (12.28) it can be seen that if we add the gradient 
of any arbitrary function to the vector potential, the magnetic 


Sec. 12] THE ELECTROMAGNETIC FIELD. MAXWELL’s EQUATIONS 


116 


field will not change, since the rotation of a gradient is identically 
equal to zero. Putting 

A-A'+V/(a;,y,2,0, (12.30) 

we see that the magnetic field, expressed in terms of such a modified 
potential, remains unchanged: 

H = rot A = rot A'. 


In order that the addition of V/ should not affect the electric field, 
we must also change the scalar potential: 


<P ==<?-■ 


a< 


(12.31) 


where / is the same function as in (12.30). Then, for the electric 
field, we obtain 


V 1 SA. „ 

= — i- — Vm'. 
c dt ^ 


1 8X' 

e dt 


Consequently, the electric field does not change either. Thus, 
the potentials are determined to the accuracy of the transformations 
(12.30), (12.31), which are called gauge transformations. 

The Lorentz condition. Let us now choose an arbitrary function / 
such that the second pair of Maxwell’s equations leads to equations 
for the potentials of the simplest possible form. Substituting (12.28) 
and (12.29) in (12.26) gives 


rot rot A = 


I 8»A 
c2 


dt 


V 9 + 


4 7C J 
c 


(12.32) 


We express rot rot A with the aid of (11.42). Then (12.32) is reduced 
to the following form: 


- AA + 


1 8»A 
6-2 dfi 


+ v(divA + -^4f) 


4 7C ] 
c 


(12.33) 


We shall now try to eliminate the quantity inside the brackets. 
We denote it, for brevity, by the letter a, and we perform the trans¬ 
formations (12.30) and (12.31) on the potentials. Then the quantity 
a is reduced to the form 


a=divA-b-i§f =divA'-f-4-|f(12.34) 

The function / has, so far, remained arbitrary. Let us now assume 
that it has been chosen so as to satisfy the equation 


. y 1 


8" 


(12.36) 


116 


KLEOTROD YNAMICS 


[Part II 


Then, from (12.34), it is obvious that the potentials will be subject 
to the condition 

divA' + y^ = 0. (12.36) 

This is called the /jorentz condition. 

As was showti, the expression of fields in terms of potentials is 
not changed by a gauge transformation. For this reason we shall 
always consider, in future, that this transformation is performed 
so that th(! liorentz condition is satisfied; the primes in the potentials 
can then be omitted. 

The equations for potentials. From the Lorentz condition and 
(12.33), we obtain the equation for a vector potential: 


AA 


1 d^A 
c» at* 


i/il 

c 


(12.37) 


It is now also easy to obtain the equation for a scalar potential. 
From (12.27) wo have 

div E = —^ ^ div A — = 47cp . 


Substituting div A from the Lorentz condition (12.36), we obtain 

(12.38) 

Equations (12.37) and (12.38) each contain only one unknown. 
Therefore, each equation for potential does not depend on the rest 
and can be solved separattdy. 

The etpiatioiLS for potential are second order with respect to coor¬ 
dinate and time derivatives. For a solution, it is necessary to give 
not only the initial values of the potentials, but also the initial values 
of their time derivatives. 

Gauge invariance. As we shall see later, especially in the following 
section and in Sec, 21, which is devoted to the motion of charges 
in an electromagnetic field, it is necessary, in many cases, to use 
equations involving potentials. But, since potentials are ambiguous, 
we must take care that the form of any equation involving potentials 
does not change under gauge transformations (12.30) and (12.31),* 
since such transformations involve a com)ilotely arbitrary function 
/ which can be chosen to be of any form. It is clear that no physical 
result can depend on the choice of this function, i.o., on an arbitrary 
gauge transformation. In other words, equations involving potentials 
must be gauge invariant. 


* This <l(io.s not refer to equation.s (12.37) and (12.38), from wliich the poten¬ 
tials aro determined in accordance with the condition (12.36), 


Sec. 13] THK ACTION rRINCIPLE FOB THE ELECTllOMAGNETIC FIELD 


117 


Sec. 13. The Action Principle for the Electromagnetic Field 

The variational principle for the electromagnetic field. In the first 
part of this book it was shown that the equations of mechanics, 
obtained from Newton’s laws (Sec. 2), lead to the principle of least 
action (Sec. 10). We obtained the equations of electrodynamics in> 
the preceding sectioq by proceeding from certain simple physical 
laws and the assumption about the magnetic effect due to displace¬ 
ment current. In this section, Maxwell’s equations will be reduced 
to the variational principle, which is the principle of least action 
for the electromagnetic field. 1 

Electrodynamics is not eqiTWalent to the mechanics of particle 
systems or to the mechanics of liquids, which are based on Newton’s 
laws. All the same, to a very coiLsidcrable extent, electrodynamical 
laws are analogous to the laws of mechanics. This analogy can best 
be seen from the principle of least action for the electromagnetic 
field. 

The variational formulation best of all allows us to derive the 
conservation laws for the electromagnetic field. The corresponding 
integrals of motion for a field coincide with the well-known mechani¬ 
cal integrals^—energy, linear momentum, and angular momentum. 
In a closed system consisting of charged particles and a field, the 
total energy, total linear momentum, and total angular momentum 
of the charges and field arc conserved. 

In this sense, electrodynamics is indeed “a dynamics” of the elec¬ 
tromagnetic field, though this by no means signifies that the laws 
of electrodynamics can be obtained from Newton’s laws. Both are 
equivalent to certain integral variational principles, but the action 
functions arc, of course, of entirely different form. 

It is a noteworthy fact that Maxwell at first tried to construct 
mechanical models of the ether, but in his later work he rejected 
them and obtained the general equations of electrodynamics by 
means of a generalization of known elementary laws of electro¬ 
magnetism. 

The Lagrangian function for a field. In order to formulate the prin¬ 
ciple of least action it is necessary to have an ex{)re.ssion for the 
Lagrangian. The choice of Lagrangian in mechanics is determined 
by considerations based on the relativity principle of Newtonian 
mechanics, which is formulated with the aid of Galilean transfor¬ 
mations (Sec. 8). As will be explained in detail in Secs. 20 and 21, 
Galilean transformations are not valid in electrodynamics and are 
replaced by the more general Lorentz transformations, based on 
the Einstein relativity principle. These transformations allow the 
Lagrangian for the electromagnetic field to be uniquely found; 
this will be done in Sec. 21. In this section, the choice of Lagrangian 
is justified by the fact that the already familiar Maxwell equations 


118 


B LBCTRO B YNAMICS 


[Part II 


are obtained from it. Similarly, in Part 1, the principle of least action 
was formulated after Lagrange’s equations had been obtained on 
the basis of Newton’s laws. This confirmed the truth of the integral 
principle. 

In finding the Lagraiigian for a system of free particles, a summa¬ 
tion is performed over the coordinates of the particles. The electro¬ 
magnetic field, if we use the terminology of mechanics, is a system 
with an infinit<! number of degrees of freedom because, for a complete 
dcscri[)tion of tlio field, we must know all its components at all points 
of space, where they differ from zero. But the points of space form 
a nondenumcrabU! sot, i.e., they cannot be numbered in any order. 
For this reason, for the electromagnetic field the summation in the 
Lagrangian is replaced by an integration with respect to continuously 
varying jjarainctcsrs, i.e., coordinates of points in which the field 
is given. Tlie point coordinates are analogous to the indices which 
label the degrees of freedom of a mechanical sy stem. 

The equations f)f mechanics are second order in time with res])ect 
to generalized coordinates qk. The equations for potentials (12.37) 
and (12.38) are also second order in time. Therefore, potential quanti¬ 
ties shouhl be chosen as the generalized coordinates. 

In otluw words, A (r, t), 9 (r, t) correspond to qk (<), where A and 9 
are potentials which are generalized coordinates of an electromagnetic 
field. I'he value of the radius vector r for tlie point at which the po¬ 
tential is taken corresponds to the number of the generalized co¬ 
ordinate k. 

In order to write down the complete Lagrangian function, we 
must first of all define it in an element of volume dV and integrate 
over the volume occupied by the field. It has already been mentioned 
that in this .section wo will ])rocecd immediately from a Lagrangian 
that leads to correct Maxwell equations; the choice of this Lagrangian 
as based on considerations related to the relativity principle will 
be left to Sec. 21 . The Lagrangian is of the following form; 

( 13 . 1 ) 

Since the [lotentials are liable to be generalized coordinates of the 
field, expression (13.1) should be rewritten thus: 


1 (rotA)2 + ^ 


Srr 


P9j d y 


(13.2) 


Here, in place of a summation over the degrees of freedom, an 
integration over the volume has been performed. 

The extremal property of action in electrodynamics. We shall now 


show that notion, i.e., 8 = j L dt, possesses the same v'ariational 
property in ek'ctrodynamics as it does in mechanics; its variation 


Sec. 13] THE ACTION' PRINCIPLE FOR THE ELECTROMAGNETIC FIELD 


119 


becomes zero if the field satisfies the correct equations of motion 
(in this case, the Maxwell equations). 

We shall begin ivith variation with respect to the scalar potential 9 j 

- pSqpjdF. (13.3) 

As was shoivn in Sec. 10 , variation and differentiiition are com¬ 
mutative so that 8 V 9 = V 8 <p. According to (12.29) we replace 

the term by —E. Therefore, 


We shall now make use of equation (11.27), in accordance with 
which 

EV 8 (p = div (E 8 (p) — 8 cpdivE. (13.5) 

We then obtain 

8, /^ -/ [ - p)] ,;k. (13.0) 

The first term in (13.6) can be transformed into a surface intogi*al, 
so that 8 , L will have the form 

S, /. = - J 89 E d s + I' S 9 (— 4 ^- - p) dF. (13.7) 

We shall consider that the first integral is taken over a surface 
on which 8 9 becomes zero, similar to the way that, in Sec. 10 , 8 q 
was equal to zero at the limits of integration (a surface is the limit 
for a volume integral). 

Therefore, 

8,L=JS9(4^/--p)dF. (13.8) 

However, since 

divE = 47 tp (13.9) 


[see (12.27)], 8 , L and hence 8^S, becomes zero as expected. 

I^et us now vary L with respect to A. This variation has the form 


S.L-^ 


/[i(4 


ex 

et 


rot A 8 rot A + ^F. 


(13.10) 

Once again we interchange the differentiation and variation signs 
and, where possible, we replace the potentials by fields after variation. 
We obtain 84 L in the following form: 


(13.11) 


120 


KLKC'I'KOD YN A-M ICS 


[Part II 


Let us write down the transformation by parts: 

l]-‘ SA = -.--EaA-SA-^, (13.12) 

ot c’t at 

UrotSA-- -div[IISAJ + SArotU. (13.13) 

Tile last c(|iiation follows from (11.29). To take advantage of 
(13.12), we must write down the variation of the action S instead 
of the variation of L. 'I’lum the (inst term of (13.12) can be directly 
integrated with if^sinwt to time, and S .4 <S' will be 

'i 'i <1 

a'A*S' --/sALd<=-fESAdK j +|'d< j[IISAJds + 

fa ffl ^0 

+ ]dtj <!!' (- '“)'■ + ,[■: 4*- + |)S,V . (13.14) 

*0 

The variation S A is equal to zero at the initial and final instants 
of time <0 and tj, as well as over the surface bounding the field. 
Therefore, 

and since the field satisfied the equation 

rotH--^ + ^ (13.16) 

[see (12.26)], /S' is ecjual to zero. 

The first i)air of Maxwell’s eciuations is, of course, .satisfied identi¬ 
cally if the fields are ex^iressed in terms of potentials in accordance 
witii (12.28) and (12.29). 

Thus. Maxwell’s equations can be interpreted as equations of the 
mocha nies of an electromagnetic tield. They could be obtained from 
the variational method, .starting from the Lagrangian (13.1) and the 
reipiirement that the variation of action shoidd be equal to zero 
for any arbitrary variations of the scalar and vector potentials. 
For this it is sufficient to rei)eat the arguments set out in Sec. 10, 
as a])plicd to integrals (13.8) and (13.16). 

The invariance of action with respect to a potential gange trans- 
forniafion. We shall now show that action is invariant under gauge 
fransformations (12.30) and (12.31), despite the fact that it involves 
not only fields, but also potentials contained in the last two terms 
of etpiat ion (13.1). We shall call the corresponding part of the action : 

-p9). 


(13.17) 


Sec. 13] THE ACTION l*KIXCTl*I,E FOR THE ELECTROMAONETIC FIELD 


121 


Let us now apply gauge transformations (12.30) and (12.31) to 
A and (p. This gives 

.S'i= f dt J dV (4'^ - p9' + -4-^- + P ~ . (13.18) 

We transform by parts terms containing /: 


jV/=div(/j)-/div j, 


= ^^(P/)-/ 


8t 


Substituting this in the integral (S’j and performing the integration, 
as in (13.14), we have 

Jds/j- *,/dl7p i 4- 

^0 

+ Jdl J'dr|->4-P9'-/(divj+ 1^)1. (13.10) 


However, the integrated terms do not atfect the Maxwell equations 
since, when performing a variation of /Sj, both 5^ L and Sa L 
are equal to zero at the boundaries of the integration region. We 
liave already encountered this in ( 10 . 0 ). The term, jiroportional 

to /, under the integral sign, is multiplied by the quantity div j 

+ ’ 

which is identically equal to zero according to the (Oiarge conser¬ 
vation law (12.18). Thus. <S'i retains the form (13.17). 

The energy oi a field. Maxwell’s equations also .apply to <a frei^ 
electromagnetic field not containing charges or currents. For this 

it is sufficient to omit from them the terms 'and 4 rt p. In ac- 
cordance with (13.1) and (13.2), the Lagrangian for a free field is 


-^0 ~ 


i;^ - H* 


St: 


dV 


ex 

8t 


+ V(p -(rotA )2 dV. (13.20) 


We shall now determine the energy of an electromagnetic field 
by proceeding from the general equation (4.4). First, let it be recalled 
that the values of potentials at all ])oints of space are generalized 

coordinates. But then the derivatives are generalized velocities. 

c t 

Consequently, the expression 


reduces to the form 


, yi . dL 


L 


1 /I ^-A 
4710 \ c bt 


by means of the comparison 


, 2*- jdv. 


122 


ELECTBODYNAMICS 


[Part II 


We shall now show that energy is expressed only in terms of the 
field, and not in terms of j)otentials. Using (12.29) we write dotvn 
the energy thus: 

I E (E + V9) - • 

This expression is not invariant with respect to a gauge transformation 
and must be transformed. 

Tran.sforming the term E V <p by parts, we have, from (11.27), 

EV'p = div (E 9 )— 9 divE = div 9 E, 

since div E =0 for a field free of charges. Tlic volume integral of 
div 9 E is transformed into a surface integral. However, according 
to the meaning attached to and S’, the integration should he 
performed over the whole region occupied by the field (this is analog¬ 
ous to summation over all the degrees of freedom of the system). 
At the boundary of this region, the field is equal to zero by definition, 
so that the surface integral in the expression for energy also becomes 
zero, heroin this we obtain the required expression for the energy 
of an electromagnetic field in the absence of charges: 

,y=-LJ (E2-[-ii2)dF 

Hence, the quantity 


may be inteqireted as the energy density of the electromagnetic 
field. It is invariant with respect to a gauge transformation of a 
potential. 

Conservation of the total energy of field and charges. We shall 
now show that the energy S’ (13.21), together with the energy of the 
charges contained in the field, is conserved, i.e., is the energy in the 
usual, mechanical, sense of the word, and not some quantity which 
is formally analogous to it only as regards its derivation from the 
variational princi])le. 

To do this we multiply e(iuation (12.26) scalarly by E and (12.24) 
scarlarly by H, and subtract the second from the first. This gives 
the following relationship: 

1 (e -f-H S = E rot H - H rot E -. 

C \ dt ot I c 

Now taking advantage of equation (11.29), we reduce the equation 
obtained to the form 

d I w + m\ ,. c _ 

87 [— 7 ^] = - div ^ [EH] - p V E. 


(13.21) 

(13.22) 


(13.23) 


Sec. 13] THE ACTION PRINCIPLE FOR THE ELECTROMAGNETIC FIELD 


123 


Here we have put j = p v by definition. We now integrate (13.23) 
over some volume, though not necessarily the whole volume oc¬ 
cupied by the field, and transform the integral of div to a surfiice 
integral: 


dt J Stt 


[EH]ds —fpvEdF. 


(13.24) 


Tjet us first consider the second integral on tlie right. By definition, 
the quantity p d F is the charge element de. The iiroduct E d e is 
the electric force acting on this charge element. The scalar product 
(/ r 

d e E V = d e • E • is equal to the work done on the element 

of charge in unit time or—put in another way—to the change in 
kinetic energy T of the charge in unit time. Later, we shall show 
that the m.agnetic field does not perform work on charges (Sec. 21). 
To summarize, equation (13.24) can also bo wTitten as follows [the 

dT 

last integral in (13.24) will be represented in the form , i.e., 
the work done in unit time]: 


(<5- + T) == [EH] dF. (13.26) 


Tlio Poynting vector. Thus, the decrease in energy, in unit time, 
of an electromagnetic field and of the charged particles contained 

therein is equal to the vector llux [E H] across the surface bound¬ 
ing the field. If this surface is infinitely distant and the field on it 
is equal to zero, what we have is sim]ily the energy conservation 
law for an electromagnetic field and for the charges within it. Other¬ 
wise, if the volume is finite, the right-hand side of equation (13.2r)) 
indicates what energy passes in unit time through the surface bound¬ 
ing the volume. Hence, the quantity 


U = 


(13.26) 


represents the energy crossing unit area in unit time or, more simply, 
the energy density flux vector (the Poynting vector). 

Field momentum. Similar computations, which we shall not give, 
show that an electromagnetic field pos.scsses momentum. The mo¬ 
mentum of a field is given by the following integral: 


(13.27) 

If the electromagnetic field interacts with some obstacle, for 
example, the walls of the enclosure in which it is contained, or with 
a screen, then the momentum of the field is transmitted to the ob¬ 
stacle. The momentum transmitted normally to unit area in unit 


124 


[; I -KCTHOD Y X AM It'S 


[Part ri 


time is nothing other titan tlie pressure (since momentum trans¬ 
mitted in unit time is force). For this reason, electrodynamics pre¬ 
dicts that electromagnetic fields (and, as a particular case, light 
waves) are capable of exerting a pressure on matter. 

Angular momentum of a field. According to (13.27), the field- 
momentum density is 


From this it follows that the angular-momentum density is 

and the total angular momentum of the field is 

M- (13.28) 

Linear momentum and angidar momentum of a field satisfy the 
conservation laws together with similar (juantities for the charge 
contained in the field. The value of the angular momentum of a 
field is very essential in the cpiantum theory of radiation. 


Soc. 14. The Electrostatics o! Point Charges. 

Slowly Varying Helds 

An important class of apiiroximate solutions of electro-dynamical 
cijuations comprises slowly varying fields, for which the terms 

andin Maxwell’s equations can be neglected. The re¬ 
maining terms form two sets of equations, which are entirely inde- 
])endont of each other; 


div E = 4 K p, 

(14.1) 

rot E — 0 

(14.2) 

div H = 0, 

(14.3) 

rot II == -- ' ^ . 

(14.4) 


The first two eipiations contain only the electric field and the den¬ 
sity of the charge producing the field; the second two equations 
involve only the magnetic field and current density, the right-hand 
sides of the equations being regarded as know'n functions of coordinates 
and time. Since there are no time derivatives in (14.1)-(14.4), 
the time dependence of the electric field is the same as the electric- 


Sec. 14] 


THE ELECTROSTATICS OF POINT CHARGES 


125 


charge densities, and the time dependence of the magnetic field 
is the same as the current densities. Hence, to the approximation 
of {14.1)-(14.4), the field is, as it were, established instantaneously, 
in con’espondence with the charge and current distribution that 
generated it. 

The fact is that any change in the field is transmitted in space 
with the velocity of light c. If we consider the field at a distance B 
from a charge, the electromagnetic disturbance will reach it in a 

time —. The charge, of velocity v, will be displaced, during that 
time, tlirough a distance v —. The approximation (14.1)-(14.4) 

C 

can be applied only when the displacement v ~ docs not lead to 

any essential redistribution of the charge. For example, let a system 
consist of two equal charges of opposite sign, which succeed in chang¬ 
ing places in a time —. Then, the electric field at a distance R, at 

i? ® 

the instant t = —, will have a direction o])posite to the one it had 

during the instantaneous pro])agation at the instant t = 0 . 

Hence, if the dimensions of the system of charges are r and their 
velocities v {r and v determine the orders of magnitude), then equations 
(14.1)-(14.4) can be used at the distance R from the system, for 

which the inequality — > “ or R r is satisfied. 

We shall consider the limiting case, when v c. Then the region 
of applicability of our approximation will bo very large. 

Equations (14.1), (14.2) are called the equations of electrostatics, 
and (14.3) and (14.4), the equations of magneto.statics. 

Scalar potential in electrostatics. In order to satisfy equation (14.2), 
we put 

E = .—V 9 . (14.5) 

According to (14.29), 9 is the scalar potential. The equation for the 
scalar potential is obtaining from (14.1) 


div grad 9 == A 9 = — 47 tp , (14:.6) 

which also follows from (12.38) if we equate to zero the noastatic 
term -T-f. 

Let us find the solution to equation (14.6) for a point charge, i.e., 
we put p equal to zero everywhere except at one point of space. 
Let us put the origin at this point. Then 9 can depend only on the 
distance from the origm r. 

In Sec. 11 an expression for the Laplacian A was obtained in 
spherical coordinates (11.46). In the special case, when the required 
function depends only on r, we obtain from (11.46) 


126 


ELECTKODYNAMICS 


[Part II 


_1 

r»' dr 


(14.7) 


Let us integrate this equation between and r^, first multiplying 
it by r^. Since the region of integration does not contain the origin, 
where the point charge is situated, the integral of the right-hand 
side becomes zero. Hence, 


rl 


df 

dr. 


= rf 


= const. 


Therefore the potential is 

9 = —~ + i?. 

T j. 


Tlio constant B is equal to zero if we take the potential to be eqxial 
to zero at an infinite distance away from the charge. Let us now 
determine the constant A. For this, we integrate equation (14.6) 
over a certain sphere surrounding the origin. Since the Laplacian 
A 9 is div grad 9 , the volume integral can be transformed into an 
integral over the surface of the spliere. This integral is 

|^grad9ds---J r‘^clQ.= . 

On tlie right-hand side we have 

— j •iiz^dV — — 47 te, 

since the integration region includes the point where the charge is 
situated. Thus A -- — e. 

The potential of tlie point charge is 

9-',. (14.8) 


We obtain the same thiixg for a spherically symmetrical volume- 
charge distribution, if the potential is calculated outside the volume 
occupied by the charge. In other words, the potential of a charged 
sphere at all external points is the same as the potential of an equal 
point charge situated at the centre of the sphere. A similar result 
is, of course, obtained for the gravitational potential. This fact is 
used in most astronomical problems, where celestial bodies are con¬ 
sidered as gravitating points. 

If the origin does not coincide with the charge, and the charge 
coordinates are x, y, z (radius vector r) then the potential at point 
X, Y, Z (radius vector R) is 

_ e _ e 

I » — *■ 1 “ V' ■(■yUjc)* + (r — y)»”+ 


(14.9) 


Sec. 141 


THE ELECTUOSTATICS OP POINT CHAKGES 


127 


The potential of a system of charges. The potential due to several 
charges Cj, Cg, 63 , a, ..., whose positions are given by the radius 
vectors r^, r^, ..., r', at the point R. is 

9 = y .. = y - - - -"w_^ - -. 

y |K —r'l ~ y‘)^ + {Z — z‘)^ 

Using the summation convention, any radicand in this formula 
can be rewTittcn in the form 


(Xx— (Xx—xl). 

But, in order to save space, we sliall use the notation {Xx — a:[) ^ 
instead of (Xx — a:j^) (Xx — x^) . Then the potential due to a system 
of point charges is written as 

I 

But we must remember that inside the brackets is a summation 
for X from 1 to 3. 

Note also tliat the potentials due to separate chai’gcs at the point 
R are additive, since equation (14.6) is linear in 9 . And so the full 
solution, due to all the charges, is equal to the sum of all the partial 
solutions for each charge separately. 

The potential due to a charge system at a large distance. Let us 
now assume that the origin is situat<>d somewhere inside the region 
occuiiied by the charges (for examjile, at the centre of the smallest 
sphere embracing all charges), and that all the radius vectors r‘ 
satisfy the inequalities 

|R|>|r'’|. (14.11) 


In other words, we shall look for the potential of a system of charges 
at a great distance from it. Then the function (14.10) can be expanded 
in a Taylor’s series in terms of y‘, z‘. We shall perform the exiiansion 
up to the quadratic term inclusively, but we shall first write it oidy 
for one term of the sum over all the charges, omitting the index 
i. The expansion is of the form: 


[(X -x)^+iY-yr + (Z- 2)2]-'/. = [(Xx - xx)T'l‘ = 

= [xa-v. J [xa-''- + ~v. • <' *• > 2 ) 


The summation convention permits writing in concise form the Taylor 
series for a function of several variables. Since XI — R^, we obtain 
the expression for the first derivative 

& rxr2i_I/_ ^ 1 d li S 1 JCil 

1^^1 ’ “ Jx^ ~ ax7 Jlf Ti ^ ■ 


(14.13) 


128 


Kr.IXJTUOlJ VN'AMICS 


[Part II 


where Ave iiave used equation (11.36) which, in the notation of this 

section, is of tiie form . 

oAn li 

Tims, the term in tlu; .sum (14.12), wliich is linear in x^, i.s equal to 


I //) -!-cZ _ rR 


(14.14) 


It is .somewhat more difficult to calculate the term which is (juadratic 
in x~j. VVe first write down the .second derivative: 

'C'^ f 0 X(t _ I 0A'|i y d 1 

0AV0A'v /e v.w /,'■> SA'., dX; ' 

'fhe partial derivative is equal to zero for a /- v and to 1 for 

<) Jvv * 

[X — V. Further, 


b 1 bR 0__1 _ A'v 3 3.Vv 

0 A'v 'A‘'‘ 0A'v bR~l¥ If ■ ' ■ ' “/?”» 


hy the rule for differentiation of involved functions. 
Thus, wo obtain 


0* I 

SA'^ A\ 77 


1 _ 0 AV 
/7» 0A'v 


R-> 


Hem^e, the required e.vpansion | R — r | is of the form 


I l 

iR--rl ' n 


rll i_ 3A',xXv_ 1 dX^\ 


(14.15) 


VVe shall iiow subtract from the (quadratic term a quantity equal 
to zero: 

r’* I f - I _ .r^ I . 1 


3 ca; R ' 6 ^ ft 


0 from (14.iS)]. Then the last term in (14.15), written in terms 

0 A' 0 T _ 
0 A‘ 8 Y 


[a;,- 

of com])onents, is (taking advantage of the fact that Yy- ^ 


bZ 

-i, 

i'X 

bX 

c r 

= 0) 

<>z 

<’ y ' 

bZ 

< z 


:tx- 

I \ 

, 9 1 

' 3 !••= 

1 

R" 

' ■ R^ I 

H- ( 

R^ 

" ft“ 


+ 


/ :iZ’- 

1 \ 

, <) I3XY\ 

\ 77» 

ft3 i 


+ 


I- 


3 (A'2 + -I- 

RH 


ft>/ • 


Here, it is quite obvious that a term equal to zero has been sub¬ 
tracted, for X“-i-Y^-i-Z^ = }2^. UeaiTanging the terms, we have 

-vJ.) + (^*-4)(^-iir) + (.=-4)- 

]■ 


\ ft* ft» I + ftS 


, axz , , 3rz 

-•^’-ft“ + “ 2 '^-ft»" 


Sec. 14] 


THK ELECTROSTATICS OF POINT CHARGES 


129 


The expansion (14.15) must be substituted in the equation for 
potential (14.10) and summed over all the charges. We introduce 
the following abbreviated notation; 


2^ e,r‘-, 
I 


q=cx= 2^ - - 

i 

qyy = 2! “ 

i 


^1\ 

3 /’ 

3 /’ 

'■'* \ . 
3') ’ 


q^y=- 2J^‘^‘y‘’ 

i 

qxz --= 27 > 

I 

qy^ = 27 y‘ • 


(14.16) 


(14.17) 


(14.18) 


The vector d (i.e., the three quantities dx, dy, dz) and the six 
quantities qxx, qyy, qzz, qxy, qxz, qyz, depend only on the charge 
distribution in the system, and not on the place at which the potential 
is determined. In the notation of (14.16)-(14.18), the potential 
at large distances away from the system is of the form 


i;6,- , (dR) ,1 / : 

'p - ■ iT + ~R^~ + r 


3X^tXv 

Ro 


dXyJ' 


(14.19) 


with the terms of different indices of the type qxy actually appearing 
twice in the summation (for example, q^^ and the equal term q^i). 

The vector d is called the dipole moment of the charge system. 
The six quantities q are called the quadrupole moment comiionents. 
The dipole moment. We shall now examine the expression obtained 


for potential. The zero term 


corresponds to the approximation 


according to which all the charge is considered to be concentrated 
at the origin. In other words, it corresponds to a substitution of the 
entire charge system by a single point charge. This approximation 


is clearly insufficient when the system is neutral, i.e., if 27«. = o- 

I 

This case is very usual, since atoms and molecules are neutral (their 
electronic charge balances the charge on the nuclei). 


9 - 00«0 


130 


BI^CmiOD YNAMICS 


[Part II 


Let us assume that the total charge is equal to zero and then 
consider the first term of the expansion, involving the dipole moment. 

This term decreases like^, i.e., more rapidly than the potential 

of the charged system. Besides, it is proportional to the cosine of 
the angle between d and R. The simplest thing is to produce a neutral 
system by taking two equal and opposite charges. Such a system 
is called a dipole. Its moment is 

d = 27«<!*■’’ = e (r^ - r*) (14.20) 

in accordance with the definition (used in general courses of physics) 
that the dipole moment is the product of charge by the vector joining 
the positive and negative charges. 

It can be seen from equation (14.20) that the definition of dipole 
moment does not depend on the choice of coordinate origin, since 
it involves only the relative position of charges. We shall show that 
the dipole moment always possesses this property. 

Indeed, if we displace the origin through some distance a, then 
the radius vectors of all the charges change thus: 

r' = r'* + a 

Substituting this in the expression for the dipole moment, we obtain 
d = 27fi«’' = + ® = 27^'*'''’ (14.21) 

because 27 ~ 

But if the system is not neutral, then we choose a in the following 
manner: 

27 ®'*’' 

a = J-- (14-22) 

i7®^ 


This choice is analogous to the choice of centre of mass for a system 
of masses. Thus, we can say that the vector a determines the electri¬ 
cal centre of a system of charges. For a neutral system it is impos¬ 
sible to determine a, since the denominator of (14.22) is equal to 
zero. If for a charged system we choose a according to (14.22), then 

= 0, i.e., the dipole moment of a charged system, relative 

i 

to its electric centre, is equal to zero. 

We thus have the following alternatives: either the system is neutral, 
and then the expansion (14.29) begins with a dipole term independent 


Sec. 14] 


THE ELECTROSTATICS OF POINT CHARGES 


131 


of the choice of coordinate origin, or the system is charged, and then 
the dipole term in the expansion is equal to zero for a corresponding 
choice of origin. 

Quadrupole moment. In the expansion (14.19), we now consider 
the second term containing the quadrapole moment. A quadrupole 
is a system of two dipoles of moment d, which are equal in magnitude 
and opposite in direction. It is clear that a potential expansion for 
such a system wUl have neither a zero nor first term, so that equation 
(14.19) contains only a second term on the right-hand side. The sim¬ 
plest quadrupole can be formed by placing four charges at the vertices 
of a parallelogram, where the charges are of equal magnitude but 
with pairs of charges having opposite signs. The charges alternate 
when we traverse the vertices of the parallelogram. Such a system 
is neutral. However, a charged system, too, can have a quadrupole 
moment. It indicates to what extent the charge distribution in the 
system differs from spherical symmetry. Indeed, in this section it 
was shown that the potential due to a spherically symmetrical charge 

system decreases in strict accordance with a — law, and the potential 
due to a quadrupole follows a - 3 - law. For this reason, the quadrupole 

term in the potential expansion can arise only in the case of a non- 
spherical charge distribution. 

The principal axes of a quadrupole moment. Let us now determine 
in what sense the quadrupole moment characterizes a nonspherical 
distribution. In equation (14.22), an analogy was established between 
the centre of inertia of a mass system and the electric centre of a 
system of charges. In a similar way, equations (14.17), (14.18) allow 
us to establish a certain correspondence between the components 
of quadrupole moment and moments of inertia of a system of mas.ses 
Jxx, ..., Jyz (see Sec. 9), defined in equations (9.3). 

We can disregard the fact that a summation appears in equations 
(14.17) and (14.18), while (9.3) involves integration. This difference 
wiU not exist if we take a continuous charge distribution or a dis¬ 
crete mass distribution (as for nuclei in a molecule). Wo shall take 
the latter. In addition, we shall forget for a moment that the compo¬ 
nents of the moment of inertia involve masses and not charges. 
Then the relationship between the quadrupole moment and the 
moment of inertia is of the form: 


Qxx ^ — jXX g- {jXX jyy jzx), 

f/yy = — J^yy ~h (Jxx Jyy Jzz), 
^zz = — Jzz ^ XX Jyy -h Jzz ), 


^xy ^ — Jxy, 
^xz = — Jxzt 
qyz^ —Jyz. 


9 * 


132 


EI.ECTROD YNAMICS 


[Part II 


The sign ~ above the equality symbol indicates correspondence 
between terms. Indeed, according to (9.3) the first line gives 


- Z’e; {y^ + z’O = 


y‘ 


qxx. 


i > 

The relations in the second column are obvious. 

In Sec. 9 it was shown that moments of inertia can be reduced 
to principal axes, i.e., a coordinate system can be found for which 
the products of inertia are zero. But since the relations between q 
and J are true for any coordinate system, the components of the 
quadrupole moment of different signs also become zero in these 
same principal axes. In the principal axes, the quadrupole moment 
is expressed, in terms of moments of inertia, as 


32 ~ (<f 1 -}- 3 ~ 2) . 

33 3 (</l -i Jz -J 3 ) . 


(14.23) 


If the system i)ossesses spherical symmetry, then 
so that qi~ qz-=q 3 -0. Therefore, the presence of a quadrupole 
moment in a system of charges indicates that the charge distribution 
is not spherically .symmetrical. However, a reverse assertion would 
not be true; if the quadrupole moment is equal to zero, the system 
of charges may not be spherically symmetrical. It will then be neces¬ 
sary, in expansion (14.19), to take into aecouut terras of higher order. 

It will be noted that from (14.23) there follows directly the identity 
i/i i-q'j-l 0 , i.e., only two of the three iirincipal components of 
a quadrupole moment are independent. 

The relations (14.23) should be regarded literally if we are talking 
about a gravitational potential. We know that the earth is not strictly 
spherical, but is flattened at the poles. Therefore, the earth’s gravi¬ 
tational force contains terms which are not governed by an inverse 
square law. This affects the motion of the moon, and all the more 
so that of artificial satellites moving closer to the earth. 

Quadrupole moment when axes of symmetry exist. Equations (14.23) 
become simpler if two moments of inertia of a body are equal, for 
example, Then 

3i ~ (/g -- Ji) — ly . 

<72 ~ y ('^3 ~ •^i) ^ ~ Y ’ 

2 

33?^ 3 -(A - J 3 ) ^ 3- 


Sec. 14] 


THE ELECTROSTATICS OF POINT CHARGES 


133 


In this case, the qiiadrupole moment has only one independent 
component q. Its sign is called the sign of the quadrupole moment. 

The quantity q = • 

i 

If the charges were distributed with spherical symmetry, we 
would have the equality eiT*'= B ^CiZ'', for then -= 

II I 

^e,y'' ^ would be equal to zero. 

The 7 )ositive sign of q shows that^e.-z'” > y^Cjr^, i.e.,it indicates 

I* f 

a charge distribution extending along the z-axis. From (14.19), the 
potential due to such a quadrupole with one component q is 


1 - 

_ J_\ 

\ 

R^l 

4 J{^ 

R” I 

' 2 \ R" 

ft’ ) 

3 1 

- T'?l 

IX^-- 

C 

ys - 2Z^\ 


' R- 3Z2 \ 

, " 7 


- - -4 J5-(1 Scos^H). (14.24) 

Thus, the potential of a quadrupole depends on the angle 9- according 
to the law 1 - 3 cos^ 9^, where D- is the angle between the axis of 

symmetry of the quadrupole and the radius vector of the point 
at which the potential is determined. 

Similar deviations from spherical symmetry have been found in 
the electrostatic potential of many nuclei. The quadrupole moments 
of nuclei give us an insight into their structure. 

The energy of a system of charges in an eleetrostatie field. We 
shall now calculate the energy of a system of charges in an external 
electric field. The potential energy of a charge in a field is equal 
to f/ p 9 , because the force acting on the charge is equal to 
F - VC/- — eVcp = eE. The energy of a system of charges is thus 

U^ye,rf(r'), (14.25) 


where r' is the radius vector for the ith charge. 

Let us suppose that the field does not change much over the space 
occupied by the charges, so that the potential at the site of the ith 
charge can be expanded in a Taylor’s series: 


cp (r) = 9 (0) + x, (- 1 ^)+ IX, X, 


(14.26) 


134 


BliBCTBOD YN Allies 


[Part II 


We shall transform the last term in the same way as in the expansion 
(14.16); taking advantage of the fact that <p is the potential of the 
external field (and not the field produced by the given charges), 

7*^ 

so that A<P = 0 . we subtract from 9 the quantity A? equal to zero. 
Then, after summation over i, we obtain 

17= 9 (O)^;^.- - (dEo) + 4 (?-+ in "sy.- + ?« S' + 


+ 2?; 


*>' bxdy 
9 ( 0 ) 2 ^ e,- 


+ 27 ; 


8*9 

dxdz 


8*9 


I 


■ (dEo) + -o 31*:’ 


dxtjL 8xv 


(14.27) 


Here, the value of the field (V 9 )o = Eo at the origin has been 
substituted into the term involving the dipole moment. Relating 
equation (14.27) to the jirincipal axes of the quadrupole moment, we get 

17= 9 (0)2Je. - (dEo) - ( 7 , ^ + 32 + 32 - (14-28) 


In the case of a neutral system, the term involving dipole moment 
is especially important. The quadrupole terra accounts for the ex¬ 
tension of the system, since it involves field derivatives. It is interesting 
to note that if the system is spherically symmetrical, i.e., if it has 
a quadrupole moment equal to zero, there is no correction to finite 
dimensions. Higher order corrections are also absent, so that the 
potential energy will always depend only on the value of the potential 
at the centre. This is why spherical bodies not only attract, but 
are also attracted, <as points. Of course, these assertions are mutually 
related by Newton’s Third Law which holds for electrostatics, since 
the field is determined by the instantaneous configuration of charges. 


Exercises 

1) Show that the mean value of the potential over a spherical surface is 
equal to its value at the centre of the sphere, if the equation A 9 = 0 is satisfied 
over the whole volume of the sphere. Relate this to the result obtained for 
the potential energy of a spherically symmetrical system of charges in an 
external field. 

Tlie potential should be expanded in a series involving the radius powers 
of the sphere. In integration over the surface, all the terms containing x, y, 
and z an odd number of times become zero. The terms containing x, y, and z 
an even number of times can be rearranged so that they are proportional 
to A 9 . AA 9 , and so on. There remains only the zero term of the expansion, 
which proves the theorem. 

2) Calculate the electric field of a dipole. 


Sec. 16] 


THE MAOHETOSTATICS OIP POINT CHABOES 


136 


Sec. 15. The Magnetostatics ot Point Chaises 

The equations of magnetostatics. In the previous section it was shown 
that if the velocities of the charges are small in comparison with 
the velocity of light, then the magnetic field satisfies the following 
system of equations: 

divH = 0, (15.1) 

rotH = ^j. (15.2) 

They are called the equations of magnetostatics. From equation 
(15.2) it follows that 

divrotH=-^div j =0 . (15.3) 

Thus, for (15.2) to make strict sense, the currents must bo subjected 
to the condition div j = 0. But this condition is not directly satisfied 
for point charges, and only the charge conservation equation (12.18) 
holds. 

Mean values. The condition div j = 0 for moving point charges 
can only be satisfied for an average over some interval of time 
We shall define the mean value of a certain function of coordinates 
and velocities of charges / (r, v) in the following way: 

r=“//(r.v)d«. (15.4) 

u 

This averaging operation is commutative with difierentiation of 
the function with respect to coordinates, since it is performed with 
respect to a fixed frame of reference and not to the coordinates of 
the charges. 

Let us now average equation (12.28): 


0 0 


Integration of the right-hand side part can be performed directly: 

_1_ f ^ J* _ _ Fp (<o) —P (0) 

<0 } 4 

0 

Let us now assume that the diflerence p (tg) — p (0) increases 
more slowly than the time interval itself. Then, if we choose 

sufficiently large, the ratio - ^191 can be as small as required. 

H 

Because of this, the mean value of the current may indeed satisfy 
the equation 


136 


K LECTROD Y N AMICS 


[Part II 


divj —0. (15.5) 

Consequently, equation (15.2) and all subsequent equations in 
this section must be regarded as mean with respect to time; this 
will be denoted by a bar over each quantity relating to the motion 
of charges (no bar will be put over II). 

The definition of steady motion. Let us assume that the condition 

lim ^ ~ - - 0 is satisfied not only for charge density, but also for 

any function relating to the motion of charges. Such motion is termed 
stationary or steady. 

A special case of stationary motion is periodic motion, for example, 
cyclic motion. However, for a stationary state it is sufficient that the 
charges remain all the time in a limited region of space, for the 
difference / (<q) - / (0 ) then remains finite. 

The equations of this section will relate to the stationary motion 
f)f point charges. 

The equations for vector potential. In order to satisfy equation (15.1), 
we put, as in Sec. 12 [see (12.8)], 

II rot A, (1.5.6) 

where A is the vector potential. Equation (15.6) does not fully deter¬ 
mine A since, if we add to A the gradient of any arbitrary function /, 
as in (12.30), the expression for II will not change. Thus, an additional 
condition must be imposed on A. The Lorentz condition suggests 
that we must have 

divA^O. (IS'l) 

'Phen, substituting (15.6) in (15.2), we obtain 

rot II = rot rot A — . (15.8) 

But according to (11.42) 

rot rot A = graddiv A — AA --AA, (15.9) 

and we have used condition (15.7). Therefore, A satisfies the equation 

= (15.10) 

which is entirely analogous to equation (14.6) for the scalar potential. 
Equation (15.10) can be obtained from (12.37) directly if we discard 

the term which is superfluous in magnetostatics. 

The vector potential for a point charge. The solution of (15.10) 
appears exactly the same as the solution of (14.6) given by equation 
(14.8) for a separate point charge: each component of A satisfies 


Sec. 15] 


THE MAGNETOSTATICS OF POINT CHARGES 


137 


equation (14.6), the only difiFerence being that on the right-hand 
side there appear the functions — ^*,pi’,.,It follows 
that the vector potential for a point charge is 


ev 

clR—r| • 


(15.11) 


We shall noAv show that A satisfies condition (15.7). The divergence 
must be taken with respect to the radius vector R at the point at 
which A is determined. 


But VRlR-r| = -Vr|R-r;, so that 


div A = -^vVr 


K 


It 


e d 1 

(15.12) 


I’he e.vpression on the right-hand side of this equation is the total 
time derivative of the quantity • From the steady-state 

condition, it is equal to zero. 

The Biot-Savart law. We shall now calculate the magnetic field 
of a point charge. Using equation (11.28), we obtain 


n = rotA.-- 

c 


e fv, it - - rj 


Tliis equation refers, of course, only to steady motion. In particular 
it is applied to constant current. 

Vector potential at large distances from a system of stationary 
currents. The vector potential for a system of point charges is equal 
to the sum of vector potentials for each charge separately: 


A=i;, 


e, v* 
IR - r' 


(15.1,3) 


We shall now obtain approximate formulae which are valid at large 
distances from the system, similar to those obtained in electrostatics. 
For this, we substitute an expansion of inverse distance in (15.13) 
[see (14.14)]. 

R-r'; ~R ' (16.14) 


The quadratic term is not taken into account this time. The ex¬ 
pression for vector potential, to the approximation of (15.14), is 
of the form 


138 


ELECTEOD YNAMICS 


[Part n 


A = ^^2" e, v‘ + e, (Rfi) y' 


i?y 4-2'“''+ i>ai6) 


i i 

since r‘=:V‘. 

The zero term of the expansion is a total derivative and, by the 
steady-state condition, is equal to zero. We now transform the first 
term of the expansion, using the identity 


0= *■' =27^- +i7«. (»v0r'. (15.16) 

1 1 i 


From this identity it follows that in (15.15) we can substitute 
half the difference of the expressions on the right-hand side of (15.16). 
Then the vector potential will be 


^ ~ = - -Oal-ZcaRLi-'VJJ. (15.17) 

I 

We now interchange the signs of the summation and vector product: 


A = 


R e,- [r*' t‘] 

^ 2f 


(15.18) 


Magnetic moment. The sum appearing inside the brackets in (15.18) 
is called the magnetic moment of the system of charges (or system 
of currents). The mean magnetic moment is written thus: 


1^ = 27 (15.19) 

i 

The equation for the vector potential (15.18) can be written, by 
means of the magnetic moment, in the following form: 

The field of a magnetic dipole. Let us now calculate the magnetic 
field. By definition 

H = rot A = rot , jij . 

Since p! is a constant vector, equation (11.30) gives 
H = (P V) V- P A- (pi V) . 


Sec. 16] 


THE MAOITETOSTATICS OF POINT CHABOES 


130 


because A“^ =0- Further, (j* V) R=n [see (11.36)]. In order to cal¬ 
culate we use equation (11.36). This yields 


(ji V) ^ = (fi V -^) == - (tl V12) = 


ie‘ 


Finally, collecting both terms, we arrive at an equation for H; 


„ _ 3B(BiI)— 


(16.21) 


For comparison, we deduce the expression for the electric field 
of a dipole: 


E 


Vm- rr (»d) _■ 3B(Bd) —B»d 


(16.22) 


Thus, both expressions for the field (both electric and magnetic) 
are of entirely analogous form. The only difi'crence is that, instead 
of the electric moment, the equation for magnetic field involves 
the magnetic moment. This explains its name. 

In the case of a charge moving in a flat closed orbit, the definition 
of magnetic moment (15.19) coincides with the elementary definition 
of moment in terms of “magnetic shell.” As was shown in Sec. 6 
[see (5.2), (5.4)], the product [rv] is twice the area swept out by the 

(is 

radius vector of the charge in unit time, or [rv] = 2 . By definition 

of the mean value (15.4) 


1 * 


I r ® 

to J c dt cto 

o 


(1.5.23) 


Here, is the time of orbital revolution of the charge. In this time, 
the charge passes every point on the orbit once; hence the mean 

current is equal to I—-^. This yields the definition of magnetic 

moment familiar from general physics: 

(16.24) 


The similarity of equations (16.21) and (15.22) shows the equivalence 
of a closed current (i.e., a magnetic shell) and a fictitious dipole 
with the same moment p,. The field at large distances from a system 
of currents is produced, as it were, by the effective dipole. 

The relationship between magnetic and mechanical moments. An 
especially interesting case is that when all the charges of a system 
are of the same kind (for example, when they are all electrons). 


140 


ELECTBOD YNAMICS 


[Part II 


Then the magnetic moment is proportional to the mechanical moment. 
Indeed, for a system of charges with identical ratios —, we obtain 


1 ^ 


-2T E--^.E t'f'p’i ^ 


'Imc 


M 


i I i 

(15.25) 

Equation (15.25) has very important applications. 

A system of point charges in an external magnetic field. We now 
consider the question of the interaction of a system of currents with 
an external magnetic field. For this we must have an equation 
describing the interaction of a point charge with the field. We obtained 
equation (13.17) for the general spatial charge distribution. In this 
equation, the transition to point charges is obtained by changing 
the integral to a summation over the charges. The term obtained 
for action is of the form 


^ J - e.- 9.), (15.26) 

i 

where the indices i in A and 9 denote that the potentials are taken 
at the same point as the ith charge. 

In magnetostatics, only slowly moving charges for which v<^r 
are studied. Newtonian mechanics can then be applied to their motion. 
In the absence of a field, the action function of the particles is of the 
form 


It will be shown in Sec. 21 that this expression holds only when v-4 c. 
In magnetostatics, where this condition is satisfied, the action function 
of a system of charges in an external field is obtained by adding 
(15.27) and (15.26): 


.S' - - !■ S. - ftf). (15.28) 


The field due to the charges themselves does not appear in this 
equation. The expression for the integrand is the Lagraugian of the 
system. It involves velocity linearly as well as quadratically (iii 
the expression for kinetic energy), and, for this reason, does not have 
the form that we used in Part I, L—T — U. 

However, the general relationships still hold. Therefore, from the 
Lagrangian the expression for momentum is obtained in terms of 
velocity 


dr 


mi V' 


•A... 


C 


(15.29) 


Sec. 15] 


THE MAGNETOSTATICS OF POINT CHARGES 


141 


Let US determine the energy in terms of momentum using the basic 
equation (4.4): 


27 Vpi - L = 2^ (w(, A. v' - —2— - -®'- A. V' + ei9i) - 


= 2 —2— + ei9i . 


(15.30) 


SO that the term which is linear with respect to velocity is eliminated 
from the expression for energy in terms of velocity. 

The Hamiltonian function for a system of charges in an external 
magnetic field. The linear term in velocity of the Lagrangian alfects 
the form of the Hamiltonian. Let us write down Jf" from its definition 
(10.15). To do this it is necessary to substitute into the energy ex¬ 
pression, by means of equation (15.29), momenta instead of velocities: 


The Hamiltonian is 


(15.31) 

(15.32) 


Let us assume that the magnetic field, in which the system is 
situated, is weak and uniform (at least within the limits of the system). 
The vector potential for a homogeneous field will be represented as 


A:.. --[Hr]. 


(15.33) 


Indeed, then rot A = H [from (11.30), (11.35) and (11.33)]. And also 
divA = 0 from (11.29) and (11.34). 

Since the magnetic field is weak we can neglect in (15.32) the term 
involving Af. Then, substituting (15.33) in (15.32), we find an ex- 
])ression for Jf': 

I 

The last term in (15.34) gives the required addition to the 
Hamiltonian. Since this term is proportional to H, we can replace pi 
by 7«, v‘ in it to the same accuracy, i.e., neglecting terms of order H^. 
Performing a cyclic permutation of the factors in (15.34) and putting 
the sum inside a vector product sign, we obtain an expression for 
the addition to the Hamiltonian: 

.T' = - (h [r'Vj) = - (H^i). (15.36) 

t 

This expression is very similar to the energy of a system of charges 
in a homogeneous electric field which involves only the electric dipole 
moment of the system of charges. Note that this term is of the form 


ELECTROD VNAMICS 


142 


[Part II 


—(dE) [see (14.28)]. This indicates a further similarity between 
electric and magnetic moments. 

Larmor’g theorem. Let us now compare the expression for the 
momentum of a charge placed in a constant homogeneous magnetic 
field with that for the momentum of a particle relative to a rotating 
coordinate system. From (16.29) and (15.33), the former is 

p = mv + -^ = mvH--^[Hr], (16.36) 

the latter can bo easily found from (8.5); 


p = mv' + m [tor]. 


(15.37) 


Wo now consider a steadily moving system of identieal charges 
(for example, an atom or molecule); the nuclei, being heavier, are 
regarded as fixed. Let us assume that in the absence of any external 
magnetic field, the motion in the system is known. Then, comparing 
equations (16.36) and (15.37), it is easy to see that if we consider the 
motion of these charges relative to axes rotating with angular velocity 


o> = 


eH 

2mc ’ 


(15.38) 


it will not differ from motion relative to fixed axes in the absence 
of a magnetic field. The equations of motion relative to rotating 
axes will have their usual form p,=F,, where F, is the force acting 
on the ith charge in the absence of any external magnetic field, 
because the correction to the momentum due to the angular velocity to 
[defined in accordance with (15.38)] will cancel with the correction 
due to the magnetic field. The magnetic field must be sufficiently 
weak so that the change in magnetic force with rotation can be 
neglected. 

Wo can say, therefore, that, with the application of a constant 
and nniform weak external magnetic field, a system of charges with 

identical — ratios begins to rotate with a constant angular velocity 
eH 

1 wl 1 = ~2mc * statement is called Larmor’s theorem, and co^ 
is the Larmor frequency. 

Precession of the magnetic moment. If a system possesses a magnetic 
moment p. for motion which is undisturbed by a magnetic field, 
then, when a magnetic field is superimposed, this moment will move 
around the direction of the magnetic field, similar to the free top 
in eqriations (9.14) (note that to is in the direction of the axis of 
rotation, i.e., in the direction of the magnetic field). The precession 
of m.agnetic moment about the field is called the Larmor precession. 

Magnetic moment in an inhomogeneous field. Let us suppose that 
the magnetic field possesses a small inhomogeneity. Then, in the 
equations of motion, the term 


Sec. 16] 


THE MAGNETOSTATICS OF POINT CHARGES 


143 


F = - 13^' = V (Hji) (16.39) 

denotes the force acting on the moment and tending to move it 
as a whole. Expanding (15.39) by (11.32), we obtain 

F == (jjL V) H + [p, rot H]. 

But for an external field rot H is equal to zero so that the force is 

F = (pV)H. (15.40) 

This is the well-known force of attraction to a magnet. It is 
maximum near the poles of the magnet where the inhomogeneity 
of the field is greatest. 


Exercise 


Study the magnetic moment (i moving in a magnetic field given by the 
components Hz = — Wo! ~ Hx cos co <; Hy — II ^ sin a t. Consider the 


cases <0 — <i>o = -» <*> 

® 2wc 


■ 0 . 


The equation describing a vector rotating with angidar velocity is, 
according to Secs. 8 and 9 , 


d\L 

dt 


= [<Oi p]. 


Whence we obtain an equation for the precession of magnetic moment in a 
magnetic field 


By multiplying both sides of this equation by g, we convince ourselves 
that the absolute value of magnetic moment g is conserved. It is, therefore, 
sufficient to write the equations only for the components g* and gy, replacing 

V-z by V g® —gj . 

Using the abbreviated notation o>o = ) o), = [Cf. (16.38)], we 

multiply the equation for gy by ± i and combine with the equation for g* 
to obtain 

-gy (g* ± igy) = ± fwolg* ± iv-y) ± V g* — g| — gj • 

We seek the solution in the form g* ± * gy = .4 ± e *' “•*, and get the 
following equation for amplitudes A ± : 

(to — (Oj) A ± = toj "v/ g* - A + A - . 


Multiplying these equations, we find 


A + A _ 


(to- + 0>f 


A± = 


g“i 


"v/ (to—to„)* + tof 


■When to = toj (“paramagnetic resonance”), the moment rotates in the 
plane x y with frequency to^. When to 0 , i.e., in the case of an infinitely 
slow rotation of the field, the moment strictly follows the field, its direction 
all the time being the same as that of the field. 


144 


ELECTRODYNAMICS 


[Part I] 


Sec. 16. Electrodynamics of Material Media 


Field in a medium. We know that material media consist of nude] 
and electrons, i.e., of very small charges in very rapid motion. There 
fore, in a small region of a body— a, region having atomic dimensions— 
all electromagnetic quantities (field, charge density, and current) 
change very rapidly with time. In two neighbouring small regions 
these quantities may, at the very same instant, have completely 
different values. Therefore, if we examine the field in a medium full 
of charges in detail, then we will observe only a rapidly and irregularly 
varying function of coordinates and time. 

Mean values. The inhomogeneities of a field are of atomic dimensions. 
However, such a detailed picture of the field is not usually of any 
interest. As usual, in any description of macroscopic bodies, it is 
essential to know mean values for a large number of atoms. For exam¬ 
ple, in mechanics, mean density values are used. For this mean tc 
have any significance, we must isolate a volume of the body contain¬ 
ing a large number of atoms, determine its mass and divide by the 
volume. 

This volume must be so large that the microscopic atomic structure 
of the substance cannot affect the mean value of the density. At the same 
time, the mean macroscopic value must be constant over that volume, 
This will be readily seen from the following. Let the volume be arbi¬ 
trarily divided into two equal parts. Then the mean for each part 
should not differ from the mean for the whole volume. 

Such a volume is termed physically infinitesimal. We shall call it 
Vq. If we take all its dimensions to be large compared with atomic 
dimensions, then the mean value should not depend on the shape ot 
the surface bounding the volume; the latter may be spherical, cubi¬ 
cal, etc. 

Besides averaging over volume, it is also necessary to perform 
averaging over time. The interval of time over which the average is 
to be taken must be large compared with the times of atomic motions, 
though still sufficiently small so that the mean values over two semi¬ 
intervals do not differ from one another. 

Let the volume have the form of a cube of side a. We shall denote 
the coordinates of its centre by x, y, z. The time interval, over which 
the averaging is performed, will be called and the instant corre¬ 
sponding to the centre of the interval will be denoted by t. The coordi¬ 
nates of any point inside the cube, relative to the centre, will be called 
5, 7], C and instants of time measured from t, will be denoted by 6^. 
Thus, the limits of variation of the quantities are given by the follow¬ 
ing inequalities: 


2 ^ 2 ’ 


Sec. 16] 


EtECTBODYXAMICS OP MATERIAL MEDIA 


145 


The actual value of any quantity at a definite instant of time is 
/ y+7), 2 + !^, t-{■&). It is related to a mathematically infinitely 

small volume dV ==d^ dy\ dX, and an interval of time d^. The average 
value, over a physically infinitesimal volume Fq and interval of time 
Ig, is obtained if / is integrated over dV dt and the integral divided 
by Foifl, in accordance with the usual definition of an average: 

/ {x, y, z, t) ^ 

“A. “/a “/a Va 

"" J J 2/ + ^> z + ^. < +&). 

° -"/a -"/a -“’/a ^Va (16.1 


This average value / (x, y, z, t) refers to the point x, y, z, and time t. 
The electrodynamics of such mean values is termed macroscopic, as 
opposed to microscopic, which has to do with a field due to separate 
point charges and a field in free space. 

The mean thus determined is differentiable with respect to time and 
coordinates. As parameters it involves the coordinates of the centre of 
a physically infinitely small volume x, y, z, and the time t. Obviously, 
we can differentiate with i-espect to these values: 


/ (X, y, z, 1) - 

‘U "A ‘■Z. ‘>k 

’ vi:: J <*5 1 .<•) J«J» 

- -“h -V, -“It -w* 


f(x I, y >r ri, z 1 +^) = 


(16.2) 


In other words, the mean value of the derivative of a quantity is 
equal to the derivative of its mean value. 

Density of charge and current in a medium. Under the action of 
the electric and magnetic field, there occurs a redistribution of the 
charges and currents in any substance. When Maxwell’s equations 
are averaged, the mean density of the redistributed charge is p and 
that of the current is j. We shall express p and j in terms of other 
values which will later make it possible to give the averaged Maxwell 
equations a very symmetrical form. 

We define the dipole-moment density in a substance by the follow¬ 
ing formula: 

d=JPdF. (16.3) 


The dipole moment P in unit volume is called the electric polarization 
of the medium. If the substance is completely neutral, its dipole 

moment is uniquely determined as ^e,r‘ [see (14.21)]. Going over 

i 

to a continuous charge distribution, we write 


10 - 0060 


146 


BiaiCTRODTOAMICS 


[Part II 


d=JprdF. 

Integral (16.3) can be identically written in the form: 

Jl»dF= - JrdivPdF. 


(16.4) 


(16.6) 


This relationship can most simply be proved by writing in terms of 
components, for example, 

J JJ * ( + ^)dxdydz = J J(a!P.) \dydz + 


+ J j a: (Py) I da:dz + j Ja: (P*) dx dy -- j j ^ Pxdxdydz. 
ri 

The limits of integration are at the external boundaries of the medium, 
where the values P*, Py, Pz are zero. This proves (16.5). Comparing 
(16.4) and (16.6) we obtain 


J r (div P + p) d F = 0. 


(16.6) 


However, since the shape and dimensions of the body are arbitrary, 
the quantity 

divP + p = 0 (16.7) 

must be zero. 

Thus, the mean density of a charge “induced” by the field is equal 
to the divergence of the electric polarization vector taken with oppo¬ 
site sign. 

In a similar way, wo can express the mean density of induced cur¬ 
rent. To do this we define the magnetic polarization vector, equal to 
the magnetic-moment density, as 


|A = jMdF. 


(16.8) 


But the magnetic moment, by definition, is expressed as 


e,- [r' V'] 
2c 


Applied to the current distribution, this gives 


p.= / 


[rljdF 

2c 


We shall now prove an identity analogues to (16.6): 

fMdF = y/ [rrotM]dF. 


(16.9) 


(16.10) 


Sec. 16] 


ELECTBODYNAMICS OF MATERIAL MEDIA 


147 


For this, it is simplest to go over to components: 

J [rrotM]*dF = J (yrot^M — 2 rotyM)dF = 

The terms — y~^— integrated by parts. All 

the integrated quantities become zero when the limits are inserted, 
so that, in agreement with (16.10), only 2 Mx remains. Now, comparing 
(16.9) and (16.10) we obtain 

= J . (16.11) 

In order to determine j fully, we calculate its divergence and apply 
the charge conservation law (12.18) written for mean values [see 
(16.2)]: 

From (16.11) and (16.12), J is uniquely determined as 

j=-|^ + crotM. (16.13) 

Indeed, tliis expression satisfies both equations. Finding the diver¬ 
gence of both parts of (16.13), we arrive at (16.12), since div rot M==0. 

61 ’ 

Further, substituting the quantity-^— in the left-hand side of (16.10), 
we get 

According to (16.3) and (16.4), P is replaced by pr. Ilut, [r, pr] = 0, 
6 P 

SO that the term does not contribute to equation (16.11). The 
identity (16.13) is thus proved. 

Averaging Maxwell’s equations. We shall now consider the averaging 
of Maxwell’s equations. From (16.2), difierentiation and averaging 
are commutative, so that a bar can simply be put over the first pair 
in order to denote that they have been averaged: 

rotE=-|^, (16.16) 

divH=—0. (16.16) 

The mean value of an electric field E is called the electric field in a 
medium. We shall hereafter write it without the bar, which denotes 
that it has been averaged, taking it for granted that only mean values 


10* 


148 


ELECTROD YNAMICS 


[Part II 


will always be taken in a medium. The mean value of the magnetic 
field is called the magnetic induction and is denoted by the letter B. 
ft is all the more unnecessary to write a bar over it because the concept 
of induction, which is not equal to field, makes sense only for a medium. 
The asymmetry in the terminology for electric and magnetic fields 
will be explained later. 

In this notation, the first pair of Maxwell’s equations takes the 
following form: 

rotE= - (16.17) 

divB:.-0. (16.18) 

Now let us average the second pair of Maxwell’s equations: 

, Y7 1 SE , in— 

divE = 47rp. (16.20) 

We substitute p and j from (16.7) and (16.13) and rearrange the 
terms somewhat, obtaining the two following equations (the bars are 
again omitted): 

rot(B -47tM) + 47tP), (16.21) 

C O t 

div (E + 47rP) = 0 . (16.22) 

We introduce the following now designations: 

E-f-47tP-=D. (16.23) 

D is called the electric induction. 

Further, 

B-—47iM=^H. (16.24) 

II is called the magnetic field in a medium, which, therefore, does not 
equal the mean value of the magnetic field in a vacuum. 

In the notations (16.23) and (16.24), the second pair is written similar 
to the first pair: 

rot II (16.25) 

divD-0. (16.26) 

The similarity between (16.26) and (16.18) explains why it was 
convenient to call the mean value of the magnetic field the magnetic 
induction: here, both electric and magnetic induction vectors have no 
sources in the medium. The similarity between (16.25) and (16.17) 
justifies the term magnetic field given to the vector H (16.24). 

The incompleteness of the system of equations in a medium. Thus, 
due to a suitable system of notation, the first and second pairs of 


Sec. 16] 


ELECTRODYNAMICS OF MATERIAL MEDIA 


149 


Maxwell’s equations in a medium have, as it were, become more symmet¬ 
rical than those in a vacuum. But we must not forget that this system 
has now ceased to be complete: as before, there are eight equations 
(of which only six are independent) and twelve unknowns B, E, D, H 
(with three components for each vector). Consequently, the system 
(16.17), (16.18), (16.26), and (16.26) cannot be solved until a relationship 
is found between inductions and fields. This relationship cannot be 
obtained without knowing the specific structure of the material 
medium. 

Dielectrics and conductors. We shall consider, first of all, how charges 
behave in a medium in the presence of a constant electric field. The 
field will displace the positive charges in one direction, and the nega¬ 
tive ones in another. As a result, polarization P will arise. Two essen¬ 
tially different cases can occur here. 

1) Under the action of the field inside the body, a certain finite 
polarization P, dependent on the field, is established. This polariza¬ 
tion may be represented vividly (though in very simplified fashion!) 
as a displacement of charges from the equilibrium positions which they 
occupied in the absence of the field to new equilibrium positions— 
much like the way a load suspended on a spring is displaced in a gravi¬ 
tational field. If a finite polarization (dependent on the field inside the 
body) is established, that body is called a nonconductor or dielectric. 

2) In a constant electric field acting inside the body, the charges do 
not arrive at equilibrium, and a definite rate of polarization increase, 
dV 

, is established. In this case, through every section of the conductor 

Ot gp 

perpendicular to the vector , there pass electric charges or, what 
amounts to the same thing, a flow of current. From equation (16.13), 

gp 

the derivative may indeed be interpreted as a current component. 

As regards the second current component, c rot M, it relates to the 
instantaneous value of quantities and cannot characterize the change 
of anything with time. For this reason, the classification of bodies into 
conductors and dielectrics is obtained from the behaviour of the 
SP 

quantity . 

The displacement of charges under the action of a field can roughly 
be likened to a load falling in a viscous medium with friction, when, as 
is known, a definite speed of fall is established. 

A medium, in which a constant electric field produces a constant 
electric current, is called a conductor. 

If a constant electric field is produced in free space and a conducting 
body of finite dimensions is introduced into it (for example, a conduct¬ 
ing sphere or ellipsoid), the charges in the body will be displaced so 
that a field equal to zero will be established inside the conductor. For 
this, the mean charge density inside the body must also equal zero. 


160 


ELECTBOD YNAMICS 


[Part n 


because the lines of force of the field originate and terminate at charges. 
Under the action of such a field, the charges inside the conductor 
would be displaced. This means that an equilibrium will be established 
inside the conductor only when all the induced charges emerge to the 
surface. They will be distributed on the surface of the body so that the 
mean field inside the conductor is zero, and the lines of force outside 
the conductor will arrive normal to every point of the surface. 

Continuous current in a conductor. A continuous ciurent can 
flow in a conductor only along a closed conducting circuit. And the 
electric field must always have a component along the direction 
of the circuit. Then charges of given sign wiU always move in one 
direction, thereby producing a closed current. The work performed 
by unit charge in moving around the circuit is called the e.m.f. acting 
in the circuit: 

e.m.f.= jpidl. (16.27) 


This formula differs from (12.1) in that E denotes the field acting 
inside the conductor. 

External sources ol e.m.t. In a conductor, a constant e.m.f. can 
only exist at the expense of some external source of energy, for example, 
a primary cell. When a current passes in the circuit of the cell, ions 
are neutralized on the electrodes, thus 3 delding the source of energy 
that maintains the e.m.f. 

If, as usual, we put E == — yq), then the expression for e.m.f. 
will be 

e.m.f. = -JV 9 dl =-dij + ^dz) 


= —Jd9 = 9i —92. (16 28) 

Therefore, the e.m.f. may be defined as the change in potWtial 
in going round a closed path. Thus, the potential is not a unique 
function of a point: for each traverse, it changes by the value of the 
e.m.f. in the circuit. 

The magnetic properties of bodies. We shall now consider the 
magnetic properties of bodies. In a constant magnetic field, a definite 
equilibrium state will always be established in the medium. Here 
we must distinguish between the following two cases. 

1) In the absence of a field, the atoms or molecules of a substance 
possess certain characteristic magnetic moments that differ from 
zero. 

As was shown in the previous section, the energy of every separate 
elementary magnet in a magnetic field is — [a H. Hence, the energy 
of elementary magnets that have a positive moment projection 
on the field is less than the energy of elementary magnets with a 


Sec. 16] 


ELECTRODYNAMICS OE MATERIAL MEDIA 


161 


negative moment projeetion. Atoms and molecules are in random 
thermal motion. As a result of this motion and of the action of a 
magnetic field, an advantageous energy state is established in which 
positive moment projections on the field predominate. For more 
detail about this equilibrium see Part IV. 

It will be noted that the projection of an isolated magnetic moment 
on the magnetic field is constant—^it merely performs a Larmor 
precession around the field. But the interaction between molecules 
disturbs the motion of separate moments, and results in the establish¬ 
ment of a state with a mean magnetic polarization other than zero. 

2) The atoms or molecules of a body do not possess their own 
magnetic moments in the absence of a field. 

As was shown in the previous section, when an external magnetic 
field is applied, the motion of charges in atoms or molecules changes 
due to the Larmor precession. Indeed, a precession of angular veloc- 
eH 

ity (0 = is superimposed on motion undisturbed by a magnetic 

field. In exercise 4 of this section it will be shown that this precession 
leads to the appearance of a magnetic moment in a system of charges. 
We shall only note here that the direction of the magnetic moment 
induced by the field must be in opposition to the direction of the 
magnetic field; this follows from Lenz’s induction law. Indeed, an 
induced current produces a magnetic field in a direction opposite 
to that of the inducing field. 

A substance in which an external magnetic field produces a resultant 
moment in the same direction, is called paramagnetic. If the magnet¬ 
ization is in the opposite direction to the field, the substance is dia¬ 
magnetic. 

Ferromagnetism. There are crystalline bodies in which the magnetic 
moments are aligned spontaneously, i.e., in the absence of any external 
magnetic field. Such bodies are called ferromagnetic. The magnetic 
polarization of the body itself is related to the directions of the 
crystalline axes. For example, in iron, whose crystals have cubic 
symmetry, the intrinsic magnetization coincides with one of the 
sides of the cube. This direction is called the direction of ready magnet¬ 
ization. In order to deflect the magnetic polarization from the direction 
of ready magnetization, work must be performed. 

A single crystal of a ferromagnetic substance will be magnetized 
so that the resultant energy is a minimum—equilibrium always 
corresponds to minimum energy. However, it does not necessarily 
follow from this that all of the single crystal is magnetized in one 
direction; in this case it will possess an external magnetic field whose 

energy is ^ H^dV. This quantity is always positive and increases 

the total energy. But if the single crystal is divided into regions 
or layers whose magnetization alternates in direction, then the 


162 


BltECTBOD YNAMICS 


[Part II 


external field can be eliminated since neighbouring layers (or, as 
they are called, domains) produce fields of opposite sign. In the tran¬ 
sition region between domains, the polarization gradually turns from 
the direction of ready magnetization in one domain to the reverse 
direction in the other domain. Clearly, if a certain direction is the 
direction of ready magnetization, then the directly opposite direction 
also possesses this property. The structure of the transition region 
has been studied theoretically by L. D. Landau and E. M. Lifshits. 

The domain structure of crystals was later demonstrated experi¬ 
mentally. If a very thin emulsion of particles of a ferromagnetic 
substance is spread over the smooth surface of a ferromagnetic single 
crystal, the particles will be distributed along lines where the inter¬ 
faces between domains intersect the surface of the crystal. 

Since, between domains, the polarization is deflected from the 
direction of ready magnetization, it is necessary to perform work 
to establish the transition region. Summarizing, if the whole single 
crystal consists of one domain, its energy increases at the expense 

of work done in creating an external field, equal to J Il^d V ; 

if, however, the crystal consists of many domains, the energy increases 
at the expense of the additional energy of the transition regions. 
'I'he equilibrium state will be that state in which the energy is least. 
The energy of a field increases with the volume it occupies, that 
is, as the cube of the linear dimensions of the crystal. The energy 
of the transition regions increases in proportion to their total area. 
In a crystal of sufficiently small dimensions, there can exist only 
one transition region whose area is proportional to the square of the 
linear dimensions of the single crystal. Therefore, in such a small 
crystal, the volume energy changes according to a cubic law with 
respect to dimensions, while the surface energy varies according 
to a square law. In a sufficiently small crystal, the volume energy 
becomes less than the surface energy; such a crystal is not separated 
into domains but is magnetized as a whole. This has been experi¬ 
mentally established in crystals with dimensions of 10“*-10'® cm. 
The thickness of domains in large crystals of appropriate ferromagnet¬ 
ics is of the same order. 


An example of the shape of a domain as proposed by Landau 
and Lifshits is shown in Fig. 22. The arrows 
denote the direction of the polarization in each 
I ill 1 1 domain. The serrations at the boundary almost 
completely destroy the external magnetic field; 

Fig. 22 the lines of magnetic induction inside the crystal 

are closed tlmough them, and do not emerge. 


Fig. 22 


The magnetization of a ferromagnetic in an external field. If a 
magnetic field is applied to a ferromagnetic crystal in the direction of 
ready magnetization, then those regions, for which the polarization 


Sec. 16] 


EUECTBODYNAMICS OF MATEBIAL MEDIA 


163 


is in opposition to the field, are contracted with displacenaent of 
the interfaces and may disappear completely in a comparatively 
small field. Then the crystal is magnetized to saturation. In order 
to magnetize the crystal to saturation in a direction that is not coin¬ 
cident with the initial direction of polarization in the domains, con¬ 
siderably larger fields are required. 

In a polycrystaUine body, such as ordinary steel, the separate 
single crystals are oriented more or less at random relative to one 
another. In any case, the directions of ready magnetization are not 
the same for the separate crystals. When an external magnetic field 
is applied, the different crystals are magnetized differently and the 
magnetization curve is not as steep as is possible in the case of a 
separate crystal. The magnetic interaction between separate crystals 
results in a definite magnetic polarization remaining after steel has 
been magnetized and the field subsequently removed. This is what 
is known as hysteresis. 

Magnetic interaction of atoms. Let it be noted, in addition, that 
the magnetic interaction between separate elementary (atomic) 
magnets is not at all adequate in explaining the cause of ferromagnetism. 
The energy of interaction between two elementary magnetic moments 
is of the order IQ-i® erg, while the energy of thermal motion at room 
temperature is about 10 erg (see Part IV). This is why random 
thermal motion should destroy the orderly magnetization already 
at a temperature of about 1° above absolute zero. Actually, ferro¬ 
magnetism of steel disappears in the neighbourhood of 1,000° above 
absolute zero, thus corresponding to an interaction energy between 
elementary magnets in the order of 10~^® erg. 

Ferromagnetism is of quantum origin and cannot be explained 
with the aid of classicial analogues. 

The relationship between fields and inductions. A substance is 
always in equilibrium in a constant external magnetic field. To this 
equilibrium there corresponds a very definite induction and polari¬ 
zation. In a weak field, the relationship between the quantities is 
linear. For this reason, the magnetic induction is expressed linearly 
in terms of the magnetic field in a medium: 

B = xH. (16.29a) 

In a dielectric, where a static equilibrium polarization corresponds 
to a definite electric field, there is a similar relationship for weak 
fields. 

D = sE. (16.29b) 

The quantity x is called the permeability and s is the dielectric 
constant. 

It should be noted that in ferromagnetics the region for which 
a linear law is applicable has an upper limit of not very large fields 


154 


ELECTBOD YK AMICS 


[Part II 


(10®-10* CGSE), since saturation sets in; in diamagnetic and 
paramagnetic substances at room temperatures a linear law applies 
for all actually attainable fields. 

The vector nature of electric and magnetic fields. The question 
may arise: Why is magnetic induction expressed linearly solely in 
terms of magnetic field, while electric induction is expressed solely 
in terms of electric field? 

In order to answer this question we must examine the vector 
properties of electromagnetic quantities in more detail. 

Two separate systems of rectangular coordinates exist in space; 
a right-hand system and a left-hand system. They are related to 
each other like left and right hands, if the thumbs are in the direction 
of the x-axes, the forefingers along the y-axes, and the middle 
fingers along the z-axes. It is obvious that no rotation in space can 
make these two systems coincide. However, one system transforms 
to the other if the signs of the coordinates in it are reversed. Of 
course, both coordinate systems are completely equivalent physi¬ 
cally. The choice of any one of them is completely arbitrary. Therefore, 
the form of any equation expressing a law in electrodynamics should 
not change under a transformation from a right-hand to a left-hand 
system. 

Let us now take Maxwell’s equation (12.24). In order to perform 
a transformation to another coordinate system, it is sufficient to 
change the signs of the coordinates. This changes the sign of the 
vector operation rot, because this operation denotes a differentiation 
with respect to coordinates. What happens, then, to the electric 
field components ? Since only one of two vectors is differentiated with 
respect to the coordinates, namely E, the sign of one of them must 
change in order to retain the form of the equation. It is easy to see 
that vector E will change sign. Indeed, the right-hand side of equation 
div E = 4 u p is a scalar and does not change in sign. On the left- 
hand side, the sign of the div operation changes and, hence, the signs 
of all the components of E must also change. Therefore, the com¬ 
ponents of the magnetic field do not change sign in a transformation 
from a left-hand to a right-hand system. 

In a rotation of a coordinate system, the projections of any vector 
are transformed by the same equations of analytical geometry as 
the coordinates. As was shown, the change of sign for all three coordi¬ 
nates is not equivalent to any rotation. It turns out that some vectors, 
such as E, behave quite similarly to a radius vector r; when the signs 
of the components of the radius vector r are changed, the signs of 
all the components of E also change. Other vectors, such as H, behave 
like a radius vector under coordinate rotations, and not like a radius 
vector in the transformation from a right-hand to a left-hand system. 

Vectors that behave like E are called true or polar vectors, while 
those behaving like H are called pseudovectors or axial vectors. 


Sec. 16] 


ELECTRODYNAMICS OF MATERIAL MEDIA 


165 


Velocity, force, acceleration, current density, and vector potential 
are, in addition to the electric field, real vectors while magnetic 
moment, angular momentum and angular velocity are pseudo¬ 
vectors. 

The fact that angular momentum is a pscudovector can easily be 
seen from its definition: M = [rp]. Both factors of the vector prod¬ 
uct, r and p, change signs, so that M does not change in sign. 

A pseudovector cannot be linearly related to a real vector in electro¬ 
dynamics because the sign in any such equality would depend on 
the choice of coordinate system, which contradicts physical facts. 
For this reason the vectors B and H, D and E appear separately in 
the linear laws (16.29a) and (16.29b). 

The equations for conductors in a constant field. We shall now 
consider the equations of electrodynamics of constant fields for 
conductors. As has already been indicated, it is not a constant value 
of polarization that is established in a conductor in a constant field. 


but a constant rate of increase of polarization . This quantity has 

Ot ^ J) 

the meaning of current density j'. The derivative appearing on 
the right-hand side of Maxwell’s equations, may be replaced thus: 


3D . 6P 


= 47tJ' , 


(16.30) 


because the field is constant. Here, the current j' is also continuous. 

The magnetic field is a pseudovector and camiot be linearly related 
to the current density. We note that for metals the linear relationship 
between field and current (Ohm’s law) does not break down, no 
matter how strong the field. 

The quantity is called the specific conductance, or conductivity. 
In the CGSE system its dimensions are inverse seconds. For metals ct 
is of the order 10*^ sec“^. 

Slowly varying fields. So far, an electromagnetic field in a medium 
has been regarded as strictly constant with time. But if the field 
varies sufficiently slowly with time, it may also be considered as 
constant. Let us give a general criterion whereby we can say what 
field may be regarded as slowly varying. 

We assume that a constant field is switched on at some initial 
instant of time < = 0. A stationary state is not established in the 
medium at once but only after a certain interval of time 0 has elap.sed. 
If, for example, the medium is a dielectric, then, in that time, a def¬ 
inite polarization is established corresponding to the given field; 
in a metal, 0 characterizes the time taken for a constant current 
to be established. 0 is called the relaxation time. If, during the relax¬ 
ation time, the field changes by only a small fraction of its value 
it can be regarded as constant within the accuracy of that small 
fraction. In other words, the criterion of slowness of variation of a 


156 


ELECTKOD YNAMICS 


[Part II 


field is this: within the relaxation time a stationary state, correspond¬ 
ing to the given value of field, has time to establish itself in the medium. 
Such fields are termed slov/ly varying. For them, the same values 
of permeability, dielectric constant, and conductivity can be sub¬ 
stituted into Maxwell’s equations, as for constant fields. 

Lot us write down Maxwell’s equations for a slowly varying field in a 
conductor. In the exi)ression 


ai) 

bt 


dTj 

bt 


A 

47t - - 

bt 


_SE 

bt 


47tcrE 


(16.31) 


the first term can be neglected in the majority of cases because it 
in no way exceeds y-; if, as occurs in metals, a is of the order of 10^^, 

then aE "p- EjQ. Whence Maxwell’s equations are obtained for a 
slowly varying field in a conductor: 


X Ti 47rj' 47rcrE 

rot H ^ - =-, 

C C 

(16.32) 

rotE - , 

c bt 

(16.33) 

div X H = 0. 

(16.34) 


This system is complete and, together with the boundary con¬ 
ditions (see exercises 1 and 5 in this section), is sufficient for the deter¬ 
mination of slowly varying fields in a conductor. 

Rapidly varying fields. \jet us now consider the case of rapidly 
varying fields, i.e., fields which change more rapidly than the relax¬ 
ation process or the establishment of a definite stationary state 
in the medium. Then the state of the medium depends not only 
on the instantaneous value of the field, but also on its values at 
previous instants of time; in other words, it depends on the way 
in which the field changes with time. Such a relationship is very 
complicated in the general ease. It is simplified if the field is weak ; 
then, at any rate, wo may e.xpect the relationship to be linear. 

Expansion in harmonic components. Let us examine the general 
form of the linear relationship. To do so we represent the field as 
follows: 

E(<)=^EfcCos(<Ofcf-[-<pk), (16.35) 

k 

i.e., w'c expand it in harmonic components. The more values of the 
amplitudes Ejt, frequencies cot, and phases <pk, we take, the better 
the approximation for the variation of E. However, if the relationship 
between induction and field is linear, then the induction is also of 
the form of a sum of harmonic components: Dfe cos (<ofef <pfe), 
where (and this is most important) each term in the sum for induction 
is determined by the term of the same frequency in (16.35). 


Soc. 16] 


ELECTRODYNAMICS OF MATERIAL MEDIA 


167 


This does not contradict the general statement that the induction 
is determined by the entire time dependence of a rapidly varying 
Held; for a harmonic relationship between field and time, it is fully 
given by its amplitude, phase, and frequency. And a component of 
the field with a certain frequency can in no way give rise to induction 
components with another frequencj' if the relationship between 
field and induction is linear, for no linear relationship exists between 
trigonometric functions of different arguments. Therefore, if we 
write the functions even with the same frequency but with different 
phases cpk and we do not directly obtain a linear relationship 
either. However, if we use a complex form and express the field and 
induction in terms of exponentials by means of the equations 


k 

I) (<) = 27 (D* e"'+ DJ e' “‘'l 

k 


(16.36) 


(the star denotes a complex conjugate quantity), then a linear rela¬ 
tionship between field and induction can be written in the following 
form; 


Dfe = sfc Elk or D (w) £ (<o) E (w). (16.37) 


The quantities Dj, and Efc are complex. From a comparison of 
(16.35) and (16.36) it can be seen that they differ from the real field 

and induction amplitudes by the complex factors e'^fe and ^ e''^k. 

Thus, the dielectric constant sk — s ( w) must also be a complex 
quantity. A complex permeability x (“) is similarly introduced. 

Maxwell’s equations in complex form. Let us now write down 
Maxwell’s equations for complex field components. It must be noted 
that they are a function of time according to an e“‘"' law. We shall 
divide the equations by these factors and get the following system 
of Maxwell’s equations for rapidly varying harmonic fields: 


rot 11 — - i £ (w) E , 

(16.38) 

rot E --= i 7 (co) 11 , 

(16.39) 

div £ E = 0. 

(16.40) 

div 7 H - 0. 

(16.41) 


The imaginary parts of the dielectric constant and permeability 
lead to the energj'^ of a rapidly varying field being spent on the gener¬ 
ation of heat in the substance (see exercise 18). 


168 


ELECTROD y NAMICS 


[Part II 


We note, in addition, that for rapidly varying fields the division 
of bodies into conductors and dielectrics is conditional and is deter¬ 
mined by the relationship between the imaginary and real parts 
of the dielectric constant. A substance retains the character of a 
conductor up to such frequencies of a rapidly varying field as satisfy 

the inequality ^ 0- 

Exerciseg 

1) Show that on tho boundary between two media the tangential compo¬ 
nents of tho fields and the normal components of induction are continuous. 

Integrate the equations for the inductions over a small flat cylinder, 
and tho equations for fields over a narrow quadrilateral bounding the inter¬ 
face (Fig. 23a and b). 

2) Show that a magnetic field varying sinusoidally with time is damped 
with depth in tho conductor (x == 1). 

From equations (10.32)-{16.34) wo have 

dllz _ 4:t(T (IKy _ i “ Tr d^Jlz __ iTzaiu „ 

dx c ’ dx r '' dx^ “ ’ 

whence 

II,~ /ij’ 

whore a: is a coordinate, normal fo tlii' surface of a conductor. 

3) Sliow t.hat o()uations (16.32)-(16.34) are formally ai>plicablo to the 
case of rapidly varying fields, if the relationship between field and time is 
consi<lorod harmonic;, in this case the conductivity a is proportional to tlio 
imaginary part of e, and the real part of e is equal to zero. 


4) Calculate the permeability of a substance whose molecules do not possess 
intrinsic magnetic moments. 

Tho additional velocity of charges, when a magnetic field is applied, is 
V = [w r], where w is given by Larmor's theorem (16.38); whence the magnetic 
moment is determined from the general expression (15.19). The moan pro¬ 
jection of this moment on H is obtained by averaging over the angles between 
H and r. Fi-oin this wo find tho magnetic polarization and, finally, x: 


l-47ciV 


(r-V 

6 m, c* 


where N is the number of molecules in imit volume and (r*)* . is the mean 
square of the radius of rotation of the fth charge. 

6) Show that an electric field near the siurface of a charged conductor is 
equal to 4 re y, where y is the surface density of static charge on the conductor. 


Sec. 16] 


EMCTRODYlfAMICS OF MA'TEBIAI/ MEDIA 


169 


We use the same method as in exercise 1 of integrating equation (12.27) 
over a small flat cylinder bounding the conductor on both sides, with account 
taken of the fact that the field inside the conductor is equal to zero. For quasi- 

stationary fields - 5 — = j'n, where j'n is the projection of J' on the external 
oZ 

normal. 


6 ) Calculate the energy of a system of charged conductors in a vacuum. 

r E* 

We substitute E ~ —V 9 in the definition of energy eieotr = "s— 

J o TT 

and integrate by parts, taking advantage of the fact that in a vacuum 
A 9 = 0. Since, from exorcise 6 in this section, the field close to a condiictor 
is equal to 4 n y. the surface of a conductor is equipotential, we reduce 
^elcctr to the form 


^ clectr — 


where e,- is the charge on the ith conductor and 9 / i.s its potential. 

7) Determine how a constant uniform electric field changes if a conducting 
sphere is introduced into it. 

The field potential must be sought in the form 9 = — Eo t + . where 

the vector d is in the same direction as the initial uniform field Eo. d is deter¬ 
mined from the condition that the tangential ccjmponont — E —■ v ? on 
the sphere is equal to zero. 

8 ) Determine in what way a constant utiiform elecfrio field E^ varies if 
a dielectric sphere of dielectric constant z is introduce<l into it. 

The field potential outside the sphere must bo sought in the samo form 
as in problem 7, but, inside the sphere, it must be sought in the form — E' r, 
where E'=^ const. Determine the vectors d and E' from tho boundary condi¬ 
tions derived in exercise 1 . 


9) Find tho electric field which arises in space when a point charge e is 
brought to a distance a from an infinite flat conducting siufaco. 

We drop a perpendicular from the point at which the charge is situated 
to the surface of the conductor and, at a distance « inside the conductor, wo 
place an equal and opposite (fictitious) charge e. Then, the field component 
tangential to the siu’face becomes zero. Tho field outside the conductor is 
everywhere equal to the vector sum of tho fields djie to tho real and fictitious 
charges. 

10) Find the electric field when a point charge e is brought to a distance 
a from an infinite flat surface of a dielectric with constant e. Tho dielectric 
is infinitely deep. 

We make the same construction as in exorcise 9. We look for tho field 

6^ r 

inside the dielectric of the form —j-®-, where Tj is a radius vector from tho point 
at which the real charge is situated; the field outside the dielectric is of the 
form where fj is a radius vector drawn from the “image” of the 

’’i 

charge. The constants e' and e" are determined from the boimdary conditions 
of exercise 1 . 


11) Assuming that x = 1, determine the magnetic energy of a system 
of conductors carrying a steady current. 

Starting from the equation ^ mag (H^dV, we substitute H = rot A 

8 77 J 

and integrate by parts using (11.29). The surface integral at infinity is equal 


160 


KLECTROD YNAMICS 


[Part II 


to zero. Then wo use (16.32) and reduce S'nag to the form 

^ mag “ '2 ’ 


12) Express the magnetic energy for a system of currents in terms of a 
double integral over the volumes of the conductors. 

We replace the summation over the charges, in equation (15.13), by a 
v<jlume integration. This gives 


1 ff J(r)j{r') 

-S’mag - 2^,-JJ . 


For lino conductors wo can substitute J dl instead of IdV, provided dV 
and dV are volume elements of different conductors. Then the mutual magnetic 
energy for two line conductors i, k is 


•S’’*' mag 


li Ik C (■ dl, dlfc 
c* J J I r, ‘—Ffcl 


M,k li Ik , 


whore I r; — r* [ is the distance between the elements of contours dl, an<i 
dl*. When i -- k wo must regard the conductor as thin, thovigh not infinitely 
thin, othorwiso the integral diverges. The intrinsic magnetic energy for one 
conductor is 


Mik is called the mutual induction coefficient, and Mu is the self-induction 
coefficient. 

13) Write down the Lagrangian for a system of currents, a.ssiuning that 
there is capacity coupling between the conductors by virtue of capacitors 
connected in the circuit. 

Because of the linearity of oloctrodynamical equations, the iJotcntial of 
the ith conductor is expressed linearly in terms of the charges on all the con¬ 
ductors: 

k 

From oxorciso 6, we obtain the electric energy. 

^ cicetr — Hik Cfc • 

i.fe 

The charge on the plates of a capacitor is related to the incoming ciurent 
by 6* •--- /*. With the aid of exorcise 12 wo obtain the magnetic energy 

mag — "ly Mik I, Ik = Mik e/ efc . 

I. k 1, * 

From Sec. 13 the Lagrangian (neglecting sign) isf (H" - E^)dV, whence 

O 77 J 

L. -- S mag ~ ^ olectr* “ {^^ik Ci 6k ~ Cik Pj 6k) 


[cf. (17.16)]. 


Sec. 16] 


ELECTRODYNAMICS OF MATERIAL MEDI.I. 


161 


14) Determine the work performed in unit time by a varying electromagnetic 
field in a medium. 

We write equation (13.23) for the external .space not occupied by the sub¬ 
stance, where p = 0. BVom the boundaiy conditions of exorcise 1, it follows 
that the normal component of the Poynting vector U is continuous at the 
boundary of tlie body. From this, ajjplying the same transformations to equa¬ 
tions (16.17) and (16.25), as load to (13.23), w-e find 


dA 

dt 


1 

4:1 


8D 

at 


-f H 


gB’ 

at, 


dF, 


whore —j— may be expressed in terms of the change in energy of the field in 
d t 

the external space. 

15) Calculate the energy transformed into heat in imit time in a conductor 
situated in a constant field. Assume H to be a single-valued function of B. 

For such a body, where H and B are related imiquoly, H f (B), 

where / (B) is some fimction of B. For example, for B = x H, / = . Tlio 

result of the preceding exercise gives 


dA 

dt 


iL 

dt 


r/(»! 

J 4 5T 


dV = 


aB^dV , 


see (16.30) and (16.31). For a constant field, there is zero on the left-hand 
side of the equation, while the right-hand side is an ossontially positive quantity. 
This energy must therefore be converted into heat according to the energy 
conservation law. 

16) Write down Lagrange’s equation for a system of cui-rents taking into 
account the conversion of energy into heat. 

From exercise 14, the heat generated in tinit time may be written as 

where n is the resistance of the ith conductor. Wo search f.ir Lag- 

1 

range’s equation with the right-hand side in the form 

d aL BL _ 

dt aii Be, ‘ ' 


From the definition (4.4) we find 


d^ 

dt 


dS 

dt 


vnr. d aij gLl .a 

= 2 


AVhenee v; = — n li 


■ n a . 


17) Reasoning in the same way as for exercise 15, show that if H is a double¬ 
valued fimction of B, having one value for < 0 and the other for > 0, 

() t V t 

then, for a periodic variation of B, the heat generated in one period is equal 
C H d 

to J —^- , where the integral is taken over one period. 

18) Show that if e (<o) and x (“) possess imaginary parts, heat is generated 
in a rapiiUy varying field. 


11 - 0040 


162 


ELKCTRO D YNAMIOS 


[Part IT 


The density of heat generation may be represented as the divergence from 
the Poynting vector U =[E H]. In forming quaflratic complex quan¬ 
tities, we must take into account their time dependence. For example, if we 
take E and H with a factor 6 “'"', their product will be proportional to 
After time averaging, this factor will yield zero. Therefore wo must take only 
Irt'oduets of the form [E1I*J -f [E*H]. Now, using etpiations (16.38) an<l 
(16.39), wo obtain 

div ([E* H] -I- [E H*]) = * 6 , (e - e*) EE* -t- iu (■/ - ■/*) HH* . 

4 TT 4 TC 4 t: 

Hero, both parts of tlie equation are real, and if e = 6 , -I- and x = Zi + ( X:> 
then t)n the right-hand side there is the expression- - — to (ej EE* X 2 HH*) . 

From this it can bo sotm that ej 0 and Xs > 9, since the energy of the field 
is absorbed by the medium. 

19) Calculate tho dielectric constant of a medumi, considering that all 
the chargt's in it arc conncctctl by elastic forces with tho equilibrium ixjsitions. 
The characteristic oscillation frctiuoncy of tho charges is o>g and tho frequency 
of tho field is w. 

Tho raditis vector of a charge satisfies the differential equation 
m ( r -1- r) « e E* e— iwt = e E. 

Its solution has the frequency of tho external field and may bo written as 

eE 


I’ho polarization can bo obtained from this by multiplying by tho munber 
of charge's in unit volume N and by e. Since the induction D is e<iual to e E 

4 TX 6^ 

or E 1 4 7t P, wo tind that e = 1 -|— -x -. For w -s 0, wo obtain tin- 

■m(w§ —u*) 


. , 4rtAre2 

static dioloctrio constant e,, = 1 -|--— 

m tog 


CO, E (to) is obtainetl 


for very largo froquoncitvs or for free charges e = 1 — 


4 7zNe^ 
m to^ 


Sec. 17. Plane Electromagnetic Waves 

General equations. In this section we shall first consider the solutions 
of Maxwell’s equations for free space, i.e., in the absence of charges. 
These solutions, as we shall see, are of the form of travelling waves. 
Analogous solutions also exist for a nonabsorbing material medium. 
I’hese solutions will also be found in the present section. 

In the absence of charges or currents, the equations for scalar 
and vector jiotentials are written thus: 

AA--1^=0, (17.1) 

= (17.2) 

with the additional condition (12.36) 


Sec. 17] 


PLANE ELECTROMAGNETIC WAVES 


163 


divA + --|f- 

c ot 


: 0 . 


(17.3) 


Equations of the form (17.1) or (17.2) are called wave equations. 

The solution of a wave equation. We shall look for particular solu¬ 
tions of equations (17.1) and (17.2) which depend only on one coordi¬ 
nate (for example, x) and on time. Then the wave equations can be 
rewitten in the following manner: 


1 ci*A „ 
i)x‘ c2 dfi 

^^9 _ 1 ^9 _ 

c«' 


(17.4) 

(17.6) 


and the supplementary condition takes the form 

dx c at 


(17.6) 


We shall now find the solution of (17.4) or (17.5) without imposing 
any further restrictions. We shall temporarily introduce the following 
notation: 


x + c..t = l,, 
x — ct — y]. 


(17.7) 


We transform (17.5) to these independent variables. (17.5) can be 
rewTitten symbolically as 


Then 


/J_ 


1 a \ 


1 

8 


=0 


\^a.' 

+ - 

c at] 

- 

\ax 

c 

at, 

)<P^ 


dtp 

ai 8 <p 

an 


dtp 

+■ 

89 


a.v 

95 

dx '' a-ri 

ax 


55 

8 y) 

f 

1 89 _ 

d 9 

1 ai 

89 

1 

an 


dtp 


8 (f 

c at 

ai 

c at 

+ 3,) 

c 

at 


as. 


a^T’ 

' constant t 

'■ (i.e.. 

il 

0), 

as. 

ax 

■ = 

1 and 

8y) 

ax 


(17.8) 


for constant x (i.e., dx = 0), ^ 4t- 

' ’ ” c, at c at 

the same equations. Thus, symbolically 


= 1, while 
1 in accordance with 


_£_4.Jl_L = 2— = 

8x ' c at c/5 ’ ax cat “ a-ii ’ 

_L_L\ — n 

ax ' c at)\ax c at)’^ a^arj 


Hence, wave equations (17.4) and (17.5) are written thus; 


a^A_ 

a^tr, 


= 0 , 


a^ <p 

Wa^ 


= 0 . 


(17.9) 


11* 


164 


ELECTHOUYNAMK'S 


[Part II 


Integrating any of them with respect to we obtain 

It is not diflicult now to integrate with respect to /i; 

f, 

A = IC (vj) d fi Cj (?), <p = I O' (•/)) d fi + C\ (1). 

Finally, the required solution is written as: 

A-=Ai(-/)) + A2(5), 9-9i(-o) -I-?2(^) , (17.11) 

since the substitution of (17.11) into (17.9) gives zero identically. 
Passing to the variables x, t, wo can write the solutions to (17.4), 
(17.5): 

A = Ai(a; —c<) + A2(a: + r0 , rf = <fi(x — ct) + <p^(x + cl). (17.12) 

Plane travelling waves. The solution doi)ending on x+ct does 
not depend on the solution whose argument is x — c <; these are 
two linearly independent solutions. Therefore it is sufficient to con¬ 
sider one of them: 

A = A(.r-c«), (17.13) 

(p = rf {x — ct). (17.14) 

In order to satisfy the supplementary condition (17.6), we perform 
a gauge transformation: 

(p{x-ct) = (p'(x-ct) — ^-^f(x-ct) = (p'-\-f (17.15) 


(the dot over / denotes differentiation with respect to the whole 
argument y; = ;c — c t). But, if wo put <p' = — /, we obtain simply 
9 = 0 . Then, from (17.6), we also obtain Ax= 0 . Thus, for a solution 
of the form considered, depending on x — ct only, the Lorenz con¬ 
dition is satisfied most simply by substituting 9 = 0 , A*=: 0 . 

The electric field component x is equal to zero: 


iKix 

dt 


(17.16) 


From the general result of Sec. 12 , this property of Ex does not 
depend on a jmtential gauge transformation. 

The magnetic field component x is also equal to zero: 


Hx 


sy 


= 0 . 


(17.17) 


Sec. 17] 


PLANE ELECTROMAGNETIC WAVES 


166 


We find the remaining field components: 


T1 _ 


_ 4 F = _^ ~ == 

c at ' 

-A., Ay. 

CX 


(17.18) 


From this equation it follows that E and H are perpendicular, 
because 

F,n=EyHy + E,H,^0. (17.19) 


They are equal in absolute magnitude, since E —H = VAy + Al. 

The solution of the form (17.13) has a simple physical meaning. 

Let us take the value of E at an instant of time < =- 0 on the plane 
. 1 ' = 0. It is equal to E (0). It is clear that the E (0) will have the 
same value at the instant of time t on the x>lane x = ct, because 
E (.c — ct) =E(0) on that plane. We can also say that the plane 
on which the field E is equal to E (0) is translated in space through 
a distance ct in a time t, i.e., it moves with a velocity c. The same 
applies to any plane x = Xq, for which there was some value of field 
E (Xq) at the initial instant of time. To summarize, all planes with 
the given value of field are propagated in space with velocity c. 
Therefore, the solution E (a: — ct) is called a travelling plane wave. 

We note that the form of the wave does not change as it moves; 
the distance between planes x — x^ and x = Xn, for which E is equal 
to E (Xi) and E (x^), is constant. This result 
holds for any arbitrary form of wave, i^ro- 
vided it is travelling in free space. 

llepcating, the velocity of propagation 
of a wave in empty sjiace does not depend 
on its shape or amplitude and it is equal to 
a universal constant c. 

The transverse nature of waves. The elec¬ 
tric and magnetic fields, as we have seen 
from (17.19), are perpendicular to the 
direction of wave propagation, as well as to 
each other. This is why it is said that elec¬ 
tromagnetic waves are transverse (as op- Fig. 24 

posed to longitudinal sound waves in air, for 

which the oscillations occur in the direction of propagation). The 
direction of propagation, the electric field, and the magnetic field are 
shown in Fig. 24. In it, n is a unit vector along the x-axis. 

In future it will be sufficient to take only one component of the 
electric field. For this it is necessary to take one of the coordinate 
axes, for example the y-axis, in the direction of the electric field, 
which in no way limits the generality. This is shown in Fig. 24. 


166 


ELECTRODYNAMICS 


[Part II 


The a:-coordinate will be written in the form x = rn, so that 

Ey = Ay{tn — ct). (17.20) 

But in this notation it is not necessary to relate the vector n, in the 
direction of propagation to the a;-axis. A solution with argument 
of the form (17.20) is applicable to any direction of n, provided, 
naturally, that n, E, and H are mutually perpendicular. 

The momentum density of the wave [see (13.27)] is equal to 


1 

47CC 


[EH] 


1 

4xc 


A} 


and is directed along n. The energy density is 

KA + W 1 
Sir ~ 4n 


It differs from the momentum density by the factor c. Tliis, as 
we shall see later, is very essential for the quantum theory of 
light. 

Pressure ol light. If a wave falls on an absorbing obstacle, for 
example, on a black wall, and is not reflected, then its momentum 
is transmitted to the wall in accordance with the conservation law. 
But momentum transmitted to a body in unit time is, by Newton’s 
Second Law, nothing other than force. It follows that there is a force 

of - 4 -^- for every square centimetre of the absorbing barrier, upon 

which the wave is normally incident. Force referred to unit surface 
is, by definition, the pressutr of the electromagnetic wave on the 
barrier. Consequently, electrodynamics predicts the existence of 
light pressure. This was observed and measured by P. N. Lebedev. 

Harmonic waves. A special interest is attached to travelling waves 
for which the function E (x — d) is harmonic. The most general 
harmonic solution is of the following form: 

E =Rc|re—'“(‘--r)j , (17.21) 

where the symbol Re {} denotes the real part of the expression 
inside the braces, F is a conqilex vector of the form -{-i [Cf. 
(7.14c)], and to is the wave frequency in the same sense as in equation 
(7.3). to is the number of radians per second by which the argument 
of the exponential function changes. 

The wave vector. The vector to — is called the wave vector. It is 

C 

denoted by the letter k: 

k s to — . 


C 


(17.22) 


Sec. 17] 


PLANE BLBCTBOMAGNETIO WAVES 


167 


The geometric meaning of k is easy to explain. We define the wave¬ 
length, i.e., the distance ^ r-n in space at which E assumes the same 
value. Let the required wavelength be X. Then 


. X Ar*n 

10) — 10)- 2 m 

e ‘ = e ' = e 


(17.23) 


because the period of the function e‘* is equal to 2 n. Hence, 


X = 


(17.24) 


Comparing the wavelength with tlie wave vector, we obtain 

k = 11, X = . (17.26) 

Polarization of a plane harmonic wave. Let us now study the 
nature of the oscillations of an electric field. To do this, we write the 
vector F in the form 


F = Fi -f iF„ (El — t R,) 6'“. (17.26) 

We choose the phase a so that the vectors Ej and Eg are mutually 
lierpendicular. We multiply equation (17.26) by e-** and square. 
Then we obtain 


(El —iEg)^ = 7^2 


(17.27) 


We have taken advantage of the fact that Ei and Eg are perpendic¬ 
ular. Because of this (Ej — iEg)* is a purely real quantity. Therefore, 
the imaginary ])art of the right-hand side of expression (17.27) must 
be put equal to zero. Representing as cos 2 a — i sin 2 a, we 
obtain 


or 


— {FI — FI) sin 2a -f 2 (Fi Fg) cos 2a -- 0 


, 2 (F, Fj) 

F(-Fl ’ 


(17.28) 


whence the angle a is determined for the given solution (17.21). 

It is now easy to express Ei and Eg. Indeed, from (17.26), Ei — 
— Eg = (Fi + i Eg) e-'“ = Fi cos a -}- Fg sin a — i (Fi sin a - Fg cos a), 
so tliat 


El = Fi cos a -f Fg sin a, 1 
Eg = Fi sin a — Fg cos a. J 


(17.29) 


We now include a constant phase in the exponent (17.21) and, for 
short, put 


163 


ELECTRODYNAMICS 


[Part TI 


Then, in the most general case, the electric field for a plane harmonic 
wave will bo 

E = Re {(El — i Ej) = Ejcos t{; + E 2 sin •]). (17.31) 

Here, the vectors E^ and Eg are defined as perpendicular. 

Let us assume that a wave is propagated along the x-axis. 
The t/-axis is directed along Ej, and the 2 -axis along Ej. Hence, from 
(17.31), we obtain 

Ey = El cos , Ez = E^ sin i};. (17.32) 


Let us eliminate the phase (I*- We divide the first equation l>y E^ , 
the second by E ^, square and add. Then the phase is eliminated and 
an cipiation relating the field components remains: 


+ 


E} 


= 1 . 


(17.33) 


It follows that the electric lielfl vector describes an ellipse in the 
// 2 -i)lano moving along the x-axis with velocity c, and passes round 
the whole ellipse on one wavelength. Relative to a fixed coordinate 
system, the electric field vector describes a helix wound on an elliptic 
cylinder. The pitch of the helix is equal to the wavelength. 

Such an electromagnetic wave is termed elliptically jiolarized. It 
represents the most general form of a plane harmonic wave (17.21). 

If one of the comi)onents i.s equal to zero, for example Ei = 0 or 
E 2 = 0, then the oscillations of E occur in one plane. Such a wave 
is termed plane polarized. 

When Ey is equal to E ^, the vector E describes a circle in the 
;iy 2 -plane. Ile])ending on the sign of Ez , the rotation around the 

circle occurs in a clockwise or anti¬ 
clockwise direction. Accordingly, 
the wave is termed right-handed 
or left-handed polarized. These 
waA'cs are shown in Fig. 25. For 
the same A-alue of phase d/, the 
rotation occurs either in a clockwise 
or anticlockw'ise direction. 

The sum of two waves of equal 
amplitude, which are circularly 
polarized, gives a plane polarized wave. The relationship between 
their phases determines the plane of polarization. Thus, if the waves 
show'll in Fig. 25 are added, the osciUations Ej and —Eg mutually 
cancel and only the plane polarized oscillation Ej remains. 

In turn, a circularly polarized oscillation is resolved into two 
mutually perjiendicular plane oscillations. 

Certain crystals, for example tourmaline, are capable of polarizing 
light. 


Sec. 17] 


PLANE ELECTKOM.AGNETIC WAVES 


109 


IJnpolarized light. In nature, it is most common to observe un¬ 
polarized (natural) light. Naturally, such light cannot be strictly 
monochromatic (i.e., possessing strictly one frequency to), for, as we 
have just shown, monochromatic light is always polarized in some 
way. But if w'e imagine that the components Ej and Eg in Fig. 25 
are not related by a strict phase relationship (17.32), but randomly 
change their relative phases, then the resultant vector will also 
change its direction in a random manner. However, for this, it is 
necessary that the oscillation frequencies should vary in time witluTi 
some interval A to, since the difference of phase between two oscil¬ 
lations of strictly constant and identical frequency is constant. 

The propagation ol light in a medium. We shall now consider the 
question of the propagation of light in a material medium. At the 
end of the preceding section we said that the quantities s and x 
have meaning only for oscillations of a definite frequency (o. To 
simplify notation, we shall not use the symbol for a real part Re {}, 
remembering that the real part is alw.ays taken. Since all the quanti¬ 
ties depend on time according to an c“'“' law, the derivative 

reduces to a multiplication by — i w. Then the .system of Maxwell’s 
equations can be written in the following form: 


rotH= --^eE, 

(17.34) 

div E = 0, 

(17.35) 

rotE = -^xH, 

(17.3«) 

div H = 0 . 

(17.37) 

Once again we look for a solution in the form of a plane wave. 
Since the time relationship is already eliminated, all the quantities 
depend only on one coordinate, for example, upon x. From (17.35) 
and (17.37), it follows that 

SEx dHx f, 

Bx ’ Bx 

or 

Ex = 0, Hx = 0, 


because a solution that is constant over all space does not repre.sent 
any w’ave. Thus, the waves are transverse. Equations (17.34) to 
(17.37) are satisfied if we substitute Ey = E {x), E^ = 0, Hy = (), 
Hz = H (x), or, in other words, if the electric field is directed along 
the y-axis and the magnetic field along the z-axis (a right-handed 
system). 


170 


E LEOTROD YNAMICS 


[Part II 


Indeed, there then remain the following equations: 


dll 

dx 

dFj 

dx 


ivi 

c 

i<a 

c 


• Z E y 

XH. 


(17.38) 

(17.39) 


Eliminating any of the quantities E or H, we obtain equations 
which are identical in form. For example 


whence 


d^E „ 

dx^ 


(17.40) 

(17.41) 


If the wave is propagated hi any arbitrary direction, and not 
along the x-axis, then the solution (17.41) is rewritten thus [here, 
the .symbol Re {} is included for a comparison with (17.21)]: 

E = Re{Fe“‘“('~“T-^^)J. (17.42) 

And so, compared with (17.21), the wave velocity has been multi¬ 
plied by Accordingly, the wave vector will, instead of (17.22), 

formally satisfy the equation 

k = ~Vzxn. (17.43) 


However, - ~ is the velocity of light in a medium, and (17.43) 
V^x 

is the wave vector, provided e and y are real numbers. Then equation 
(17.43) will be fully analogous to (17.25). In this case, the solution 
(17.42) is periodic in space and in time and describes a plane wave 

travelling with velocity — 7 = 

vex 

The ratio of the wave velocity iir free space to that in a medium is 
called the refractive index of the medium for waves of given frequency 
CO. VVe note that for visible-light frequencies, s (co) has nothing in 
common with its static value. For example, water has a dielectric 
constant of 81, so that V s = \/81 =9, while the refractive index in 
the visible frequencies is approximately equal to 1.33 (x can be consid¬ 
ered equal to 1 ). 

Absorption 0 ! light in a medium. We now consider a more general 
case of complex e ==Sj -f- is.^ (for simplicity we shall put '/=!). As was 
shown in exercise 18, Sec. 16, the imaginary part of s accounts for 
absorption of light. We shall clenote the root of the complex dielectric 
constant thus: 


Sec. 17] 


PLANE ELEOTBOMAGNETIC WAVES 


171 


VF = Vsi -j- i Sj = Vj + i 'J.2 . (17.44) 

Let us substitute this expression into the exponent of equation 
(17.42), putting nx=l, Uy — O, nz = 0. Since — 1 . avo obtain 

_. /__v vtM 

E = He{Fe ^ . (17.46) 

Thus, the wave is damiJcd in propagation. 

Its amplitude diminishes e times at a distance ——. A solution of tlio 

form (17.45) cannot exist in a region Avhich extends to infinity in all 
directions because x= — oo substituted into (17.46) yields E = oo. 
A solution which is damped in space can be used, for example, wlien 
an electromagnetic wave from free space is incident on an absorbing 
medium. And the ar-axis in (17.45) mxist be considere<l as directed into 
the medium, 

£xerclso8 

1) Consider the rofloetion of a piano oloctroinagnotic wave from tho inter¬ 
face between two transparent (nonabsorbing) media a and b with refractive 
indexes v,a va and vji, vb = Vjj, = 0). Solve the problem in two 
cases: 1) tho electric vector lies in the plane drawn through tho normal to the 
interface and through tho wave vector k, II) tho oloctric vector is parallel 


to the interface between the two media. Calling the angle of incidence 0 and 
the angle of refraction tt, find the ratio of the amplitudes of tho incident and 
reflected waves for both cases, I and II (Fig. 26). 

At the interface, the normal components of the inductions and the tangential 
components of tho fields mxist bo equal. In order to satisfy tho conditions at 
the boundary, it is necassary to introduce a third wave, which is reflected 
from the interface. We shall take tho equation of tho interface to be j/ = 0. 

The phase of the incident wave at the interface is —n*a: = a: sin 9, that 

c c 

for the reflected wave — x sin 0,, and that for tho refracted wave — ® sin 1> . 

c ^ c 

All three phases must coincide over the whole interface, whence 

'Ja sin 0 = vb sin ft (the law of refraction), 

0 = 0i (the law of reflection). 


172 


ELECTKODYXAMICS 


[Part II 


Taking into account that II = yll [this is easily obtained from (17.39) 
and (17.41)], wo write the boundary conditions (see e.xercise 1, Sec. Hi); 


v’ (I'J sin 0 — El sin 6) = E^ sin & , 

E cos 0 d- E-i cos 0 = E^ cos ft , 

Va(&’ — El) = vfjA’j , 

where E, A’, and E„ aro tlio electric fields in the incident, reflected and refract¬ 
ed waves. Wo can see that, by eliminating A?, K, and E^ from tho.se conditions, 
wo again obtain (he law of refraction. The ratio of the amplitudes is 


El _ tan (0 — ft) 
E tan (0 -b ft) 


(I) 


If 0 -1- ft = "iy, then El = 0 and reflection does not occur. (How can this 
he vorifiotl by double reflection ?) 

In case. If we must write down the boundary conditions and obtain (he 
eipiation 


El 

E 


sin (0 — ft) 
sin (0 -f- ft) 


(II) 


(I) and (I'l) are called (ho Frc'snol equations. 

2) In the case — sin 0 > 1 (total internal reflection), show that instead 

V|, 

of equation (1) and (If), a reflection coefflcient of unity is obtained. B’ind 
at wliat ilepth (ho wave, passing in the medium 6, is attenuated c times. 

3) B'ind the frequency of electromagnetic oscillations in an infinite square 
prism with perfectly roiloct.ing walls, assuming a longitudinal electric field 
constant along the k'ngth of the prism. Consider that the field inside the prism 
doe.s not become zero. 

Wo must consider that the tangential component of the electric field at 
th(' walls of (ho prism is e(|ual to zero, so that the normal component of the 
I’oynting wet or U should become zero. 'Pho solution to Maxwell’s ccpiations 

can he obtained from the potential Ax = ..4„ sinsin-^^e— ,Ay = -1- = 

a a 

--•'fi = 0 (the .i-coordinate is taken along the axis of the prism). It is of the form 


Ex - A'o S'*' sin e , 
a a 

IIy = I/p cos sin —- 6“, 

‘ a a 

IIX ---- — Hfl sin —~ j.Qg JUL e 
a a 

on the condition that 

27t2 

E„ = lip and c* 


(a is the side of the square). 

4) Solve the same problem for a travelling wave in the prism (a waveguide). 
The field in a w’aveguide is not zero anywhere except at the walla. 

The form of the vector potential in the previous problem suggests one of 
the following possible solutions: 


Sec. 18] 


TRANSMISSION OF SIGNALS. ALMOST PLANE W.AVES 


173 


Ax = A^e ' sin 

Ct Or 

= *4oye~' cos -^sin-^, 

-4z = <p = 0 . 

In order to satisfy the condition div A = 0, we must demand that 

ikA^ - - Af)y — 0 . 

Ck 

The normal (to the walls) components of the vector U ai'e again equal to zero 
since jB* = 0 at the walls. 

We determine the frequency from equation (12.37): 

2~2 

6)2 = c2it2 + —. 

o2 

From here it can bo seen that the frequency obtained in the .example 3) is 
tlie smallest wave frequency that can be propagated in the prism. This wave 
corresponds to X = oo. 

5) Show that when two waves, which are circularly poliu’izod in opposite 
directions, and which have equal amplitudes (but with frequencies differing 
by a small quantity Ao) and are travelling in one direction, lU’o combined, 
a wave is obtained whoso polarization vector rotates to an extent ilepomling 
on the distance of propagation. 


See. 18. Transmission of Signals. 

Almost Plane Waves 

The impossibility of transmitting a signal by means of a mono* 
chromatic wave. A plane monochromatic wave (17.42) extends with¬ 
out limit in all directions of space and in time. Nowhere, so to speak, 
doc.s it have a beginning or an end. What is more, its proiterties are 
everywhere ahvay.s the same; its frequency, amplitude and the distance 
betxveen two travelling crests (i.e., the wavelength X) are always 
constant. All this can be easily seen by considering a sinusoid or helix. 

Let us now po.se the problem of the possibility of transmitting an 
electromagnetic signal over a distance. In order to transmit the signal, 
an electromagnetic disturbance must be concentrated in a certain 
volume. By propagation, this disturbance can reach another region of 
si)ace; detected by some means (for example, a radio receiver), it will 
transmit to the point of reception a signal about an event occurring 
at the point of transmission. Likewise, our visual perceptions are a 
continuous recording of electromagnetic (light) disturbances origi¬ 
nating in surrounding objects. 

A signal must somehow be bounded in time in order to give notice 
of the beginning and end of any event. 

In order to transmit a signal the amplitude of the wave must, for 
a time, be somehow changed. For example, the amplitude of one of the 
waves of a sinusoid must be increased and we must wait until this 
increased amplitude arrives at the receiving device. A strictly mono¬ 
chromatic wave, i.e., a sinusoid, has the same amplitude everywhere 


174 


ELKCTIIOD YNAMICS 


LPart II 


and is therefore not suitable for the transmission of signals in time. 
In the same way, an ideal plane wave with a given Avave vector caimot 
transmit the image of an object limited in apace. 

The propagation of a nonmonochromatic wave. We shall now con¬ 
sider what can be done by HU])erimposing several sinusoids upon one 
another. Suppo.se that the freciuencies of all these travelling waves 

are included within an interval coq — to oip -j- . We shall 

consider that the frequency interval A to is considerably smaller than 
the “carrier” frequency tOp. The amplitudes of all the waves (to) 
will bo assumed to be identical for any frequency within the chosen 
interval, and equal to zero outside that interval. 

Then the resultant oscillation will be represented by the integral 
of all the partial oscillations: 

Wg + A W/2 

A’ = J /(Jo (6)) = I e-‘ dto . (18.1) 

Wq—A(o/2 

In this equation !xot only the frequency is variable, but also the 
absolute value of the wave vector k (the so-called wave number). 

According to (17.22), it is equal to in free space and, in a material 

medium, k =^, ^, where v in turn is a function of frequency. In future, 

in this section, we shall always assume y^ — O, i.e., that there is no 
absorption. 

Since the frequency lies within a small interval, k can be expanded 
in a series in powers of co—-tOo: 

k (co) = k (cOo) -f (co — cOo) ^ . (18.2) 

Substituting (18.2) in the integral (18.1), wo obtain the following 
expression for the resultant field: 

. Aco 

/; =/!Joe-‘^“«'-*o') j e—*] dco. (18.3) 

Ao> 

- 


We now introduce a nexv integration variable co — coq. Then the 
integration can be easily performed and the field reduces to the follow¬ 
ing form: 

Ao >/2 


= FqC -i(Mo/-CoJr). . 


(18.4) 


Sec. 18] 


TRANSMISSION OP SIGNALS. ALMOST PLANE WAVES 


175 


The shape of the signal. Let us now examine the expression obtained. 
It consists of two factors. The first of them, repre¬ 

sents a travelling wave homogeneous in space with a mean “carrier" 
frequency coq- However, the amplitude of the resultant wave is no 
longer constant in space because of the second factor: 


where the designation g and ij* are obvious from the equations. This 
factor has a greatest maximum at t, i.e., when the argument 


of the sine and the denominator are equal 
will be the less the greater their number 
(Fig. 27). The greatest maximum is equal 

to A CO (since ■ =1 for = 0). This 

maximum is not situated at a fixed place, 
but moves in space with a velocity 


d<^ 


(18.5) 


because, from the definition of the point 
of maximum t}/ = 0, it follows that 


dk 


t — vt. 


to zero. The other extremes 


As we indicated at the beginning of this section, a signal can be 
transmitted from one point of space to another by means of a displace¬ 
ment of the maximum, since this maximum is distinguished from other 
maxima. 

A disturbance of this kind concentrated in space is called a wave 
packet. 

The propagation of a signal of arbitrary shape. A wave packet need 
not necessarily have the form shown in Fig. 27. By choosing a rela¬ 
tionship for Eq (w) other than that in equation (18.1) (i.e., by choosing 
not a constant amplitude in the interval A w, but a more complicated 
frequency function), the shape of g (({/) can be changed. For instance, 
the resultant amplitude may have the shape of a rectangle, so that 
the transmitted signal will resemble the dash in the Morse code. If 
the frequency coq is within the radio-frequency range, then the signals 
can follow upon one another within audio frequency, in this way repro¬ 
ducing music or speech. 

The frequency range and the duration of the signal. In order to trans¬ 
mit a signal, it is always necessary to choose a range of frequencies. 


176 


ELKCTBODYNAMICS 


[Part IE 


Let US determine this range. Suppose that the receiving device is 
situated at some point a: = const. The width of the received signal 
can be seen from Fig. 27. In units of 4<, it is equal to 7t in order of magni¬ 
tude. Therefore, the duration of the signal is determined from the 
equation 

AtJ; = ■ A<~Tr. 

In other words, the duration of the signal A< is related to the frequency 
interval Aw necessary for its transmission by the expression 

Aw-Ai^ —2tc. (18.6) 

It should be noted that this estimate refers only to the order of 
magnitude of A w and At. The determination of AiJ^ is, to some extent, 
arbitrary. In certain cases Aw-A# > 2 tt, so that the estimate (18.6) 
is a lower figure. 

If a radio station is required to transmit sounds audible to the human 
ear, then the quantity A< is not greater than 0.5 x 10~^ sec, since 
the limit of audibility is 2 x 10^ oscillations per sec. From this 
Aw = 2 7t- 2.10L 

The range of Aw is always less than the “carrier” frequency Wq 
which, even for the longest-wave transmitting stations, is not less 
than 1.5 X 10® X 2 tc. In practice, an interval of A w three or four times 
less than the value given is quite sufficient, since clipping off the 
very highest frequencies in music, singing or speech does not introduce 
any essential distortion. 

Television transmissions require a considerably greater frequency 
interval, because an image must be reproduced 25 times every second; 
and, in turn, the image consists of tens of thousands of separate signals 
(points). As a result, the carrier frequency is about 2 7ux6x 10'^, 
corresponding to the metric band of radio waves. Such waves are prop¬ 
agated over a relatively small radius. They are screened by the cur¬ 
vature of the earth’s surface like light. The relation (18.6) is always 
correct in order of magnitude; therefore, for distant television trans¬ 
missions, it is necessary to have either relay stations, very high-placed 
transmitters, or cable lines. 

Phase and group velocity. We shall now consider in more detail the 
velocity with which signals are transmitted. From (18.5), the velocity 
of a wave packet is 

d(^ 

It differs from the propagation velocity of the constant phase sur¬ 
face, which is expressed in terms of frequency and wave number as 


Sec. 18] 


TBANSMISSIOUf OF SIGNALS. ALMOST PLANE WAVES 


177 


Indeed, the expression for a travelling monochromatic wave can be 
written in the following form; 

E . 

Comparing this formula with the general expression for a traveUmg 
wave E-—E (x — id), we arrive at (18.7). The velocity of the wave is u, 
and not c, because (18.7) is by no means necessarily related to the pro¬ 
pagation of a plane wave in a vacuum. 

= -^ is called the 'pJmse velocity of the wave; v is called the group 

velocity of the wave packet obtained by superimposing a group of 
waves. In a vacuum, v and u coincide because u^ — ck. However, if 
there is dispersion, i.e., a dependence of the refractive index on the 

frequency, then w = -^ A: so that v^k. 

The group velocity may be regarded as the velocity of propagation 
of a signal only when it is less than the velocity of light in free space c. 
If the expression (18.5) formally gives t; > c, we cannot avoid a more 
careful analysis that takes into account absorption. As a result, it 
turns out that an electromagnetic signal in the form of a very weak 
precursor is propagated with a velocity c, but the major portion of the 
wave energy arrives at the point of reception with a lesser velocity 
(see A. Sommerfeld, Optik, Wiesbaden, 1950). 

As an example of the calculation of group velocity we shall take the 
dependence of frequency on the wave vector in the form: 

co2 = 4- k^. 

This form is obtained for a waveguide (see exercise 4, Sec. 17, or 
exercise 19, Sec. 16, in the limiting case of extremely large frequencies). 
Whence the group velocity is 

_ c^k 

Ct> 

and since ck < to, we have v <c. 

Here, the phase velocity proves to be greater than c: 


u= 


=c 


ck 


>c. 


We note that uv—c^. 

In vector form the group velocity is defined as follows: 

9<o 

'W' 


v = 


(18.8) 


If we make use of a more accurate dispersion law (obtained in exer¬ 
cise 19, Sec. 16), then for to® ~tOo® there proves to be a frequency 
region for which e (to) is negative. For such frequencies the refractive 
index is purely imaginary and the expressions (18.5) and (18.8) become 
meaningless. 


12 - 0060 


178 


ELECTROD YNAMICS 


[Part II 


The form of a wave in space and the range of the wave vectors. An 
expression similar to (18.6) can also be obtained for the form of a wave 
in space at a definite instant of time. For this we must put const 
and then, once again taking A({/ ~Tt, we obtain 

. , Ao> dh A Ao; 

= -- 

Ak-Ax~2Ti; . (18.9) 

This means that if we want to limit the extent of an electromagnetic 
disturbance to a region Ax, we must perform a superposition of mono- 

2 TC 

chromatic waves in the interval of values k of order -r—. In three 

Ax 

dimensions (18.9) is rewritten thus: 

Akx • Axr^ 2 t: , 

Aky ■ Ay 2ti , (18.10) 

Akz- Az —^27r. 

The limiting accuracy of radiolocation. We shall explain the relations 
(18.10) by means of a graphic example. Let us suppose that an electro¬ 
magnetic wave has, in some way, to be bounded on the sides, as in the 
case of a radiolocation (radar) beam. Let us find the greatest accuracy 
with which the locator can register the position of an object at a dis¬ 
tance 1. Obviously, this accuracy is given by the transverse diameter 
of the beam d at a distance I from the locator. 

Let the frequency at which the locator works be equal to w, then 

the corresponding wavelength is X = • If the electromagnetic wave 

were to be propagated in unbounded space it would have an accu¬ 
rately defined wave vector 

k = -^n (18.11) 

(n is a unit vector in the direction of the beam). If the wave has a 
cross section d, then k can no longer be regarded as an accimately 
defined vector along n. In order to write down an expression for the 
electromagnetic wave at any point in space occupied by the beam, 
it is necessary to take a group of plane waves whose vectors k lie inside 
a cone described by a certain angle of flare. The maximum deviation 
of the wave vectors of these plane waves from the mean vector k, 
determined from (18.11), will bo called kj^. Here, we do not have in 
mind a cone with a sharply bounded surface, but only an estimate 
of the angular flare ofthe beam. According to (18.10) is related to the 
whole cross section of the beam by the following relation: 

d • 2tz . 


( 18 . 12 ) 


Sec. 18] 


TRANSMISSION OF SIONATS. ALMOST PLANE WAVES 


179 


Here, we have put Ax=d, ^K=2 because the inaccuracy 
obtains on both sides of the axis of the beam. 

The dimensions of the reflector of the locator itself can be ignored if 
the diameter of the beam is considered at a great distance; and this is 
of practical interest. In other words, d is determined only by the 
relationship (18.12) and is independent of the dimensions of the re¬ 
flector. 

The divergence of the beam of rays at every point is measured by 
the ratio . For this reason, the ratio of the cross section of the beam 


d to the distance from the locator I cannot be less than the quantity 
2fci 


k • 


d ^ 2 * 1 . 

I ^ k • 


(18.13) 


This relationship is shown in Fig. 28 for the limiting case of the 
equality. However, it must be borne in mind that it is not in reality 
an equality but an estimate of order of magnitude. (18.12) is also approx¬ 
imate and the symbol > must be written in it. 

Thus, we have obtained two estimates for kx'- 


kx> ^ (lower estimate) 
and, from (18.13), 

^^ ~ (upper estimate). 


Eliminating kx from these estimates, we obtain 


or finally 


JL 

lx ~ d 

d>VT\ 


(18.14) 


For example, if Z=100 km and X=1 m, then the position of the 
object caimot be determined with an accuracy exceeding 320 m. 
This is why the dimensions of the reflector could be neglected in the 
estimate. 

The limit of applicability of the concept of a ray. Equations (18.10) 
indicate within what limits the concept of a ray is applicable in optics. 
Obviously, one can talk about a ray in a definite direction only when 

Ak<^k, (18.16) 

i.e., when the transverse broadening of the wave vector is considerably 

2 TC 2 7C 

less than the wave vector itself. But and k'^ —— so that 

(18.16) is equivalent to the condition 

d . 


12* 


(18.16) 


180 


ELECTBOD YNAMICS 


[Pajrt II 


In other words, the dimensions of the region in which the concept 
of a light ray is defined must be considerably larger than the wave¬ 
length of the light wave. For example, a small circle in the wall of a 
camera-obscura of diameter, say, 1 mm is considerably greater than 
the wavelength of visible light, which is of an order of magnitude 
0.6 X 10-* cm. Therefore, the image obtained in a camera-obscura is 
formed with the aid of light rays. 

The optics of light rays is called geometrical optics. A ray is defined 
only when its direction is given, i.e., the normal to the wave front. 
If we are given a beam of nonparallel (for example, converging) rays, 
then the wave front is curved. But the radius of its curvature at each 
point is considerably greater than the wavelength. Such a converging 
beam of rays represents a set of normals to an “almost plane” wave. 
The curvature of the wave front close to the focus of the rays may 
become comparable with the wavelength, and then there arise devia¬ 
tions from geometrical optics. Such deviations are ealled diffraction 
effects. They are also observed when a light wave falls on some opaque 
obstacle. In accordance with geometrical optics, we should have ob¬ 
tained a sharp shadow—a transition from a region where the field differs 
from zero to a region where it is equal to zero. But Maxwell’s equa¬ 
tions do not permit such solutions, which are discontinuous in free 
space (cf. the boundary conditions, exercise 1, Sec. 16). In actual fact, 
there always exists a transition zone between “light” and “shadow,” 
in which the wave amplitude changes in a complicated oscillatory 
way. 

Exercises 

1) Find the limiting dimensions of an object which may be observed in 
a microscope using light of wavelength X. 

Denoting the semiangle of the cone of rays, drawn from the microscope 
objective to the object, by 0, we have Ak = k sin 6. Whence 

Ax — ^ 

Afc A; sin 6 sinB 


It is therefore convenient to use a beam of rays with large solid angles 
and small wavelengths. 

2) Show that if the dispersion law of exercise 19, Sec. 16, is used, then v<c. 
We write down an expression for the inverse of v. 


V 


8k 

da 


s -t- <0 


g\/r \ 

da / 


This loads to the inequality 


a 8e /- 


d € 

The derivativeis everywhere positive so that, for e>l, the inequality 
can be seen directly. When e < 1 we have 


Sec. 19] 


THE EMISSION OF ELEOTBOMAGNBTIC WAVES 


181 


e + 


= 1 + 


2 Sm * ' — ’ 


Squaring both sides of the inequality, it is easy to see that this quantity is 


greater than + 


= V7. 


See. 19. The Emission of Electromagnetic Waves 


Basic equations and boundary conditions. So far we have considered 
electromagnetic waves irrespective of the charges producing them. 
In this section we shall consider the emission of waves by point charges 
moving in a vacuum. The basic system of equations in this case is 
(12.37) and (12.38) together with the Lorentz condition (12.36). We 
revTite these equations anew: 


AA 

(19.1) 

A<p-^-^=-47tp, 

(19.2) 

div A -f- = 0. 

c dt 

(19.3) 


We begin the solution with (19.2) proceeding in the following man¬ 
ner. We assume that p differs from zero only in an infinitesimal vol¬ 
ume dF. We find the potential <p for such a “point” radiator. By virtue 
of the linearity of equation (19.2), the potential of the entire spatial 
distribution of charge appearing on the right-hand side of (19.2) is 
equal to the integral of the potentials due to infinitesimal small ele¬ 
ments of charge Se=pdF. 

In order to determine the solution uniquely, we must impose a cer¬ 
tain boundary condition. It is assumed that the charges are situated 
in infinite space, i.e., that there are no conductors or dielectrics any¬ 
where. 

In free space, a boundary condition can be imposed only at an infinite 
distance away from the charges. In accordance with the posed problem 
of radiation, it is natural to suppose that there was no field for an 
infinitely large interval of time before the initiation of radiation at an 
infinitely large distance from the radiator: 

<p(<—> — oo, /•—>-oo) = 0, 

(19.4) 

A(<—> — oo , r— >-oo) = 0. 

If no boundary conditions are imposed on the solution of an inhomo¬ 
geneous equation, then any solution of the homogeneous equation can 
always be added to it so that a miique answer cannot be obtained. 

The radiation of a small element of charge. Let us begin with aninfinitesi- 
mal charge element 8e—pdV. We place it at the coordinate origin. Then 


182 


ELECTROD YNAMICS 


[Part II 


the solution of (19.2) will possess spherical symmetry. In Sec. 11 , an ex¬ 
pression for the Laplacian operator A was derived in spherical coordi¬ 
nates (11.46). As in the case of a static charge [equation (14.7)], we 
must retain only the term involving differentiation with respect to r 
and, this time, obviously, we must also differentiate with respect to 
time. For the time being we consider that the charge density at all 
points, except the origin, is equal to zero. Therefore, for all points 
for which equation (19.20) is written thus: 


JLAr* ^ _ — 0 

r 2 8r dr c» dt^ ~ 


(19.6) 


Temporarily, we put 

(p = . 

Then 

8 ? _ J. _^2 h. _ y. ® 

dr r dr r® ’ dr dr ’ 

dr ^ dr ^ dr^ ^ dr dr ^ dr^ 


(19.6) 


Substituting this in (19.5) and multiplying by r (by convention, r is 
not equal to zero), we obtain 


8^0 _ 1 d^<S> 

dr^ 


(19.7) 


But this is the equation, of the form (17.6), for the propagation of a 
wave. Its solution is similar to (17.12): 

(D = <I)i(<+.L) + 0,(^-f). (19.8) 

The solution <I>i depends on the argument and the solution 

$2 depends on the argument t —y. The first of these arguments, 

t + —, for r->cx), — oo has a completely indeterminate form oo— oo, 

i.e., it is equal to anything. From the eondition (19.4), the funetion <S> 
becomes zero when r->oo, t ->— oo. Therefore becomes zero for 
any value of the argument, i.e., it is equal to zero everywhere. (The 

potential at infinity must tend to zero more strongly than ~ so that 

there should be no radiation; see below in this section.) For the 
function Og, condition (19.4) denotes that <E >2 (— oo)=0. In other 
words, the function <I>g tends to zero at minus infinity. It does not 
follow from this, of course, that it is equal to zero everywhere. Thus, 

Omitting the index 2 , we write the expression for 9 as follows: 


Sec. 19] 


THE EMISSION OP KLECTBOM.-VONBTIC WAVES 


183 


9 = 


(19.9) 


The function <I> is not as yet determined. From the form of its argu¬ 
ment we conclude that it describes a travelling wave in the direction 
of increasing radii (because f>0). Such a wave is termed diverging. 
Retarded potential. The value of the function at j-= 0, < = 0 is shifted 

to the point r in a time < = -- or, in other words, the potential at the 
point r and time f is determined by the charge, not at the instant of 
time (, but at an earlier instant t——. The term is a measure of the 

0 c 

retardation occurring as a result of the finite velocity of propagation 
of the wave. 

But when the retardation becomes a very small quantity, the 

potential very close to the charge must be determined by the instan¬ 
taneous value of the charge Se (t). We know from See. 14 that the 

potential due to a point charge is equal to (14.8), whence 

0(t) ^ Se(t} ^ p(t)dV 
r r r ' 

Therefore, 

^{t)==p(t)dV, (19.10) 


( f* 

t —~ 

with (19.9) and (19.10), equal to 


is, in accordance 


(19.11) 


Now, displacing the coordinate origin to another point, we obtain, 
like (14.9), 


|E-r| 


dV. 


(19.12) 


Here it is assumed that the charge density is given at the point 
r (x, y, z), and the potential is calculated at the point R (X, Y, Z), 
thus introducing the explicit dependence of p on the spatial argument r. 
Finally, in order to obtain the complete solution to (19.2), we must 
integrate (19.12) over all the volume elements, i.e., with respect to 
d V—dx dy dz: 


|B-r| 

c 


(19.13) 


For point charges, p denotes the special function that was defined 
in Sec. 12. 


184 


KLKCTKODYIfAMICS 


[Part II 


Equation (19.1) has exactly the same form as (19.2) and its solution 
satisfies the same boundary conditions. Therefore, the vector poten¬ 
tial is written quite analogously to (19.13): 


A 


-J - c|il-rT 


(19.14) 


Comparing (19.14) with (15.10), which gives A for a stationary 
current, we see that J depends on the argument r in two ways: first, 
directly, in accordance with its spatial distribution and, secondly, 
via the time argument; since the system of currents is not infinitely 
small, but has finite dimensions, the retardation of a wave from 
various points of the system is different. 

Itolarded potential at a large distance from a system of charges. 
We shall now look for the form of the solutions of (19.12) and (19.13) 
at a great distance away from the radiating system. We note that the 
integrand depends on the argument i? in both integrals in two ways: 
in the denominator and via the argument t. The function in the 
denominator depends very smoothly on Jt. Its expansion in terms 

of powers of H yields terms which tend to zero like at large dis¬ 


tances away from the system. As will be shown later, they do not 
add anything to the radiation (for n>l). So we simply replace 

b'l'rge distances from the system, the term | R—r | , 
ai)pearing in the argument t of the numerator, looks like this; 


R-r|-R-rVR = K--^^f- =Ji-rn, (19.15) 


where n is a unit vector in the direction of B. The subsequent terms 
of the expansion (19.15) contain Jt in the denominator and are insignif¬ 
icant. 1’hus, at a large distance from the radiating system, the poten¬ 
tials are: 

(‘"■‘’I 

An estimate of the retardation inside a system. The term in 

the arguments of the integrands of (19.16) and (19.17) indicates by 
how much an electromagnetic wave, coming from the more distant 
parts of the radiating system, is retarded in comparison with a wave 
radiated by the nearer parts of the system. In other words, the term 

determines the time that the electromagnetic wave takes in 
passing through the system of charges. If the velocity of the charges 


Sec. 19] 


THE EMISSION OF EEECTKOMAGNETIO WAVES 


185 


is equal to v, then, in that time, they are displaced through a distance 
V The retardation inside the system is neghgible when this 

distance is small in comparison with the size of the system r. Therefore, 
if y r (or, more simply, v<^c), then the charges do not have 

time to change their' positions noticeably during the time of propa¬ 
gation of the wave in the system. 

However, in order that nothing should really change in the system, 
the eharges must also maintain their veloeities in that time, because 
the vector potential depends on the currents, i.e., on the particle 
velocities. This imposes a further condition which is formulated in the 
following manner. Let the charges oscillate and radiate light of fre- 

quency co. The wavelength of the light is equal to X = ——. In the time 

the phase of the charge oscillations changes by co ~. This change 

must be small in comparison with 2 ti, whence it follows that the size 
of the system must be small compared with the wavelength of the 
radiated light in order that the retardation inside the system should 

be insignificant. Thus, the terra in the argument of the integrand 

is small provided two inequalities are fulfilled: v-^c, r<|X. 

The vector potential to a dipole approximation. Let us assume that 
both the inequalities obtained have been fulfilled. We omit the term 

in the time argument in the expression for vector potential (19.17). 
Then the whole integrand will refer to the same instant of time 
t — — and we obtain 

C 

Recall now that i = pv and that the charges are point charges. 
Then the integral (19.18) is reduced to a summation over separate 
charges: 

i c 

R 

Here, the lower index t -denotes that the whole sum must be 

taken at that instant of time. But v' = —rr, so that 

at ’ 

, , , . d{«-—I 

f e 

Here we have used the definition for dipole moment (14.20). We 
note that (19.20) involves only a time derivative d. Therefore, the 


186 


ELEOTBOD YNAMICS 


[Part II 


transformation (14.21), which corresponds to a change of coordinate 
origin, does not change d either for a charged system or for a neutral 
system. In particular, (19.20) holds also for a single charge. 

The apiiroximation (19.20), in which A is expressed in terms of a 
derivative of the dipole moment of the system as a whole, is termed a 
dipole approximation. 

The Lorentz condition to a dipole approximation. In Sec. 17, a poten¬ 
tial gauge transformation for travelling plane waves was chosen such 
that the scalar potential became zero. We shall make the same gauge 
transformation for diverging spherical waves. To do this we must 
subject the vector potential to the following condition: 

divA = 0, (19.21) 


which is obtained from (19.3) if we take <p = 0. 

In condition (19.21) we should not differentiate A with respect to 
R in the denominator: each such differentiation increases the degree 
of R by unity, while the potential is determined at a large distance 
from the radiating system. Only terms inversely proportional to R 

B 

contribute to the radiated energy (see below). The unit vector n = 

which will appear in differentiation, need not be differentiated a second 
time, since that would also give rise to superfluous degrees of R in the 
denominator. 

We choose an arbitrary gauge function in the form 


f- 


nd I <- 


A 

c 


R 


(19.22) 


Then, applying equation (11.37), it is easy to see that the condition 
(19.21) is fulfilled: 


div (A+ V/) = - - div n 


= 0 , 


Rc^ ' “ “ Bc^ 

1 d f 

And the scalar potential is cancelled by- 

Field to a dipole approximation. Let us now calculate the electro¬ 
magnetic field. We need to differentiate only inside the argument 

t —In calculating the magnetic field, we make use of (11.38) and 
of the fact that rot grad / = 0: 

H = rot A = rot d — ~| = 

The electric field is 


Rc^ 


[nd]. (19.23) 


E = - 


1 8A_.IL V/ = _ 

c dt cdt ^ Rc^ 


Rc^ 


n (nd) = 


Rc^ 


[n[nd]] = [Hn]. 


(19.24) 


Sec. 19] 


THE EMISSION OF ELECTROMAGNETIC WAVES 


187 


From these equations it can be seen that the electric field, the 
magnetic field, and the vector n are mutually perpendicular. In addi¬ 
tion, H—E, smce [Hn]2=//2—(Hn)^ and Hn = 0, since H and 
n are perpendicular. Consequently, the wave at a pomt R at a great 
distance away from a radiating system is of the nature of a plane 
electromagnetic wave. This result was to be expected because the 
field is calculated far away from charges, where the wave front may 
be approximately regarded as plane and the solution becomes the 
same as obtained in Sec. 17. 

Fig. 29 gives a general picture of the field. We situate the vector d 
at the centre of a sphere of large radius B so that d coincides with the 
polar axis or, in other words, is directed towards the “north pole.” 
Let us draw the radius vector of some point. Through this point, we 
draw the meridian and the parallel. Then the electric field is tangential 
to the meridian and is directed “towards 
the south,” while the magnetic field is 
tangential to the parallel and is directed 
“towards the east.” It can be seen from 
equations (19.23) and (19.24) that the 
field becomes zero at the poles and max¬ 
imum on the equator, i.e., on a plane 
perpendicular to d. The field distribu¬ 
tion in space does not possess spherical 
symmetry. We note that the trans¬ 
verse field cannot be spherically sym¬ 
metrical for purely geometrical reasons. 

The zone, in which the field is calculat¬ 
ed accor^ng to equations (19.23) and 
(19.24), is called a toave zone. 

The intensity of dipole radiation. Let us now find the energy lost 
by the system in radiation. We must calculate the energy flux crossing 
an infinitely distant surface. The energy flux density, or the Poynting 
vector, is 

U = ^ [EH] = ^ [[Hn] n] = -^ nH*. (19.25) 

Hence, the energy flux is directed along a radius, as it should be in a 
wave zone. The total energy crossing a sphere of radius R in unit time is 

T “ J - -4V J »*-**- ^ J 

because the vector ds is directed along n. Further, ds = R^- t: sin 3 
where 3 is the polar angle. From (19.23) 

H2 = -^d2sin*3. (19.27) 


188 


ELECTROD YNAMICS 


[Part II 


By substituting (19.27) in (19.26), cancelling R^, and integrating, we 
obtain an expression for the energy radiated in one second: 


dS _ 2 
dt “ 3 c» • 


(19.28) 


We note that aU the terms ia the expression for fields containing R 
in the denominator to a higher degree than the first would not con¬ 
tribute an 5 d;hing to (19.28) for a sufficiently large R. It is for this reason 
that only first degree terms in R have been retained in the denominator. 

The significance of equation (19.28). Equation (19.28) expresses a 
result of fundamental importance—energy is radiated whenever a 

charge is accelerated. Indeed, d = . Hence it is necessary—so 

that d should differ from zero—that the charges should be in accel¬ 
erated motion irrespective of the sign of the accelerations. 

But then electrons moving in an atom should radiate energy con¬ 
tinuously and should fall into the nucleus; every electron is in accelera¬ 
tion motion, otherwise its motion could not be finite (see Sec. 5). 

In actual fact atoms are obviously stable and the electrons do not 
fall into the nucleus. 

Here, we realize that classical, Newtonian, mechanics can in no 
way be applied to the motion of an electron in an atom. In the third 
part of this book we shall explain the stability of atoms using quantum 
mechanics, where the very concept of motion differs quahtatively 
from that in classical mechanics. 

Magnetic dipole and quadnipole radiation. We have indicated that 
charges must be in accelerated motion to radiate. But a simple example 
can be given when equation (19.28) yields zero even for accelerated 
charges. Let the system consist of two identical charged particles. 
According to Newton’s Third Law their accelerations are equal and 

opposite in sign, so that d = ^eir' = e(fj-|-f 2 ) = 0. In this case 

i 

this law is applicable because, to a dipole approximation, the retar¬ 
dation of electromagnetic interactions inside the system is considered 
as negligibly small and, hence, the interaction forces between charges 
are regarded as instantaneous. But there is then no need to take 
account of the momentum transmitted to the field and the total mo¬ 
mentum of the particles is conserved, thereby leading to the condition 
?!=—r 2 . For this case, approximation (19.18) does not describe the 
radiation and it becomes necessary to use higher-order approximations. 

If, in the expansion in powers of —, a further term is retained in 

addition to the zero term, then a radiation is obtained which depends 
on the change in magnetic quadrupole and dipole moments of a system 
of charges. It is essential that this expansion be not in terms of inver.se 


Sec. 19] 


THE EMISSION OE EliECTKOMAQNETIO WAVES 


189 


powers of R, as in electrostatics and magnetostatics, but in powers 
of the retardation inside the system. 

We have already mentioned that the retardation inside the system 

is small when v<^c and r X. The ratio — is involved in the 

magnetic moment of the system. Therefore, those terms in the ex¬ 
pansion (involving powers of the retardation) which are proportional 

to - -, account for magnetic dipole radiation. The quadrupole moment 

of the system involves an additional power of r compared with the 
dipole moment, and so quadrupole radiation is related to those terms 

of the expansion which are proportional to y . Higher approximations 

are important in those cases for which lower-order approximations, 
for some reason or other, become zero, as is the case of the two identical 
charges. 

The field due to a magnetic dipole radiator is similar to the field 
of a radiating electric dipole. Unlike the field represented in Fig. 29, 
the magnetic field, for magnetic dipole radiation, lies in the plane (i, 
(i.e.. it is along a meridian), while the electric field is along a parallel. 
The equation for intensity is similar to (19.28), though it involves 

(|i)^ instead of (d)^. Since the magnetic moment is proportional to 
the intensity of magnetic dipole radiation is less than the intensity of 
electric dipole radiation in the ratio 

The field of a radiating electric quadimpole has a more complicated 
configuration. The expression for the intensity of such radiation 
involves the square of the third derivative of the quadrupole moment 
of the system. In order of magnitude, the intensity of quadrupole 

radiation is less than the intensity of dipole radiation in the ratio 


Exercises 


1) Calculate the time that it takes a charge, moving in a circular orbit 
around a centre of attraction, to fall into the centre as a result of the radiation 
of electromagnetic waves. Regard the path as always approximately circular. 

2) A particle with charge e and mass m passes, with velocity v, a fixed 
particle of charge e,, at a distance p. Ignoring the distortion in the orbit of 
the oncoming particle, calculate the energy that this particle loses in radiation. 


Answer: 


= 


3 


OO 

2 e*ef r d( 

T J (p»-i- 

-OO 


IT 

3 m* c® p* V ‘ 


3) Why is it that when two identical particles collide (61 = 62 , = 

magnetic dipole radiation does not result if the interaction is calculated according 

to the Coulomb law ? The intensity of magnetic dipole radiation is . 


190 


BtKCTBOD YNAMICS 


[Part II 


4) A plane light wave falls on a free electron causing it to oscillate. The 
electron begins to radiate secondary waves, i.e., it scatters the radiation. 
Find the e&otive scattering cross section, defined as the ratio of the energy 
scattered in unit time to the flux density of the incident radiation. 

We proceed from the fact that r and then determine from(19.28). 
cE^ 

Dividing by the energy flux —j—, we obtain 

4 7C 

_ 8 7t 

" 3 m^c* ' 

Sec. 20. The Theory ol Relativity 

The law of addition of velocities and electrodynamics. In Sec. 15, 
the interactions of charges with a magnetic and an electric field was 
reviewed [see (15.34)]. But the motion of the charges was considered 
slow, in other words, the velocity satisfied the inequality c. 

Yet this inequality is by no means always satisfied. Electrons ob¬ 
tained in beta decay, particles in cosmic rays, and particles in accel¬ 
erators move with velocities close to that of light. Hence, it is necessary 
to obtain the laws of mechanics for these ultra-highspeed charged 
particles. 

If we attempt to apply Newtonian mechanical laws to these particles 
we will encounter an insurmountable contradiction—the law of 
addition of velocities cannot be applied in electrodynamics in its 
usual form (see Secs. 8 and 10). 

The equations of Newtonian mechanics are of the same form for 
all inertial systems moving uniformly relative to each other. In such 
systems there are no inertial forces. Naturally, under no circumstances 
can the principle of the equivalence of inertial systems be violated, 
otherwise we should have to assume that there existed a reference 
system at absolute rest. We must also consider that the equations of 
electrodynamics appear the same for all inertial systems in free space— 
in the form (12.24)-(12.27). It follows from these equations that the 
velocity of propagation of electromagnetic disturbances is equal to c 
and is the same for all directions in space. If it turned out that in 
some inertial systems the velocity of light depended upon the direction 
of its propagation, then these systems would not be equivalent to a 
system in which the velocity of light is the same in all directions. In 
this system, the electrodynamical equations would admit of a solution 
in the form of a spherical wave—a solution similar to the one that was 
obtained in the preceding section. In all other inertial systems, the 
velocity of light would depend on the direction of the wave normal. 

An analogy would arise with the propagation of sound in air: the 
velocity of sound in a system at rest relative to the air does not depend 
on direction, but in a system moving relative to the air the velocity 


Sec. 20] 


THE THEORY OF RELATIVITY 


191 


of sound is less in the direction of motion and more in the opposite 
direction as a consequence of the law of velocity addition. 

So far, it has been considered that light is transmitted in an elastic 
medium, “the ether,” and it has been regarded as self-evident that the 
velocity of light must be governed by the same law of velocity addition 
as the velocity of sound in air. Then a reference system fixed in the 
“ether” would have to be regarded as being at absolute rest, while all 
the remaining systems, as in absolute motion. In these systems the 
velocity of light would depend on its direction of propagation, in 
accordance with the law 

c' = c + v, (20.1) 

where, for simplicity, only that direction is taken which coincides with 
the relative velocity of the system. 

Michelson’s experiment. A direct experiment was performed which 
showed that the velocity of light cannot be combined with any other 
velocity and, in all reference systems, it is equal 
to a universal constant c. This was the famous 
Michelson experiment (1887) which we shall 
describe in brief. (^A ray of light falls on a 
half-silvered mirror^AS (Pig. 30), where it is 
split up: a part of the light is reflected and falls 
on mirror A while the other part is transmitted 
and falls on mirror B. Let SA be perpendicular 
to the motion of the earth and let SB be 
parallel to the earth’s motion. The light reflect¬ 
ed from the mirrors A and B returns to the 
plate SS\ the ray BS is reflected from it and 
falls on screen G while the ray AaS is trans¬ 
mitted to the screen directly. Thus, both rays Fig. 30 

are entirely equivalent as regards transmissions 
and reflections, though in the sections AS and BS the light is propa¬ 
gated differently relative to the earth’s motion. 

Let us assume now that the velocity of light is combined with the 
motion of the earth according to the usual law of addition of veloc¬ 
ities. Then, along the path SB, the velocity of light relative to the 
earth is equal to c— V, and c-\-V along the return path, where V 
is the velocity of the earth. The time light takes to travel along the 
entire path SBS in both directions is 

I I _ 2lc 21 , 21V^ 

c + F + c- F c®-F*— 

where l=SB. We have used the fact that F<^c. Along the section 
SA the velocity of light and the earth’s velocity are perpendicular to 
each other (in a reference system fixed in the apparatus). If again the 
law of velocity addition holds, then the velocity of light relative to 


192 


BLBCTROD YNAMICS 


[Part II 


the apparatus, along the section is equal to Vc*— F® (c is the 
hypotenuse of the triangle, V and Vc® — F* are the sides). The time 
taken by the light to travel along the whole path 8AS, equal to 2 1, is 

21 _■ 21 IV^ 

C 

Thus, the difference in the passage times along the paths SBS 
and is equal to . By means of repeated reflections, the path 

can be made sufficiently long (several tens of metres). Choosing it 
properly, we can arrange that the proposed time difference for the paths 
8AS and SBS is equal to a half-period of oscillation. The rays on the 
screen C should then mutually cancel. In order to be certain that the 
cancellation has occurred as a restdt of the combination of the velocity 
of hght with the velocity of the earth, and not by accident, it is suffi¬ 
cient to rotate the apparatus through 45° so that the direction of the 
earth’s velocity would be along the bisector of the angle A8B. Then 
the difference in time taken by the rays to travel along the paths 
SAS and SBS should at least become equal to zero; hence, if, in the 
previous position, the rays mutually cancelled as a result of the com¬ 
bination of the velocity of light with the earth’s velocity then, in the 
new position, the rays wordd mutually reinforce each other. In otlier 
words, the interference bands on the screen would be displaced by 
half a wavelength. 

Actually, no change m the path difference between the rays occurs 
when the apparatus is rotated, i.e., the expected effect is completely 
absent. An addition of the velocity of light and the earth’s velocity 
does not occur. 

The negative result of Michelson’s experiment is completely under¬ 
standable if we reject the postulate of an “ether.” Nevertheless, 
at the time when the experiment was performed, no one as yet under¬ 
stood that electrodynamics does not require an “ether” in order to 
become as complete and clear a science as mechanics. The fact that 
the law of addition of velocities—a truism for physicists in the past— 
failed to hold, appeared as an inexplicable paradox. 

In addition, Mchelson’s experiment apparently contradicted the 
phenomenon of astronomical abberation of light and Fizeau’s experi¬ 
ment. (These will be considered later in this section in the light of the 
theory of relativity.) 

Einstein’s relativity principle. We shall not give the history of the 
painful attempts to explain this paradox but, instead, we will straight¬ 
way present the correct solution to the problem as given by A. Ein¬ 
stein in 1905. It is known as the special theory of relativity. Despite the 
delusion -widespread among laymen, this term by no means expresses 
the relativity of omr physical knowledge. It expresses the mutual 
equivalence of all inertial systems moving relative to one another. 


Sec. 20] 


THE THBOBY OF RELATIVITY 


193 


The equations of electrodynamics do not imply the presence of any 
elastic medium (“ether”) for the transmission of eleetromagnetic 
disturbances. We have already discussed this in Sec. 12. The reality 
is the electric field itself. For this reason, the equations of electro¬ 
dynamics are just as independent of the choice of inertial reference 
system as the equations of mechanics. Both sets of equations describe 
motion, i. e., the change of state with time directly. Mechanics de¬ 
scribes the change of mass configuration, while electrodynamics de¬ 
scribes changes of the electromagnetic field. The forms of the equations 
of motion cannot change as a result of the choice of inertial system. 

This is why the result of Michelson’s experiment does not contradict 
the notion of relativity of motion, but confirms it. Michelson’s experi¬ 
ment shows that the velocity of light in free space is the same in all 
inertial systems. The velocity of propagation of interactions is a funda¬ 
mental constant in the equations of electrodynamics. These equations 
are invariant to a transformation from one inertial system to another 
only when the velocity of propagation of the interactions in both sys¬ 
tems is the same. And so the result of Michelson’s experiment contra¬ 
dicts only the law of addition of velocities, i. e., Galilean transforma¬ 
tions (8.1) and (8.2). This law of addition of velocities is confirmed 
experimentally only for relative velocities and for velocities of motion 
that are small compared with the velocity of light c. Obviously, it 
must be replaced by a more precise law for the region of high velocities. 
But this more precise law must also hold in mechanics for large par¬ 
ticle velocities. This can be seen from the following reasoning. 

Lot the charges in a specific inertial system interact in some way 
with the electromagnetic field producing certain events (for example, 
collisions between the charges). They may be precalculated on the 
basis of the equations of mechanics and electrodynamics. In trans¬ 
forming to another inertial system, the equations of mechanics and 
electrodynamics must retain their form, otherwise, other consequences 
will follow from the transformed equations taken together; in partic¬ 
ular, those events which were precalculated and occur in the first 
inertial system do not necessarily take place in another system. 
But events such as collisions, for example, are objective facts; they 
should be observed in all coordinate systems. Yet if we apply Galilean 
transformations (8.1) then the equations of Newtonian mechanics will 
not change, while the equations of electrodynamics will change, since 
the law of addition of velocities (10.12) is not applicable in electro¬ 
dynamics. Therefore, we must find transformations to replace the 
Galilean transformations such as would leave both the equations of 
mechanics and the equations of electrodynamics invariant. But then 
it would become necessary to make the laws of Newtonian mechanics 
more precise, since they are correct only for low particle velocities. 

A physical theory which cannot predict facts independently of the 
mode of their description is imperfect and contradictory. It is this 


13 - 0060 


194 


ELECTBOD YN AMICS 


[Part II 


that makes us reconsider the basic facts of mechanics, no matter how 
self-evident they seem to be in our everyday experience, which has 
to do with the motions of bodies at velocities that are small compared 
with that of light. 

The Lorentz ti^nsformations. We look for transformations of a more 
general form than, the Galilean transformations for passing from one 
inertial system to another. Like the Galilean transformations, they 
must satisfy certain requirements of a general nature. The.se require¬ 
ments may be expressed as follows. 

1 ) The transformation equations are symmetrical with respect to 
both systems. We shall denote the quantities that refer to one system 
by letters without primes (x, y, z, t), while those that refer to the other 
system will be primed {x', y', z', t'). We denote the velocity of the primed 
system with respect to the unprimed system by F. Then the mathe¬ 
matical form of the equations expressing unprimed quantities in 
terms of primed quantities (and the velocity V) is the same as that 
of the equations for the reverse transformation, if we change the sign 
of the velocity in them. This requirement is necessary for the equiva¬ 
lence of both systems. 

2 ) The transformation must convert the finite points of one system 
to the finite points of the other, i.e., if (x, y, z, t) are finite, then a 
transformation with finite coefficients must leave (x', y', z', t') finite 
values. 

Condition (1) greatly restricts the possible form of the transforma¬ 
tions. For example, it can be seen that the transformation functions 
cannot be quadratic, because the inversion of a quadratic function 
leads to irrationality, just as that of the function of any degree other 
than the first. A linear-fractional transformation (i.e., the quotient of 
two linear expressions)—under certain limitations imposed on the 
coefficients—may be inverted retaining the same form. For example, 
for one variable the direct and the inverse linear-fractional functions 
look like 

, ax+b b — fx' 

X — -p-r. *=—r-^—• 

ex + f ’ ex — a 

But this function does not satisfy condition (2): if x'—aje, x be¬ 
comes infinite. Therefore, a linear function is the only possible one. 

3) When the relative velocity of two systems tends to zero, the 
transformation equations yield an identity {x' =^x, y' —y, z' =z, t' ==t). 

4) A law of the addition of velocities is obtained from the transfor¬ 
mation equations such that it leaves the velocity of light in free space 
invariant: c'=c. 

Summarizing, we can say that the transformation equations: 1) 
maintain their form when inverted, 2) are linear, 3) become identities 
for small relative velocities, 4) leave the velocity of light in free space 
unchanged. 


Sec. 20] 


THE THEORY OE RELATIVITY 


19S 


These four conditions are sufficient. The required equations can be 
obtained most simply if one of the coordinate axes (for example, the 
aj-axis) is taken in the direction of the relative velocity. Then the other 
axes will not be affected by the transformation. 

We return to Fig. 11 (page 68), but we will not make the arbitrary 
assumption that t=t' (experiment supports this only for small relative 
velocities of both systems). Let us see what results from the conditions 
(l)-(4). If the velocity is along the a:-axis, then, as has just been found, 
y' =y, z' =z. This can be seen simply from Fig. 11. In the most general 
form, linear transformations of x and t, appear thus: 

x' = oix + , (20.2) 

t' = yx+8t. (20.3) 

The constant terms need not be written in these equations; they can 
be included in the definition of x or x' through choice of the coordinate 
origin. 

Let us apply equation (20.2) to the origin of the primed system, 
x' =-^ 0. This point moves with velocity F relative to the unprimed sys¬ 
tem. Hence, x—Vt. Substituting x'=^0, x—Vt in (20.2), we obtain, 
after eliminating t, 

otF.|-p = 0. (20.4) 


We shall solve equations (20.2) and (20.3) with respect to x and t. 
Elementary algebraic computations give 


Sx'— 

^ aS — Py ’ 


( 20 . 6 ) 


yx' — at' 
Py — “8 


( 20 . 6 ) 


Let us now apply condition (1). For this we note that the coefficients 
P and Y> which interrelate the coordinate and time, must change 
sign together with the velocity V. Otherwise, if the x and x' axes 
are turned in the opposite direction the equations will not preserve 
their form, and this is impermissible. Thus, the equations for the 
inverse transformation from unprimed quantities to primed have 
the same form as (20.2) and (20.3): 

X — ctx' — |3<', 

< = — yx' + 8t'. 

Comparing (20.7) and (20.5), we obtain 

— 8 
* aS — PY ’ 

-P_. 

^ aS — PY 


(20.7) 

( 20 . 8 ) 

(20.9) 

( 20 . 10 ) 


13* 


196 


ELECTBOD YU AMICS 


[Part II 


From (20.10) it follows that 

a8-PY=l- (20.11) 

Then, from (20.9), we obtain 

a-8. (20.12) 

No other relationships are obtained from the comparison of the 
direct equations with the inverse equations. 

We now use condition (4). We divide equation (20.2) by (20.3): 


x' 

~V 


“T + P 


(20.13) 


Let a: be a point occupied by a light signal emitted from the origin 
of the unprimed system at an initial instant of time <==0. Obviously, 

~ =c. But in accordance with condition (4),-^ =c. Hence, 

* r 


«C + P 
YC + 8 ■ 


(20.14) 


We substitute the relations (20.4) and (20.12) into (20.14) in order 
to eliminate p and 8. There remains a relation between a and y: 

yc* + ac = ac — ocV , 

whence 

y=-a-^. (20.16) 


We now substitute 
an equation for a: 


(20.15), (20.4) and (20.12) into (20.11) and obtain 

(20.16) 


In extracting the square root, we must take the positive sign 
in accordance with condition (3), because then (20.3) becomes t'—t 
for a small relative velocity. A minus sign would yield t' — — t, 
which is meaningless. 

Now expressing ail the coefficients a, p, y, and 8 in accordance 
with equations (20.16), (20.4), (20.15), and (20.12), respectively, and 
substituting (20.3) into (20.2), we arrive at the required trans¬ 
formations : 


X — Vt 


(20.17) 


Vx 


t’ = 


lA 


72 


(20.18) 


Sec. 20] 


THE THEORY OE BBLATIVrry 


197 


In order to explain the meaning of these equations we shall apply 
them to some special cases. Let a clock be situated at the origin 
=0 of the primed system. It indicates a time t’. Then, from equation 
(20.20), it follows that 


( 20 . 21 ) 


The clock which is at rest relative to its reference system we call 
the observer’s clock. It can be seen from (20.21) that one observer, 
comparing his clock with that of another observer, will always observe 
that the latter clock is slow, i.e., that If a clock is situated 

at the origin of the unprimed system (i.e., at the point * = 0), the 
transformation equation to the primed system is of the same form, 


since, from (20.18), we now obtain 


This not only 


does not contradict (20.21), but expresses that very fact: a clock 
moving relative to an observer is slow compared with his own clock. 

In the theory of relativity, a single universal time does not exist 
as in Newtonian mechanics. It is better to say that the absolute 
time of Newtonian mechanics is, in actual fact, an approximation, 
correct only for small relative velocities between clocks. The absolute¬ 
ness of Newtonian time has sometimes given cause to regard it as 
an a priori, logical category independent of moving matter. 

At any rate, Newton, by accepting instantaneous action at a 
distance, naturally had to consider time as universal; if we formally 
put c = cx3 in (20.18), we obtain t' =t. The instantaneous transmission 
of signals would allow us to synchronize clocks in all inertial systems 
independently of their relative velocities. In Newtonian mechanics, 
gravitational forces played the part of such instantaneous signals. 

It is sometimes thought that, knowing the velocity of light c, 
we can introduce a correction into the readings of clocks in different 
inertial systems such that the rate of time will everywhere be the 
same. But it is precisely equation (20.21) that describes the relative 


198 


BLEOTEOD YNAMICS 


[Part II 


passage of time in both reference systems after a correction has been 
introduced for the finite time of propagation of fight. Time reduction, 
as has already been shown, is completely reciprocal. Consequently, 
it can in no way be accoimted for by any change, resulting from 
motion, in the properties of clocks. The time reduction effect is purely 
kinematical. 

It must also be added that in speaking about clocks we by no means 
necessarily have in mind clocks which have been made by human 
hands; any natural periodic process that gives a natural time scale 
will do as well, for example, the oscillations in a light wave. It is 
clear that the physical properties of a radiating atom cannot in the 
least depend upon the inertial system in which the atom is described. 
This is what gives us the right to assert that equation (20.21) refers 
to similar clocks. 

At the same time we must remember that it is impossible to define 
time without relation to some periodic process, i.e., irrespective of 
motion. 

Relativity and objectivity. The relativity of time by no means 
indicates a rejection of the objectivity of its measurement in any 
given system of reference. It is entirely of no consequence what 
observer is observing the clock. The relative character of time in 
inertial systems is the only thing that counts. We have all long since 
become accustomed to the relativity of the datum line of time measure¬ 
ment related to time zones (i.e., to the sphericity of the earth). The 
theory of relativity teaches us that the time scale is also relative. 

The fact that an objective concept may be relative can be seen 
from the following example. In the Middle Ages it was thought 
that even direction in space was absolute, and it was then thought 
impossible to imagine that the earth was spherical since it would 
then foUow that our antipodes would have to walk upside down! 
The concept of “up” or “down” was related not with the direction 
of a plumb-line at a given point on the globe, but with certain other 
categories characteristic of the ideology of the Middle Ages. The 
vertical directions in Moscow or Vladivostok form a substantia] 
angle between each other, but nobody nowadays would think of 
arguing about which of them is the more “vertical.” The concept 
of verticality is completely objective at every point on the globe, 
but is relative for dijBFerent points. In the same way, time is objective 
in each inertial system, but is relative between them. 

Contraction of the length scale. We shall now consider the question 
of the measurement of length. In order to find out the length of 
a moving body (“its scale”), we must simultaneously plot the co¬ 
ordinates of its ends in a fixed system. Obviously, a fixed observer 
has no fundamentally different means of measurement, for, otherwise, 
he would have to stop the motion of the scale (i.e., transfer it to 
his reference system). If the ends of the scale are fixed by a stationary 


Sec. 20] 


THE THEORY OF RELATIVITY 


199 


observer at one time* we must put «=0. From (20.17), there foDows 
an expression for the length of a moving scale A*' measured by a 
fixed observer: 


Ax 


( 20 . 22 ) 


Like (20.21), this equation has a symmetrical inversion. If a 
■‘moving” observer measures a “stationary” scale, we must put 

— 0; it then turns out that Ax = -:r- r-T — . We conclude from 


V c* 


(20.22) that a moving scale is shortened relative to a stationary 
observer. The contraction occurs in the direction of motion. 

Lorentz supposed that this scale compression does not appear to 
both inertial systems, but, for some unknown reason, occurs when 
the scale moves relative to the “ether.” Lorentz and others attempted 
in this way to explain the negative result of Michelson’s experiment. 
Yet the very symmetry of the direct and inverse Lorentz trans¬ 
formations (20.17)-(20.2b) (they were known before the advent of 
the theory of relativity), from which the contraction of length follows 
only as a special case, shows convincingly that there is no system 
at absolute rest relative to an “ether.” It may be noted that by the 
beginning of the twentieth century, the “ether,” which had been 
introduced by Huygens as a medium that transmitted light oscillations, 
remained in physics simply as a rudimentary concept. The discovery 
and confirmation of the electromagnetic nature of light made the 
h 5 rpothetical elastic medium quite superfluous (see Sec. 12). Only 
the theory of relativity disclosed the real meaning of the Lorentz 
transformations. But then, many concepts regarded as absolute in 
Newtonian mechanics, turned out to be related to the motion of 
inertial systems. 

The formula for addition of velocities. We shall now find an equation 
for the addition of velocities arising from the Lorentz transformations. 
Differentiating (20.17) and (20.18) and dividing one by the other, 
we obtain 


dx' 

dt' 


= = 


dt 

V dx 
* c» dt 


(20.23) 


1 - 


* The idea of the simultaneity of two operations performed in the same 
coordinate system may be imiquely defined with the aid of light signals. Indeed, 
observers at rest relative to each other at a given distance can always check 
their time with the aid of light signals by introducing a constant correction 
for the known time of propagation. 


200 


EUECTBOD YNAMICS 


[Part II 


Noting that dy'—dy and dz'==dz, we have a transformation of 
the velocity components perpendicular to V: 


dy' 

dt' 


V'y = 


V dx 
c* dt 


Vvx 

c‘ 


= 


Vvx 

c‘ 


(20.24) 


For small velocities, (20.23) and (20.24) become the ordinary 
equations for addition of velocities. This can be seen if we let c tend 

to infinity, i.e., by putting -y =0- 

It is easy to see that if v = V vi + ty + — c , then likewise 

v' —c, i.e., the absolute value of the velocity of light does not change 
in passing from one inertial system to another. But the separate 
components of the velocity of light, which are less than c, may of 
course change; the direction of a light ray relative to different observers 
differs, since there is no absolute direction in space. 

Abberation of light. In this connection let us consider 
the phenomenon of the aberration of light. Astronomical 
aberration, or the defiection of light, consists in the 
fact that stars describe ellipses in the sky in the course 
of a year. Their origin is easy to explain: the velocity 
of the earth, in annual motion, combines differently 
with the velocity of the light emitted by the star 
(Fig. 31). If the velocity vector of the starlight relative 
to the sun is tJS then the resultant direction of the 
velocity, for one position of the earth, is ET^ and, in 
half a year’s time, El\. These directions are projected 
on different points of the celestial sphere so that in 
the course of a year a star describes a closed ellipse. In angular 

1 V 

units, the semi major axis of the ellipse is always equal to — 
where V is the velocity of the earth. ^ = 20".25. 

We may ask the question: Why does not the velocity of light 
in Michelson’s experiment combine with the earth’s velocity but 
remains equal to c, while the phenomenon of aberration shows that 
velocities combine (we note that Michelson’s experiment was also 
performed with an extra-terrestrial source of light). The explanation 
is that in Michelson’s experiment it was the absolute value of the 
velocity of light c that was measured (from the path difference of 
the rays), while in the aberration of light there is a change in the 
direction of the velocity of light as a result of the combination of 
its components with the velocity of the earth. Considering that the 
velocity of light relative to the sun is perpendicular to the plane 


£ 


Fig. 31 


Sec. 20] 


THE THEORY OF RELATIVITY 


201 


of the earth’s orbit, we must put v* = 0, Vy=c, Vz = 0 into (20.23) 
and (20.24). Then the components of the velocity of light relative 
to the earth are 


' T7 ' ^ 

Vx = — V , Vy= cV I -^. 

And, in accordance with Michelson’s experiment, v'x^-\-Vy'^—c^. The 
direction of the projection of the velocity of light onto the plane 
of the earth’s orbit (ecliptic) is reversed in the course of half a year, 
which is the reason why aberration occurs. 

Similar equations are obtained in the more complicated case when 
the rays from the star are not perpendicular to the plane of the 
ecliptic. They coincide with the equations that follow from a simple 

F® 

addition of velocities if terms of the order are neglected. 

Before the theory of relativity was put forward, it was wrongly 
supposed that the aberration of light contradicted Michelson’s experi¬ 
ment. 

Fizeau’s experiment. Fizeau’s experiment, which determined the 
velocity of light in a moving medium, was also believed to contradict 
Michelson’s experiment. Fizeau’s method was this. A beam of light 
was divided into two parts using a half-silvered mirror (Fig. 32). 
These beams were passed through tubes with flowing water; one 
beam in the direction of flow and the other in the opposite direction. 
For comparison, the same beams were passed through tubes in which 
the water was at rest. By .subsequent reflections the beams once 
again combined and cancelled each other when the path difference 
between them was equal to an integral number of half wavelengths 
(i.e., when they were in opposite 
phase). Coherence between them was 
obtained due to the fact that they 
both came from the same source. In 
stiU water, the path difference was chosen 
so that the rays were reinforced, i.e., the 
phase difference was equal to an even 
number of half wavelengths. The path Fig. 32 

difference in flowing water was varied. 

Since the frequency of the light and the tube lengths remained un¬ 
changed, the change in path length indicated a change in the velocity 
of light relative to the tubes. 

First of all, we note that the result of Fizeau’s experiment in no 
way contradicts the general ideas about the relativity of motion. 
A reference system fixed in flowing water is not equivalent to a 
system fixed in the tube, if we are studying the propagation of light 
in water. 


202 


ELECTROD YNAMICS 


[Part II 


Since the velocity of light in water is equal to > where v is the re¬ 
fractive index of the water, the general equation for the addition 
of velocities (20.23) shows that does not remain a constant quantity 

when passing to another coordinate system. At the same time, we 
cannot use the simple velocity-addition equation, because the de¬ 
nominator of equation (20.23) differs from unity by-^ (F is the veloc¬ 
ity of the water). Considering that V<4c and expanding the de¬ 
nominator in a series up to the linear term inclusive, we find the change 
in the velocity of light in moving water (see exercise 1): 


It was precisely this value, which differs from that given by the 
simple velocity-addition law, that was obtained by Fizeau. Since 

Michelson measured c and Fizeau measured —, there is no contra- 

V 

diction between them. 

Interval. Despite the fact that x and t are changed separately by 
the Lorentz transformations we can construct a quantity which 
remains invariant (unchanged). It is easy to verify that this property 
is possessed by the difference cV — x^. Indeed, 

e^t'^ + - .2 - + 2F»'t' 


, x'^+ 

y.2 ^ - -^ 

I - r 

or 

c2<2 - = cH'^ - a;'* =- . (20.25) 

The quantity a is called the interval between two events; that 
which occurred at the coordinate origin a: = 0 and initial time f = 0, 
and another event that occurred at the point x and time t. 

The word “event” may also be regarded in its most common 
everyday sense provided that its coordinates and time may be defined. 
If the first event is not related to the origin of the coordinate system 
and the initial instant, then 


s* = c2 (<j - <i)2 - (*2 - == c2 (<' - t{)^- (x’^ - xi)2. (20.26) 


Sec. 20] 


THE THBOBY OF KBLATIVITY 


203 


Considerable importance is attached to the interval between two 
infinitely close events: 

ds^ = c2*2 - dx^. (20.27) 

It is not at all necessary to consider that both events occurred on 
the abscissa. Since dy' = dy and dz' =dz, the interval is always in¬ 
variant : 


ds^ = - dx* - dy^ - dz^ = c^dt^ - dP = - dV^. (20.28) 

The interval, written in the form (20.28), is not related to any definite 
direction of velocity. 

Space and time intervals. The interval provides for a very vivid 
way of studying various possible space-time 
relationships between two events. Let the spatial 
distance between the points at which the events 
occurred be taken along the abscissa, and the 
interval of time between them, along the ordinate 
axis (Fig. 33). To begin with, let c< > 1, for example, 

5* = cH^ - (20.29) 

We shall plot the values of ct and I corresponding 
to two definite events measmed in quite dijfferent 
inertial systems. No matter what values are ob¬ 
tained as a result of these measurements of ct and 
I, the interval s (20.29) between the events is the 
same. It follows that the locus of the points, for 
all possible spatial distances I and time intervals 
ct, is an equilateral hyperbola s^^cH^ — P. Two 
branches of the hyperbola are possible: one lies in the fast relative 
to the event that occurred &tt — 0,x ~- 0, while the other is completely 
in the future. It is easy to see that such a relationship inevitably 
results if the events are causally related. Let the events, in some 
reference system, be known to occur in the same place, for example, 
sowing and reaping. To this system there corresponds the point 0 
(sowing) and the point A (reaping). But since aU points of the given 
branch of the h 3 perbola lie at t>0, the sowing in any reference 
system must occur earlier than reaping. 

We can also proceed from causally related events which in our 
coordinate system do not occur at one point of space, such as firing 
and hitting the target (they occur at one point in a system fixed 
in the bullet). To the system fixed in the bullet there now corresponds 
a vertical section OA in Fig. 33, while to our system there corresponds 
some inclined line dra^vn from the origin to a point on the same 
upper hyperbola. Thus, here too, the second event—that of hitting 
the target—occurs after the shot in any reference system. 


204 


ELEOTaOD YN AMICS 


[Part II 


K the velocity of the bullet (or any material particle) is v, then 
s^—cH^ — It necessarily follows from the inequality (20.29) 

that l=vt<ct or v<c, if a reference system exists in which both 
events occurred at a single point in space. For this reason, the velocity 
of any material particle can only be less than the velocity of light c. 

The region above the first asymptote is called “absolvie future.” 
relative to the initial event. 

Although the example of a bullet hitting a target may appear 
to be a special case, in actual fact the foregoing reasoning may be 
used for all cases when an effect is related to a cause by some material- 
transfer process, either in the form of a particle or in the form of 
a wave packet. A reference system may be related to moving matter, 
and therefore any velocity of material transport satisfies the inequality 
u<c. 

Consequently, the theory of relativity never contradicts the objective 
nature of causality. And it is precisely the sequence of cause and 
effect which determines the direction of time. 

We can also consider other pairs of events. For example, let one 
event occur on the sun and the other, five minutes later, on the 
earth. Light travels from the sun to the earth in eight minutes; 
for the two given events ct<l, s^ — cH^ — 1^<0. For purely physical 
reasons, such events can in no way be related carisally because the 
velocity of transport of matter does not exceed c. For such events, 
the interval term P is not equal to zero in any system of reference. 
And so there is no coordinate system, relative to which events related 
by an imaginary (s*<0) interval occur at a single point of space. 
Yet, on the other hand, their time sequence has not been defined: 
coordinate systems exist in which the first event occurs before the 
second, and there are systems in which the second event occurs 
before the first.* Thus, the theory of relativity denies the absolute 
nature of the simultaneity of two events occurring at different points 
in space and separated by an imaginary interval. 0 and B in Mg. 33 
are such events. B lies on the hyperbola which, relative to 0, belongs 
partly to the future and partly to the past. But 0 and B can in no 
way be related causally, since no interaction can arrive at B from O 
instantaneously. 

Hence, the relativity of simultaneity does not contradict the absolute 
nature of causality. 

The region between the asymptotes is called absolutely distant, 
relative to the coordinate origin. 

The light cone. The asymptotes to the h 3 rperbola l=ct are of special 
interest. For them, 5=0. 

* Foi' example, the second event occuiTed earlier than the first relative 
to any system moving in the direction from the earth to the sun with a velocity 

5 

exceeding-^ c, as may easily be seen from (21.20). 

O 


Sec. 20] 


THE THEOBY OE KELATIVITY 


206 


The relationship l=ct holds for two events related by an electro¬ 
magnetic signal, for example, for the emission and absorption of 
a radio signal. For these two events, s — 0 in aU reference systems, 
because the velocity of light is invariant and we must always have 
that l — ct. Since the curve in Fig. 33 in actual fact corresponds to 
3-1-1 dimensions (three spatial and one time) instead of a plane, the 
locus of zero intervals is picturesquely called “a light cone.” 

Proper time. The concept of proper time of a particle is closely 
related to that of interval. This is the time measured in a coordinate 
system fixed to a particle. The displacement of the particle relative 
to this system is equal to zero by definition. Hence, the proper time 
that has elapsed between any two positions of the particle is pro¬ 
portional to the interval calculated for these two positions; 


. ds 

h- — - 


■ $ . (20.30) 

Here, v is the velocity of the particle relative to 
reference system in which the interval of time 
(20.30) and also (20.21), the proper time is 
For finite time intervals 

an arbitrarily chosen 
is equal to di. From 
always the shortest. 

^0“ 


(20.31) 

i.e., 

to<t. 

(20.32) 


From (20.32) there follows a consequence with which, at first 
sight, it is difficult to agree. Characteristic tune is the time which 
determines the rhythm of life processes in the human organism. And 
for this reason if an imaginary traveller leaves the earth with a velocity 
close to that of light, and later returns to the earth, then, in accordance 
with (20.32), he will have grown less old than a person, initially 
of the same age, remaining on the earth. 

The asymmetry between the traveller and person that remained 
on the earth is explained by the fact that the traveller was not moving 
inertially—he first travelled away and then returned. For this, it 
was necessary for him to turn about in some way, i.e., to lose the 
property of inertiality retained by observers on the earth. It may be 
noted that the time spent in turning may make up an infinitesimally 
small amount of the time of travel, if the journey itseK is sufficiently 
long. This is why the turning operation cannot in any way re-establish 
equality between t and But this operation is necessary m order 
that the comparison between the ages of both observers can be 
performed, i.e., to return them to the same point of space and to 
a single coordinate system. Thus, disregarding the “striking” formu¬ 
lation of the traveller experiment, we may say that time in a non- 


206 


ELKOTROD YITAMIOS 


[Part II 


inertial system may differ to any extent from the time in an inertial 
system, even though the noninertial system may deviate from 
inertiality for an extremely short time. 

The case of the human traveller is, of course, purely imaginary, 
technically speaking. But a relationship of the type (20.32) is observed 
in the decay of mesons in cosmic rays. The mean life of a positive 
Tc-meson, of mass 273 electronic masses, in decaying to a [x-meson, 
of mass 207 electronic masses, together with a neutral particle, is 
2 X 10~® sec (the negative 7r-meson is most often captmed by nuclei). 
This time is measured for a 7c-meson stopped in the substance, i.e., 
it is proper time. The velocity of the meson, like that of any other 
particle, does not exceed c. If the relationship (20.32), expressing 
the relativity of time, did not exist, then a rapid 7r-meson with a 
velocity of the order of c would on the average travel through 
c X 2 X 10"® cm = 600 cm of air. Actually, the mean path of a 7t-raeson 
is considerably greater due to the fact that its life-time, in a co¬ 
ordinate system fixed in the air, is considerably greater than its 
proper life-time. 

Frequency and wave-vector transformations of electromagnetic 
waves. Proceeding from invariants, we may find out in what way 
the noninvariant quantities involved are transformed in passing 
from one reference system to another. We shall now show that the 
transformation properties of wave-vector components and frequency 
are those of coordinates and time. 

In order to prove this, it is sufficient to note that the phase of a 
wave is invariant. Indeed, the phase characterizes some event, for 
example, the fading of the electric and magnetic fields at a certain 
instant of time and at a certain point in space. If we examine this 
wave in another coordinate system, then the coordinates and time 
(corresponding to this event) will have other values, though the event 
itself cannot change, of course. This is easy to understand if we 
imagine that the electric and magnetic fields are measured by the 
readings of some inertialess device. Two such devices, situated at 
the same point of space at a certain instant, but in motion relative 
to each other, must together indicate the zero value of the field. 
Otherwise, the coordinate system in which the electromagnetic field 
is equal to zero will in some way be distinguished from the 
others. 

From (17.21) and (17.22), the expression for the phase of a wave is 
(J; = xkx -f yky + zkz — (at. 

This quantity must be invariant in transforming to another coordinate 
system. Let us express x' and t’ in accordance with (20.17) and (20.18) 
and substitute them into the condition of invariance of phase: 


Sec. 20] 


THE THBOBV OF KELATIVITY 


207 


xkx-\-yky + zkz — (iit = x'k'x + y'kY-{-z'kx — (i^'t' — 


x—Vt 


kx + y' ky + 2 kz' 


Now comparing thie coefficients of x, y, z, and t, we obtain the 
transformation equations for the wave-vector components and 
frequency: 


h — ^ 

(20.33) 

II 

(20.34) 

II 

(20.36) 

<^’+kx’V 

1/, 

(20.36) 


These are entirely analogous to equations (20.19) and (20.20). 
The longitudinal and transverse Doppler effect. If the frequency 
of a somce of light with respect to its own coordinate system is equal 
to to', and the angle between its velocity and the line of sight is 

equal to so that cos 9-'=-^ cos 9-', then we obtain from 


(20.36) 


to = 


-t- -^cosO'j 


(20.37) 


In particular, if the source moves along the line of sight (i.e., towards 
the observer). 


(20.38) 


These equations describe the well-known Doppler effect, by means 
of which the radial velocities of stars are measured. The square root 
in the denominator gives a correction introduced by the theory of 
relativity into the formula usually used. 

If the velocity of the source is perpendicular to the ray, then, 
from the requirement that kx — 0, we also obtain a change in frequency, 

although it is of second order with respect to-^ : 


208 


BLECTBODYNAMICS 


[Part II 


(20.39) 


This transverse effect was observed by Ives in the radiation from 
moving ions (in canal rays) when the ratio — was sufficient to detect 

the frequency displacement spectroscopically. This gives direct 
experimental proof of the contraction of the time scale in relative 
motion. 

A comparison of inertial forces and the force of gravitation. Let 
us now investigate the transformation from inertial to noninertial 
systems. We define the latter as a system in which there are inertial 
forces. 

All inertial forces have the common property that they are pro¬ 
portional to the mass of the body. Among the interaction forces, 
only one force is known which possesses that property—^this is the 
force of Newtonian gravitation. The fact that gravitational force 
is proportional to the mass of the body is very well known, though 
very surprising all the same. The mass of a body may be defined 
from Newton’s Second Law when any kind of force (electric, magnetic, 
elastic, etc.) acts on the body. It is therefore very difficult to under¬ 
stand why the force of interaction between bodies, namely, the 
force of gravitation, is proportional to that very mass involved in 
the expression for Newton’s law [see (2.1)]. As is well known, all 
other interaction forces are independent of mass. 

In addition, the very form of the gravitation law itself somewhat 
contradicts our physical intuition—^iii accordance with this law, 
gravitational forces are transmitted over any distance instantaneously. 

Einstein called attention to the profound significance of the analogy 
between inertial forces and the force of gravitation. In certain cases 
these forces are indistinguishable in their action. For example, when 
an aeroplane performs a turn and, in doing so, inclines the plane 
of its wings, the passengers feel, as before, that the direction of gravity 
acting on them is perpendicular to the floor of the cabin. In this case, 
a resultant force consisting of gravity and a centrifugal force operates 
like the force of gravity—^they are both proportional to the mass 
and act on all the bodies inside the aircraft in the same way. It is 
physically impossible to separate these two forces without con¬ 
sidering objects outside the aircraft. 

When a lift begins to rise the gravitational force is, as it were, 
increased—^the force of inertia due to the acceleration of the lift 
is added to it. 

In these examples (almost trivial), the inertial and gravitational 
forces are equivalent “on a small scale,” that is, in certain small 
regions of space. In large regions, there is a certain essential difference 


Sec. 20] 


THE THEORY OT BBLATIVITY 


209 


between the behaviour of inertial forces and gravity. The latter 
diminishes with the distance from the centre of attraction, while 
inertial forces either remain constant or increase without limit. Thus, 
centrifugal force increases in proportion to the distance from the 
axis of rotation. The force of inertia in a coordinate system fixed 
in an accelerating lift is the same at any distance away from the lift. 

The general theory ol relativity. The basic idea of Einstein’s gravi¬ 
tational theory is that motion in a gravitational field is the same 
sort of inertial motion as the accelerating of passengers relative 
to a braking carriage. It is precisely for this reason that acceleration 
due to the action of gravity does not depend on the mass of the body. 
In order to understand why the force of gravitation, as opposed 
to the well-known inertial forces, becomes zero at an infinite distance 
away from attracting bodies, we must assume that the space close 
to the attracting bodies does not have the geometrical properties 
of Euclidean space. In other words, we must take it that space and 
time obey non-Euclidean geometrical laws in the sense of the ideas 
first developed by Lobachevsky and later by Riemann. Free motion 
in such non-Euclidean (Riemannian) space is curvilinear. However, 
since it is precisely the inoperties of space itself that determine 
the curvature, acceleration of bodies does not depend upon their 
masses (if we can neglect the effect of the latter on the gravitational 
field) in the same sense as the field of a falling stone does not affect 
the gravitational field of the earth. 

Thus, gravitational and inertial forces are indistinguishable in 
small regions of space. In such regions, a noninertial coordinate 
system is equivalent to an inertial system in which an additional 
gravitational field is operative, with the same acceleration of falling 
bodies which, in the noninertial system, is ascribed to inertial forces. 
For this reason, this theory of gravitation is also called the general 
theory of relativity in contrast to the special theory of relativity, 
which considers only inertial systems. 

Since the equations of motion in the general theory of relativity 
(in the same way as all equations of motion) are formulated in differ¬ 
ential form, equivalence on a small scale is quite sufficient for writing 
down the equations. 

However, we must remember that a rotating coordinate system 
is not, as a whole, equivalent to a gravitational field. Indeed, a rotating 
system can, generally, only be determined for distances from the 
axis of rotation for which the velocity of rotation is less than that 
of light. This is why a rotating system is not equivalent to a non¬ 
rotating system, which has meaning in infinite space, too. 

The mechanics of Einstein’s general theory of relativity is con¬ 
siderably more complicated than Newtonian mechanics, which is 
included in this theory as a limiting case. But Einstein’s theory is 
free from the gnosiological concepts, so alien to us, of the hypothesis 


14 - 0060 


210 


ELECTBODYNAMICS 


[Part II 


of action at a distance. The properties of space and time in Einstein’s 
theory are studied in inseparable unity with the motion of matter, 
and not only as a requisite for the motion of matter. Abstract space 
and time, which in Newtonian physics were sometimes regarded as 
almost belonging to logical, a ■priori categories, do not exist in Ein¬ 
stein’s gravitational theory—in the general theory of relativity, space 
and time are endowed with physical properties. 

The consequences of Einstein’s gravitational theory. Einstein’s 
refined gravitational theory leads to a series of results that may be 
verified by astronomical observation. 

1) The perihelion of Mercury should rotate through 43" per century. 
This is in excellent agreement with astronomical facts. 

2) Rays of light from stars passing near the limb of the sun should 
be displaced towards it, since light is not propagated rectilinearly 
in non-Euclidean space. This result also agrees closely with accurate 
observations made during solar eclipses. 

3) Spectral lines in heavy stars should be shifted towards the rod 
end of the spectrum, and this, too, is found to be the case. 

For the first time in the history of science, the general theory 
of relativity made it possible to pose the cosmological problem, i.e., 
the problem of the structure and development of the Universe. 
The present state of the cosmological problem is far from a solution 
due to insufficient astronomical data and to the mathematical diffi¬ 
culties associated with Einstein’s gravitational equations. Jt should 
be noted that before the general theory of relativity, the cosmological 
problem was posed in a purely speculative way; Einstein’s theory 
indicated the path for scientific investigation and has led to a series 
of important results. 


Exercises 

1) Calculate the change in the velocity of light propagated through flowing 
water in Fizoau’s experiment. 


M ± =- 

1 ■ 


• ±-v 

~F~ 

VC 


Disregarding the theory of relativity, the result would be a J- 


±F. 


2) Obtain a precise equation for the aberration of light, with an arbitrary 
inclination a of the ray of the ecliptic. 

„ F 
cos f>- 

Answer: cosa'^--^-. 

I -cos a 

c 


3) Write down the equations for the Lorentz transformations for an arbitrary 
direction of the velocity V relative to a coordinate system. 


Sec. 21] 


BBI/ATIVISTIO DYNAMICS 


211 


rV r'V 

In our equations x - —p- , x' =—pr— . Tlie component perpendicular to the 


velocity is 
From (20.17) 


V(rV) 

pa 


l2L 

V 


TV 

V 


VJi'Vl 

pT- 

Vt 


jpa 

C*' 


Multiplying this equation by -zpr and adding equations, we obtain 


r'- 


pa ^ ^ 


V(rVl 

pa 


Fn-Va 


4) Write down the Lorontz-transformation equations for the components 
of acceleration. 

5) Show that the “hnu’-dimensional volume element” dx dy dz dt is invariant 
with respect to a Lorontz transformation. 

dr = da:'11-from (20.22) and dt — dt' 

from (20.21), whonco the statement follows. 

(i) A light beam is within a solid-angle element dfi. Show that Lorentz 
transformations leave the quantity w^dO invariant. 

Uso the rosiJt of o.xcrciso 2: dO. — — 2iTd cos h. 


Sec. 21. Relativistic Dynamics 

Action for a particle in the theory of relativity. The adjective rel- 
ativistic denotes invariance with respect to Lorentz transformations, 
which invariance satisfies the relativity principle. For example, 
Maxwell’s equations in free space are relativistic. 

In effect, the Lorontz transformation is derived from the require¬ 
ment that the equations of electrodynamics remain invariant. There¬ 
fore, the proof of the relativistic invariance of Maxwell’s equations, 
which proof will be given somewhat later in this section, is simply 
in the nature of a confirmation. 

The situation with mechanics is altogether different. Newtonian 
mechanics satisfies only the Galilean relativity principle, which holds 
for velocities small compared with c. Therefore, it is necessary to 
find equations of mechanics such that they will be invariant with 
respect to Lorentz transformations. 

In Sec. 10 it was shown how to develop a mechanics by proceeding 
from the principle of least action. And it was found possible to deter¬ 
mine the form of the Lagrangian of a free particle by proceeding 
from two basic assumptions [see equations (10.11)-(10.13)]: 


14* 


212 


ELECTBOD YNAMICS 


[Part II 


1) Action is invariant to Galilean transformations; 

2) The Lagrangian of a free particle depends only on the absolute 
value of velocity; the velocity vector v cannot be involved in it 
because, in the absence of an external field, there are no distinguishable 
directions (in space) relative to which the vector v can te given. 

In relativistic mechanics the first condition is replaced by the 
invariance to a Lorentz transformation, while the second condition 
remains unchanged. Both conditions are satisfied by an action function 
of the form 

1 I o 

( 21 . 1 ) 


*S’=Jads=Jac /1 — ^ dt, 


where we have used the relationship (20.30) between ds and dt. Agree¬ 
ment with the first condition can be seen from the fact that action 
is expressed in terms of interval only, while agreement with the 
second condition is obvious. No other invariant quantities can be 
constructed from dl and dt except the interval, whence the uniqueness 
of the choice (21.1). 

The Lagrangian lor a free particle. In order to define the constant a, 
we examine the limiting form of (21.1) for a small particle velocity. 
If __ 

fl-|--l-^t- (21.2) 


From the definition of the Lagrangian (10.2) 

S = jLdt 

it follows that the Lagrangian is 
L = ac|/ 


'1. 


a.c — ■ ^ 


c2 


(21.3) 


(21.4) 


The first term in (21.4) is constant and can be omitted as not appearing 
in Lagrange’s equation [see (10.8)]. The second term should be 
compared with the Lagrangian for a free particle in Newtonian 
mechanics: 

( 21 . 6 ) 


Whence 


L = 


a = 


ic. (21.6) 

The meaning of m here is the mass of the particle measured in a 
coordinate system in which the particle is at rest (or infinitely near 
rest). Thus, by its very definition, the quantity m is relativistically 
invariant. Finally, we have the Lagrangian in the form 


L = — mc^ L/1 


(21.7) 


Sec. 21] 


RELATIVISTIC DYNAMICS 


213 


Momentum in relativistic mechanics. From (21.17), we immediately 
obtain an expression for momentum in the theory of relativity: 


8L my 


( 21 . 8 ) 


As required, at small particle velocities it reduces to the Newtonian 
expression p = mv. 

Sometimes the quantity ——- (i.e., the proportionality factor 

V1 — v^jc^ 

between velocity and momentum) is called the maas of motion of 
the particle, as opposed to the rest mass m. To avoid confusion we 
will not use the expression “mass of motion,” and will take the 
term mass to mean the quantity m which is relativistically invariant 
by definition. 

The limiting nature of the velocity of light. The limiting character 
of the velocity of light, about which we have already spoken in 
Sec. 20, can be seen from equation (21.8). As the velocity of a particle 
approaches the velocity of light, its momentum tends to infinity. 
The only exception is a particle whose mass is equal to zero. Its 
momentum, written in the form (21.8), gives the indeterminate form 
0/0 for v—c and can remain finite. But then the velocity of this 
particle must always equal c. This property, as we know, is relativist¬ 
ically invariant since the velocity of light is the same in all inertial 
systems. The momentum of such a particle must be given in¬ 
dependently of its velocity [and not according to equation (21.8)], 
since the velocity is already determined and is equal to c. A velocity 
greater than c is utterly meaningless because it involves an imaginary 
quantity for momentum. 

Energy in the theory of relativity. Let us now determine the energy 
of a particle. In accordance with the general definition for energy (4.4), 


8L j- 

■■ V ^- L = 

dy 


' [^1 

r c 


+ mc2 y 1 


y> 


(21.9) 


Equation (21.9) once again confirms the limiting nature of the veloc¬ 
ity of light. When v tends to c, the energy of the particle S tends to 
infinity. In other words, an infinitely large quantity of work must be 
performed in order to impart to the particle a velocity equal to that 
of light. 

Rost energy. From equation (21.9), the energy of a particle at rest 
is equal to mc^. Let us apply this equation to a complex particle ca¬ 
pable of spontaneously decaying into two or three particles. Many 
atomic nuclei and also unstable particles (mesons) are capable of such 
disintegration. In the disintegration, the energy must be conserved, 


214 


ELBCTltOD Y NAMICS 


LPart II 


^ = (21.10) 

because disintegration is spontaneous, caused not by any external 
interaction, but by some internal motion in the complex particle. 
The Lagrangian for this motion is not known explicitly, but in any 
case it cannot involve time. Therefore, the energy of a complex particle 
before disintegration is equal to the energy of the two particles formed 
after the disintegration, when there is no longer any interaction 
between them. 

The energy of all these particles is expressed m accordance with 
equation (21.9), as aj)plied to all free particles (whether simple or 
complex) when their motion is considered as a whole. The only possible 
form of the Lagrangian for such motion is (21.7), from which it follows 
that the energy is in the form (21.9). Substituting this expression in 
(21.10) and noting that the initial particle was at rest, we obtain 


TOC* = 


( 21 . 11 ) 


But the terms S’l and on the right are correspondingly greater than 
TOjC* and TOgC*, whence wo obtain the fundamental inequahty 


TO ^ TOi + TO2 . 


( 21 . 12 ) 


Hence the mass of a complex particle capable of spontaneous dis¬ 
integration is greater than the sum of the masses of its component 
particles. In Newtonian mechanics, the mass characterizing the motion 
of the system as a whole [see the last term of equation (4.17)] is equal 
to the sum of the masses of the component particles. 

If we define the difference 


T= 


(21.13) 


as the kinetic energy of a particle (for small energies it reduces to 
T = and call toc* the rest energy, then it can be seen from the law 

of conservation of energy (21.11) that part of the rest energy of a 
complex particle is converted into kinetic energy of the component 
particles, and part is converted into their rest energy. Only the total 
energies S, and not the kinetic energies T, satisfy the conservation 
law because the kinetic energy of a complex particle as a whole is 
equal to zero before disintegration and cannot be equal to the essen¬ 
tially positive kinetic energy of the disintegration products. 

In chemical reactions, the change in the rest masses of the reacting 
substances occurs in the order of 10“® (and less) of the total mass. 


Sec. 21] 


BELATIVISTIO DYNAMICS 


215 


In nuclear reactions, where the particle velocities are of the order c/10, 
the change in mass may approach one per cent. 

When an electron and positron (a positive electron) are annihilated, 
their energy, including rest energy, is totally converted into the energy 
of electromagnetic raidiation. 

As we shall see from quantum theory, radiation is propagated in 
space in the form of separate particles—so-called light quanta (quan¬ 
tum mechanics teaches that this is compatible with the wave proper¬ 
ties of radiation!). The velocity of a light quantum is equal to c so 
that its mass is identically equal to zero. For this reason, the total 
rest mass of the particles taking part in the annihilation process is 
2 mc^ before the annihilation and zero afterwards. 

However, the change in the energy of the electromagnetic field is, 
of course, equal to 2 me*, provided the electron and positron did not 
have any additional kinetic energy. We could, by convention, call the 
energy of an electromagnetic field, divided by c*, its mass. With such 
a definition of mass, the total “mass” would be conserved. But com- 
])ared with the law of conservation of energy, such a law of conserva¬ 
tion of “mass” does not contain anytliing new; it only repeats the law 
of conservation of energy in other units. 

It is precisely the rest mass that is best to use in describing nuclear 
reactions, for a change in rest mass determines the energy which may 
be generated as a result of the reaction (in the form of kinetic energy 
of the disintegration products, or in the form of radiated energy). 

There is no sense in calling the energy of a light quantum divided by 
the square of the velocity of light, its mass, because this quantity 
does not in any way characterize light quanta. Tiiis quantity has one 
value in one reference frame and another value in another frame, 
because the energy of any particle depends upon the reference system 
relative to which its motion is defined. Yet rest mass is a quantity 
that characterizes the particle. For example, the rest mass of an elec¬ 
tron, involved in the expressions for all its mechanical integrals of 
motion, is equal to 9 x 10-*® gm. The corresponding quantity for a 
quantum is identically equal to zero, and, in this sense, characterizes 
a light quantum in the same way that the quantity 9 x 10-*® gm is 
characteristic of an electron. 

The mass of a particle determines the relationship between the 
momentum and velocity of the particle in accordance with equation 
(21.8). It is impossible to determine the mass of a particle by its 
momentum alone, since particles with the same momenta can have 
quite different masses. For this reason, it is meaningless to state 
(though this is sometimes done) that the existence of light pressure 
(i.e., momentum of the electromagnetic field) proves that the fight 
quantum has a finite mass. 

It is sometimes said that a mass of one gramme is capable of releasing 
an energy of 9 x 10*® ergs (i.e., 1 c*). However, if the substance con- 


216 


EUECTROD YNAMICS 


[Part II 


sists of atoms the possibility of generating this energy is still question¬ 
able since up to now not a single process is known in which the total 
quantity of protons and neutrons (collectively called, nucleons) is 
changed.* This is why, the relative change in rest mass in nuclear 
reactions is always measured in fractions of one percent. 

The possibilities of various reactions are also limited by the conser¬ 
vation of total charge. 

The Hamiltonian for a free particle. We shall now express energy in 
terms of momentum. Squaring equation (21.9) and subtracting from it 
equation (21.8), after it has been squared and multiplied by c^, we 
obtain 

— — (21.14) 

We have called the energy expressed in terms of momentum the 
Hamiltonian [see (10.15)]. Hence, 


S — = Vm^c* + . (21.15) 

Whence we obtain a relationship between the energy and momentum 
of a particle that has no rest mass: 


S = cp. 


(21.16) 


The Lorentz transformation for momentum and energy. We shall 
now find out how energy and momentum behave with respect to a 
Lorentz transformation. From equation (21.8) we get 


Px- 


mvx 


m dx 


f 


dt 


F4 


= mc 


dy 


dz 


Px = mc-^ 


mc^ dt 


da 


dt 


dt /I 


v‘ 


da • 


(21.17) 


The quantities m, c and ds are invariant. Hence, the components 
Px, Py and px are transformed similar to dx, dy, and dz, i.e., similar 
to X, y, and z. In accordance with the last equation, energy transforms 
like time. We can make the following comparison; x, y, 
Pz~ z, ^ r^cH. 


* In order to annihilate the whole mass we would have to first prepare 
“antimatter” (when ordinary matter interacts with antimatter they are mutually 
annihilated, cf. Sec 38). But this would require a like expenditiuo of energy. 


See. 21] 

RELATIVISTIC DYKAMICS 

217 

Now 

mation 

substituting momentum and energy in 
(20.17) and (20.18), we obtain 

the Lorentz transfor- 


(21.18) 


Vy = Py> 

(21.19) 


Pz=^Pz, 

(21.20) 


11 

(21.21) 


We note that a correct transition from (21.18) to a nonrelativistic 
equation for the transformation of energy is obtained only when the 
rest energy mc^ is substituted in place of for then = —mV 

(i.e., v'x = Vx — V) in agreement with the Galilean law for addition of 
velocities. 

Hence, if we demand that the Lorentz transformation yield the 
correct limiting transition to a Galilean transformation, it is necessary 
to include the rest energy of the particles in their total energy. Con¬ 
versely, the kinetic energy T (21.13) does not give a correct limiting 
transition. 

Further, we note that if we form the expression from equa¬ 

tions (21.17) we obtain 

^2 _ 

in accordance with (21.14) 

The velocity of a system of particles in the theory of relativity. We 
shall now show how to determine the velocity of a system of particles 
in relativity theory. We shall consider two particles. Between the veloc¬ 
ity, momentum, and energy of each particle there exists the relation 

P = 4-- (21.22) 

It is obtained if we divide (21.8) by (21.9). The same equation can 
also be obtained somewhat differently. Let us determine, from (21.18), 
the velocity F of the coordinate system relative to which the momen¬ 
tum of the particle is equal to zero. Putting = 0 on the left-hand side 
of (21.18) we will have, on the right, 

T7 . PxC’‘ 

s 


or, if the velocity is not along the a;-axis at all, in accordance with 
(21.22) V=pc*/^=v. As applied to a single particle, the statement 


218 


BLECTBOD YNAMICS 


[Part II 


v=V is trivial and simply denotes that the momentum of the particle, 
relative to a coordinate system moving with the same velocity as the 
particle itself, is equal to zero. 

We now ai)ply equation (21.18) to two particles in order to find the 
velocity of the coordinate system relative to which their total momen¬ 
tum is equal to zci-o. The total momentum in the primed system is 
p'-fp'--p', and the total energy Let us take the a;-axis 

along p'. Since the Tjorentz transformation is linear and homogeneous, 
it has the same form for the sum of two quantities as for each sei>arate- 
ly. Therefore, we immediately obtain an equation similar to (21.22): 


V _ (ill '--Pil 


(21.23) 


I’lie primes may bo omitted here. In order to obtain the limiting tran¬ 
sition to the velocity of the centre of mass in Newtonian mechanics 
from (21.23), it is necessary to take Pi=-»«jVi, p 2 =wi.jV 2 , 

^2 C-, i.e., the particle energies are replaced by their rest energies. 

'I'lie quantity V, expressed in terms of the particle velocities accord¬ 
ing to equations (21.8) and (21.!J), does not have the form of a total 
derivative of any quantity with respect to time. Therefore, it is im- 
possibl(! in relativistic mechanics to determine its coordinates in terms 
of tlie velocity of the centre of mass. It is better to say that if we 
attenqit to express the coordinates of the centre of mass by means of 
a classical (or some other) equation it is impossible to represent V 
in tlie form of a time derivative of these coordinates, except in the 
trivial case when and V 2 are constant. This is \vh 3 ?^ the concept of 
centre of mass for particles moving in accelerated motion cannot be 
used. 

As regards relative velocity, v,^- -Vo, it is meaningless in relativistic 
mechanics, since there is Jio simple law for the combination of veloc¬ 
ities. 

Action for particles in an electromagnetic field. Let us now turn 
to the equatioirs of motion for a charged particle in an electromagnetic 
field. We already know tlie part of the action function which describes 
the interactioii of cluarges and field. Jfrom equation (13.17), this is S^. 
Since the variation of (S'j leads to Maxwell’s equeations, we can be cer¬ 
tain of the relativistic invariance of i6\. As applied to point charges, 
we have already written in magnetostatics in equation (15.26). 
But action for free cluarges in the next equation, (16.27), was suitable 
only for small particle velocities. 

We now know the Lagrangian for a fast particle in the absence of a 
field (21.7). Thus, the Lagrangian in an external field is equal to the 
sum of the rclativistically invariant expressions (21.7) and (16.26): 


P = 


+ 


— e<p. 


(21.24) 


Sec. 21] 


BELATIVISTIO DYNAMICS 


219 


We now obtain an expression for momentum and energy. Momen¬ 
tum is 


dL 

dy 


my 


- +-A 

2 C 


Po+T^- 


(21.26) 


Here, Po denotes momentum in the absence of a held, 
b’rom (4.6), the energy is 


= — i; = vpo + -^vA + ^ _ AvA+e9=<ro+e9, 

(21.26) 

where <^q is the energy in the absence of an external field; according 
to (21.9), it is equal to 


Thus, the linear term in velocity docs not appear in the energy 
expressed in terms of momentum. It will be seen here that the Lagran 

gian is not of the form T — U because it involves the hnear term A V. 

The Hamiltonian for a charge in an external field. From (21.25) 
we obtain 

Po-P-^A, (21-5J7) 

and, from (21.26), 

<ro=<^-C9. (21.28) 

But we already know the expression for 6'^ in terms of Po from equa¬ 
tion (21.15), which relates to the energy and momentum of a free 
particle. Substitutmg Pp and in (21.15) in accordance with the 
last equations, we obtain the Hamiltonian of a charge in a field: 

c2|p- -1- ecp . (21.29) 

The equations of motion of a charge in an external field. From (21.29) 
we can obtain the equations of motion for a charge in an external 
field. However, it is simpler to make use of the Lagrangian (21.24). 
We know that Lagrange’s equations are of the following form: 


d oL dL 
dt dy dt 


(21.30) 


where equation (21.30) replaces three equations of the form (2.21) 
for the coordinates of v and r. 


220 


BLEOTBOD YNAMICS 


[Part II 


The derivativeis equal to P = Po + “-A., 


so that its total time 


derivative is 


d 8L Spo . e dA 

di Oy dt c dt ' 


(21.31) 


In order to expand the expression , we first write it down for 
one component: 


dAx _ SAx I dAx dee , dAy dy dAz dz _ dAx , a 

dt ^ dt ^ dx dt dy dt dz 'dt ~ dt ^ ^ * 

(21.32) 


[see (11.31)], whence, going to a vector equation and substituting 
in (21.31), we have 

= T + + (M.33) 

Let us now calculate the right-hand side of (21.30). Instead of -^we 
can write the completely equivalent expression VL: 


Vi = -|-V(Av)-eV 9 . 

The gradient V (Av) denotes coordinate differentiation, where only 
A and not v depends explicitly on the coordinates. Thus, applying 

equation (11.32), we find : 

VL= I (vV) A + [vrot A] - e<p . (21.34) 


Now, substituting (21.33) and (21.34) in (21.30) and taking all the 
terms involving potentials to the right-hand side, we obtain 

^-■«(-V»--i^)+-|-[vrotA]. (21.36) 

The right-hand side of (21.35) involves the electromagnetic fields in 
accordance with their definition in terms of potentials (12.28) and 
( 12 . 20 ). 

Hence, the equation of motion of a charge involves only the field 
and not the potential, as follows from the condition of gauge invariance. 
After substituting the fields the equation takes the form 


d my 


(21.36) 


The right-hand side of (21.36) is called the Lorentz force. In addition 
to the usual term, eE, which we know from electrostatics, it involves 


Sec. 21] 


BBLATIV1.STI0 DYNAMICS 


221 


a term similar to the Coriolis force. It is related to the part of the 
Lagrangian which is linear in velocity. 

The magnetic part of the Lorentz force, ~ [vH], is very similar to the 

expression for the force acting on a curi'ent in an external magnetic 
field and, natiu-ally, can be obtained from it. We did not have to use 
this method of derivation because the part of the Lagrangian which 
describes the interaction between charges and field was already known 
from Sec. 13. And besides, the relativistic invariance of (21.36), which 
emerges obviously from derivation from the invariant Lagrangian 
function, is considerably more difficult to grasp from the elementary 
definition of a magnetic force acting on a current. 

The work performed by a field on a charge. From equation (21.36), 
we can obtain an expression for the work done by an electromagnetic 
field on a charge. We know by definition that the work is equal 
to the change in kinetic energy. Let us multiply scalarly both parts 

of (21.36) by v. We shall then have the expression on the left- 

hand side. But v= .|^-inaccordancewithHamilton’8equation(10.18), 

so that V = —Jr > on the left-hand side what 

dt 8 Pq d/t di dfv 

we have is the required quantity for the change of kinetic energy 
in unit time. On the right-hand side the term v [v H] = [vv] H = 0 
and there remains only the work done by the electric force: 

As was to be expected, the magnetic force [vH] does not per¬ 
form work on the charge because it is perpendicular to the charge 
velocity at every given instant of time. 

The Lorentz transformation for the field components. From (21.37) 
and equation (21.36), if we write it in terras of components, it is 
easy to obtain the Lorentz transformation equations for the field 
components. These equations must be written so that their form 
does not change in passing from one coordinate system to another. 
Let us take equation (21.36) for the component of momentum on 

.T, and multiply it by~. We shall also multiply equation (21.37) 

17 

by -y— and also by , where F is the relative velocity of the coordinate 
as c 

system. After this we subtract (21.37) from (21.36). Then on the left- 
hand side we have 


222 


ELECTROD YNAMICS 


[Part II 


On the right-hand side we will have the expression 

= « ^.(£-1- S-) + t(«- T t - + T • 

But ds is invariant. Therefore, the quantities on the right-hand 
side must be transformed in accordance with the basic equations 
(20.17) and (20.18). Differentiating these equations, we obtain 

ds ds ds [/ c* ’ ds ds ’ ds ds 

Now, dividing both sides of the equation by 1 —^ and miilti- 
plying by , we will have an equation for the a;-component of 
momentum m the new coordinate system: 


') e V K + 


In accordance mth the ])rinciple of relativity, this equation must 
be written in the same way as for the unprimed coordinate system: 

dt' dt’ c. dt' ' 

Gomparhrg the last two equations, we obtain the field transformation 
equations: 

(21.38) 


V 

//y + — 

' c 


(21.39) 


(21.40) 


In the same way, though from other equations (21.36), is it easy 
to find other equations for field transformation: 


(21.41) 


(21.42) 


Sec. 21] 


BEIiATlVtSTIO DYNAMICS 


223 


(21.43) 


Consequently, in contrast to coordinates, it is not the longitudinal 
but the transverse components that are transformed in the field. 

The change in field, in passing from one 
coordhiate system to another, is verified to a 

nonrelativistic approximation (i.e., to the ac- 
curacy of terms of the order -1 in a unipolar 

induction experiment. A diagram of tlie ex¬ 
periment is shown in Fig. 34. The magnet NS 
rotates around its longitudinal axis. Two col¬ 
lectors connected by a fixed conductor are joined 
to the centre of the magnet and to its axis. 

When the magnet is rotated, an e.m.f. appears 
in the wire. This experiment is frequently inter¬ 
preted as meaning that when the magnet is 
rotated the wire “cuts” its lines of force as 
if the lines were attached to the magnet like 34 

brushes. 

Actually, unipolar induction must be understood as follows. There 
is only a magnetic field H in the coordinate system attached to the 
magnet, while the electric field is equal to zero. Hence, in a system 
fixed in the wire, relative to whicli the magnet moves, an electric 
field, too, should be observed in accordance with (21.42) or (21.43). 

This field is of an order of magnitude y- H and produces the e.m.f. 

We note that a coordinate system is defined only when both the 
electric and magnetic fields are specified. It is insufficient to specifiy 
only one of them. 

The invariants of an electromagnetic field. From equations (21.38)- 
(21.43), it is easy to obtain the following two invariants: 

= £2 - , (21.44) 

E'H' = EH. (21.45) 

From the invariance of these expressions it follows that the electro¬ 
magnetic field of a plane wave appears similar in all systems. Indeed, 
in a plane wave, E — H or — H^ = 0. This property is invariant 
according to (21.44). Further, E J. II, so that (EH) =0. This 
property is invariant according to (21.45). 

The quantity (E H) is invariant with respect to a Lorentz trans¬ 
formation. But with respect to a replacement of x, y, z by -x, -y, -z 


224 


ELECTKOO YNAMICS 


[Part II 


(i.e., an inversion of the signs of the coordinates), it is not invariant, 
because in this case E changes sign tvhile H does not (see Sec. 16). 
The quantity — 11^ is invariant even when the coordniate signs 
are inverted. But this quantity is the Lagrangian for a free electro¬ 
magnetic field. Integrated over the invariant volume dx, dy, dz, dt 
(exercise 5, Sec. 20), it yields invariant action, as required, while 
the quantity (EH) does not give a real invariant. 

The linearity of Maxwell’s equations with respect to field, A real 
invariant can be formed from the quantity EH merely by squaring. 
It is, of course, not at all obvious beforehand why such a quantity 
as well as the square of the invariant E^ — cannot appear in the 
Lagrangian for an electromagnetic field. Tlie same can be said of 
higher-order terms which do not change sign in the substitution of 
X by -X, etc. But if some terms—other than quadratic with respect 
to field—are left in the Lagrangian, then Maxwell’s equations will 
contain nonlinear terms. 

The essential difference between nonlinear and linear equations 
Ls that the sum of two solutions of a nonlinear equation is not its 
solution. Indeed, if two electromagnetic waves are propagated in 
a vacuum they are simply combined, and in no way distort each 
other. In nonlinear theory, the velocity is a function of the wave 
amplitude, while in electrodynamics the velocity of light is a universal 
constant. 

For this reason, the choice of the Lagrangian in the simplest form 
E^ — expresses the experimental fact that the law of variation 
of any electromagnetic field in space and in time is in no way dependent 
upon whether another field is operative in that same charge-free 
region of space. 

Actually the quantum electromagnetic field theory indicates the 
existence of certain nonlinear effects. In the range of phenomena 
for which classical electrodynamics is applicable, these effects are 
not essential. 

Transformation of charge density and current density. From the 
definition of charge density one can find the law of its transformation. 
Smee charge is an invariant quantity, we have 

de = pdxdydz = p^dx^dy^dz^, (21.46) 


where po is the charge density in a system relative to which it 
is at rest and, hence, the quantity is also invariant by definition. 
Whence, 


dyQ diZQ dt dt 

^ dxdydz dt^ da ’ 


(21.47) 


where we have used (20.30) and exercise 5, Sec. 20. The current 
density is 


Sec. 21] 


RELATIVISTIC DYNAMICS 


225 


. dx dt dx 


dy 


dz 
Us • 


(21.48) 


From here it can be seen that the current components are trans¬ 
formed like coordinates, while charge density is transformed like 
time. 

Let us consider a conductor in which a current is flowing. In the 
coordinate system in which the conductor is at rest, it remains neutral, 
but in other systems a charge density must appear on it. This fact 
does not contradict the invariance of total charge, but follows from 
it in accordance with (21.46)-(21.48). 

The invariance ol action for the field. Let us now verify that the 
action term (13.17) describing the interaction of field and charge 
is invariant. It follows from (21.27) and (21.28) that the vector 
potential transforms like momentum (i.e., like a radius vector), 
while the scalar potential transforms like energy (i.e., like time). 
For this reason the product 

Aj - <pp 


behaves, with respect to a Lorentz transformation, like an interval; 
in other words, it remains invariant. Integrated over the invariant 
“four-dimensional” volume dx dy dz dt, it yields the invariant action 
term Si- Hence, Maxwell’s equations are obtained from the invariant 
action function S, so that they are also invariant themselves. This 
could also have been verified from equations (21.38)-(21.43). 


Exercises 

1) Find the scalar and vector potentials of a freely moving charge. 

In its own coordinate system, the scalar potential is = — and the vector 

potential is equal to zero. Hence, in a system relative to which the charge 
moves, its scalar potential is 

(fo 6 


-lf‘ 


and the vector potential is 


<Po ■ 


if, V* 

|/l__ 


15 - 0060 


226 


KLECTBOD YNAMICS 


[Part II 


Further, r„ must be expressed in terms of coordinates in the fixed system 


*•0 = vA+yl-^^l = 


+ y* + z* . 


We can put C instead of vt, i.e., the abscissa of the moving charge. The 
electromagnetic disturbance arrives at the given point x, y, z from the point 5 ', 
where the charge was situated earlier. We have 

5-5' v7^5V“+2/* + 2“ R' 


V c c 

from the definition of lag. Putting ^ = vt in r„, we obtain an expression for 9 
and A in terms of R'. 

‘-‘"I 

2) Find the motion of a charge in a constant imiform magnetic field. 

If the field is in the direction of the z-axis, the equations of motion are 
of the following form: 


dpx e dy 
dt c dt 


dpy e dx „ 

~dr'~~'~c~dt^' 


Further, p* = const, pj= const, + = const. 


<f = const. 


We look for the coordinates x and y in the form: 

x=Rcosat, y = Rain bit. 
For R and a the following expressions result: 


ecH ' • 

The particle moves along a helix. For small velocities, co reduces to the 

constant value ——. 

me 

3) Find the motion of a charge in a constant uniform electric field. The 
equations of motion are 

v-- 


From the last equation we obtain 

Vms c‘ + c= (pS + -(- pi)"—V+ c* (pj„ + pf^ + plj) =e Ex. 

From the first equation 

Px — Px^ = eEt, Py-Py, = 0, p^-p^_^ = 0. 

These equation integrals together give a; as a function of t. 


Sec. 21] 


KBLATIVISTIO DYNAMICS 


227 


If = 0 , then dividing p* by py, wo have an expression for in terms 

of X (by eliminating t from the energy integral). The trajectory is of the form 
of a catenary. 

4 ) Find the motion of a charge in a central attractive Coulomb field. 
The energy integral is of the form 


(<y —e?)* —c®|p? + -^1 =TO»c«, — 


Further, denoting the azimuth by we obtain 


Af = 


d<^ 

dt 


Whence 


Pr = 


dr 

dt 


c2 

1 dr 


F? 


Pr 

M • 


Substitution of p, in the energy integral and separation of the variables r 
and (j; loads to an elementary quadrature. The trajectory for finite motion 
{S < nic^) is similar to an ellipse, but with a rotating perihelion. 

6 ) Examine the collision of a travelling particle of zero mass with a particle 
of mass m at rest. Determine the energy of the incident particle after collision, 
if its angle of deflection O- is loiown. 

Answer: S' — --. 

1 ^ -^(1 — cos») 

6 ) Find the motion and radiation of a charge connected elastically to 
some point of space (with frequency t>>„) and situated in a uniform magnetic field 
Hz = H, //* = ffy = 0. 

The oscillations of the charge are governed by the following nonrelativistic 
equations: 

m'x— — ma^x + — 'Hy, 
c 


my = —maly — ~Hx, 
mz— —mojjz. 

The third equation does not depend on the first two. The first two equations 
are easily solved if we put a: = £ie>“‘, y —bet'll. Then 

o(«g-<o2)-io>--—6 = 0, 

7nc 

eH 

b (coq — «*) -f -a= 0. 

me 

Let us multiply the second equation by i and first subtract it from the 
first, and then add it. The combinations a±,ib then satisfy the equations 

eH 

(a ± ^ 6) (Wo — w^) (a ± » 6) *> =» 0, 


15' 


228 


ELECTROD YNAMICS 


[Part II 


Cancelling a±ib, we arrive at the equations for frequencies 


eH<a 

me 


We regard as small compared with Wq. replace o> by in the term » 

and represent the difference wo ■— a® (“ I" “o) (“—“o)> which is approximately 
equal to (a —o>o). Then we obtain expressions for the frequencies of both 

oscillations: 

ell 

e H 

They differ from the undisplaced frequency by, i.o., by the Larmor 
frequency ml- 

From the equations for a and 6, after substituting m — mj=F ml. we have, 
to the same degree of approximation. 


a = ±ib. 


If we represent the coordinates in real form, we obtain for both oscillations 
x — n cos (m„ ml) ti y= ±a sin (m„ T ml) t. 

Thus, the radius vector of a particle performing oscillations with frequency 
Oo + <^L rotates in a clockwise direction, while for oscillations with frequency 
Mo—ML it rotates in an anticlockwise direction. Thus, in accordance with 
Larmor’s theorem, the frequency ml is added to the frequency Mq or subtracted 
from it, depending upon the direction in which the charge rotates (we note 
that the sign of ml is changed for a negative charge). 

Let us consider the radiation of such a charge in a magnetic field. We know 
that the electric vector of the radiated electromagnetic wave lies in the same 
plane as the charge displacement vector. If radiation is observed to bo due 
to the 2-component of the dipole moment, its electric vector is along the 2-axis 
and is proportional to 2. Thus, the radiation is plane-polarized and is of frequency 
M^. The oscillation occurring along the field and having an undisplaced frequency 
radiates electromagnetic waves, which are polarized in the same plane as the 
magnetic field. This oscillation does not radiate at all in the direction of the 
magnetic field, but the oscillations with frequency m„ “L radiate circularly- 
polarized waves, and the electric-field vector rotates in the same direction 
as the charge displacement vector. 

All three frequencies radiate in a direction perpendicular to the field. How¬ 
ever, since the charge oscillations are viewed from one side in this position 
in circular rotation, the vector of electric-field oscillations lies in a plane per- 
pendicidar to the constant external magnetic field, so that waves with fre¬ 
quencies Mj it ML are now also plane-polarized. In observations that are not 
at right angles to the field, we obtain elliptically-polarized oscillations and a 
plane-polarized oscillation of frequency 

The calculations set out here form the classical theory of the Zeeman effect. 
The line splitting that is actually observed for various values of magnetic 
field is correctly described only by quantum theory (Sec. 34 ). 


PART III 
QUANTUM MECHANICS 

Sec. 22. The Inadequacy of Classical Mechanics. 

The Analogy Between Mechanics and Geometrical Optics 

The instability of the atom according to the classical view. Ruther¬ 
ford’s experiments in 1910 established that the atom consists of 
light negative electrons and a heavy positive nucleus of dimensions 
very small compared to the atom itself (see Sec. 6). For such a system 
to be stable, it is necessary that the electrons should revolve around 
the nucleus like planets about the sun, for unlike charges at rest 
would come together. 

This stability condition of the atom is, nevertheless, insufficient. 
In the case of motion in an orbit, electrons will experience centri¬ 
petal acceleration, but, as was shown in See. 19, a charged particle 
undergoing acceleration radiates electromagnetic waves, thereby 
transmitting its energy to the electromagnetic field. Thus, the energy 
of an electron moving around a nucleus should continuously diminish 
until the electron falls onto the nucleus. This statement is in striking 
contradiction to the obvious fact of the stability of atoms. 

The Bohr theory. In 1913, N. Bohr suggested a compromise as a 
way out of this difficulty. According to Bohr, an atom has stable 
orbits such that an electron moving in them does not radiate electro¬ 
magnetic waves. But in making a transition from an orbit of higher 
energy to one with lower energy, an electron radiates; the frequency 
of this radiation is related to the difference between the energies 
of the electron in these two orbits by the equation 

A <0 == S 2 , 

where A is a universal constant equal to 1.064 x 10“*^ erg-sec. 

Both of Bohr’s principles were in the nature of postulates. But 
it was possible with their aid to explain, in excellent agreement with 
experiment, the observed spectrum of the hydrogen atom and also 
the spectra of a series of atoms and ions similar to the hydrogen atom 


230 


QUANTUM MECHANICS 


[Part III 


(for example, the positive helium ion, which consists of a nucleus 
and one electron). Despite the fact that both of these, essentially 
quantum, postulates of Bohr were completely alien to classical 
physics and could in no way be explained on the basis of classical 
concepts, they represented an extraordinary step forward in the 
theory of the atom. 

Indeed, the first postulate contains the statement that not every 
state of the atom is stationary, but only certain states. This statement, 
as we now know, derives from quantum mechanics just as directly 
as elliptical planetary orbits derive from Newtonian mechanics. 

The Bohr theory was very successful m explaining the spectra 
of single-electron atoms. But the very next step, a two-electron 
atom such as the helium atom, did not yield to consistent calculation 
by the Bohr theory. The theory was even less capable of explaining 
the stability of the hydrogen molecule. For this reason, the situation 
in physics, notAvithstanding a number of brilliant results of the Bohr 
theory, was completely unsatisfactory. Besides the particular diffi¬ 
culties that we have noted here, the Bohr theory was, on the whole, 
eclectic, since it was inconsistent in its combination of classical and 
quantum concepts. 

Light quanta. The inadequacy of classical ideas appeared most 
obvious in the problem of the stability of the atom. But earlier 
there were many facts which classical (i.e., nonquantum) physics 
failed to explain. A case in point is the theory of an electromagnetic 
field in equilibrium with matter (for more detail, see Sec. 42). Here, 
classical theory leads to an absurd result—the total energy of an 
electromagnetic field in equilibrium with radiating matter is expressed 
in the form of a divergent, i.e., infinite, integral. 

In order to give a satisfactory description of experimental facts, 
Planck, m 1900, postulated that sources of radiation emit and absorb 
energy of the electromagnetic field in finite amounts. These discrete 
quantities, or quanta, as they were called by Planck, are proportional 
to the frequency of the emitted or absorbed radiation. It is easy to 
see that the factor of proportionality must be the same as in Bolir’s 
second postulate (actually, Planck introduced a quantity 2 n times 

greater, but used a frequency v = , equal to the number of oscil- 

lations per second). Bohr’s second postulate relates the properties 
of discreteness of stationary states of a radiating system (atom), 
occurring in line spectra, to the energy of the emitted quanta. Classi¬ 
cally, it is just as impossible to explain this discreteness as it is to 
explain Planck’s initial hypothesis. 

The duality of electro^namical concepts. At the beginning of 
the twentieth century, the classical theory of light also turned out 
to be incapable of explaining many facts without appeahng to an 
additional hypothesis concerning light quanta. But at the same time, 


Sec. 22] THE ANALOGY BETWEEN JMECHANICS AND GEOMETBIOAI, OPTICS 231 


there was a whole range of phenomena, such as diffraction and inter¬ 
ference of light, which appeared to be intimately boimd up with 
its wave nature. It did not seem possible to explain these phenomena 
in terms of classical corpuscular concepts. 

Another group of phenomena could be explained only on the basis 
of Planck’s hypothesis concerning light quanta, and was in most 
obvious contradiction to the classical wave conceptions. Let us note 
two such phenomena. 

I refer, firstly, to the so-called photoemissive effect, i.e., the emission 
of electrons from the surface of a metal in a vacuum when the metal 
is illuminated by ultraviolet rays. The energy of the photoelectrons 
depends only on the radiation frequency and is independent of its 
intensity. This can only be understood if we assumed that the energy 
of electromagnetic radiation is absorbed in the form of quanta, Aco. 
Then the kinetic energy of an electron will be equal to the energy 
of the quantum minus the electronic work function (the energy 
needed to remove the electron from the metal). 

Einstein, to whom this explanation of the laws of the photoelectric 
effect belongs, went further apd assumed that electromagnetic 
radiation is not only absorbed and emitted in the form of quanta, 
but is also ‘propagaied in that form. 

Since the energy of a quantum is equal to Aw, and its velocity 

is c, it should possess a momentum (see Sec. 21). It follows that 

a quantum is a particle of zero mass. It will be noted that the energy 
and momentum of an electromagnetic wave are related m exactly 
this way (see Sec. 17). 

The second phenomenon which exhibited the quantum properties 
of radiation provided for confirmation of Einstein’s hypothesis 
concerning the momentum of light quanta. In the scattering of 
X-radiation by electrons, the latter may be regarded as free, since 
the characteristic frequencies of their motion in a substance are very 
small compared with the frequency of the incident radiation (we 
have considered such scattering in exercise 4, Sec. 19). It is essential 
that in accordance with classical theory the scattered radiation 
must be of the same frequency as the incident radiation. But exper¬ 
iment showed that the frequency of the scattered radiation is less 
than the frequency of the incident radiation, and depends upon 
the angle at which the scattering is observed (Compton effect). The 
displaced frequency can be calculated in relation to the scattering 
angle, if it is assumed that the act of scattering occurs as the col- 
lison of two free particles—a moving quantum and an electron at 
rest. A collision of this sort was considered in exercise 5, Sec. 21, 
especially for the case of an incident particle with zero mass. The 
equation obtained there gives a perfectly correct description of the 
frequency shift in the Compton effect, if we consider that the energy 


232 


QUANTUM MECHANICS 


[Part III 


of the quantum is equal to ^ = Ato and its momentum p = A k 
(or A <o/c in absolute magnitude). 

Quantum mechanics. Thus, in the theory of light and the theory 
of the atom, a peculiar dualism arose: one and the same physical 
reality (the electromagnetic field or the atom) was described by 
two contradictory theories: elassical and quantum. A way out of 
this situation was foimd in consistent quantum theory, where all 
motion possesses certain wave properties, which, however, cannot 
be detected in the motion of macroscopic bodies, but are essential 
in the description of the motion of such microscopic particles as 
quanta and electrons. The criterion to be used in order to determine 
whether it is necessary to take into account the wave properties 
of a given motion will be given in the following section. The only 
thing to note here is that it involves the constant A. 

The basic principles of quantum mechanics received direct experi¬ 
mental verification after the discovery of electron diffraction, whose 
laws are very similar to the diffraction laws of electromagnetic waves. 
All atomic phenomena are qualitatively and quantitatively fully 
accounted for by quantum mechanics. 

The present state of nuclear theory. Nuclear phenomena are some¬ 
what more complex. At the present time, we do not know the laws 
governing the interaction of nuclear particles. These laws are closely 
related to the properties of special nuclear fields, which properties 
differ in many respects from those of the electromagnetic field. At 
the present time, we still do not know the theory of nuclear fields. 
It may also be that there is still insufficient experimental data for 
the development of such a theory. Therefore, nuclear theory is, to 
date, considerably less developed than atomic theory, in which aU 
interactions are of an electromagnetic nature and well known. In 
any case, the difficulties experienced by modern nuclear theory 
lie outside the region in which nonrelativistic quantum mechanics 
can be applied, and in no way affect its basis. 

The correspondence between geometrical optics and classical 
mechanics. An essential role has been played in the formation of 
quantum theory by analogy between classical mechanics and wave 
optics. A correspondence be established between them in the 
present section. Since geometrical optics is a limiting case of wave 
optics, an analogy between geometrical optics and classical mechanics 
permits of transition to the wave equations of quantum mechanics 
by means of a generalization. Let us establish an analogy between 
the equations of mechanics and geometrical optics, quite formally 
for the time being. Its meaning will be given later. 

Surfaces of constant phase. Let us explain the significance of the 
wave phase in geometrical optics. To do this, we perform the limiting 
transition from wave optics to geometrical optics. We write the 
expression for the field in the form 


Sec. 22] THE ANALOGY BETWEEN MECHANICS AND GBOMBTRIOAL OPTICS 233 


E = Eg (r, f) COS 


X (r. t) 


( 22 . 1 ) 


Here, X is the wavelength, which is regarded as small compared with 
the linear dimensions of the region occupied by the field. In the limiting 
case of a plane wave the phase is 


Y= —+ hr 


[Cf. (17.21), (17.22)]. Since <0 = 


2nu 


2jtii 


( 22 . 2 ) 

, „ , where v, is the 

A A 

phase velocity, it is convenient to obtain a relationship containing X 
explicitly, describing the phase as <p = y • 

The expression for the field must be substituted into the wave 
equation 

= ( 22 - 3 ) 

In differentiating with respect to t and r, we retain only those terms 
containing the highest degree of X in the denominator because X is 
a small quantity. Hence, 

aE 
at 


^0 X dt X 


a‘E 


— Ef 


1 d<‘x 


= ->0 X 

As stated, the first term in the last expression must be discarded 
when X->0. Whence, 

dt^ — X 8« / ® X 

and similarly 

AE^-Eo(y VxJ^osy. 

Substituting these expressions into the wave equation (22.3), we 
obtain a first-order differential equation for the phase <P = y: 

In the limiting case of a plane wave, it follows from (22.2) that 

1^ = It--Vcp (22.6) 


and 


(0 = 


5 9 
dt 


“ =0, 


where 


( 22 . 6 ) 


234 


QUANTUM MECHANICS 


[Part III 


But, according to (22.4), this same equation is satisfied also by the 
quantities 4 ^, in the expression of the almost plane wave ( 22 . 1 ). 

CT Ot 

It follows that we can take equations (22.6) and (22.6) as definitions 
for the wave vector and frequency of an almost plane wave. 

The wave vector is directed along the normal to a 
surface of constant phase (p= const, i.e., along a 
light ray at a given point of space. The propa¬ 
gation of an almost plane wave may be represented 
as the displacement in space of a family of surfaces 
of constant phase. In Fig. 35, these surfaces are 
shown in cross section by solid lines, and the 
light rays are dashed lines. 

At various instants of time t, a surface of 
definite value <p = <Po occupies various positions in 
Fig. 35 space corresponding to the equation 9 (r, 0 = 9o- We 

determine the propagation velocity of this surface. 
To do this, we proceed from the condition obtained by differentiating 
9 =const: 

d9-^dt + ^dT^0. 

Let dr be a vector in a direction normal to the surface. Then 
is the absolute value of k. From (22.5) and (22.6), we obtain 


which coincides with the definition of phase velocity (18.7). 

Phase and group velocities are, in general, different, since group 
velocity is equal to 


(22.7) 


It is essential that for an almost plane wave, cc may be expressed 
as a function of k, just as is done in the case of a plane wave. 

Surfaces of constant action. We shall now consider a family of tra¬ 
jectories of identical particles moving in some force field. For example, 
these may be shrapnel particles formed when a shell is exploded 
(though not pieces of the shell itself having different masses!); it 
must be considered here that the shrapnel explosions occur at the 
same place continuously, so that the particles fly one after the other 
along each trajectory lagging only in time. It is not absolutely neces¬ 
sary to consider particles emerging from one point. Trajectories may 
be taken which are normal to some initial surface. Each particle emerg¬ 
ing at t=tQ has a definite trajectory depending on its initial coordinates 


Sec. 22] THE ANALOGY BETWEEN MECHANICS AND GEOMETRICAL OPTICS 236 


and initial velocity. The value of the action 8 for each particle may be 
calculated along these trajectories from the equation [see (10.2)] 

( 

8 = jLdt. (22.8) 

‘ ^0 

Since at the instant t—to the position of a particle is determined by 
its trajectory, and L is a known function of coordinates, velocities, 
and time, L{q{t),q{t), t}, the action along each trajectory is also 
known as a function of time. 

Let us join by surfaces the points of all the trajectories for which 
the value of 8 is constant. The equation of this surface is 8 (r, t) — 8Q. 
In accordance with (22.8), this surface coincides at the initial instant 
of time with the surface from which the particles emerged. 

The relationship of momentum and energy with action. The surfaces 
of constant action are displaced in space and are orthogonal to the 
particle trajectories, because, from (10.23), 


and, in general, 


Px- 


es 

dx 


py- 


dS 

sy ’ 


8 S 


8 S 


(22.9) 


The partial derivatives are proportional to the direction cosines of 
the normal to the surface 8=8q. 

Let us calculate, in addition, the partial derivative of 8 with respect 
to time. Since the action depends upon the coordinates and the time, 
its total derivative is equal to 


But from (22.8) 


85 ^ as dr 

at at 8r dt ' 


( 22 . 10 ) 


Substituting this into the left-hand side of (22.10), and p— -^into 
the right-hand side of the same equation, we obtain 

^ = i_pv=-^. (22.11) 


Similar quantities in an optical-mechanical analogy. Comparing 
equations (22.9) and (22.11) with (22.6) and (22.6), we conclude 
that a surface of constant action of a system of particles is propagated 
similar to a surface of constant phase: the momentum of particles is 
similar to the wave vector, while the particle energy is similar to the 


QUANTUM MECHANICS 


236 


[Part III 


freqv^ncy. In accordance with Hamilton’s equation (10.18), the particle 
velocity v analogous to the group velocity of a wave is 


The velocity of a surface of constant action is 

£_ 

P ' 


u — 


( 22 . 12 ) 


(22.13) 


This value by no means coincides with the particle velocity. Thus, for 
free particles 


so that M = -^. The quantity u is analogous to the phase velocity of the 
waves. 

The expression for group velocity (22.7) corresponds to Hamilton’s 
equation (22.12). In Sec. 18 it was shown that the group velocity corre¬ 
sponds to the velocity of propagation for a wave packet, i.e., a disturb¬ 
ance concentrated withm a certain region of space. Thus, the analogy 
between mechanics and geometrical optics establishes a correspondence 
between a particle and a wave packet. 

The transition to geometrical optics provides for representation of 
the solution of the wave equation, in a certain region of space, in the 
form of a plane wave; however, the quantities defining this wave, such 
as the frequency and the wave vector, are themselves slowly varying 
functions of coordinates and time. The relationship between frequency 
and wave vector will be of the form to = to (k, r), where the quantities 
k and r that describe the wave propagation satisfy the same Hamilton¬ 
ian equations as p and r of a particle moving along a trajectory. It 
is essential here that the vector k should not change much in magni¬ 
tude and direction over a distance of one wavelength, and that the 
frequency <o should not change greatly in one oscillation period. 

to=cA: for a plane wave in free space; this is completely andogous 
to the relationship (21.16) between the energy and momentum of a 
particle of zero mass. 

If light is propagated in an inhomogeneous medium, the phase 
velocity u, appearing in equation (22.4), is a variable quantity. For 
example, when light is refracted at the boundary of two media, u 
has different values on both sides of the boundary. The propagation 
of light in an inhomogeneous medium is similar to the motion of a 
particle in a medium of variable potential energy. 


O. 22] THE ATSTAI^OGY BETWEEN MECHANICS AND OEOMETBICAL OPTICS 237 


The optical-mechanical analogy was established by Hamilton in 
i26. However, up to the time of the formation of quantum mechanics 
e., up to 1926), the physical significance of this analogy was not 
iderstood. 

The law of transformation and the dimensions for similar quantities, 
le analogy between optical and mechanical quantities is relativisti- 
lly invariant. Comparing formulae (20.33)-(20.36), for the trans- 
rmation of wave vector and frequency, with (21.18)-(21.21), for 
e transformation of momentum and energy, we see that similar 
lantities are transformed in a similar manner. 

The optical and mechanical quantities differ only in dimensions, 
ms, phase has zero dimensions while action has the dimensions of 

L dt, i.e., gm. cm^/sec. Accordingly, the wave vector and momentum, 

id the frequency and energy, likewise differ by the dimensions of 
tion. As can be seen from a comparison of the Lorentz transfor- 
ations for these quantities, the proportionality factor is an invariant 
lantity. 

In the following section we shall show that the analogy between 
Bchanics and geometrical optics emerges as a limiting relation from 
e precise wave equation of quantum mechanics. 


Exercises 


1) Formulate the equation of surfaces of constant action for a system 
particles emerging from a single point in space a: = 0, z = 0, in the plane 
= 0 in a gravitational field. The absolute value of the particle velocities 
■Wj and their direction is arbitrary. 

We have 

Vx=--=Vax, Px = 'mv^, X = v„xt, 

Vz = v^z — gt, Pz = mvaz — mgt, z = Vnzt— ^. 

iminating the initial conditions for velocities, we obtain 


■ = m 


X 

t 


8S mz 


mgt dS 
2 


,, m I 

*=-2^rr 


12 r 


deed, from the expression for S we obtain 


m 


mv^ 

~2~' 


2) Proceeding from the fact that phase is analogous to action, show that 
ht of given frequency is propagated along trajectories for which the prop- 
ition time of constant phase is least (Fermat principle). 

At constant frequency 

ndr 


<p = j" kdr = J* ■ 


it the product n dr is equal to the displacement of the surface in a direction 

rmal to it. It follows that —is the propagation time dt. In accordance 

th the variational principle, which governs phase as well as action, the time 
ist be least. 


238 


QUANTUM MECHANICS 


[Part III 


Sec. 23. Electron Diffraction 

The essence of diffraction phenomena. Classical mechanics is anal¬ 
ogous only to geometrical optics and by no moans to wave optics. 
The difference between mechanics and wave optics is best of all illus¬ 
trated by the example of diffraction phenomena. 

Let us consider the following experiment. Let there be a screen with 
two small apertures. Let us assume that the distance between the 
apertures is of the same order of magnitude as the apertures themselves. 
Temporarily, we cover one of the apertures and direct a light wave 
on the screen. We shall observe the wave passing through the aper¬ 
ture by the intensity distribution on a second screen situated behind 
the first. Let us now cover the second aperture. The intensity distri¬ 
bution will be changed. Now let us open both apertures at once. An 
intensity distribution wiQ be obtained which will not in any way 
represent the sum of the intensities due to each aperture separately. 
At the points of the screen, at which the waves from both apertures 
arrive in opposite phase, they will mutually cancel, while at those 
points at which the phase for both apertures is the same, they will 
re-inforco each other. In other words, it is not the intensities of light, 
i.e., the quadratic values, that are added, but the values of the fields 
themselves. 

This type of diffraction can occur only because the wave passes 
through both apertures. Only then are definite phase differences ob¬ 
tained at points of the second screen for rays passing through each 
aperture. 

We disregard here the diffraction effects associated with the passage 
through one aperture. These phenomena are duo to the phase differ¬ 
ences of the rays passing through various points of the aperture. 
Instead of examining such phase differences we consider that the 
phase of a wave passing through each aperture is constant, but we 
take into account the phase differences between waves passing through 
different apertures. Nothing is essentially changed by this simplification. 

X-ray diffraction. In order to observe the diffraction of X-rays which 
are of considerably shorter wavelength than visible light, they can be 
made to scatter by correctly arranged atoms in a crystal lattice. Waves, 
scattered by different planes of the lattice, have constant phase differ¬ 
ences 2 Tc, 4 7t, ..., etc., for definite scattering directions. The distance 
between planes of the lattice, the scattering angles at which maxima 
are observed, and the wavelength axe related by a simple formula 
(the Wolf-Bragg condition). From this equation we can determine 
the wavelength of X-radiation. 

Diffraction by a crystal lattice is somewhat more complicated than 
in the experiment with two apertures, though, fundamentally, it 
occurs for the same reasons; the wave is scattered by all the atoms of 
the lattice, and the total amplitude of the scattered wave is the result 


Seo. 23] 


ELECTRON DIFFRACTION 


239 


of adding all the amplitudes of the waves scattered by all the atoms, 
with the path differences of the rays, i.e., the phase differences, taken 
into account. 

Electron diffraction. The very same phenomena is observed when 
electrons (and also neutrons and other microparticles) are scattered 
by crystals. As we know, electrons act on a photographic plate or a 
luminescent screen in a way similar to X-rays; as a result, direct exper¬ 
iment shows that microparticles undergo diffraction that is governed 
by the same laws as the diffraction of electromagnetic waves. 

However, for this, each electron must be scattered by all the atoms 
of the lattice, because the electrons travel entirely independently of 
each other; there can be no coherence (i.e., a constant phase difference) 
between them, nor can any arise. They may even pass through the 
crystal singly (see below). Only light waves which are originated from 
the same light source exhibit diffraction in such a maimer; a stable 
diffraction pattern is obtained because the same wave passes through 
both apertures. If the waves passing through the apertures originated 
at different sources they could not cancel or reinforce each other at 
fixed points of the screen. The alternation of light and dark regions 
would depend on the relative phases with which the waves passed 
through the apertures; a constant phase difference cannot be main¬ 
tained for light from different sources. 

Electron diffraction demonstrates that the laws of motion in the 
microworld are wave-like in character; to obtain the same diffraction 
pattern for X-rays, each electron must be scattered by all the atoms of 
the lattice. This is clearly incompatible with the concept of a definite 
electron trajectory. 

Diffraction phenomena prove that electron motion is associated with 
a phase of a certain magnitude. 

The de Broglie wavelength. From diffraction experiments we can, 
without diffic^ty, determine the wavelength both for electrons and 
for X-rays. For electrons, it turns out to be very simply related to 
their velocities. Let us write down the expression for the wave vector 
obtained from experiment: 

k = |-. (23.1) 

Here, A is a universal constant having the dimensions of action spoken 
of in the preceding section (p. 229). It is equal to 1.064 X lO"*’ erg-sec. 
Earlier, it was much more common to use a constant 2 n times bigger. 
The value for h used in this book is frequently denoted by A. 

The relation (23.1) was suggested in 1923 by L. de Broglie, before 
the first experiments on electron diffraction. The quantity 

, 2it 2 :th 

is called the de Broglie wavelength. 


mv 


(23.2) 


240 


QUANTUM MECHANICS 


[Part III 


Equation (23.2) shows that a certain wavelength can be ascribed 
to the motion of each body, but in the motion of macroscopic bodies 
it is extremely small as a result of the smallness of h compared with 
the quantities which characterize the motion of macroscopic bodies, 
where the orders of magnitude correspond to the cgs system. There¬ 
fore, diffraction phenomena do not actually restrict the applicability 
of classical mechanics to macroscopic bodies. 

The limits o! applicability of classical concepts. The relationship 
between the quantities is entirely different when equation (23.2) 
is applied to the motion of an electron in an atom. The size of an atom 
is determined, for example, from experiments on X-ray diffraction or, 
more simply, by dividing the volume of one gram-atom of condensed 
material (solid or liquid) by the Avogadro number N = 6.024 x 10 '^®. 
The atomic radius is of the order of 0.5 x 10 ”® cm. From this it is 
easy to evaluate the velocity of an electron by equating the “centri¬ 
fugal force” to the force of attraction to the nucleus, equal to ~ 
in a hydrogen atom. For the velocity, the value obtained is 

e 

V ; -r , 


whence the wavelength (23.2) is 


Substituting the numerical values gm, e~4.8 x 10”“ CGSE, 

we convince ourselves that the wavelength is approximately six times 
larger than r. In other words, a distance of the order of an atomic 
diameter can accommodate one third of a wave: this corresponds to 
dimensions which are characteristic of diffraction phenomena and com¬ 
pletely “smears out” the trajectory over the atom. 

Hence the motion of an electron in an atom is wave motion. Just 
as the concept of a ray has no place in optics, if the light is propagated 
in a region comparable with the wavelength, so the concept of an elec¬ 
tron trajectory becomes meaningless for the motion of an electron in 
an atom. 

The electron is not a wave, but a particle! It is necessary to warn 
the reader against certain common delusions. First of all, contrary to 
what is often written in the popular literature, the electron is never 
a wave even in quantum considerations. Without any doubt the elec¬ 
tron remains a particle; for example, it is never possible to observe 
part of an electron. If a photographic plate replaces the second (rear) 
screen in the diffraction experiment, then at the point of incidence of 
each electron there will appear a single point of blackening. The point 
distribution characteristic of a diffraction pattern will also result when 


Sec. 23] 


ELECTBON DIFFRACTION 


241 


the electrons pass through a crystal singly. * Thus, it is not the electron 
that becomes a wave, but the laws of motion in the microworld that 
are wave-like in character. 

It is clear that a diffraction pattern can in no way be obtained from 
a single electron. Since each electron gives a single point on the screen, 
we must have very many separate points in order to obtain the correct 
alternation of light and dark regions on the plate. 

At the same time, diffraction would be utterly impossible if both 
screen apertures or all the crystal atoms did not actually participate 
in the passage of one and the same electron. In the diffraction experi¬ 
ment, electron trajectories simply do not exist. What is actually wave¬ 
like in the motion of a particle will be shown later. 

In all its properties, the electron is a particle. Its mass and charge 
always belong to it and are never divided in any diffraction experiment 
or any other experiment that we know. 

The incompatibOity of trajectories and diffraction phenomena. 
Another common delusion is that the electron supposedly does possess 
a trajectory but that we are as yet unable to observe it due to imper¬ 
fections in technical facilities, or to the inadequacy of our physical 
knowledge. In actual fact, the diffraction experiment shows that the 
electron definitely does not have a trajectory, just as diffracted light 
is not propagated in rays. To think that the development of physics 
will in future show the existence of an eleetron trajectory in the atom 
is just as unreasonable as to hope for the return of phlogiston in heat 
theory, or of a geocentric world system in astronomy. 

Statistical regularity and the individual experiment. The absence of 
trajectories by no means signifies that we have lost all regularity. On 
the contrary, an identical diffraction experiment, performed, of course, 
with a large number of separate electrons of definite velocity, always 
yields an identical diffraction pattern. Thus, causal regularity undoubt¬ 
edly exists. However, it is statistical in character appearing in a very 
large number of separate experiments, beeause each electron passage 
through a crystal may be regarded as a separate independent result. 

Diffraction phenomena lead to a regular distribution of points on a 
photographic plate in the same way that a large number of shots at 
a target is subject to a law of dispersion. However, as opposed to bullets 
which fly along trajectories and therefore give a smooth distribution 
curve for the places where they strike the target, the blackened grains 
on a photographic plate caused by electrons are produced in a more 
intricate manner characteristic of wave motion. Thedistribution of bullet 
fits is due to indeterminacy in the initial firing conditions and becomes 
less in the case of better aiming, while the random character of electron 


* In somewhat different form, this was shown by direct experiment by 
V. A. Fabrikant, N. G. Sushkin and L. M. Biberman, using currents of very 
low intensity. 


16 - 0060 


242 


QUANTUM MECHANICS 


[Part III 


behaviour presents a perfectly regular diffraction pattern and, for a 
given electron velocity, can in no way be reduced. 

In addition, it may be noted that the statistical regularity in a 
diffraction experiment has nothing to do with the statistical regular¬ 
ities which govern the motion of a large ensemble of interacting par¬ 
ticles. As has already been repeated several times, the same pattern 
is obtained completely independently of the way in which the elec¬ 
trons pass through the crystal—all at once, or singly. A certain phase 
governing the motion exists only because each electron interferes with 
itself. 

Electron trajectories in a Wilson cloud chamber. It still remains to 
examine in more detail the question: In which cases do we, neverthe¬ 
less, deal with the concept of an electron trajectory ? In a cloud cham¬ 
ber, in a cathode-ray oscilloscope, and in many other instruments, 
electron trajectories can be precalculated very well from the laws of 
classical mechanics. In a cloud chamber, there even remains a cloud 
track along the line of motion of an electron.* 

First, recall that under certain conditions light is also propagated 
along definite trajectories (rays). Geometrical optics is applicable when 
the inaccuracy in defining the wave vector A.kx, subject to the inequal¬ 
ity 

Afca: -Aa:> 27 c (23.3) 

[see (18.9)], is small in comparison with k. Substituting Afc* for an 
electron in equation (23.1), wo obtain the analogous expression of 
quantum theory: 

Apx'^x~2Tth**. (23.4) 

Tliis is the so-called uncertainty relation of quantum mechanics. The 
concept of an electron trajectory has reasonable meaning if the un¬ 
certainties of aU three momentum components Ap*, Apy, Apx are 
small compared with the momentum itself: 

Apx4: Px, Apy<4py, Apz< Pz. {23.5) 

It may be pointed out that we have all along been saying “electron” 
simply to be specific. The same applies to a proton, neutron, meson 
and the like. 

Let us suppose that the track of an electron in a cloud chamber is 
0.01 cm wide and the electron energy is equal to 1.000 ev = 1.6 x 10"® 

4. fi V 1 n-io 

erg (1 electron-volt = -= 1.6 x 10"^® erg). According to (23.4) 


* An electron passing through a gas ionizes the atoms in its path. The 
supersaturated vapour with which the chamber is filled condenses on the ions. 
Upon illumination, the droplets appear in the form of a cloud track. 

** Sometimes, Ap*, Ax are meant to signify not the “spreadings” p* and x 
in themselves, but their mean square values. Then, Ap* Ax > h. 


Sec. 23] 


BLECTBON DIFFBACTION 


243 


the component of momentum perpendicular to the trajectory has an 
indeterminacy 


6.6 X 10-« 

10-a 


= 6.6 X 10-*6, 


and the momentum itself is 

p = V2^ = V2.9 X 10-«« = 1.7 X 10-1®. 

It follows that the relationships (23.5) are satisfied to an accuracy 
of up to four parts in ten million. The observation of a track in a 
cloud chamber does not allow us to determine the trajectory with 
accuracy sufficient to notice deviations (in electron motion) from the 
laws of classical mechanics. 

The limitations of the concepts of classical mechanics. Thus, quantum 
mechanics does not abolish classical mechanics, but contains it as a 
limiting case, much the same as wave optics includes the geometrical 
optics of light rays as a limiting case. As we shall see later, quantum 
mechanics is concerned Avith the same quantities as classical mechan¬ 
ics, i.e., energy, momentum, coordinates, moment. But the finiteness 
of the quantum of action h imposes a limitation on the applicability 
of any two classical concepts (for example, coordinates and momen¬ 
tum) for one and the same motion. 

The coordinate and momentum of an electron cannot simultaneously 
have precise values because the motion is wave-like. To attempt to 
define these precise values is just as meaningless physically as it is 
to seek precise trajectories for light rays in wave optics. In the same 
way that it is impossible to obtain, as a result of improvements in 
optical devices, a precise definition for light rays in wave optics, any 
progress in measuring techniques as applied to the electron will not 
allow us to determine its trajectory more precisely than indicated by 
the relation (23.4), since strictly speaking the trajectory does not exist. 

Attempts are sometimes made to interpret relation (23.4) errone¬ 
ously. It is taken that a trajectory cannot be determined because 
the precision of the initial conditions does not exceed Apx and A a; 
connected by relation (23.4). This would mean that some actual tra¬ 
jectory does exist but that it lies within a more or less narrow region 
of space and within a certain range of momenta. The “real” trajectory 
is likened to the imaginary trajectory from a gun to a target before 
firing. The path of the bullet is not precisely knovni beforehand, if 
only because strictly identical powder charges cannot be obtained. 
But this inaccuracy in the initial conditions for the bullet only leads 
to a smooth dispersion curve for the hits on the target, while the distri¬ 
bution of electrons indicates diffraction effects. The presence of dif¬ 
fraction shows that no “real-though-imknown-to-us” trajectory exists. 
As a matter of fact, relation (23.4) by no meams indicates with what 
error certain quantities may be measured simultaneously, but to what 


16* 


244 


QUANTUM MECHANICS 


[Part III 


extent these quantities have precise meaning in the given motion. 
It is this that the uncertairdy principle of quantum mechanics expresses. 
The term “uncertainty” emphasizes the fact that what we are con¬ 
cerned with is not accidental errors of measurement or the imperfection 
of physical apparatus, but the fact of momentum and coordinate of 
a particle being actually nonexistent in the same state. 


Sec. 24. The Wave Equation 

The wave function. Diffraction of Mght occurs because the wave 
amplitudes are added. When the wave phases coincide the intensity 
(which is proportional to the square of the resultant amplitude) is 
maximum; when the phases are opposite the intensity is minimum. 
In the diffraction of electrons, a quantity similar to intensity is meas¬ 
ured by the blackening of a plate, which blackening is proportional 
to the number of electrons incident on unit area. The distribution of 
the blackened grains on the plate obeys the same law as in the case 
of the diffraction of X-rays (in the sense of alternation of maxima and 
their relative positions). Thus, in order to explain the diffraction of 
electrons we must assume that with their motion there can be associat¬ 
ed some wave function whose phase determines the diffraction pattern. 

At the end of this section, we will show in the general case that such 
a wave function must be complex, since a real wave function cannot 
correspond to just any type of motion. 

Probability density. Electrons move independently of one another 
and pass through a crystal singly, as it were. Therefore, the number 
of electrons in an element of volume dF is proportional to the proba¬ 
bility of the appearance of one electron. Probability (a quantity simi¬ 
lar to light intensity) must be quadratic with respect to the wave 
function, in the same way that light intensity is quadratic in wave 
amplitude. But since probability is a real quantity, it can only depend 
on the square of the modulus of the wave function. Let us put 

dw = \<h(x,y,z,t)\^d'V. (24.1) 

Here div is the probability of finding an electron in the volume dV 
at the instant f; then 1 1 }' is the probability referred to imit volume 
or, otherwise, the probability density. 

The linearity of the wave equation. Like in optics, where the laws 
of propagation of the wave itself are studied on the basis of Maxwell’s 
equations, the intensity being found by squaring the wave amplitude, 
it is necessary in quantum mechanics to find the equation governing 
the quantity and not the probability density. This equation must 
be linear. Indeed, two interfering waves, when combined, give a result¬ 
ant W'ave. In order to obtain the same interference pattern as in 
optics, it is necessary to perform a simple algebraic addition of the 


Sec. 24] 


THE WAVE EQUATION 


246 


functions; both the summands and the sum must satisfy the same wave 
equation. But only the solution of linear equations satisfy this require¬ 
ment. The phases of the waves are very essential here since in order to 
formulate the laws of diffraction it is necessary to know not only the 
behaviour of the squares of the amplitude but also that of their phases. 

In other words, we must have an equation for the wave function 
itself of the particje (]; {x, y, z, t). 

The wave function of a free particle. Proceedmg from the analogy 
between geometrical optics and mechanics, it is easy to construct a 
wave corresponding to a free particle not subject to the action of 
external forces. We know tliat the state of a free particle is character¬ 
ized by its momentum p. But, m accordance with relationship (23.1), 

a wave with wave vector k=-|^ corresponds to a particle with momen¬ 
tum p. It follows that the wave function of a free particle depends on 
coordinates in the following way (we write it in complex form): 


The time dependence of a wave function is also determined very 
simply, if we recall that the frequency of a wave corresponds to the 
particle energy (Sec. 22). The proportionality factor between them 
has the dimensions of action. As was shown at the end of Sec. 22, the 
factor of proportionality must be the same as between the wave vector 
and momentum; this follows from the condition of relativistic invar¬ 
iance of the correspondence between mechanical and optical quanti¬ 
ties.* Hence 


(24.2) 


Whence we obtain the wave function of a free particle 

_ ■ ’f± I ■ p r 
t-»kr_g h h 

The group velocity of the waves is 

S fi> 

' fik dp 


(24.3) 

(24.4) 


Hence, it coincides with the particle velocity as it should in accordance 
with (22.12), thereby confirming (24.2). 

The relation between wave function and action. We note that the 
wave function (24.3) may be written m the form 


(|) = e 


(24.6) 


* Or at least from invariance to Galilean transformations. 


246 


QUANTUM MECHANICS 


[Part ELI 


where 8 is the action of the particle. Indeed, the action of a free par¬ 
ticle is 

S=-<S‘t + ^T, (24.6) 

because from this we obtain 


P = 


dS 

dr 


^VS, 


es 
at ’ 


as should be the case according to equations (22.9) and (22.11). 

Equation (24.6) confirms the relationship established also in Sec. 22 
between the wave phase and the action of a particle. 

The wave equation for a free particle. The analogy between mechanics 
and optics by no means presupposes that the equations of mechanics 
are written in a relativisticaUy invariant form. It is sufficient to recall 
that the analogy had already been established by Hamilton in 1825. 
The significance of the analogy consists in the fact that a correspond¬ 
ence is established between quantities: momentum and wave vector, 
energy and frequency, action and phase. In future, except in those 
cases where it is specifically stated otherwise, we shall proceed from 
the nonrelativistic form of the equations of mechanics. 

Let us now find the differential equation satisfied by the wave func¬ 
tion (24.3). We have 


(24.7) 


8 <|) 

8x 


i(p^x + p^Y + P^X) 
_ + . ^ _ 


8x^ 


(24.8) 


From (24.7) and (24.8) we obtain 

_ A 4 - 4 . 

i at 2 m \ 8x^ 8y^ ' 82 ®/ 


P’‘ 

2m 


'I' 


(24.9) 


(here we are already using nonrelativistic expressions!), or in abbre¬ 
viated form 


h 8iJ< 

i at 


(24.10) 


where A s the Laplacian operator. Equation (24.10) holds because 
S = , as can be seen from (24.9). 

The Schriidinger equation. Let us now generalize equation (24.10) 
to the case of a particle moving in an external potential field JJ. In 

order to have a relation analogous to d’ = -f- U since S = 

for a free particle in (24.9), we must put 


h 8<i, ^_^ 

i at 2m 


(24.11) 


Soo. 24] 


THE WAVE EQUATION 


247 


E. Schrodinger formulated this equation in 1925, generalizing the 
de Broglie relations for free electrons to the case of bound electrons 
(this was also before the discovery of electron diffraction). 

Equation (24.11) directly follows from (24.10) for the simplest case 
17=const, because then it is satisfied by the same substitution of 
(24.3), though with the momentum value p = V 2m {S —V). From 
here it is only one step to generalization to the case of variable poten¬ 
tial energy. 

But this generalization must in no way be regarded as the deriva¬ 
tion of the Schrodinger equation from the equations of prequantum 
physics, for it expresses a new physical law. 

Its relationship to classical posies can be seen in the limiting tran¬ 
sition, which is folly analogous to the transition from wave optics 
to geometrical optics. 

The limiting transition to classical mechanics. Let us substitute the 
expression for the wave function (24.5) in (24.11). This expression must 
hold in the above limi ting transition because then the wave phase 
surfaces 9 = const correspond to surfaces of constant action 5 = const 
for particles. 

9 = 1-. (24.12) 


Instead of the formal relationship considered in Sec. 22 , we now 
have an equality, since we have introduced a new, universal physical 
constant h. Thus, we put 

= e' • 

Whence 


SiJj _ i 8S 1 

'Jt ~ 

8<li _ i 8S , 

8x h 8x ‘ ’ 


8x^ h 8x^ ‘ h’‘\dxl 


Let us substitute these derivatives in (24.11). Then, after eliminating 
we obtain the equality 


8S ^ 1 

8t 2m 


4!las+u. 

2m 


(24.13) 


The limiting transition from quantum to classical mechanics is 
attained by considering the de Broglie wavelength very small com¬ 
pared with the region in which motion occurs. Since the wavelength 
is proportional to the quantum of action h, the same limiting transition 
may be performed formally by considering that h tends to zero. 
This means that all the quantities having the dimensions of action 


248 


QUANTUM MECHANICS 


[Part III 


are so large compared with the quantum h, that the latter can be 
neglected. In (24.13), passing to the limit h = 0, we have 


dS _ (V6')^ JJ 

dt 2m ^ ^ 


(24.14) 


or ^ + U from (22.9) and (22.11). 

The limiting transition that we have performed here almost com¬ 
pletely repeats the transition from wave optics to geometrical optics 
carried out in Sec. 22. 

The correspondence between classical and quantum theory. We 
have seen that the Schrodinger wave equation does in fact give a 
correct limiting transition. This equation is, as it were, a fourth 
member in the following correspondence: 


geometrical optics-> classical mechanics 

I 

wave optics 


i 

quantum mechanics 


The vertical arrows denote a transition from rays or trajectories 
to wave patterns, while the horizontal arrows denote a transition 
from waves to particles. The latter relates only to nonquantum 
electrodynamics because in the transition to quantum field equations 
the need arises for a corpuscular representation (see Sec. 27). Here, 
we consider only the analogy between quantum mechanics and 
classical wave optics. 

The range of application of various theories. The regions in which 
quantum mechanics and wave optics can be applied do not overlap 
anywhere; in wave optics or, what is just the same, in electrodynamics, 
the velocity of light c is regarded as finite but the quantum of action h 
is considered arbitrarily small. In nonrelativistic quantum mechanics, 
c is considered arbitrarily large while h has a finite value. A quantum 
theory of the electromagnetic field, in which both h and c have finite 
values (i.e., the velocity ranges are comparable with c, and quantities 
with the dimensions of action are comparable with h), has, in essentials, 
also been completed at the present time. At any rate, any concrete 
problem requiring the application of quantum electrodynamics, may 
be uniquely solved to any required degree of precision, and the results 
agree with experiment. The existence of a light quantum as an in¬ 
dependent particle is not a supplementary hjrpothesis which must 
be made in order to formulate quantum electrodynamical equations. 
The consistent quantization of electromagnetic field equations 
necessarily leads to the corpuscular aspect of the theory (for more 
detail see Sec. 27). 


Sec. 24] 


THE WAVE EQUATION 


249 


The nonrelativistic particle quantum mechanics (i.e., constructed 
on the relation S = + U) is, in the region for which it is applicable, 

a theory which is just as consummate as Newtonian mechanics. 
Like the equations of Newtonian mechanics, the wave equation 
(24.11) is valid only for particle velocities small compared with the 
velocity of light. But still, in the region for which it is applicable, 
it is just as firmly established (in the same sense) as are the Newtonian 
laws for the motion of macroscopic bodies. 

The grounds for this are absolutely the same—^both nonrelativistic 
quantum theory and Newtonian mechanics agree with the widest 
range of experimental data, never contradicting them and providing 
for correct and unique predictions. In addition, they nowhere contain 
contradictory statements. The latter condition is, of course, not 
sufficient for a physical theory to be correct but it is at least necessary. 
The Bohr theory, or the old quantum mechanics (as it is otherwise 
known) did not satisfy this requirement; in addition to the classical 
concept of trajectory, it involved the quantum concept of discreteness 
of states. For this reason, it had always been clear that the Bohr 
formulation of quantum theory was not final and should be revised, 
no matter how wide the range of experimental facts that it explained. 

Quantum mechanics permitted the construction of a consistent 
theory of atomic structure. The actual calculation of wave functions 
for electrons in complex atoms is a problem of enormous mathematical 
complexity.* However, it is, of course, by no means the purpose 
of quantum mechanics to calculate the spectra of complex atoms; 
the essential point is that quantum mechanics allows us to systematize 
atomic and molecular states in such a way that the very nature 
of the spectra is understood, whereas classical mechanics could not 
explain even the stability of the atom. Thanks to quantum mechanics 
such fundamentally important facts as the chemical affinity of atoms 
or Mendeleyev’s periodic law are now understood. 

In its domain, quantum mechanics will, of course, perfect methods 
of approach to various concrete problems. The correctness of its 
general principles will serve as a basis for such refinement. 


The normalization condition for a wave function. Let us return 
to the wave equation (24.11). We shall write it for a wave function t}* 
and a conjugate function ({/* (in the second equation we have to 
replace — i by i): 


h Si}' 
i dt 

i dt 


2m 

2m 


At};* + U<\i*. 


* It is considerably simplified thanks to approximation methods suggested 
by V. A. Fok. 


250 


QUANTUM MECHANICS 


[Part III 


Let US multiply the first equation by (}'*. and the second by tj(, 
and subtract the second from the first. The term C/ij; is eliminated 
and the remaining terms give 

The left-hand side of the last equality is transformed to the form 

-Thr^=-Tit\^V- 

We can write the right-hand side more fully thus: 

“ ('J'* A(j; - (j^A:];*) = - {>P* divgrad divgrad (J;*) = 

= —div (<J^* grad grad ijj*) 

[see (11.27)]. Finally, we represent the equality obtained in the 
following form: 

AI = - div {- 2 ^ ((j.* V<{/ - V^*)}. (24.16) 

The left-hand side of this equality is the time derivative of the prob¬ 
ability density of finding a particle close to some point of space. 
Let us integrate (24.16) over the whole volume in which the particle 
might be situated. If this volume is finite then beyond its boundaries tj; 
and t);* must be equal to zero. But then, from the Gauss-Ostrogradsky 
theorem 

|^Ji4-l«dF = 0. (24.17) 

It follows that the integral itself does not depend upon time. It is 
easy to see that it must be equal to unity because this is the probabihty 
of an electron being somewhere, i.e., the probability of a trustworthy 
event. The condition 

J|^|2ci!F = l (24.18) 

is called the normalization condition of a wave function. 

If the electron motion is infinite, i.e., nowhere becomes zero, 
the normalization condition appears more complicated. However, 
in practice, it is always possible to consider that the volume in which 
the electron is situated is very large and finite, so that the condition 
(24.18) can be used. The physical results, naturally, do not depend 
upon the arbitrary choice of volume. 

Probability •flux density. If we integrate (24.16) over any arbitrary 
volume, we obtain 


Sec. 24] 


THE WAVE EQUATION 


251 


Aj|^|2dF=-J(iiv{-^(4.*V^-t^V+*)}iF = 

If on the left we have the change in the probability of finding 
an electron inside the given volume, then on the right-hand side 
we must have the flux of the probability of it passing through the 
boundary surface of the volume. Accor^ng to (24.19) the density 
jf the probability flux is equal to 

It follows that a real wave function gives j = 0, i.e., it cannot 
be used for describing the current of an electron. Therefore, in a general 
ielinition, the wave function ij; must be a complex quantity. 

The equation lor stationary states. Let us assume that potential 
anergy does not depend explicitly upon time. Then classical mechanics 
leads to conservation of energy of the system. The action of such 

a system involves a term — S’t. But since = e '< in quantum 
mechanics, too, we must seek a wave function in the form 

_ • IL 

^ = e ' iiQ(x,y,z). (24.21) 

Substituting this in (24.11) and omitting the zero subscript, we obtain 
;he equation 

-^Afi>+Url, = Si>. (24.22) 

As we shall see in the next section, this equation has a solution 
(vhich does not satisfy definite necessary conditions for all values 
of S. Thus, it turns out that, in contrast to the energy in classical 
mechanics, the energy of a quantum system cannot always be 
arbitrarily given. 

Exercise 

Prove that if there are two solutions of (24.22) for different values of energy 
f and S', then 

J ^*(r,S)^(r,S')dV 

The functions (r, and ^ (r, ^') satisfy the equations 

—A4i -1- I7il( = S'^. 


252 


QUANTUM MECHANICS 


[Part III 


Let us multiply the first equation by i|j. the second by 4'*. and subtract the 
second from the first. Integrating over the whole volume, similar to (24.19), 
and then transforming the volume integral on the left to a surface integral, 
we obtain the equation 

J T’ = 0- 

It follows that if then the second factor is equal to zero as required. 

This is the so-called orthogonality property of wave functions. It will be shouii 
in more general form in Sec. 30 because it forms one of the most important 
principles of quantum mechanics. 


Sec. 25. Certain Problems of Quantum Mechanics 


In this section we shall obtain solutions to the wave equation 
for certain cases which are partly illustrative and partly auxiliary. 
Nevertheless, many important laws are explained from these examples. 

We have already obtained the solution of the wave equation for 
a free particle (24.3). We shall now examine the solutions for bound 
particles. 

A particle in a one-dimensional, infinitely deep potential well. Let 
us suppose that a particle is constrained to move in one dimension 

remaining in an interval of length a, so 
that 0<a;<a. We can imagine that at 
the points x = 0 and x=a, there are ab¬ 
solutely impenetrable walls which reflect 
the particle. A limitation of this type is 
represented with the aid of the potential 
energy curve shown in Pig. 36. U — co 
at X < 0 and x > a. We put C/ = 0 at 
0<x<a; this is the potential energy 
gauge. To leave the region 0<x<a, a 
particle would have to perform au in¬ 
finitely large quantity of work. Thus, 
the probability for the particle to be at x = 0 or x=a is equal to zero. 
With the aid of (24.1), we obtain 

ij;(0) = ij;((i) = 0. (25.1) 

These boundary conditions may also be justified by means of a limiting 
transition from a well of finite depth. This will be done later. 

Insofar as the potential energy is time independent, the wave 
equation must be written in the form of (24.22). The motion is one¬ 
dimensional and, therefore, we must take the total derivative 
in place of A. From this we have 


2m 


dx^ 


(25.2) 


See. 25] 


CERTAIN PROBLEMS OP QUANTUM MECHANICS 


253 


We introduce the shortened notation 


2m<S’ 


(25.3) 


so that the wave equation will be of the form 

The solution to (25.4) is well known: 

= CisinRarH-Cjcosxx. (25.5) 

But from (25.1) tj; (0) = 0 so that the cosine term must be omitted 
by putting 0^ = 0. There remains 

4^ = Cl sin Xa:. (25.6) 

We now substitute the second boundary condition 


4^ (a) == Cl sinxa = 0. (25.7) 

This is an equation in It has an infinite number of solutions; 

xa — {n+ 1 )t:, (25.8) 

where n is any integer equal to or greater than zero: 


0 ^ ^ oo . 


(25.9) 


We discard the value n= — 1 because, for ra= — 1, the wave function 
becomes zero everywhere, 4^=sin 0 = 0. Hence, 14^12 = 0, so that the 
particle simply does not exist anywhere (a “trivial solution”). Now 
substituting x from the definition (25.3) and solving (25.8) with respect 
to energy, we find an expression for the energy 

<25.10) 


Eigenvalues of energy and eigenfunctions. The boundary condition 
imposed on a wave function is, for a given problem, just as necessary 
as the wave equation itself. However, as can be seen from (25.10), 
the boundary condition is not satisfied for all values of the energy, 
but only for values belonging to a definite series of numbers by which 
the problem imder consideration is given. It will be seen later on 
that, depending upon the conditions, these numbers may form a 
discrete series or a continuous sequence. They are called eigenvalues 
of the energy of a quantum mechanical system. The wave functions 
belonging to the energy eigenvalues are called eigenfunctions. 

The foregoing example is the simplest, one in which the energy 
eigenvalues form a discrete series. The energy of a free particle forms 
a continuous series of eigenvalues. Indeed, the only condition which 


2S4 


QUANTUM MECHANICS 


[Part III 


can be imposed on the wave function of a free particle consists in 
the fact that it must be finite everywhere, because the square of 
its modulus is the probability density of finding the particle at a 
given point in space. But the function (24.3) remains finite for all 
real values of p and S. 

The assembly of energy eigenvalues of a particle is termed its 
energy spectrum. The energy spectrum in an infinitely deep potential 
well is discrete, while the energy spectrum of a free particle is 
continuous. 

The solution of Schrodinger’s equation (24.22) for stationary states 
is always associated with finding the energy spectrum. In contrast 
to the Bohr theory, where the discreteness of the states appeared 
as a necessary but foreign appendage to classical motion, in quantum 
mechanics the very nature of the motion determines the energy 
spectrum. This will be seen especially clearly in the examples to follow. 

The nodes or zeros of a wave function. The function ij; becomes zero n 
times in the interval 0 — a (except at its ends). The quantity of 
zeros (“nodes”) of a wave function is equal to the number of the 
subscript of the energy eigenvalue. 

This result is easily understood from the following considerations. 
In the interval 0 — a (for w = 0) there is one sinusoidal half-wave; 
for n=l, there is one complete wave; for n = 2, there is one and a 
half waves, etc. Thus, the greater n is, the less the de Broglie wave¬ 
length X. But energy is proportional to the square of the momentum, 
i.e., it is inversely proportional to the square of X, in accordance 
with (23.2). Therefore, the less X is, the greater the energy. This 
conclusion holds, of course, for wave functions that are not of a purely 
sinusoidal shape, though not as a general quantitative relationship 
but, instead, qualitatively—the more zeros or “nodes” that the wave 
function has, the greater the energy. The least energy state has no 
zeros anjrwhere except at the limits of the interval x — 0 and x — a. 
It is called the ground state, all the other states being termed excited. 

^Normalization of a wave function. It remains to determine the co¬ 
efficient Cl in order to define a wave function completely. We shall 
find it from the normalization condition (24.18); 

J \’^\’^dx — Cl j sixi^y.xdx = Cl j ^ dx = 

0 0 0 

fiitx sin2xa:\l‘> G\a , 

= -= = 

The second term of the integrated expression becomes zero at both 
limits in accordance with (26.8). Thus, 


See. 26] 


OEBTAIN PBOBLEMS OF QUANTUM MECHANICS 


255 


= (25.12) 

Real wave functions. The wave function (26.12) is real. Therefore, 
from (24.20), the current in this state is equal to zero. This can also 
be seen in the following way. The wave function (25.12) can be ex¬ 
panded into the sum of two exponentials. Each such exponential 
represents, together with a time factor, the wave function of a free 
particle (24.3), one of them corresponding to a momentum 'p — hx 
and the other, to the same momentum but with opposite sign. Thus, 
a state with Avave function (25.12) is represented as the superposition 
of two states with opposite momenta, these states having equal 
amplitudes. The mean momentum for a particle moving in a potential 
well according to classical mechanics is equal to zero; it changes 
sign for eiwy reflection from the walls of the well. In this sense, 
we can say that the mean momentum is also equal to zero for quantum 
motion. The difference is that at every given instant classical mo¬ 
mentum possesses a definite value, while the quantum momentum 
of a particle in a well never has a definite value ; the wave function 
involves states with momenta of both signs. This corresponds to 
the uncertainty principle; since the particle coordinate is within 
the limits 0<x<a, the momentum cannot have an exact value. 

In addition, we note that in this particular problem of a rectangular 
well the square of the momentum has a definite value since the 
uncertainty is extended only to the sign of the momentum. The 
square of the momentum in this case is proportional to the energy. 
The square of the momentum for a well of any arbitrary shape is 
also not fixed. 

A particle in a three-dimensional infinitely deep potential well. 
Let us now suppose that a particle is contained in a box whose edges 
are %, a^, a,^. Generalizing the boundary conditions (25.1), we conclude 
that the wave function becomes zero on all the sides of the box: 


(0, y,z) = ^ (X, 0, 2) = ({/ {X, y, 0) = t]; (ffi, y, z) = 

= '\>(x,a2,z) = <\i(x,y,as)=0. (25.13) 

The wave equation must now be written in three-dimensional form: 


2m \ dx^ ‘ dy^ fiz* / ^' 


(26.14) 


It is convenient to write the solution as follows: 


ij< = (78inxia; • sinx^y • sinxgZ. (26.16) 

It is written only in terms of sines and not cosines so as to satisfy 
the first line of the boundary conditions (25.13). We substitute (26.15) 


266 


QUANTUM MECHANICS 


[Part III 


in (15.14) and, utilizing the fact that for every factor of (25.15) there 
exists an equality of the form, 


this gives 


sin sin x ; 


A 4- = - (xj + x“ + x|) . 


(25.16) 


To satisfy equation (25.14) the energy must involve x^, 
in the following way; 


2wi 


(xj + X» + X* 


Xj and Xg 
(25.17) 


The quantities x^, Xg, Xg are determined from the second line of the 
boundary conditions (25.13). The factors of (25.15) convert to zero 
either at x — a^, or y — a,^ or z — a^. In other words. 


sinxiai = 0, Xicq = ni7t; 

sinxgttg^^’ = (25.18) 

sinx3a3 = 0, y.^a^ = n^v:. 


Here Wj, and % arc integers of which none are equal to zero (other¬ 
wise <4 would be equal to zero over aU the box). 

Substituting x^, Xg, Xg from (25.18) in (25.17), we have the energy 
eigenvalues 

tp _ , nl ^ 

® ~ 2 m Uf ■ "I «§ 


(25.19) 


The least possible energy is 


’111 — 


fh^lj 

2m \ af 


(25.20) 


It follows that the value S’ — O is impossible. 

Calculating the number of possible states. To each value of the three 
numbers n^, and Wg, there corresponds a single particle state. 
Let the numbers n^, n^, and Wg be large in comparison with unity. 
Such numbers may be differentiated: the differential dn^ denotes 
a number interval which is small compared with Mj, but stiU including 
many separate integral values of Then it stands to reason that 
there are exactly dn^ possible integers, included within 

the interval dui (and similarly within the intervals dn^ and dn^). 
Let us plot Ui, Wg, and Wg on coordinate axes. In this space we construct 
an infinitely small parallelepiped of volume dn^ dn^ dn^. In accordance 
with the foregoing, there are dui dn^ d«g groups of three integers 
n^, Tig, Wg in this parallelepiped, each corresponding to one possible 
state of the particle in the box. Altogether, the number of such 
states in the examined interval of values Wj, Wg and Wg is 

dN (rii, Wg, Wg) = duidn^dn^ . 


( 25 . 21 ) 


Sec. 25] 


CERTAIN PROBLEMS OP QUANTUM MECHANICS 


257 


Substituting here x^, x^ and Xg from (25.18), we obtain another ex¬ 
pression for the number of states: 


dN (xi,Xg,Xg) = ^ ^ (25.22) 


where F=ai Ug Ug is the volume of the box. The numbers x^, Xg and 
Xg take only positive values. 

It was pointed out above that to each value of x there correspond 
two values of the momentum projection, which are equal in magnitude 
and opposite in sign. Therefore, if we compare the number of states 

included in the intervals dx,, and then there are half 

^ h h 

as many states for the latter. Correspondingly, the number of states 
in the interval of values of momentum dfx dpy dpz is 


dN (px, py, pz) = 


V dpxdpydpz 
(2it h)^ 


(26.23) 


whore px, Py and p^ assume all real values from — oo to cxj. 

Equation (25.23) agrees with the uncertainty relation (23.4). If 
the motion is bounded along x by the interval a^, then only those 
states differ physically for which the momentum projections differ 

by not less than . Hence, there are states 

within the interval dpx. Multiplying —arrive 

at (25.23). In order to ensure coincidence of numerical coefficients 
with the results of rigorous derivation from the wave equation when 
evaluating the number of states from 
the uncertainty relation the quantity 
2-Kh was selected on the right-hand side 
of (23.4) or 2tz from (18.10). 

We shall now consider the number of 
states after changing somewhat the inde- 
jiendent variables. We plot the quantities 
Xj, Xg and Xg on the coordinate axes 
(Pig. 37). Let us construct in this “space” 
a sphere whose equation looks like 

+ X -t- Xa = K* . 

The numbers Xj, Xg and Xg are posi¬ 
tive so that we shall bo interested only 
in one eighth of the sphere; this octant is shown in Fig. 37. How 
many states are included between the octants of two spheres with 
radii K and K-f-dK? The number of states is equal to the integral 
of (25.22) over the whole volume between the octants, or 


17 - 0080 


258 


QUANTUM MECHANICS 


[Part III 


dN (K) ^ J dN (Xi, Xg, X3) 


V. 4:tK2dK 

8n* 


FKi*dK 

27c“ 


(25.24) 


This is evident simply from the fact that the volume is equal to the 
surface of the octant multiplied by dK. But from (25.17) K 

is very simply related to the energy of the particle: 

^_ s/im^ 

Whence 


h 


dN (<r) - 


VmVtdV.dS' 

2V.rc2As 


(25.25) 


Thus, the number of states included between ^ and S-\-d^ in¬ 
creases in direct proportion to In a one-dimensional potential 

well we would obtain dN — dn— - Equation (25.25) 

has great significance in all that is to follow. We observe that this 
formula involves only the volume of the vessel V, irrespective of 
the ratio of the edges o^, a^, and a^. In mathematical physics courses 
it is shown that the result (25.25) holds for energy eigenvalues which 
are sufficiently large compared with the energy of the ground state. 
The number of states is proportional to the volume of the vessel 
and is independent of its shape. 

A one-dimensional potential well of finite depth. We shall now con¬ 
sider a one-dimensional potential well of finite depth. We specify 

it in the following way: 17 = 00 at — 00 < 
^ ^ X <0, U~0 for 0< a:< a and U = 

for a < a; < 00 . In other words, the potential 
energy for a; > 0 is everywhere equal to U^, 
except within a region of width a near the 
coordinate origin, which region we called 
the well. For a:<0 the potential energy is 
~ infinite (see Fig. 38).* 

Since the solution will be of different 
analytical form inside and outside the 
well, we must find the conjugation conditions for the wave func¬ 
tion at the boundary. 

Let us take the wave equation. 


a 

Fig. 38 


2m dx^ 


+ C/4- 


(25.26) 


* It was shown in Sec. 19 that the three-dimensional wave equation can 
be reduced to a single-dimensional one, with the difference that the variable r 
must be positive by its very meaning. This may be attained formally by situating 
an infinitely high potential wall at r = 0. Fig. 38 actually refers to a spherical 
potential well with an angular-momentum value equal to zero, when there 
IS no “centrifugal” term in the potential energy [see (5.8) and (31.6)]. 


Sec. 25] 


CERTAIN PROBLEMS OP QUANTUM MECHANICS 


259 


ajid integrate both sides over a narrow region a —a+8, 
including the point of discontinuity of the potential energy x = a. 
The integration gives 

a+8 


2 m 


r/ii\ _ 


1- f 

L\ dx 1 a+8 

\dx j a-8 . 

1“ J 


a-S 


(26.27) 


Even though U sufiFers a discontinuity at the boundary of the well, 
on the right it remains everywhere finite by arrangement. Therefore, 
when 8 approaches zero, the integral on the right also approaches 
zero. It follows that the left-hand side of (25.27) is also zero. In other 
words, 


(26.28) 


the limit of the derivative on the right is equal to its limit on the left. 

This argument would not hold in the problem of an infinitely 
deep well because then the integral in (25.27) would be indeterminate. 

Besides we notice that the derivative is finite at the points a; = 8 

simply because the only solutions of equation (25.26) are those with 
finite derivatives (exponential function, sine or cosine). 

We shall now show, by means of a limiting transition, that even 
the wave function itself does not suffer a discontinuity at the boundary. 
Let U initially have a finite discontinuity region of width 8 and let 
the discontinuity of the function be A. Before passing to the limit 

the derivative in the region of discontinuity is of the order of ~ , 

so that when 8-^0, it diverges. Let us now multiply both sides of 
(25.26) by t}' and perform a transformation by parts: 


Let us integrate the transformed expression between a — 8 and 
a-f^. We then obtain 


4+6 4+8 

4—8 4—8 


(26.59) 


We shall now perform the foregoing limiting process by indirect proof. 
We may write the integrated terms thus: 

(t II) 1) *)i (1^) . 

because the derivative as was shown, is not subject to a dis¬ 
continuity. Within the assumed discontinuity region of the iJ;-function, 


17* 


260 


QtTANTtrM MECHANICS 


[Part III 


is of the order of —, but at the boundaries of the region it reverts 

to values which are independent of 8 and are therefore finite in the 
limit. Hence, the whole integrated part on the left in (25.29) is of the 

order of A -^ • The remaining integral is estimated as 

a-S 

Hence it tends to infinity as 8 tends to zero. The right-hand side of 
(25.29) is finite for 8 ->- 0 . Thus, by assuming that <{' has a finite dis¬ 
continuity A we have arrived at a contradiction. It follows that i|; 
is continuous at the point a together with its first derivative. 

Solutions in two regions. The wave equation for the region 0 < o 
(inside the well) is of the form 


We take its solution 


2m 


<{^=(71 sin yix , 


(25.30) 


where x is defined from (25.3). The solution involving the sine only 
is taken because at the left-hand edge of the weU, where the potential 
energy suffers an infinite discontinuity, satisfies the boundary 
condition (25.1): 9 (0) = 0. 

The wave equation outside the well, when x>a, is 


2m dx^ 


(25.31) 


First of aU let us take the case S>Uq. Then, introducting the ab¬ 
breviated notation 


2m 

~hr 


(26.32) 


we obtain (25.31) in the standard form (26.4) 


whence 


da:* 


= - xf (p . 


sin Xi a: -f Gg cos Xj x . 


(25.33) 


We must now satisfy the boundary conditions on the right-hand 
edge of the potential well where U suffers only a finite discontinuity. 
According to these conditions the wave function is itself continuous, i.e., 


Gi sin xa = sin x^ a + Cg cos x^ a (25.34) 

and its derivative 

X cos xa = Xi Gg cos x^ a — x^ Gg sin x^a. 


(26.35) 


Sec. 25] 


CEBTAIN PROBLEMS OF QUANTUM MECHANICS 


261 


From these equations we can determine G^, and Cg in terms of C^, 
i. e., completely express the solution outside the well in terms of the 
solution inside the well. The equations (25.34) and (26.36) are linear 
with respect to Cg and Cg and have solutions for all values of coefficient. 


<^2 = 


xjsin>casinxja-|- xcosxocosxja 
xjsinxacos XjO — xeosxasinxja 


Cl, 

Ci. 


Therefore, the boundary conditions may be satisfied for any real values 
of X and Xj. Thus, Schrbdinger’s equation is solvable for all There 
is no discrete energy eigenvalues for > ZJg. 

We could adjust the potential energy in this problem to zero at 
infinity, i.e., consider it equal to zero for x > a and equal to — U 
for o> a;^ 0. Then the case which we have just considered would 
correspond to positive eigenvalues of the total energy. 

Now let S' < Uq. We introduce the quantity 


2m 

~hF 


(Uq—S) 


(25.36) 


The wave equation is now written differently from that for S > U^, 
namely 

Its solution is expressed in terms of the exponential function 

+ C^e . (26.37) 

But the exponential e^x tends to infinity as x increases. For x = ooii 

would give an infinite probability for finding the particle, and no 

00 

finite value could be assigned to the integral J | (p dar . It follows that 

0 

a physically meaningful solution exists only for and must be 

of the form 


C.e- 


(26.38) 


Let us again try to satisfy the boundary conditions at x—a. This 
time they appear as follows: 


Cjsin xa = , 


x(/iCosxa 


X Cj e-*". 


(26.39) 

(25.40) 


Let us divide equation (26.40) by (25.39) in order to eliminate 
and Og. We then obtain 


X cot xa = 


(26.41) 


262 


QUAKTtTM MECHANICS 


[Part III 


From this equation we find the expression for sin xa: 


sm y.a = ± —====r= ± 
Vl + cot'*xa 


-W. 


(25.42) 


Let us reduce this equation to a more convenient form. From (26.3) 


a's/2m 


so that 


sin xa = ± • 


(25.43) 


only those solutions should be chosen for which ctg xa is negative, in 
accordance with (25.41), i.e., xa must lie in the second, fourth, sixth, 
eighth, etc., quadrants. 

We shall solve this equation graphically (Fig. 39). The left-hand side 
^ , of equation (25.43) is repre- 

sented by a sinusoid, while 
the right-hand aide is repre¬ 
sented by two straight lines 

of slopes ± . If 


a-\/2mUa 

the absolute value of the 
slopes of the angle of incli- 
xa nation of these lines is less 
than 2/7t, they have one or 
several common points with 
Fig. 39 the sinusoid in the quadrants 

corresponding to the roots 
of (25.41). The trivial point of intersection xa =0 does not count 
because, for x == 0, the wave function is zero everywhere. Thus, in a 
weU of finite depth of the form considered, there are only several 
energy eigenvalues. 


o'\/2mt7j 


^ Tt ’ 8 ma<‘ ’ 


there are in general no points of intersection of the straight lines with 
the sinusoid corresponding to energy eigenvalues. In Fig. 39 the 
points of intersection in the even quadrants are marked by small 
circles. 


Sec. 25] 


CERTAIN PROBLEMS OP QUANTUM MECHANICS 


263 


Finite and infinite motion. We shall now relate the shape of the 
energy spectrum to the t 3 ^e of motion. For ^ >Uq the solution 
outside the well is of the form (25.33). It remains finite also for an 

infinitely large x. Therefore, the integral J | ij; | * da; taken over the 

region of the well is infinitesimal compared with the same integral 
taken over all space. In other words, there is nothing to prevent the 
particle going to infinity. Such motion was termed infinite in Sec. 6. 
For Uq, the solution (25.38), if it exists, is exponentially damped 
at infinity. Hence, the probability of the particle receding an infinite 
distance from the origin is equal to zero—^the particle remains at a 
finite distance from the well all the time. This motion was termed 
finite in Sec. 5. 

Thus, infinite motion has a continuous energy spectrum while finite 
motion has a discrete spectrum consisting of separate values. If the 
depth of the well is very small, the finite motion may bo absent. It has 
no comiterpart in classical mechanics. Finite motion is always possible 
in a potential well if | ^ | < 17. 

The result that we have obtained does not only refer to a rectangular 
potential well. Indeed, if the potential energy is taken to be zero at 
infinity, then the solution with positive total energy is of the form 
(26.33) for sufficiently large x, while the solution with negative total 
energy is of the form (25.38). The latter contains only one arbitrary 
constant while (25.33) contains two constants. Both solutions must be 
extended to the coordinate origin in order that the condition <|> (0) = 0 
can be satisfied at the origin (we consider that x is always greater 
than zero). Obviously, if we have two constants at our disposal we 
can always choose them so that the condition t{j (0) = 0 is satisfied.* 
Contrarily, a solution of the form (25.38) containing one constant 
becomes zero at the origin only for certain special values of x. 

A continuous spectrum corresponding to infinite motion may be 
accounted for in the following way. A free particle moving in unbound¬ 
ed space has a continuous spectrum. The wave function of the particle 
in infinite motion differs from the wave function of a free particle 
only in the region of a potential well. But the probability of finding 
the particle in this region is infinitesimal if the whole region of motion 
is sufficiently large. Therefore, the wave function for infinite motion 
coincides with the wave function of a free particle in “almost” the 
entire space, i.e., in that region of space for which the probability 
of finding the particle is equal to unity, and the energy spectrum 
turns out to be the same as for a free particle. 

The wave function in a region where the potential energy is greater 
than the total energy. If Uq tends to infinity the function outside the 


* If ^ (0) = (0) -H (0), then . 


264 


QUANTUM MECHANICS 


[Part III 


well very rapidly tends to zero. In the limit Uq ->■ oo, it tends to zero 
however close to the boundary x—a, thereby giving the boundary 
condition (26.1). 

In the case of a finite C/q ^^e wave function outside the well does 
not become zero at once. Therefore a finite probability exists that 
the particle will be outside the well at a finite distance from it. 

This would have been completely impossible in classical mechanics, 
as is obtained from (25.38) in the limiting transition h -»0 for k = oo 
and (j; however small outside the well. This, naturally, should be the 
case: if the particle is situated outside the well its kinetic energy is 
S — f/fl < 0. But the velocity of such a particle is an imaginary 
quantity. In classical mechanics it means that a given point of space 
is absolutely unattainable for the particle at the given value of its 
energy S. In quantum mechanics, a coordinate and velocity never 
exist in the same states as precise quantities. Earlier, we interpreted 
this in terms of the uncertainty relation, i. e., we considered cases for 
wliich precision in the concept of velocity for a certain state was 

restricted by the limits . However, this is a lower limit and has 

to do with particles which are almost unaffected by forces. The appear¬ 
ance of an imaginary velocity in the equation for a boimd particle 
shows that the very concept of velocity is not applicable to a region 
of space, however large, for which U >S. We can express this differ¬ 
ently by saying that, for V >S, the uncertainty in the kinetic energy 
is always greater than the difference U — 

To summarize, in classical mechanics there is no counterpart to 
the motion of a bound particle outside a well. 

Exercises 

I) The potential energy is equal to zero for x<0 and equal to for x>0 
(the potential threshold). Incident from the left are particles with energies 
S’ > U„. Find the reflection coefficient. 

The wave function on the left is 


j 2^ _ . px 

On the right, above the threshold, the function is 
■ p'x 

Cj e ** , where p' — ■\/1m (S — U). 

Find the ratio |0*|®/|CiP from the boundary conditions at x—0, i.e., the 
ratio of the squares of the amplitudes of the reflected and incident waves. 
The ratio is equal to unity for S< Uo. 

2) The potential energy is equal to zero for x<0 and for x>a. U — Uq 
for 0<a;<a (a potential barrier). Particles are incident from the left with 
energies less than Uq. Find the coefficient of reflection. 


Sec. 26] 


HARMONIC OSCII.I,ATOBy MOTION 


265 


The wave function to the left of the barrier is equal to e'** + Ce”**'* ; 

under the barrier, i.e., for 0<a;<a, the wave function is C^e'’^. We 

look for a wave function of the form 0'se'*Jf beyond the barrier. This means 
that beyond the barrier the only wave is that travelling to the right (i.e., 
only the transmitted wave), while in front of the barrier wo find both the incident 
wave and the reflected wave. The constants (7, Oi, (7j, C 3 are determined from 
the continuity condition of the wave function and its first derivative at the 
boundaries of the barrier. The expressions for the constants C and are as 
follows: 

_ _ 2 (x'* + k^) sh x« 

(x + ik)^e—*‘‘ — (x — ik)^e>“‘ ’ 

® (x + ik)^e -xa — (x — ik)^e^‘> 


The particle flux on the left and right of the barrier is, respectively. 


• hk „ 
m 


|CP), j: 


hk 

m 


Substituting G and G 3 it is easy to see that both the expressions for flux coincide 
as expected. 

If xai> 1, i.e., the barrier is transparent to a very small extent, wo hav’o 
G -1 , Ca = - — - 6-*"6-xa . 

X 

Thus the flux diminishes exponentially with the thickness of the barrier. 

It will also be noted that the total particle flux through the barrier is pro¬ 
portional to the particle density in front of the barrier, because the boundary 
conditions are linear and homogeneous with respect to the wave functions. 
By specifying the amplitude of the wave fimction on the left we determined 
the density and flux of the particles. 

3) Verify the orthogonality property for wave functions (the exercise 
in the preceding section) for a particle in a box of finite and infinite depth. 


Sec. 26. Harmonic Oscillatory Motion in Quantum Mechanics 
(Linear Harmonic Oscillator) 


The wave equation lor an oscillator. In Sec. 7 we considered har¬ 
monic oscillations with one degree of freedom. The Hamiltonian func¬ 
tion of this system, called a linear harmonic oscillator, is of the form 


J^ = 


P'‘ 

2 m 


2 


( 26 . 1 ) 


Forming Hamilton’s equations, we obtain 


p = - 


d3^ 

dx 


= — 


• _ _ p 

dp m ’ 


Eliminating p, we arrive at the usual equation of harmonic oscillations 
(7.13): 


x + = 0. 


266 


QUANTUM MBOHAiaCS 


[Part III 


In quantum mechanics the wave equation corresponding to this 
motion has the form [see (24.22)] 


. TO 6 >“a:“ , 
2m dx^ ' 2 ^ 


(26.2) 


Indeed, since the motion has only one degree of freedom, instead of 
the Laplacian A, we must simply write the second derivative. The 

potential energy is equal to —^— • 

Let us now introduce other units of measurement, in particular, 

we shall take the unit of length equal to 1/—^, so that 


x = 


(26.3) 


The quantity ^ is dimensionless. The derivative 


dtjl 

dx 


is equal to 


Further, we put 


d^ _ / W6> d^ 

dx h * 


2S = £ • A<o . 


(26.4) 

(26.5) 


In terms of these dimensionless variables, equation (26.2) assumes 
the form 


^ 24 , 


^2 4 , = 


(26.6) 


Equation (26.6) does not contain any parameters of the problem, 
i.e., < 0 , m, and h. For this reason the eigenvalue s can only be an 
abstract number. Comparing this with the expression for energy 

(26.6), we see that the energy eigenvalue of an harmonic oscillator is 
proportional to its frequency a>. 

The transition to another dependent variable. It appears convenient 
to introduce a new dependent variable: 


Whence 


^_ If 


:? 2 e‘ 


il 

2 sr(?)- 


-^1 

^ S’i?) + e 
e 2 g(l) 


dl 


^ dl + d£“ • 


(26.7) 


(26.8) 


We substitute (26.8) in (26.6) and perform the necessary rearrange¬ 
ment. The new dependent variable g (^) having been introduced, the 
equation assumes the form 


Sec. 26] 


HABMOmO OSClIiliATOBY MOTION 


267 


+ = (26.9) 

Integration in the lorm of a series. It is possible to integrate equation 
(26.9) by expanding it in a power series of the form; 

oo 

? (^) = ffo + ?! ^ + ?2 + 9^3 + • • • (26.10) 

n-0 

In order to determine the coefficients of the expansion gr«, we must 
substitute the series (26.10) into equation (26.9), differentiate it by 
terms and compare the expressions for the same powers of 5- The 
first derivative is 

OO 

= 9-1 + 2sr,5 + 3S-3 5* + ... 

n-1 

SO that 

oo 

+ ... =2j2ng„l''. (26.11) 

n ™ 1 

The second derivative is 

oo 

= 2sr3 + 6g^l + • (26.12) 

)c = 2 

In the last summation we changed the summation index, denoting 
it by the letter k. We shall now revert to n, assuming k — 2=n, 
k — n+2. Then 

oo 

=2’(” + 2) (« + + (26.13) 

«=o 

Now substituting (26.13) and (26.11) into equation (26.9) and collecting 
coefficients of we obtain 

oo 

(w + 2) (n + l)g „+2 + 2ng„ — {e — 1)!7„] = 0. (26.14) 

M = 0 

We know that for a power series to be equal to zero, aU its coefficients 
must convert to zero. Thus, 

9 n + 2—S-n 2) (n + ry* (26.16) 

In this way the expansion proceeds in powers of (^^) because the 
coefficients gn go alternately. 

Examining the series. Let us assume initially that go¥=0. Then, 
from equation (26.13), we find in turn g^, g^, .. .,gzk. Not a single odd 


268 


QUANTXJM MECHANICS 


[Part III 


coefficient will appear in the series if pi=0. On the contrary, if gQ = (i, 
gi^O, then no even coefficients will appear in the series; for this 
reason it is sufficient to examine solutions containing only even or 
only odd powers of To be specific let us first take the series in even 
powers. 

Let us examine the behaviour of the series (26.10) for large values 
of Terms involving high powers of i. e., large n are then predomi¬ 
nant. But if w is a large number then, in equation (26.16), we can neg¬ 
lect all constant numbers where they appear in the sum or, difference 
with n. If n is large the equation will take on the form 


2 

gn + i — ' 


(26.16) 


Let n — 2n' so that n' now changes by unity only. Putting this in 
(26.16), we obtain 


gn'-v\ = 


-Wn' 


(26.17) 


where we have introduced the notation g'n' = g<in' = g'n' are coeffi¬ 
cients of the series in powers of (5^). 

If we now take a function containing only odd powers of then 
the terms involving g^n' + i will not differ, for large n', from the 
terms of a series in even powers of ^ because unity can be neglected 
in comparison with 2 n'. Therefore, the form of the coefficients for 
large n' is identical both for series in even and odd powers of 
From (26.17) we find an expression for g'„>+i: 


gl, 

1 — n' (n’ — l) in' —2) ... 1 


(26.18) 


Whence the expression for gr(^) in the case of large ^ is of the form: 

oo oo 

n'»» 0 0 


Thus, the asymptotic expression for g(^) is the exponential function 
But then, in accordance with the definition of g(^) (26.7), the 
asymptotic form of <j;(5) for large ^ is 

__v ^ 

^ ^ = e ^ . 


However, this form of tj; is not acceptable: the wave function must 
remain finite at infinity because its square is probability. 

The condition lor eigenvalnes. There is only one possibility for 
obtaining a finite value of at infinity. It is necessary that the series 
(26.10) should terminate at a certain n and that all the subsequent 


Sec. 26] 


HARMONIC OSCILLATORY MOTION 


269 


coefficients gn + i, gn^i, etc., be identically equal to zero. It can be 
seen from equation (26.15) that g „+2 becomes zero when 

e = 2«. + 1, (26.20) 

where n is any integer or zero. Since gn + i is linearly expressed in 
terms oi gn + i, it is sufficient for g„ + 2 to convert to zero to have the 
series terminate at gn. It follows that when e satisfies (26.20) the 
function g{^) becomes a polynomial. The product of the polynomial 

g (^) with the exponential ^^ always tends to zero as ^ tends to 

infinity. Hence (oo) = 0. As was pointed out at the end of the last 
section, such motion is finite in the same sense as in classical mechan¬ 
ics : the probability that the particle will recede an infinite distance 
is equal to zero. To finite motion there corresponds a discrete energy 
spectrum; from (26.5) and (26.20) 

e = A<o(«-F^). (26.21) 

The least possible value of energy have already 

said in Sec. 25 a state with energy is called the ground state. For 
this state the series gr(5) is already terminated at the zero term, because 
the number of the eigenvalue of energy determines the degree of the 
polynomial g'«(^). The wave function of the ground state is of the 
especially simple form: 

'}'o{?) = S'o® ( 26 . 22 ) 


This function does not have any zeros at a finite distance from the 
coordinate origin, which must be the case in the ground state. It may 
be noted that the state with zero energy would correspond to a par¬ 
ticle at rest at the origin. However, such a state is not compatible with 
the uncertainty principle, since it has, simultaneously, a coordinate 
and velocity. 

Oscillator wave functions. Let us also find the eigenfunctions for the 
first and second excited states. In the first state (^i = feto|l -f- = 

3 

= 2 ^“.The series will be terminated if we assume g'o = 0, gi^d. 


Then e = 3 from (26.20) and gz=g^=g‘tn + i = ^, etc., from (26.15). 
The even coefficients may be assumed at once equal to zero, for which 
purpose it is sufficient to take g^^O* In general, all functions with 
even n turn out even, while those with odd n are odd. In accordance 
with what has just been said, the wave function with w = l is 


* If gfo # 0 for s = 3, then the series in even powers of 5 would have extended 
to infinity, which, as was shown, is impossible. 


270 


QUANTUM MECHANICS 


[Part III 


= 2. (26.23) 

This function becomes zero for ^ = 0, i.e., it has one node. 

In the same way it is easy to find Indeed, j = A <o ^2 + = 


= -|-A<o, e = 5. From (26.16), the coefficient fiTj is 


so that 


'i'2 = 9^o(l —2^2)e 2 


(26.24) 


(26.25) 


The nodes of this function are situated at the points 5 — ± • 

In general, the function ({;« has n nodes. The functions for several 
small values of n are shown in Fig. 40. 

We show the eigenvalue distribution and the potential energy 
curve in Fig. 41. It is very curious that the ^ 

eigenvalues are separated by equal intervals. 

The oscillator problem qualitatively resembles \__ L 


f, ^ % 


giisa 


Fig. 40 


Fig. 41 


that of an infinitely deep rectangular well (Sec. 26), but the energy 
level in the well increases in proportion to the square of the number. 


Exercises 

1) Show that, neglecting a consteint factor, the function gn ( i) may be written 
in the foiTO 

g«(5) = e^’(A)"e-5». 

Verify this by substitution in equation (26.9), in which e = 2a + l. 

2) Normalize the functions iJ/q and ij/i, taking advantage of the fact that 

oo oo 

->->00 —>00 

(see exercises of Sec. 39). 


Sec. 27] 


QUANTIZATION OF THE EEECTBOMAGNETIC FIELD 


271 


See. 27. Quantization of the Electromagnetic Field 

The electromagnetic field as a mechanical system. An electromagnetic 
field in a vacuum may be regarded as a mechanical system; this was 
shown in Sec. 13. It possesses a Lagrangian function, action, and so 
on. We are, therefore, justified in posing the problem of quantization 
of this system, i.e., applying quantum mechanics to it. 

The basic difference between electrodynamics and the mechanies 
of point masses is that the degrees of freedom of an electromagnetic 
field are distributed continuously: in order to specify the field at a 
given instant of time, we must define its value at every point of space. 
In this sense electrodynamics resembles the mechanics of liquids or 
elastic bodies, if one regards them as continuous media ignoring the 
atomic structure of the substance. The degrees of freedom of a field 
are labelled by the coordinates of points in space, while the amplitude 
values of the potential are generalized coordinates [see (13.2)]. Poten¬ 
tials are usually chosen as generalized coordinates because they satisfy 
second-order equations in time, as do generalized coordinates in 
mechanics. 

The potentials satisfy the Lorentz condition, which reduces to 
div A = 0, provided the gauge transformations are chosen so as to 
eliminate the scalar potential. 

The electromagnetic field coordinates defined in this way are not 
independent of one another. Indeed, the equations of eleetrodynamics 
involve coordinate derivatives, i.e., differences of field values at 
infinitely close points. In this sense, field equations resemble the 
equations for coupled oscillations: they are linear, but each one in¬ 
volves several generalized coordinates instead of one. The equations for 
coupled oscillations can be reduced to normal coordinates which are 
mutually independent. The same can be done with the wave equations 
of electrodynamics, thus separating the dependent variables therein. 
This considerably simplifies the application of quantum mechanics 
to radiation. 

Clearly shown here is the generality of the methods of analytical 
mechanics: they permit determining generalized coordinates and 
momenta in such manner that quantum laws can then be applied 
uniquely. 

The electromagnetic field in a closed volume. We must first of all 
represent the electromagnetic field as some kind of closed system, 
since quantum mechanics is most conveniently applied to such 
systems. We can assume, for example, that the radiation is contained 
in a box with mirror-type reflecting walls. At the walls of such an 
imaginary box (a: = 0 or x—a^, y — 0 or y—a^, 2=0 or 2 =a 3 ) the 
normal components of the Poynting vector U become zero. However, 
it is simpler to suppose that the field is periodic in space, and the lengths 
of the periods in three perpendicular directions are equal to the di- 


272 


QUANTUM MECHANICS 


[Part III 


mensions of the box; the period of the field along x is equal to a^, 
that along y is equal to a^, and that along z is equal to a^. In other 
words, 

A (x, y,z) = X{x-\- Ui, y,z) = A (x, y + a^, z) = 

= A{x,y,z + a^). (27.1) 

We have, as it were, divided space into physically identical regions, 
after which it is sufficient to consider a single region. 

The solution of equations describing a harmonic field in free space 
was found in Sec. 17 [see (17.21)]. Introducing a time dependence into 
the amplitude factor, we represent the potential in the following form: 

A (k, r, t) = A^ (<)e*'‘' + A* (<) , (27.2) 

where its reality is shown explicitly. 

The potential satisfies the Lorentz condition which, for a plane 
wave, can be reduced to the form div A = 0 (since 9 = 0); whence, 
by (11.27), we obtain 

divA(k,r,<) = div(A^e''‘') + div(A*e-*'“) = 

= (A,^Ve'''“) + (A* Ve-'>“) =i(kAJe‘'“—i(kA’)e-““-=0. 

In order that this equation be satisfied for all r, the coefficient of 
each exponential term must convert to zero. In other words, the 
vectors Ak and At are perpendicular to the wave vector k: 

(kAJ=0, (kA*) = 0. (27.3) 

For each k there exist two mutually perpendicular vectors AS (u = 1,2 ) 
corresponding to two possible wave polarizations. Any vector in a 
plane perpendicular to k can be resolved into Ak^ and Ak^*. 

We shall now apply the periodicity condition (27.1) to each term of 
(27.2) separately. For the first term we obtain 

gi(jtxx + ki/r + ktz) — gi[<!*(* +a,) + fe,y + fe«z] _ gilfci* + )ey(y + aj)+ _ 

= Aj^ + i!»y+fa u+aj) ^ 

whence it follows that 


£t kx ky kz <*8 ^ 


Therefore the components of the wave vector should be 


2 7t Hi , 2 IT nj 2 7t n3 

Kx — —— , Ky — —- , Kz -- 


(27.4) 


where w^, n^, are integers of any sign. 

Consequently, each harmonic oscillation is given by three integers 
%, Wg, % and a polarization u, which can take two values. As indicated 
in Sec. 13, is a generalized coordinate. The number of such 


Sec. 27] 


QUANTIZATION OF THIS ELEOTBOMAGNKTIO FIELD 


273 


coordinates is infinite, but at least forms a denumerable set, and not a 
continuous set equivalent to the set of aU points in space. 

This, then, is the basic simplification introduced by the periodicity 
condition. This condition is, of course, only a mathematical con¬ 
venience, there being no basic periods a^, a^, and in any final 
physical result. 

An electromagnetic field is specified if its oscillation amplitudes are 
known for all values of n^, %, Wg, and a. The general solution is equal 
to the sum of partial solutions due to the linearity of electrodynamical 
equations (27.2); 

A(r,#) =^A<’(k,r,0 =^(A“e''‘'+ A^'e-^r). (27.5) 

k, o k,o 

Energy of the field. An electric field is calculated in accordance with 
(12.29) and (27.6): 

® • (27.6) 

k,o 

The amplitudes of the field depend harmonically upon time, so that 
A2 = —A2*=iWj.A“. (27.7) 

Therefore 

E=~^a)^^(A2e*'‘f—A“*e-•■>“). (27.8) 

k.a 

The magnetic field is determined from (21.28) and (12.28): 

H = rot A =27([Ve'‘‘'. A"] + [V e—A-*]) = 

k,o 

= i27([liA3 [kAg*]e—■'“■). (27.9) 

k,cr 

Let us now calculate the field energy. According to (13.21) 

(27.10) 

To obtain we perform summation over k, k', er, and o': 


E^=-Z- 

k, k'o, o' 


(A? A°;e' (k +k') r . 


• A“A®'*cOk- 


- k') r . 


— Ag'Aj^e-'O'- 


+ A2’A?;*e-'(‘‘ t “'J'’). 


(27.11) 


It is expedient, when integrating E^ over a volume, to change the 
order of summation and integration, each volume integral being 


18 - 0060 


274 


QUANTUM MECHANICS 


[Part III 


reduced to the product of three integrals of the following type: 


.1 27i:» (/ii + Hi) (27.12) 

0 « 

for Wj + # 0. 

If Wj + wj = 0 this integral is equal to a^. Therefore the triple integral 
assumes one of two values: 

( 27 . 13 , 


It follows that in the expression for the double summation with 
respect to k and k' becomes a single one after integration, and wo must 
replace k' by —k in the terms involving the product At AS'. In 
terms containing Af Af'* we replace k' by k due to the factor 
g—ik'r "Phus. 


- A°A°'’ 

k k 


k, <j, o' 


A°*A-' + Af A^;). (27.14) 


But (given a r/- a') Ak and A!!!^, AJ and At * are orthogonal 
vectors. Therefore, instead of the double summation with respect to u 
and a’, there also remains a single sum M'ith respect to o: 


fE^dV=- J2;«^(A°A1, -I- A;;*Af:^-2A;;A-.). (27.15) 

k, a 

Wlien calculating the integral of the square of the magnetic field 
we make use of the rule (27.13). But since the product [k' A^'] 
is replaced by —[kA^k] if k'=—k, we get 


fH^dV== F2’([kA^j [kAflJ I- [kA-*l [kA^'J + 2[kAy fkA“'J). (27.16) 

k, o, o' 

The vector products may be expressed by known formulae: 
[kA^ [kA”'*]=FA^Af - (kA-) (kAf) = PA^Af, (27.17) 


where we have used the transversality condition (27.3). This ex])ression 
becomes zero for 
Thus, 

fH^dV=^ V2Jk^(A-A^^+ 2A^A-^). (27.18) 

k,a 

Combining (27.14) and (27.18) and taking advantage of the fact 

.2 ..2 7.2 __ i 


Sec. 27] 


QIIANTIZATION OF THE ELECTROMAGNETIC FIELD 


275 


if + «*)<*»' = wZ <KK- (2’-1») 

k,a 


Passing to real variables. In order to apply the usual equations of 
quantum mechanics to the electromagnetic held, it is convenient to 
pass to real variables 


W Cl 


Ql + 

—~ e°, 

«k j “ 

(27.20) 


Ql- 


(27.21) 


where ej is a unit vector in the direction of polarization of the wave. 
We get S' expressed in terms of a sum of energies, or of the Hamiltonian 
functions for linear harmonic oscillators; 


(27.22) 

k, o 

If we regard and Qk as ordinary classical dynamical variables, 
they satisfy the equations for a linear harmonic oscillator. Indeed, it 
follows from (27.22) and (10.18) that Pk = —Wk Qk and Qk = Pk- 
This agrees with the harmonic time dependence of the amplitudes 
Ag, A£. 

Each separate oscillator is characterized by four integers na- n^, 
and (7, w'hich label the indeiiendent degrees of freedom of the electro¬ 
magnetic held. Qk arc normal coordinates of the electromagnetic 
held [see (7.31)]. 

Quantization of the electromagnetic held. The result we have ob¬ 
tained (27.22) is of fundamental signihcance. It provides the most 
simple and vivid method of applying quantum mechanics to the electro¬ 
magnetic held. Indeed, the equations of nonquantized oscillators are 
equivalent to the electrodynamical equations of a nonquantized electro¬ 
magnetic held, the only difference being that they describe the held 
m other variables. But the oscillator problem in quantum mechanics 
having already been solved in Sec. 26, quantization can be performed 
in these variables as before. Quantization of the motion of oscillators 
representing a held in vacno is just this quantization of the equations 
of electrodynamics performed in the appropriate system of variables. 
From (26.21), the energy of an oscillator in the Wth quantum state 
was 

S = h<^{N 

Therefore, with the aid of equation (27.22), the energy of the electro- 


276 


QUANTUM MECHANICS 


[Part m 


ni,n„n„a k,o 


The numeral iVn,a gives the number of the quantum state of 
the oseillator, classified by the numbers %, n^, n^, and the polari¬ 
zation a. 

Quanta. We see that the energy of an oscillator can experience 
increments equal to This quantity of energy is called 

the energy quantum of the electromagnetic field. Disregarding the 


zero energy for the ti m e being, we see that the field energy 

is equal to the .sum of the energies of its quanta Thus 

is found a quantum expression for the energy of an electromagnetic 
field. It ivill be shown in exercise 1 of this section that the momentum 
of a field is equal to the sum of the momenta of its quanta, while the 
momentum of each quantum is found to be related to its energy by the 
expression 


k • 


(27.24) 


Thus, a quantum possesses the properties of a particle of zero mass. 
The possibility of the existence of such particles was elucidated in 
connection with equation (21.16). 

Polarization of quanta. A quantum has one more, so to say, internal 
degree of freedom, that of polarization. This peculiar degree of freedom 
corresponds to the “coordinate” o, taking only two values u = 1 
and 0 = 2. The energy does not depend upon o. But, of course, in order 
to fully define the oscillation correspon^ng to a given quantum, we 
must indicate the number o = l, 2 as well as the three numbers %, 
nj, Wj. We observe that these quanta relate only to the transverse 
field of electromagnetic waves and do not describe the Coulomb 
field. 

The classical approximation of quanta and electrons. A quantum 
should by no means be regarded as the outcome of some mathematical 
sleight of hand that led us to equation (27.23). The quantum is an 
elementary particle just as real as the electron. For example, when 
X-rays are scattered by electrons, the energy of each individual quan¬ 
tum Aw and its momentum Ak are involved in the general law of 
conservation of energy and momentum in collisions in the same way 
as for any other particle. In scattering, the frequency of a quantum 
diminishes, in direct proportion to its energy. 

The essential difierence between the quantum and the electron 
consists in the fact that in classical theory there is nothing to which the 
quantum corresponds; its energy <?a=Ato and its momentum p=Ak 
tend to zero when A tends to zero. Yet the quantum mechanics of the 
electron admits of a classical approximation, since the quantum quan- 


Sec. 27] 


QUANTIZATION OF THE ELECTROMAGNETIC FIELD 


277 


tities for the electron become corresponding classical quantities, which 
do not tend to zero even when we take h ->0. The relation (27.24) 
has a counterpart in classical electrodynamics; the energy density 
of an electromagnetic wave was shown in Sec. 17 to be related to its 
momentum hy means of the factor c. In the limitmg transition to 
classical theory, the energy of each quantum is regarded as infinitely 
small while their number iV^t, o is infinitely large, so that the wave 
amplitude remains finite. 

Occupation numbers. Passing from Cartesian coordinates to new 
independent variables (the components of the wave vector k), wo 
renumbered the radiation degrees of freedom, the quantities Qk 
now being the generalized coordinates. The state of a field is specified 
if all the “occupation” numbers Nk, a are known, because the number 
Nk, a defines the quantum state the given harmonic oscillator is in, 
i.e., the number of quanta in the state k, a. The numbers Nk,a may 
be regarded as the quantum variables of an electromagnetic field. 
When a field interacts with a radiator (for example, an atom) these 
numbers change. For example, if the number Nk,o has increased by 
unity this means that a quantum of corresponding frequency, direction, 
and polarization has been emitted. 

The ground state of an electromagnetic field. Let us now examine 
expression (27.23) when Nk,a — 0. In other words, let us determme 
the ground state energy of the electromagnetic field. According to 
(27.23) it is 

^(0) =-2-27-.3- (27.26) 

«i, "3. CT 

But since the numbers Wj, n^, and run through an infinite set of 
values, the sum (27.25) is infinitely large. It must be said that in this 
case the theory is not fundamentally defective because the zero 
energy itself (27.25), does not appear in any expression; the field energy 
is always measured from the ground state. 

At the same time, the ground state of the quantum oscillators 
of an electromagnetic field leads to actually observable effects because 
the amplitude of a harmonic oscillator in the ground state is not equal 
to zero. It takes on all possible values, and, in accordance with (26.22), 
the probability of a certain value of Q is proportional to | (Q) j®. 

In an electromagnetic field, the part of the oscillator coordinates is 
played by the generalized coordinates Q, in terms of which the field 
amplitudes are expressed linearly. Therefore, one can by no means 
assert that the field amplitudes are equal to zero in the ground state 
of an electromagnetic field (i.e., in the absence of quanta). The prob¬ 
abilities of definite amplitude values are given by the harmonic- 

oscillator wave functions. These functions are equal to e in 
coordinates. 


278 


QUANTUM MECHANICS 


[Part III 


Electron-level shift produced by the ground state of a field. The ground 
state of an electromagnetic field affects observable quantities. One 
of the most important effects of this tyjie consists in the following. 

IjCt an electron move in the iiotential field of a nucleus. The value 
of tlic electromagnetic potentials of a field acting on the electron is 

usually chosen as A = 0, <p = ^* . Only the static Cloulomb field has 

been taken into account. In actual fact, the potential of tlie radiating 
field must be added to the static potential, for example, in the form 
(27.5). As was indicated, this potential must not be considered equal 
to zero even when there are no quanta in the field. The field of radiation 
affects the energy eigenvalues of the electrons in an atom. They 
prove different from what they should be in a purely Coulomb nuclear 

field with A = 0. <p = ——. 

The solution of the problem of finding the energy eigenvalues for an 
atom, u'ith account taken of the radiation field, encounters considerable 
inherent difficulties. First of all, this iiroblem does not lend itself to 
a precise solution by means of mathematical analysis. It becomes neces¬ 
sary to solve it approximately, taking the correction to the energy 
produced by the radiating field as a small quantity. Tn actuality, how¬ 
ever, a direct calculation of this correction leads to divergent integrals, 
i.e., to infinite expressions. 

Nevertheless, it is possible to redetermine this correction so that a 
finite expression is obtained. To do this, one has to consider the anal¬ 
ogous correction for the energy of a “free” electron not inlluenced by 
the external field of the nucleus, and consider the difference of two 
infinite integrals. If in doing so we take great care to follow the rela¬ 
tivistic invariance of the expressions, the subtraction turns out to be 
a completely unique operation and does not contain any indeterminate 
quantity of the form oo — oo. The final correction does indeed turn 
out to be a small quantity comiiared with the binding energy of the 
electron in the atom m its ground state (4 x 10 « ev for the hydrogen 
atom). The value of the correction is in excellent agreement with mod¬ 
ern radiospectroscopic data. 

Subtraction of infinities. The meaning of such a subtraction consists 
in the following. Physically, an electron is inseparable from its charge, 
i.e., from its radiation field. When we talk about a “free” electron, 
we always imply that the electron interacts with a radiation field, 
which cannot be regarded as equal to zero. The rest energy of an elec¬ 
tron is equal to mc^, where m denotes the observed value of the mass. 
In reality, this quantity encompasses the energy of all interactions 
of the electron, including interaction with the field of radiation. 

Thus, in calculating the energy of an electron in a Coulomb field, 
the mass of the electron must be redefined so that, in the absence of 
any static external field, all the energy has a finite value nic^. This 


Hoc. 27] 


QUANTIZATION OF THE ELECTKOMAGNETIC FIELD 


279 


redefinition operation or, as it is called, “renormalization” of mass 
allows ns to find a finite quantity for the energy eigenvalue of an 
electron in an atom. Concerning the levels in a hydrogen atom, see 
Secs. 31 and 38. 

The renormalization operation consists in the fact that the mass 
which appears formally in the equations of mechanics, together with 
the mass due to the interaction of the electron with radiation, is re¬ 
garded as the observed finite quantity m = 0.9106 x 10-*’ gm. Only 
this known quantity appears in the final result for any ealculatcd 
effect. 

Difficulties of the theory. The appearance of divergent (i.e., infinite) 
cxyiressions in quantum electrodynamics is a defect of the theory and 
indicates a certain internal contradiction. The modern form of the 
theory as given by J. Schwinger, R. Feynman and F. Dyson, is appar¬ 
ently, not yet final. 

It is all the more remarkable that despite this imperfection, quan¬ 
tum electrodynamics is capable—^with the aid of renormalization—of 
yielding correct and unique answers when calculating concrete quan¬ 
tities observed in experiment. 


Exercise 

CJalculato the oloctromagnetic field inomontuia in a vacuum in terms of 
the normal field coordinates 
In the expression (13.27) 

wo substitute the electric and magnetic fields from (27.8) and (27.9). After 
int(!gration over the volume wo obtain by (27.13) 

P = “k I kil + IkA;:'*]] + [Af 1 kA2']] + [A2* [kA!;*J]). 

k, o» a' 

We rearrange the double vector products: 

P “k k -I- 2 kA- Af -f- kAf A”*,) . 

k, cr 

The quantities kAkAf-k and kAk* A^k aro odd functions of k. and disappear 
after summation over all k. There remains only 


P ~ 2 7t c2 ^ "k ^ A^ . 


k, c 


dubstituting hero the normal coordinates of the field from (27.20) and (27.21), 
we airive at the expression 

P =2*-^- 2- =i;^ k (AV „ + A). 


k, a 


k, o 


so that tho inomontum of each quantum is related to its energy bv relation 
(27.24). • 


280 


QUANTUM MECHANICS 


[Part III 


Sec. 28. Qnasi-Classical Approximation 

The classical limit of a wave function. It was shown in Sec. 24 that 
the limiting transition from quantum mechanics to classical mechan¬ 
ics is performed by means of the substitution of (24.5) 


By substituting ij/ into (24.11) (i.e., into the Schrodinger equation), 

■A 

eliminating c , and formally making h tend to zero, we obtained the 
correct classical relation between energy and momentum. This limiting 

process signifies that the do Broglie wavelength [3 very small 

compared with the region in which the motion occurs. 

It is sometimes useful not to carry the limiting process to its end, 
namely in those cases when the asymptotic form of the almost classi¬ 
cal wave function is important. If the wavelength is small compared 
with the region in which motion occurs, then the wave function has 
many nodes in this region and, as we know from Sec. 25, this corre¬ 
sponds to an energy eigenvalue with a large number. Thus, the passage 
from quantum laws of motion to classical laws is accomplished through 
a region in which the number of the eigenvalue is large compared with 
unity. If the energy eigenvalue is determined by several integers, as 
in the case with motion having several degrees of freedom (cf. the 
problem of the potential well), then all these numbers must be large 
compared with unity so that the motion should be close to the classical 
limit. The wave function in approximation (28.1) (where S is the classi¬ 
cally calculated action) thus allows us to determine eigenvalues with 
large numbers [see (28.18)]. 

Solutions with real values of the exponent. The wave function does 
not convert to zero also for real values of the exponent in (28.1), i.e., 
for imaginar 3 r values of S. Imaginary values can only occur in those 
regions of space into which—for a given energy—-the trajectories of 
classical motion cannot enter, because the potential energy would be 
greater in that case than the total energy, corresponding to a negative 
kinetic energy or an imaginary velocity. In these cases, the square of 
the modulus of the wave function determines the probability of a 
particle penetrating into a classically unattainable region of motion. 
Naturally, this probability does not convert to zero only anterior to 
the limiting transition, and not posterior to it, when h has already 
been eliminated from the equations. This is why approximation (28.1) 
is termed quasi-classical. 

The quasi-classical approximation. Let us write the equations of 
transition to a quasi-classical approximation in a one-dimensional 


Sec. 28] 


QT7ASI-CLASSICAL APPROXIMATION 


281 


case. From (24.14), by substituting = — € 
(the one-dimensional case!), we have 


whence 


S=jV2'm(i'^) dx. 


and V = 


dS 

dx 


into it 


(28.2) 

(28.3) 


Unlike classical mechanics, equation (28.3) holds not only for 
S >U, when the root is extracted from a positive quantity, but also 
when S <U, when the action is imaginary. 

In Sec. 25 we investigated a similar example of the precise wave equa¬ 
tion for the problem of a potential well of finite depth. In the region of 
the well, the wave function was of the form =sin xa:, while outside the 
well it approached zero exponentially like This resulted precise¬ 

ly when S <U, i.e., in a classically unattainable region. Of course, 
an imaginary velocity signifies that a particle moving according to 
classical laws does not attain the given position. Therefore, the expres¬ 
sion 

± ~jV2m(U^S’) dx (28.4) 


must not longer be understood as action, but simply as the exponent 
in the equation 


extended to the region ^ < U, if there is such a region. When A ->0, 
the wave function will be damped in this region infinitely quickly—like 
e ~°°; and this denotes the unattainabUity of points where for classical 
motion S' <U. 

The potential barrier. In the potential well problem, the region for 
which U >S extended rightward to infinity. Therefore, the wave 
function became zero at infinity. Considerable 
interest is attached to another problem, in 
which the potential energy at a certain distance 
away from the well again becomes less than 
the total energy. This is shown in Fig. 42. 

U >S for the region < a; < a:^. Therefore, 
in classical mechanics a particle situated to 
the left of x — Xi cannot under any circum¬ 
stances attain the region x >X 2 , from which 
it is separated by a potential barrier. In Fig. 42 

quantum mechanics, the wave function does 

not become zero between Xi and x^, since this region is finite (see 
exercise 2, Sec. 26). 

In the approximation (28.1), the exponent in the equation is a real 
quantity when x < ar^. Therefore, the modulus of t|; remains equal to unity: 


282 


QUANTUM MECHANICS 


[Part III 


S 5 

On the other hand, between x-y and x^, the modulus of t{) decreases 
according to the law 


\ 


m (U — dx 


Xl 


j 


= e 


AJv2 


m(U- 


dx 


xi 


(28.5) 


At x = X 2 , in comparison with the point a: = a;j it diminishes in the ratio 


X, 

- I \/2m(U^^ dx 

B=e i (28.6) 

after which it again stops changing, since 8 becomes a real quantity. 

Hence, the square of the modulus of the wave function diminishes 
between Xy and x^ according to the expression determined by the 
quantity B. A more precise theory provides a correction factor for B, 
though fundamentally, the function is determined by B alone. B is 
called the barrier factor. Somewhat later, it will be explained how the 
B factor is related to the penetration probability through the barrier 
in unit time. 

The quasi-classical approximation is feasible only when the order 
of magnitude for the action is large compared with h. This was men¬ 
tioned in Sec. 24, when the conditions for the limiting transition to 
classical mechanics were being determined. For this reason, equation 
(28.6) may be used only when the exponent is large compared with 
unity. If it is comparable with unity, the penetration probability 
through the barrier must be evaluated by means of precise wave func¬ 
tions. 

The Mandelshtam analogy. In wave optics there is an analogy to 
the passage of a particle through a potential barrier. When a lighi 
wave falls on the boundary of a medium of small refractive index fronr 
a medium of larger refractive index at an angle whose sine is greater 
than the ratio of the indices of refraction, there occurs total internal 
reflection in accordance with the laws of geometrical optics. If the 
problem is solved in strict accord with wave optics, on the basis of 
Maxwell’s equations (see exercise 2, Sec. 17), then it turns out that the 
wave penetrates somewhat into the second medium, but dies out 
exponentially in it. L. I. Mandelshtam took notice of this analogy 
between quantum mechanics and wave optics. It can be applied in the 
following manner. 

Let us imagine two optically dense media separated by a layer 
optically less dense. Let a light ray fall on the interface of the media 
at an angle to the normal larger than the angle of total internal re¬ 
flection. According to geometrical optics, the ray should be complete- 


Sec. 28] 


QUASI-CI.AS8ICAL APPROXIMATION 


283 


ly reflected by the layer, and it is absolutely immaterial whether or 
not there is a denser medium beyond the layer, or whether the re¬ 
flection occurs from an infinitely thick, nondense medium. Similarly, 
a particle in classical mechanics is completely incapable of penetrating 
the barrier. According to the laws of wave optics, light penetrates 
into a nondense medium, but dies out in a thickness comparable with 
the wavelength., Therefore, if the second dense medium is situated 
closely enough, part of the light “seeps” into it. 

The classical expression for the amplitude of a light wave may be 
regarded as the wave function in relation to a light quantum. The 
transition from quantum theory to classical electrodynamics consists 
in considering the occupation numbers as large (see Sec. 28). 
Then the corresponding field amplitudes change to classical ones. For 
this reason the Mandelshtam analogy represents an example of quanta 
penentrating through a barrier. As has already been pointed out, the 
limiting transition for electrons occurs differently; it corresponds to 
the transition from wave optics to geometrical optics. Therefore, in 
the classical limit, electrons do not penetrate the barrier. 

The existence of penetrations through a barrier clearly indicates 
that the concept of a trajectory is sometimes completely inapplicable 
in the case of quantum motion. A trajectory extended under the 
barrier would lead to imaginary velocity values. 

Alpha disintegration. Passage through a potential barrier enables us 
to explain one of the most important facts of nuclear physics, that of 
alpha disintegration. The nuclear masses of heavy elements with atomic 
numbers greater than that of lead satisfy an inequality of the form 
( 21 . 12 ): 


TO (A,Z) >m (A — 4, Z —2)-f-m (4, 2). (28.7) 

Here A is the atomic weight and Z is the nuclear charge (i.e., the 
atomic number in the Mendeleyev table). Thus, to (4,2 ) is the mass of 
a helium necleus with atomic weight 4 and atomic number 2. Such a 
nucleus emitted during alpha disintegration is called an alpha particle. 

All that can be seen from equations (21.12) and (28.7) is that the 
spontaneous decay of a nucleus of mass m {A, Z) is possible, though no 
indication is obtained about the time law of disintegration. The 
nuclei of certain elements have mean decay times of lO^® years while 
others have decay times of about 10~® sec, which is a difference of 
23 orders of magnitude. It will be noted that the energy of the alpha 
particles emitted differs here by a factor of only two. From experiment 
it turns out that the logarithm of the mean decay time of a nucleus 
is inversely proportional to the alpha-particle velocity. It is this 
logarithmic law that corresponds to the difference of 23 magnitudes. 
It is accounted for by the difference of barrier factors which depend 
exponentially upon the energy. 


284 


QUANTUM MECHANICS 


[Part III 


The potential-ener^ curve. At large distances from a nucleus, an 
alpha particle experiences a repulsive force of potential energy 


U = 


2 (. 2 - 2 ) 
r 


(28.8) 


[cf. (3.4)]. At small distances, attractive forces must act because, 
otherwise, the nucleus {A, Z) could not exist at all. We do not know 
the force law (i. e., the shape of the potential-energy cmve when the 

alpha particle is situated sufficiently close 
to the nucleus) and, therefore, in Fig. 43 
we draw it at will. In this we must be 
guided by the following considerations. 
The special nuclear forces which hold 
the alpha particle in the nucleus before 
emission have a small radius of action, so 
the potential-energy curve has the form 
of a “potential well.” Motion inside the 
nucleus corresponds to motion inside such 
a well. The transition region from the well 
to the Coulomb curve is not very essential 
for final results, i. e., it little affects the 
exponent of the barrier factor. 

The barrier factor for alpha disintegration. The energy of an alpha 
particle is positive at an infinite distance from the nucleus. It is 
this that signifies that the nucleus is capable of alpha disintegration, 
i.e., the alpha particle can move infinitely. In order to find the prob¬ 
ability for alpha disintegration, we must calculate the barrier factor 
B in accordance with (28.6). Because nuclear forces are short-range 
forces, the transition region is small and we can extrapolate the 
Coulomb law, without sensible error in the integral, up to the point 
r = where S becomes greater than U. Point is the effective nuclear 
radius determined from alpha disintegration. Other data concerning 
the nucleus lead to somewhat different values for the respectively 
determined effective radius. This is understandable since is obtained 
on the particular assumption that the Coulomb law is valid up to the 
region for which the potential energy curve is taken in the form of a 
well with steep sides. 

And so we determine the barrier factor according to the equation 


<f) dr 


(28.9) 


The integral in the exponent can be easily calculated by the substi¬ 
tution 

r ff , 

- 2 = cos^ X . 


2(Z-2) 6' 


(28.10) 


Sec. 28] 


QUASI-CLASSIOAL APPBOXIMATION 


285 


Then, after elementary treatment, it reduces to the form 


— 2 2(2-2)e2 / ir ' 
-(arc cos 

_ i/zzZiZ: yvi'ziKr.] 

]l 2 (Z~ 2) e'^ V 2{Z-2)e^ I 


(28.11) 


The quantity £ is the ratio of the alpha particle energy to 

the effective barrier height at the pomt r^, taken according to equation 
(28.8). Let us evaluate this ratio. For heavy nuclei 2 {Z —2)^ 180, 
9 X 10“^® cm; we shall take to be equal to 6 Mev, e® = 23 X 10 
Whence 

(Fri _ 6 - 1.6- lO-*- 9- 10-1“ _ 1 
2{Z-2)e^'~ 180 •23-10-“- — 5’ 

To a first approximation, we shall consider this quantity as small. 
Then, on the right-hand side of (28.11), we obtain the approximation 


2-v/2m 2{Z-2)e^ It: „]/ \ 

h ' ^ U l2(Z-2)e^/“ 

= ^^-2(Z-2)-|--v/mrie®(Z-2) . (28.12) 


It is easy to check the correctness of this expansion by a direct numer¬ 
ical substitution. 

The time dependence o! alpha disintegration. We shall now show 
how the expression for the barrier factor is related to the probability 
of alpha disintegration in unit time. In exercise 2, Sec. 25, it was found 
that the particle flux passing through a barrier is proportional to the 
particle density before the barrier. It is easy to see that the basic 
result obtained in the problem of a rectangular barrier coincides, 
in the limit, with the result of this section, if we go over to the quasi- 
classical approximation. Indeed, we, so to say, divide a barrier of 
arbitrary shape into separate, successive rectangular barriers. The 


penetration probability for each of them is e 


- A* 


, where A a; is the width of the rectangular barrier. The total 


= e 

penetration probability will be determined by reduction of the ampli¬ 
tude of the wave function over the whole width of the barrier. In other 

_ 

h 


words, it will be proportional to the product II e 


-S) Ax 


This 


product can, obviously, also be represented like 


V2«nu-^ A* 


286 


QUANTUM MECHANICS 


[Part III 


^ r V^2—(U —d 

or,in the limit,like c '• J ’'corresponding to (28.6). Thus, 

for any potential barrier, we can assert that the flux of transmitted 
particles is proportional to the particle density before the barrier 
multiplied by the barrier factor. 

From this we can deduce the time dependence of alpha disintegra¬ 
tion. The probability of an alpha particle existing inside the nucleus 
is equal to the integral of the square of the wave-function modulus 
over the volume of the nucleus, i.e., over the region r<ri. As has 
just been indicated, the alpha-particle flux emitted from the nucleus 
is proportional to the probability density of their being in the nucleus, 
the constant of proportionality being basically determined by the 
barrier factor. It follows that the number of nuclei decaying in unit 
time is proportional to the total number of nuclei present that have 
not disintegrated by the given instant. The constant of proportionality 
depends upon the shape of the barrier and upon the state of the par¬ 
ticles inside the nucleus, but it cannot be a function of time, as may 
be seen, for example, from the equations obtained in exercise 2, Sec. 25. 
Indeed, this constant is obtained from the solution of the wave 
equation with the time dependence eliminated, i.e., (24.22). 

For this reason, the law for alpha disintegration is expressed by the 
equation 

4f-=-T»’ (»13) 

where N is the number of nuclei that have not disintegrated at the 
given instant of time and is the initial number of nuclei. The 
quantity F has the dimensions of energy for convenience of comparison 
with other quantities of the same dimension. Every nucleus has the 
same probability of decaying in unit time, no matter how long it 

p 

has been in existence. This probability is and does not depend 
upon time. 

Equations (28.13) and (28.12) confirm the experimentally deter¬ 
mined law which yields an inversely proportional relationship between 
the logarithm of the probability for alpha disintegration and the 
alpha-particle velocity v. 

The nuclear wave function before decay. It is easy to obtain a 
time dependence for the wave function of a nucleus which has not 
emitted an alpha particle. It looks like 

r» , ^.t 

(28.14) 

The first factor accounts for the exponential attenuation of ampli- 
rt 

tude according to a e law (since the probability, i.e., the square 


Sec. 28] 


QUASI-CLASSICAL APPBOXIMATION 


287 


_ 

of the amplitude, diminishes according to e '• ); the second factor 
is the usual wave-function time factor. Expression (28.14) is very 
similar to the well-known formula for damped oscillations, with the 
difference that in the given case it is the probability amplitude 
of the initial (not yet decayed) state of the nucleus that is 
damped. 

The wave function (28.14) satisfies the initial condition (0)| = 1. 
Since the wave-function equation is of the first order with respect 
to time, it is absolutely necessary to impose some initial condition. 
It was assumed that at the initial instant of time an alpha particle 
was definitely situated inside the nucleus. However, for this wave 
function, the right-hand side of equation (24.16) does not become 
zero: there is a finite flux for the probability of a particle being emitted 
from the nucleus. This is what leads to the exponential damping 
of the probability of the predecay state. 

All nuclei before disintegration are described by exactly the same 
wave function (28.14), if at the instant < = 0 they were in the initial 
state. Therefore, they all have a perfectly identical probability of 
decaying in unit time, and it is impossible to predict which one of 
them will decay earlier and which later. In exactly the same way, 
in the diffraction experiment it is impossible to say which part of 
the photographic plate will be hit by a given electron. The decay 
law is purely statistical, in the same way as the law for diffraction 
pattern. 

In this sense, radioactive decay does not resemble the falling of 
ripe fruit from a tree—an alpha particle in a nucleus is always identi¬ 
cally “ripe” for emission. This is suggested by the homogeneity of 
the decay law (28.13) with respect to time. 

The indeterminacy ol a nuclear energy level before decay. The wave 
function (28.14) does not belong to any eigenvalue of the energy S’, 
since the states with energy S have wave functions which are time- 

_ • £L 

dependent according to e . Such states should exist for an un¬ 
limitedly long time because the amplitude of their probability does 
not fall off, while the probability for an undecayed nuclear state 

h 

is reduced by e times in a time A( = . We can define the time interval 

as the characteristic (or mean) attenuation time. 

Let us now suppose that the wave function (28.14) is represented 
as the sum of wave functions of states whose energies are determined 
accurately. We know that the time dependence of these functions 
— • Zi 

is given by the factor e * . In other words, we have to represent 

rt _. s,jt 

the “non-monochromatic” wave e '•as the sum of “mono- 


288 


QTTANTtTM MBCHAOTCS 


[Part HI 


chromatic” waves In which energy interval A«f (by order 

of magnitude) will the amplitude of these monochromatic waves 
differ noticeably from zero? 

We can answer this question by making use of relation (18.6). 
According to this relationship, a wave of duration A# is represented 
by a group of monochromatic waves whose frequencies lie in the inter¬ 
val Aw > -^-7 ~ • Substituting, in place of Aw, the equivalent 

th A 

(in this case) quantity ■ , we arrive at the following estimate: 

A<g’At>2nh. (28.15) 


Tliis is the uncertainty relation for energy. The measure of 
uncertainty for the energy is the quantity P, i.e., the inverse value 
for the probability of decay in unit time. (28.15) should he formulated 
thus; an energy of a state existing during a limited time interval A< 

^ It h 

is determined within the accuracy of the order of . Only the 

energy of a state which exists an unlimitedly long time is fully deter¬ 
minate. 

The meaning of the uncertainty relation for energy. The meaning 
of the uncertainty relation for the coordinate and the momentum 
(23.4) is not analogous to the meaning of (28.15). The estimate (23.4) 
expresses the fact that the coordinate and momentum do not exist 
in the same state; (28.15) signifies that if a state of the system has 
a finite duration At, then its energy at each instant of time within 
the interval At is not determined exactly, but is only contauied 
Avithin a region of the order of P. 

The quantity P is termed the level width of the system. The concept 
of level width can be applied to any states of finite duration and not 
only to alpha disintegration. For example, the energy level of an 
atom in an excited state has a definite width, since an excited atom 
is capable of the spontaneous emission of a quantum. 

Explanation of the level width. We shall now show how the level 
width of a nucleus capable of alpha disintegration can be found 
by considering the wave function variation under a potential barrier. 

In was shoAvn in Sec. 25 that infinite motion has a continuous 
spectrum. The motion of a system with a potential barrier is infinite 
because an alpha particle is capable of going to infinity. It follows, 
strictly speaking, that a nucleus capable of alpha disintegration 
shoidd have a continuous spectrum. 


* A wave with definite frequency is termed “monochromatic.” A mono¬ 
chromatic wave corresponds to a single colour (chromos is the Greek for colour). 


Sec. 28] 


QTTASI-CL,ASSICAIi APPROXIMATION 


289 


Let us now evaluate the energy level widths F for alpha-radioactive 
nuclei. From (28.16) it follows that even for a nucleus with a very 
short alpha-decay time (i~10-®sec) F^IO-®* erg~0.6 x 10“^®ev. 
How is it possible to combine a continuous spectrum with such a 
narrow energy interval ? 

The solution to the wave equation between and r.^ is of the 
following form: 


—-L j V 2m(u—g) dr T I 2m<U— <r) dr 

^ = C,e -t-C-.e /. . (28.16) 

The first solution exponentially diminishes with r, wliile the second 
exponentially increases. It follows that if the barrier extended to 
infinity rightwards, a solution would exist only for Cg = 0. The ratio 
C 2 IC 1 , determined from the boundary conditions at r = r^, is a function 
of energy. It is the roots of equation (<?) = 0 that give the possible 
energy eigenvalues for finite motion. The energy of a particle in a 
well of finite depth is obtained in just this way. A particle in a well, 
which was considered in Sec. 25, differs from a particle beyond the 
barrier in that the barrier is of finite width. Therefore, the second 
solution, proportional to C^, need not be strictly equal zero, but 
may only be small compared with the first solution in any small 
interval of values ^ close to a root of equation {<a) = 0. This region 
of values of S is what corresponds to the assumption that the modulus 
of the wave function outside the nucleus is small compared with the 
wave function inside the. nucleus. In other words, if the energy of 
the nucleus is contained in a given region of values F, then we can 
say that the alpha particle is in some way bound in the nucleus, 
similar to the way that a particle can be bound in a real potential 
well. The higher or broader the potential barrier, the less the barrier 

factor B and the less the decay probability proportional to it. 

But then is also correspondingly reduced, i.e., the continuous- 
spectrum state becomes closer to the discrete-spectrum state with 
an exact energy value This is what explains the meaning of 
the uncertainty it indicates how close the state is to a bound 
one, with an iofinitely long life-time. 

The uncertainty of does not limit the applicability of the law 
of conservation of energy in any way; the total energy of a nucleus 
and alpha particle is constant. However, the state with a strictly 
defined energy relates simultaneously to disintegrated and non- 
disintegrated nuclei, while a nondisintegrated nucleus has an inexactly 
defined energy. 

Any state capable of a spontaneous transition to another state 
with the same energy possesses a certain energy width. The energy 


19 - 0060 


290 


QUANTUM MECHANICS 


[Part III 


is not defined exactly in each of these states separately, but a precisely 
defined energy corresponds to both states at once. 

We can divide the total level width into partial widths related 
to the probabilities for various transitions. Thus, strongly excited 
nuclear states are capable of emitting neutrons of various energies 
and of radiating gamma quanta. Each possibility contributes its 
exponential in the term characterizing attenuation. The total atten¬ 
uation is determined by the product of such exiionentials. It follows 
that the total level width is equal to the sum of its widths in relation 
to all possibilities of disintegration. 

The llohr quantum conditions. Let us now apply the quasi-classical 
approximation to the finite motion of a particle in a potential well 
and find the energy levels. From (28.3) the wave function is 

X 

— sin j dx V 2m {S' — V) -fYj- (28.17) 

Aj 

The real solution involving the sine is taken, since, in accordance 
with (24.20), it does not involve any particle flux outside the well, 
i.e., it corresponds to a stationary state. 

Here is the left edge of the well for which to ==U. In changing 
X to Xx to the right-hand edge x^, the phase of the wave function 
can change by a whole multiple of tt together with a certain addition, 
which we shall call p for the time being. In the whole length of the 
well, the sine changes its sign for a given value of the particle energy 
by as many times as tz is added to the argument of the sine, i.e., 
the wave function (28.17) has as many nodes. However, the number 
of nodes is equal to the number of the energy eigenvalue n.; therefore, 
an equality is obtained for the determination of ^n’- 

*• 

j* V 2 m {Sn — U) dx — Tzn + p . (28.18) 

*1 

If the edges of the well are not plumb at the points x^ and ajj, 
then a very fine analysis shows that . It turns out that p = = 0 

for plumb edges of the well, because the condition (j; (a;i) — ^ 

is then imposed on the wave function, as in the problem of an infinitely 
deep rectangular well. In addition, we note that the points x^ and 
x^ are given by the result that S—XJ (x^)=^U (Xg), i.e., and x^ 
depend upon 

It should be noted that, from its very meaiung, equation (28.18) 
holds only for a large n, to a quasi-classical approximation. The integral 
is considerably gieater than h for large n, and this signifies that 
8 {X 2 ) — S(Xi)^h, whence, to a quasi-classical approximation, we 
obtain (28.17). 


Sec. 29] 


OPERATORS IN QUANTUM MECHANICS 


291 


Equation (28.17) was postulated by Bohr in 1913 in determining 
the stationary orbits for a hydrogen atom (p was considered to be 
equal to zero). Bohr supposed that electrons in such orbits do not 
radiate light and do not fall onto the nucleus, wliile radiation occurs 
only in the case of a transition from one orbit to another. 

Thus, it turns out that the Bohr quantum conditions emerge as 
a limiting case, of quantum mechanics without any additional 
postulates. 

Exercises 


1) Determine the energy levels for a linear harmonic oscillator from equation 

(28.18). _ 

V^2^/m(i>*_ 

— \/ 2^/m 01* 


From this = ha> 


which is fortunately correct for all n and not only 


for n > 1. 

2) Find the approximation that follows (28.3). 
We look for IS in the form S = S# + hSi- Then 


= e 

h 


h 


+ 


'H-ySo'4' 


2m 


- U)<^. 


The zero approximation gives from (28.3). The first approximation yields 

= or S, = iln 

. s„ 

1 I - 

so that iji = —e " . 

V *’o 

3) Find the factor B for a barrier of the form 17 = 0 when x<0, C7 = Ug —oa: 
when a: > 0, ^ 


Sec. 29. Operators in Quantum Mechanics 

Momentum eigenvalues. In a number of cases we are able to deter¬ 
mine energy eigenvalues from the wave equation (24.22). However, 
it is very important to find the eigenvalues of other quantities too: 
linear momentum, angular momentum, etc. To do this, it is con¬ 
venient to proceed from the form of in the limiting transition to 
classical mechanics; 


292 


QUANTUM MECHANICS 


[Part III 


h d 

Let US apply the operation to both sides of equation (29.1), 

i.e., we take the partial derivative with respect to x and multiply by : 


A PA. = AA e' ^. 

i dx dx 


(29.2) 


But in the classical limit S becomes the action of the particle, while 
becomes the component of momentum p* [see (22.9)]. Therefore, 

the equation for the momentum eigenvalues that yields the correct 
transition to classical mechanics is of the following form: 


h 5']; 
i dx 


(29.3) 


where p* is the eigenvalue for the a;th momentum projection. 

Momentum and energy operators. Let us compare equation (29.3) 
with the wave equation (24.22): 


+(t-^) +(t:£') J <1^ + = <^<1^ • (29.4) 

Here, the symbol | <}' denotes and similarly for 4^ > 

(aj) ')'• 

In order to find the energy and momentum eigenvalues we must 
perform a definite set of differential operations and multiplications 
by the function of coordinates in the left part of the equation. But 
these sets are connected in a very curious manner, as will now be 


the momentum 

h a 


shown. We shall call the symbol , multiplied by -4 
operator applied to a wave fimction. Instead of we will 

symbolically write p*. Then, it will be necessary to rewrite equation 
(29.3) as 

p»4 = • (29.5) 


This equation denotes exactly the same as (29.3), though the sym¬ 
bolic notation p^ should emphasize that the corresponding operation 
is applied in order to find the momentum eigenvalues. 

The operation on the left-hand side of (29.4) we shall also symbol¬ 
ically call We write and not because the energy is assumed to 
be expressed in terms of momentum, similar to the Hamiltonian func¬ 
tion if. Then, in shorter notation, (29.4) appears as 

4 . (29.6) 

is called the Hamiltonian operator, or the energy operator. 


Sec. 29] 


OPBB.ATOBS IN QUANTUM MECHANICS 


293 


Comparing (29.4) and (29.3), we see that the momentum and energy 
operators are related by the same equations as the corresponding 
quantities: 

+ + U ■ ( 29 - 7 ) 

We have written U instead of simply U in order to emphasize that 
in this equation the expression U is not regarded as an independent 
quantity but, instead, as an operator acting upon <{/, i.o., a multipli¬ 
cation operator of t{; by U. Equation (29.7) is symbolic. It is under¬ 
stood that both sides are applied to ({/. 

The meaning of operator symbolism. The usefulness of an abbreviat¬ 
ed operator notation in quantum mechanics consists in the fact that 
the equations thus become more expressive. The relation between 
quantum laws of motion and classical laws, which are limitmg cases 
with respect to the quantum ones, can be best of all seen in operator 
notation. 

If in classical equations relating meclianieal quantities we replace the 
momenta by their operators, we then obtain correct operator rela¬ 
tionships of quantum mechanics. The limiting transition to classical 
mechanics restores the usual relationships between quantities. Indeed, 

in the limitmg transition (29.1), the operator p=-^ V 

If we must perform the limiting transition for p®, then we need to 
differentiate only the exponential each time, because this yields the 
quantum of action in the denominator. For h-^0, only terms with the 
highest degree of Ji in the denominator remain, and it is these very 
terms which are obtained in replacing the operator p by the quantity 
V8 (i.e., by the classical momentum vector). We had an example of 
such a transition in Sec. 24 [see equations (24.13) and (24.14)]. 

The angular-momentum operator. It is now easy to define the angu¬ 
lar-momentum operator. We shall begin with one component Mz. It 
is clear from Sec. 6 that the angular momentum Mz is at the same time 
a generalized momentum corresponding to the angle of rotation about 
the 2 -axis, i.e., Mz = p^. Then, from (10.23) 

' (29.8) 

Therefore, in quantum mechanics the operator p, must be of the form 

K-I-A. ,29.9) 

At the same time, m accordance wdth classical mechanics, the pro¬ 
jection Mz is related to the momentum projections thus: 

Mz = xpy — ypx. 


(29.10) 


294 


QUANTUM MECHANICS 


[Part III 


It follows that there must exist an operator relationship 

p, = J/z = 4 ~ ^ = a:py - . (29.11) 

Let us check to see that the definitions (29.9) and (29.11) do, indeed, 
coincide. Let us pass to cylindrical coordinates: 


From this we have 


X — r cos 9 , 

(29.12) 

y — r sin 9 . 

(29.13) 

Sij) 0 i}i dr . Si]) Stp 

dx dr dx S 9 dx ’ 

(29.14) 

3 i]i Sij) dr 9ij' 

dy dr dy ' d<f dy ' 

(29.15) 

13) with respect to x and y, we have 


r — + p® , 9 ~ arc tan 


y . 


dr 

dx 


X 


— cos 9 , 


df y sin ip 

Sx x*+ i/* r 


dr 

dy 

_ y___ . 

^ sin 9 

dif 

X 

COS(p 

Sy "■ 


r 


Substituting all these expressions into equalities (29.14) and (29.15), 
and substituting the derivatives themselves into (29.11), we can see 
that both definitions for Mz (29.9) and (29.11) are identical. 

Angular-momentum projection eigenvalues. Let us now find the 
eigenvalues for M^. For this, it is necessary to solve the equation 

= (29.16) 

(29.17) 


This equation is very simply integrated; 


<\i = e 


(29.18) 


As we know, a wave function is the probability amplitude. Function 
(29.18) is the amplitude of the probability that the particle possesses 
an azimuth angle 9 , if the zth component of its angular momentum 
is equal to Mz. It is essential that not only the absolute value of the 
wave function has physical significance, but also its phase; this is 
indicated, for example, by the phenomenon of electron diffraction. 
For the phase of the wave function (29.18) to be determined, it must 
either not change at all or only by a multiple of 2 it in rotating the 


Sec. 29] 


OPEBATOBS IN QUANTUM MECHANICS 


295 


coordinate system through 360 ; this is because the position of the 
particle relative to such a system rotated through 360° does not 
change. If in this case the wave function were not returned to its 


initial value, it could 
tude. Thus, 

i.e., 

so that 


not uniquely represent 

4'(9 + 2Tt) = tj; (cp), 

(p + 2 It) (p 

e' * 

■ 2i»M^ 

e' '■ ^i=i(>2itik 


probability ampli- 

(29.19) 

(29.20) 


where k is an integer of any sign or zero. Whence we obtain the eigen¬ 
values of Mz'. 

Mz-=hk. (29.21) 

Mz is called the orbital angular-momentum projection for a particle. 
We shall see in Sec. 32 that a particle can have an angular momentum 
connected with its internal motion, which is not described by the wave 
function (29.18). Here, we have proven that the orbital angular- 
momentum components can only assume values which are whole 
multiples of h. 

The Stern-Gerlach experiment. The discreteness of the angular- 
momentum spectrum is confirmed by direct experiment. The idea 
of the experiment consists in the following: a direct relationship 
exists between the orbital angular-momentum projection and the 
magnetic moment projection [see (15.25)]: 

A narrow beam of vapour of the substance under investigation 
is passed between the poles of an electromagnet in a strongly inhomo¬ 
geneous field; to achieve this, one of the poles may be made tapered. 
The particles—^in the Stem-Gerlach experiment, they are atoms— 
enter the field parallel to the edge of the taper, i.e., they move in 
a direction perpendicular to the plane of the lines of force of the 
field. The plane of symmetry of the field passes through the edge 
of the taper and the initial direction of motion of the particle. We 
assume the a-axis to be perpendicular to the edge of the taper and to 
lie in the plane of symmetry of the field. If the mechanical moment 
of the electrons in the atoms has only discrete, integral projections 
on the 2 -axis, then the magnetic moment of the atoms is established 
in several definite ways. The deflecting force acting on the magnetic 
moment in a magnetic field is, by (15.40) 

dH , eh 8H 


296 


QUANTUM MECHANICS 


[Pait III 


In the plane of symmetry of the field, H is directed along the z-direction 
and depends only upon z. 

Since the angular momentum can only have a definite set of values, 
the deflecting force acting upon the atoms in the beam is also not 
arbitrary but has a very definite value for particles with respective 
angular-momentum projection M^^hk. It can be seen from (29.23) 

that the force is a quantity which is a multiple of • Therefore, 

the particles in the beam experience only those deflections in the 
magnetic field which correspond to the possible values of the force 
(29.23). In other words, the beam is split into several separate beams 
and does not proceed continuously, as would be the case for any 
nonintegral projections M^. 

Where the beam is formed, each particle was given a certain angular 
momentum. Motion in the magnetic field makes it possible to measure 
the projection of this angular momentum Mz in the direction of the field. 

The impossibility of the simultaneous existence of two angular- 
momentum projections. From the fact that the angular-momentum 
projection on any axis is integral, it follows that the angular mo¬ 
mentum does not have, simultaneously, projections on two axes 
in space. 

Indeed, in the Stern-Gerlach experiment, the z-axis is absolutely 
arbitrary. We could have measured the angular-momentum projection 
on some axis in space and then pass the same beams through a 
magnetie field making a very small angle with the field in which 
the first measurement was performed. Both measurements wUl give 
only integral projections of the angular momentum. Both one and 
the same vector cannot simultaneously have integral projections 
on infinitely close, but otherwise arbitrary, directions; when the 
first measurement was performed, the angular momentum had a 
projection only on the first direction of the field, and, correspondingly, 
in the second measurement, it had projections only on the second 
direction of the field. 

Similar to the way that coordinate and momentum do not exist 
simultaneously, it turns out that two angular-momentum projections 
do not exist in the same state. 

The simultaneous existence of two physical quantities. We shall 
consider from a general point of view the question of which quantities 
of quantum mechanics can exist in the same state of a system. Let 
us suppose that in a certain state described by the wave function iL 
there simultaneously exist two physical quantities X and v. This 
means that the wave function t|; is an eigenfunction of the two operators 
X and V. It satisfies two equations 

XiJ/ = XtJ; 

V . 


and 


(29.24) 

(29.25) 


Soc. 29] 


OPERATORS IN QUANTUM MECHANICS 


297 


X and V are, speaking generally, differential operators; X and v are 
numbers. 

Let us apply the operator v to (29.24). Since there is a number X 
on the right, it can be put on the left of the operator sign v : 

vXtj; = vX4< = Xviji = Xvij;. (29.26a) 

In the last equation we made use of (29.26). We shall now apply 
the operator X to (29.25): 

Xvt}; = Xv(j> = vX(J; = vX^*. (29.26b) 

Let us subtract (29.26b) from (29.26a): 

5X<j^ — Xv({/ = Xv(j; — vXtj^ = 0. (29.27) 


(29.27) can be symbolically written as an equality between operations. 


vX = Xv or vX — Xv = 0 . (29.28) 

This symbolic equality means that the result of operating with v 
and X should not depend upon the order of their actions, otherwise 
equations (29.24) and (29.25) cannot have a general solution. 

We can also prove the inverse theorem: if two operators are com¬ 
mutative, i.e., the result of their action docs not depend on their 
order, then it is possible to form a common eigenfunction <\i satisfying 
equations (29.24) and (29.25). 

Commutations of certain operators. Let us now apply the obtained 
result to two quantities which definitely do not exist in the same 
state: the coordinate x and momentum p*. Wo must calculate the 
commutation pxX — xpx- 

Changing from symbolic notation to the usual one, we obtain 


8 

8 X 


h 8 til 
i 8x 


h 8t]i 

i 8x 


= 4^. (29.29) 


Reverting to symbolic notation, we represent (29.29) in the following 
form: t 

PxX — xpx = ~. (29.30) 


Thus, the result of operating with p* and x depends upon the order 
of their action; p* and x are noncommutative. And this was to be 
expected because the quantities x and p* do not exist simultaneously. 

The eigenfunction of the operator x satisfies the equation (x —x')(|; =0. 
Consequently, it is equal to zero over the whole region where the 
coordinate x is not equal to the chosen eigenvalue x'. This function 
differs from zero only at one point x—x'. The eigenfunction of the 

■ 

momentum operator which satisfies equation (29.30) is e * ’ it 
differs from zero over all the space. This example shows how great 
is the difference between the eigenfunctions of noncommuting operators. 


298 


QUANTUM MECHANICS 


[Part III 


The abreviated notation of (29.30) is a convenient representation 
of (29.29). In the more complex cases that will be examined later 
the convenience of this abbreviated notation is obvious. It must 
be borne in mind that the operator notation is simply a rationalization 
of mathematical symbolism, and there is nothing incomprehensible 
in the result that pxX — xpx^O; such is the property of operator 
symbols. It will be recalled that in vector algebra there also exists 
noncommutative multiplication; and, what is more, not of symbols, 
but of quantities. In quantum mechanics the operator symbolism 
is most expedient. 

The various momentum components are commutative: 

Px Py — py p* = 0 (29.31) 

(the word “operators,” will be frequently omitted in future as being 
self-evident). The commutative relation (29.31) is obtained simply 
from the fact that the result of applying two partial derivatives 
does not depend upon the order of differentiation. 

It is also obvious that 

PyX — xpy= 0. (29.32) 


We now calculate the commutation of any two angular-momentum 
components. Let us take Mx and iffy: 

Mx^ypx— zpy. 

My — zpx — xpx . 

Let us first of all wyite the commutation without using the rules 
(29.30)-(29.32): 

il/.^il/y — My3Ix^{yPx — Zpy) {Z Px — X Pz) — ( Z P* — Xpz) (y Pz —ZPy). 

We now group the terms here so that the order of coordinates and 
corresponding momenta is not disturbed: 

iff* iffy — iffy iff* ypx [pzZ — zpz) —xpy (%Z — Z P*) . 

We substitute the commutation relation p* 5 — ip* = iand then find 
the required result: 

MxMy — MyMx = ih (xpy — ypx) — ihMz. (29.33) 

Now changing the indices x, y, z cyclically, we obtain the remaining 
commutation relations: 

iffy Mz — MzMy = ih Mx , (29.34) 

Mz Mx — Mx Mz= ih3ly. (29.35) 

All three commutation relations can be easily remembered if we 
write them in contracted form thus: 


Sec. 29] 


OPERATORS IN QUANTUM MECHANICS 


299 


[MM] = i/iM. (29.36) 

Expanding this equality in components, we once again arrive at 
(29.33)-(29.35). 

It •v^] be noted that the vector product of an operator by itself 
cannot equal zero unless the operator components of the vector 
are noncommutating (but, for example, [pp] = 0). 

We have shown that there does not exist a state in which a system 
would possess two angular-momentum projections. Angular mo¬ 
mentum has a projection only on one axis, in agreement with the 
Stern-Grerlach experiment. The only exception is when all three 
angular-momentum projections are equal to zero. The eigenfunction 
of such a state does not depend upon any angles at all [see (29.18)]. 
Therefore, as a result of applying dilferential operations of the type 
(29.9), where the differentiation is performed with respect to the 
angle of rotation about any arbitrary axis, this eigenfunction is 
multiplied by zero. The action of the operators of the angular-mo¬ 
mentum components on such a function is commutative. This does 
not contradict the Stern-Gerlach experiment, because a vector can 
have a zero projection on two infinitely close, though arbitrarily 
orientated, axes, provided the vector itself is equal to zero. But if 
only one of the angular-momentum projections is not equal to zero, 
then the two others do not have definite values because, otherwise, 
there would be a contradiction in equality (29.36). 

The square of the angular momentum. Let us now examine further 
properties of angular momentum. We shall show that even though 
two angular-momentum projections do not exist, a single angular- 
momentum projection exists together with its square 

= Ml + Ml . (29.37) 

We shall verify this: 

m M^ - if ^ 2 = i/I i/^ - i/^ Ml Ml Mz- Mz Ml , 

because Mz and Ml are, of course, commutative. Let us add to the 
right-hand side of the last equality, and subtract from it the combi¬ 
nations and MyMzMj ; we take if ^ and Jtfyoutside the brackets, 

once on the right and another time on the left. Then we obtain 

i/2 Siz - Mz .W - Mz (if* Mz - Mz Mz) -f {Mz Mz - Mz if*) Mz + 

-t- if y {My Mz — Mz My) {My Mz — M Z My) My = 

= ~ihMzMy—ih3iyMz + ihMyMz + ihMxMy^0. (29.38) 

Here we have made use of the commutation rules (29.33)-(29.36). 
The eigenvalue of if 2 will be found in the following section. 


300 


QUANTUM MECHANICS 


[Part III 


Exercises 


1 ) Find the commutations, of 

9 Vx 9 9 'Py 9 ^X 9 Vz 9 

Mx 9 ®; Mx 9 ’y ; , 2; 

M\ Px-9 M^9 P^-9 X-, 

2 ) Write down the Cartesian projections of momentum in spherical co¬ 
ordinates. 

Spherical coordinates are expressed in terms of Cartesian coordinates in 
the following manner [see ( 3 . 5 ), ( 3 , 7 ), ( 3 . 8 )]: 

r = , 


» = arc cos 


— , - —- , m = arc tan — 


Whence we obtain the partial derivatives 

ci& xz 


dr X . „ 

-— = — = sin fi cos <p , 
tix r 


cos » cos (p 


dr 


= sin 8 sin tp, 


or z 

= — = cos 8 , 
oz r 

d 9 
dx 

d 9 

dy 

d 9 


dx 


58 


68 

6z 


r* \/a:* + y^ 
yz 


^2 


COS ^ sm 9 
r 

sinO 


■i--* -f- y^ 
X 


sm 9 
rsin8 ’ 


x^ + y'^ 


cos 9 
r8m8 ’ 


= 0 . 


Fiu’ther, 


h dx 


dr d , d^ d df d 6, 

6 ^"67 ~dx ~d& ■" 


cos 9 cos 8 d 
r 68 


sm9 d 
rsin8 69 ’ 

2» d dr d d^ d d<f d d sin 9 cos 8 d 

= + + ^ «r- + —^.■ + 


h dy dy dr dy 68 dy 69 

^ cos 9 d 
rsin8 69 ’ 

i^_d _ dr d d^d dtp d _ d 

h dz ~ dz dr dz 68 dz ~ 


dr 


68 


sm8 6 
~r 


3 ) Write the angular-momentum projections on Cartesian axes in terms 
of spherical coordinates. 


Sec. 30] 


EXPAJSrSIOHS INTO WAVE EENCTIONS 


301 


Mx = yvz — zpy = 1— sin 9 — cot» cos 9 , 

-i 

91 


z'px— XVz ■■ 


COS 9 -r--cot 0 Sin 9 


Mx^xpy-ypx^-j-^. 

4 ) Write the expression for the square of the angular momentum in spherical 
coordinates: 


M* = + i&J + Ml = (M^* + iMj,) (M, -iMy)-i (My Mx - M*My) + Ml = 

= (M* + iMy) (Mx-iMy)-hMz + Mi. 

From exorcise 3 we have 


Mx + iMy=\ (ie.V A _ cot »ei« . 

Mx — tMy = -r I —*6-‘<P -5— — COt$e-l<I> -pr- 1 . 

1 \ Od O 9 / 

Applying M* + t'My to M*—iMy, wo must observe the order of the “factors” 


S " ' d 

-—and ci'P, cot * and-;--. We obtain 
09 oft 


(Mx+iMy)(Mx-iMy)^ 

= -A2 


.0 a .. a.a . „a.a\ 

i- 5 ^cot ft-;-h te'vcot ft-;^e->9-5-;^ -b e'9cot^ft = 


Finally, 


/ a* 

left* 'M 09 ’'"09“ 'Oft ' '09 

)• 


M*. 


/ 1 

d 

■ „ 3 . 1 

a« ' 

\smft 

aft 

47 * fa. '*'!" • tk ^ 

aft sm®ft 

a 9 ^ 


A a 


This expression is obviously commutative with Mz = —. This was 

shown in the present section in another way. 


Sec. 30. Expansions into Wave Functions 

The superposition principle. One of the most fundamental ideas 
of quantum mechanics consists in the fact that its equations are linear 
with respect to the wave function This result proceeds from the 
whole set of facts that confirm the correctness of quantum mechanics, 
in the same way as an analogous result in classical electrodynamics 
(see Sec. 21), which is also a generalization of experience. 

For example, the diffraction of electrons shows that the amplitudes 
of wave functions are combined in the same simple way as the ampli¬ 
tudes of waves in optics; diffraction maxima and minima are situated 
at the same positions, which are determined only by the phase relation- 


302 


QUANTUM MECHANICS 


[Part III 


ships, indeiJendently of the wave intensities. All this points to the 
linearity of wave equations; the solutions of nonlinear equations 
behave in an entirely different manner. 

The sum of two solutions of a linear equation again satisfies the same 
equation. It follows from this that any solution of a wave equation 
can be represented in the form of a certain set of standard solutions, 
similar to the way that, in Sec. 18, a travelling nonperiodic wave 
was represented by a set of travelling harmonic waves (18.1). 

The statement concerning the possibility of representing a single 
wave function in terms of the sum of other wave functions is called 
the superposition principle. 

The Hermitian property ol operators. Wave fimctions are usually 
represented with the aid of the sum of eigenfunctions of certain 
quantum-mechanical operators. In the present section it will be 
shown how such expansions are performed. First of all, however, 
it is necessary to establish certain general properties of the operators 
whose eigenvalues are physical quantities. 

Obviou.sly, these eigenvalues must be real numbers, although the 
operators themselves may depend explicitly upon i=V — 1 [see 
(29.3), (29.10)]. We shall consider the equations for the eigenfunctions 
of the operator X and another equation involving its conjugate: 

Xt|; = X(]^, (30.1a) 

X*tj;* = X*(j;». (30.1b) 

We must find the condition for which the eigenvalues of the operator 
are real numbers, X*=X. 

To do this, we multiply (30.1a) by (j;* and (30.1b) by integrate 
over the whole range of the variables x (upon which the operator X 
depends), and subtract one from the other. Then we obtain 

J((|;* Xt}'— (j^X*(};*) dx — (X — X*)j4'*^^da:. 

But the integral of cannot be equal to zero, since 

is an essentially positive quantity. 

The eigenvalue X is, by definition, real, i.e., X=X*; therefore we 
arrive at the relation 

J(({'*X4' — ~ ® • (30.2) 

Equation (30.2) can be regarded as a condition imposed upon the 
operator In fact, however, we must demand that the operator X 
satisfy the equation (30.2) not only for its own eigenfunctions (X, x) 
and (j; (X, x), but also for any pair of functions x* (*) ^ (*). Pro¬ 

vided these functions satisfy the same conditions of being finite, 
continuous, and single-valued as the eigenfunctions (|> (X, x): 


Sec. 30] 


EXPANSIONS INTO WAVE FUNCTIONS 


303 


—= 0 . (30.3) 

The necessity of such a condition will be explained later in this 
section. An operator for which equality (30.3) is satisfied is termed 
Hermitian. 

In equation (30.3), dx is an abbreviated notation for dV=dx dy dz, 
if the integration is performed over a volume (X-^.^), orrf(p,ifX = Mz, 
etc. 

The Hermitian nature of the operators p*, Mz ,... is easily verified 
by integrating by parts. For example, 

0 0 0 0 
The eigenfunctions of the operator Mz must satisfy the requirement 
of uniqueness (29.19); hence (0) =;^* (27t), tj; (0) (2:1:) similar 

to the eigenfunctions of the operator Mz- Therefore, the integrated 
quantity becomes zero. The operator — is Ml, so that 

27c 2n 

j'x* Mz^ d<p = J Ml X *^9 

o' 0 


in accordance with the general requirement (30.3). 

The Hermitian nature of ^ and M^ is proven by a double integration 
by parts. 

The orthogonality of eigenfunctions. An important property of 
eigenfunctions follows from the Hermitian nature of operators. Let 
us consider the equations for two eigenvalues of the same operator X: 

Xij; (X, x) = Xi|/ (X, x ). (30.4a) 

X*t];*(X',a:) = X'(l;*(X',a;). (30.4b) 

We multiply (30.4a) by t];* (X', x) and (30.4b) by t{; (X, x), integrate 
with respect to dx, and subtract one from the other: 


J[tj< * (X',«) Xij; (X, a:) — ij; (X, a:) X* tj; * (X', a:)] dx — 

= (X-X')J(^*(X',a;)4/(X,a:)da;. (30.6) 

The left-hand side of this equation becomes zero in accordance with 
general requirements for Hermitian form (30.3). Therefore, if X'^X, 
the following integral must equal zero 

Jij;* (X', a;) ij/ (X, x)dx = Q for X X'. 


(30.6) 


304 


QUANTUM MECHANICS 


[Port III 


This property was proved in exercise 1, Sec. 24, as applied to the 
eigenfunctions of the energy operator. It is called the property of 
orthogonality. 

Several quantities, X, v, etc., may sometimes exist in the same state. 
For this it is necessary that the operators X, v,... should be com¬ 
mutative. For example, for free motion there exist px, Py, and pz. 
Then we may form fonctions, which are eigenfunctions with respect 
to all the operators simultaneously: 

Xtj^(X,v;a;) = X:{)(X,v;a:), 

v«j/(X,v;a:) = vtj/{X,v;a;); 

the orthogonality condition for such functions is directly generalized 
in the form 

Ji|;* (X', v';a;) tj< (X, v; x) da:= 0 , (30.7) 

if or V. 

Expansion in eigenfnnetions. Let us suppose that the eigenfunctions 
of a certain operator X are known. These functions always satisfy 
(in addition to the equation XtJ/=X4') certain requirements: they are 
finite, continuous, single-valued, and so forth. Then, in accordance 
with the superposition principle, any function di (x) which satisfies 
the same requirements may be represented as the sum of the eigen¬ 
functions of the operator X: 

4-(x) =27cx''j'(X',x) . (30.8) 

X' 

We shall show how to determine the expansion coefficients Cx- 
To do this we multiply both sides of the equation by ij;* (X, *) and 
integrate with respect to dx: 

(X, x) t{/ (x) dx = ^Cx’ (X, x) ({< (X', x) dx . (30.9) 

X' 

In accordance with the orthogonality condition all the integrals 
on the right-hand side of the equality (30.9) become zero except those 
for which X'=X. Consequently, there remains the equation 

Jtji* (X,x)4' (») dx = Cx Jtj^*(X, x)ij; (X,x) cix = cx Jl^I* (X,x) j^dx. (30.10) 

We shall consider that the eigenfunctions (X, x) are normalized to 
unity, i.e., J|(j<|®dx = l [see (24.18)]. Then the expansion coeffi¬ 
cient is 

Cx=Jt{'* (X, x) (x) dx. 


(30.11) 


Sec. 30] 


EXPAUSIONS INTO WAV® FUNCTIONS 


306 


In the case when we have a system of commutative operators X, 
V, equation (30.11) is directly generalized: 

cx, v=j4'*(^>(30.12) 

The meaning pf the expansion coefficients. We have seen that a 
state tj' (^) is represented as a superposition of states with definite 
values of the quantity X. The component of the wave function 
which corresponds to this value of X is 

cx^l^ (X, z ). 

It represents the probability amplitude for the given value of the 
quantity X in the state (x). In order to find the probability itself, 
Wx, of the occurrence of quantity X, we must ehminate the coordinate 
dependence, since X and x do not exist in the same state. 

To do this, let us integrate the probabihty density of the state with 
a given X, i.e., | Cx | <^ (X, x)^, over all x. From the normalization 
condition for eigenfunctions we obtain 

X)\^dx= |cx I*. (30.13) 

The quantities Wx= 1 cx 1^ have a basic property of probability: 
their sum is equal to unity, provided the function itself satisfies the 
condition of normalization (24.18). Indeed, 

1 = J| (Kx )\^dx =J I 1®^* = 

X 

=I®/1(^. *) H 'f'* 'f' • 

X X X'^SX 

But on the orthogonality condition (30.6) a double summation is 
equal to zero. From this, in accordance with (24.18), it follows that 

2'|cx|*=i;«^x = 1. (30.14) 

Thus, the coefficient Cx should be regarded as the probability ampli¬ 
tude, similar to (j; (x). But | (a:) is the probability of detecting a 

particle with coordinate x independently of X, while |cxl® is the prob¬ 
ability of finding it with a given value of the quantity X independently 
of X. 

Expansion in angular-momentum projection eigenfunctions. The 
atomic beam in the Stem-Gerlach experiment is split into a certain 
number of separate beams, corresponding to the number of angular- 
momentum components along the magnetic field direction Mz=hk. 


20 - 0060 


306 


QUANTUM MECHANICS 


[Part III 


Let us denote the largest eigenvalue quantity k by the letter 1. Then 
it is obvious that 

l^k-^ - I, (30.16) 

i.e., k takes on 2 Z + 1 values. 

The eigenfunction corresponding to Mz=hk is 

V 

(the factor — — is introduced for normalization, 

V 2n 

If each of the separate beams is once again passed through a magnet¬ 
ic field parallel to the z-axis, there is no further splitting; this is 
because Mz in these beams has a single definite value and not the 
whole set of values in the range hi > Mz > — hi, as was the case in 
the initial beam. From this the meaning of the orthogonality of 
eigenfunctions is very well seen. If a particle is found in a beam corre¬ 
sponding to a given value of k, then the probability of finding it in a 
beam with a different value of the projection Mz—hk'^hk is equal 
to zero. From the general rule, the probability is equal to the square 
of the modulus of the coefficient of the expansion Ck of the function 

{k) in terms of the functions (k'), i.e., in accordance with the gen¬ 
eral expression (30.11) 

an 

Ck' = J (j'* W'}' {^')<^9 • 

o 

From the orthogonality condition (30.6), the integral is naturally 
equal to zero ySk'^k. Therefore the orthogonality condition is a neces¬ 
sary condition of particles being found in states with definite values 
of Mz or, as in the case of an arbitrary operator X, in states with defi¬ 
nite values of X. But the orthogonality condition follows directly from 
the Hermitian nature of operators (30.3), while equation (30.2), 
concerning the functions with equal values of X, is insufficient. 

The Hermitian condition implies the reality of eigenvalues together 
with the possibility of “pure states,” i.e., states with definite eigen¬ 
values of quantities. 

If the second magnetic field is along the a;-axis, then splitting will 
again occur due to the component of angular momentum Mx, which 
does not exist simultaneously with The number of splitting com¬ 
ponents is again equal to 2 Z -f 1, since it is determined by the maximum 
angular-momentum projection 1. This quantity cannot depend upon 
the direction of the magnetic field, and is related only to the atomic 
states in the original beam. 


(30.16) 

2Tt 

||4-N9=i). 


Sec. 30] 


EXPANSIONS INTO WAVE FUNCTIONS 


307 


The eigenfunctions of Mx are 

= (30.17) 

V 2tc 

where — I, and w is the angle of rotation about the a:-axis. 

Functions (30.16) and (30.17) do not coincide, which is a natural 
consequence of their being functions of noncommutative operators. 

As a result of magnetic splitting in a field directed along the a:-axis, 
a beam with given value of k is split into 2 Z +1 beams with definite 
values ki- Hence, the function (30.16) will be represented as the super- 
imposition of functions (30.17): 

I 

4>(k)=2^c^,'P(k,). (30.18) 

The square of the modulus I c*, is proportional to the intensity 
of the beam of the given projection Mx = h,\ obtained as a result of 
the secondary splitting of the beam with a given ilf^. 

Averages in quantum mechanics. Let us now find the average of X 
in a state given by the wave function ij' (x), represented in the form 
of a sum (30.8). By definition the mean value is 

\=2;xw^, (30.19) 

i.e., the sum of possible values of X multiplied by the corresponding 
probabilities. Let us substitute here wxfrom (30.13) and cjfrom (30.11): 

X =27 I Cx I =“ =27 =i7^ Cx /<!;(*) 4;* (X, X) dx . (30.20) 

XXX 

We shall now replace the product X^** (X, a;) by X* (X, x) and we 
first sum and then integrate. Then we obtain 

X —Jtj; (a;)27cxX*(^* (X, x)dx. (30.21) 

X 

But the operator X* does not depend upon any definite value of X 
|for example, if X = px, then X* = — i . 

Therefore, X* stands outside the sign of the summation: 

X =JtI;(a:)X*27cx'^*(>^. a:)da;. (30.22) 

X 

The sum 27®x 'i'* (^. x) = {x), since this is an equation which is a 

X 

conjugate complex of (30.8). Therefore, 


20* 


308 


QtTAlSfTUM MECHANICS 


[Part in 


X=JtJ^(a:)X*iJ^*(a:)da: (30.23) 

or, from the Hermitian condition for the operator X (30.2), (30.3), 


X=J.{/*(a:)Xi};(a:)da:. (30.24) 

Thus, in order to calculate the mean value of X in a state <1' (x), 
it is not necessary to know the eigenvalues of X, since it is sufficient 
to calculate the integral (30.24). 

The eigenvalues of the square of the angular momentum. If tj' {x) 
is one of the eigenfunctions of the operator X, then the mean value X 
is simply reduced to this eigenvalue. Indeed, then 

X = ^(1^* (X, x) X(|< (X, x) da: = xjl tj; (X, a;) l^da: = X . 


Taking advantage of the foregoing remark, it is easy to calculate 
the mean value of the square of the angular momentum. 

First of all it may be noted that in the Stem-Gerlach experiment 
the mean values of the squares of all three angular-momentum pro¬ 
jections must be the same, because it is absolutely immaterial what 
the notation of the coordinate axis is along which the magnetic field 
is directed: 

Ml = Ml=Wl. (30.25) 

It follows that the mean value is equal to three times the mean 
value Ml: 

W =Mi + Wl + Ml = 2iWl. (30.26) 


In the original beam all values of Mz—hk from —hi to hi are equally 
probable. This means that Ml is equal to 


Ml 


-t 


= h^ 'L=SZ> ^ 


21 + 1 


3 • {21 + 1) 


whence 

W = hH (i -I- 1). 


hH{l+ 1) 
3 


(30.27) 

(30.28) 


But it was shown in Sec. 29 that M^ is commutative with Mz, so 
that Jf* and Mz exist in one and the same state. In the Stern-Gerlach 
experiment the atoms in the beam occur predominantly in the ground 
state. This state is characterized by a certain absolute value of the 
angular momentum. Therefore, the mean value of the angular momen¬ 
tum in such a state is equal to its eigenvalue 


ilf2 = = hH (l+l). 


(30.29) 


Sec. 30] 


EXPANSIONS INTO WAVE PUNCTIONS 


309 


The result (30.29) may appear somewhat surprising because the 
eigenvalue of the square of the angular momentum is equal not to the 
square of its greatest projection but to some greater amount. 
However, if Ml were equal to l^, i.e., its greatest value and F, 

then for the remaining projections there would remain an identical 
zero. The other projections cannot have any definite values, including 
zero values, at the same time as Mz^ 0. Therefore, the square of the 
angular momentum is somewhat greater than the square of the maxi¬ 
mum value of any of its projections. The only exception is when all 
three projections are equal to zero (see Sec. 29). 

Composition of angular momenta. Knowing the absolute value of 
the angular momentum, we can now indicate a rule for the composition 
of the angular momenta of two mechanical systems. Let the greatest 
angular-momentum projection of one system equal hl^ and that of 
the other system hl^; and also let Then the projection of the 

smaller angular momentum in the direction of the larger one is con¬ 
tained between hl^ and — hl^, which, when added to the larger angular 
momentum, yields values ranging from h{li-\-l^) to A (Z^— l^). It 
follows that the greatest projection of the resultant angular momentum 
upon any arbitrary direction in space is equal, in units of A, to 

Z = Zj -)- ^2 j Zj^ Z 2 — 1, Zj -[- Z 2 — 2,..., Zj — Z 2 . (30.30) 

The eigenvalues of the square of the sum of the angular momenta are 
A* (Zj -f- Zg) Z 2 -1- 1), A* (Zj + Z 2 — 1) (Zj -f- Z 2 ),..., 

(Z 1 -Z 2 + 1). 

The rule for composition of angular momenta formulated here agrees 
with the result that the value of a vector sum is contained between the 
sum and the difference of the absolute values of the vectors. 

Quantum equations of motion. Let us suppose that a certain opera¬ 
tor X is given. It is required to find the operator form of its total time 
derivative, i.e., X. We shall first of all determine the total derivative 
of the mean value X. In accordance with (30.24), for any state with wave 
function A, this derivative is 

X = da; -t-dre -f- ' 

Let us substitute here the derivatives —and ■— from the 

ct ct 

Schrodinger equation (24.11), whose right-hand side we shall represent 
as 4'> where is the Hamiltonian operator [see (29.7)]. From this, 

X = J-~ • 4^*)X4' da; -f ' 


310 


QUANTUM MECHANICS 


[Part III 


We transform the first integral on the right-hand side in 
with the Hermitian condition for namely 

accordance 

41 *) (Xij^) da: = da:. 


We now combine all three integrals and obtain 


X = (1} -f 4 [i-X - X.#j) dx . 

(30.31) 

If we now define the operator X by the equality 


X -= J'j'* X4>da:, 
then we obtain the equation 

(30.32) 

♦ 4 A A A A 

(30.33) 


The operators of linear momentum, angular momentum, and coordi¬ 
nate that have been employed up till now do not depend upon time 
explicitly. For them, only the second term of (30.33) remains: 


X = [jf X - X,#]. (30.34) 

Thus, if a given operator commutes with the Hamiltonian operator 
then X — 0. It is then natural to call the quantity X a quantum inte¬ 
gral of motion. In accordance with the general result of Sec. 29, 
quantum integrals of motion have a common state with energy, since 
their operators are commutative. 

We shall now find the equations of motion for the x and p* opera- 
tors. From (29.7), the energy operator is equal to + U. Here, 
only pi is noncommutative with x ; for pi we find 

plx — xpl = plx — pxxpx + pxxpx — xpl — 

= Px (VxX — X%) + (pxX — X Px) Px = -r- P* • 

% 


It follows that 


m ’ 


(30.35) 


i.e., the operator x is related to the operator p* by the same expression 
as the quantities x = Vx and p* in classical mechanics. 

Let us now find p*. p* does not commute with tf. The commutator 
of U and p* is easily evaluated: 


(Upx-%U)^^\{U 


84. 

dx 


h dfj , 


Sec. 30 ] 

EXPANSIONS 

INTO 

WAVE EUNCTIONS 

311 

whence. 

symbolically. 


_ h 8 U 

i 8 x 


Hence, 

Upx- 

- PxU 

(30.36) 


Px = 

dO 

8 x ’ 

(30.37) 


which is completely analogous to the classical relationship between 
the momentum derivative and the force. 

The quantum equations of motion (30.34) were the starting point 
for W. Heisenberg, who arrived at quantum mechanics independently 
of Schrodinger. The equivalence of both approaches was shown some¬ 
what later. 

The wave function and measurement of quantities. The probability 
amplitudes characterize the properties of a system in relation to the 
results of measuring certain quantities. If a system occurs in a state 
with wave function tp (x), and the quantity X is measured, then the 
probability of obtaining the given value of X is [see (30.11), (30.13)] 

1 Cx = |J t}'* (a:) (p (X, x)dx *. 

For example, in the Stern-Gerlach experiment, the particles in the 
original beam have all angular-momentum projections between —hi 
and hi. Measurement results yield 2 Z-fl beams, each of them corre¬ 
sponding to the 2 th angular-momentum projection given by a definite 
value hk. However, the same measurement in a field directed along the 
x-axis of the original system would split the beam according to the 
ath angular-momentum projections. Both angular-momentum pro¬ 
jections do not exist simultaneously, and the initial states of the 
particles in the beam were identical. It follows that, as a result of 
measurement, the particles occur either with a definite 2 th, or with a 
definite xth angular-momentum projection. 

A measurement of a microscopic entity essentially changes the state 
of the latter. This is the fundamental difference between the concepts 
of measurement in classical and in quantum physics; a classical meas¬ 
urement has an infinitesimally small effect on the object being 
measured. 

As a result of measurement, the angular-momentum projections in 
the original beam acquire 2 i! -f 1 values, no matter how the measure¬ 
ment is performed. The state of these particles after measurement is 
essentially different and depends upon how the measurement was per¬ 
formed. But by performing measurements of a large number of iden¬ 
tical entities, we can find out in what state they were before measure¬ 
ment, quite independently of the method of measurement. For this 
reason, a quantum measurement yields physical results which are 
just as objective as those given by a classical measurement though, 
obviously, within the limits permitted by the uncertainty principle. 


312 


QUANTUM MECHANICS 


[Part III 


Thus, in the Stern-Gerlach experiment it appears that the particles 
had an absolute angular-momentum value M^=hH while 

the direction of the angular momentum in space was arbitrary (a non¬ 
polarized beam). 

The repeated measurement of the zth angular-momentum projec¬ 
tion, in the beams which had passed earlier through a field directed 
along the z-axis, gives a definite value of M^=h^l (Z-j-1) and a definite 
value of Mz—hk. 

Exorcises 


1 ) Expand the function + = —^, in an infinitely deep rectangular 

y/a 

potential well, in terms of functions ( 25 . 12 ). 

V2 ” 


Cn ' 


r , , J V 2 f . 7t (n + l)x J 
= J (|< t|/n dx — —J sin- - - dx = 


V2 


7 t(n-t-1) 


[— cos JT (n -f-1) -f-1] = 


V 2 


X (n -f 1) 


[1 _(_!)«+!] . 


2 ) Find the energy eigenvalues for a symmetrical quantum top. The energy 
of the symmetrical top is 


2Ji 


{Ml + MD- 


Introducing M^, we have 


2J, 


2J, 


Substituting the eigenvalues for the angular momentum and its projections, 
we at last find 


-- 55 -( 1(1 + 1 ) 


Sec. 31. Motion in a Central Field 

The motion of an electron in a central attractive field is the princi¬ 
pal problem in the quantum mechanics of the atom. And it is not 
necessary to regard the field as strictly Coulomb in character. For 
example, in alkali-metal atoms, an outer electron which is bound 
relatively weakly to the nucleus moves in the field of the nucleus and 
the so-called atomic residue (i. e., all the other electrons). The charge- 
density distribution for these electrons possesses spherical symmetry 
and therefore produces a central field. We shall suppose that the poten¬ 
tial energy of the electron is equal to U (r), where r is the distance 
from the nucleus. 

The energy operator and the angular-momentum integral. The equa¬ 
tion for the energy eigenvalues of an atom (24.22) is, as usual. 


(31,1) 


Sec. 31] 


MOTION IN A CENTBAi FIELD 


313 


Here, m is the reduced mass of the nucleus and the electron, which 
mass is very close to the mass of the electron. Since the field is central 
we must pass to spherical coordinates. The Laplacian operator in 
spherical coordinates was obtained in Sec. 11 (11.46). Using this 
expression, we rewrite (31.1) explicitly: 


2 m 1_ r* dr^ dr 


_J_^ 

sina Sa 


sin^a ^ 


-]r U {r)^ 


(31.2) 


The operator involving angular difierentiation is simply the square 
of the angular momentum introduced by us in exercise 4, Sec. 29. 
Therefore, equation (31.2) can also be rewritten as 


2m Br^ Br 2mr'‘ 


■ + U (r) ^ 4'• 


(31.3) 


It follows that the Hamiltonian operator ^ [see (29.6)] is related to 
the angular-momentum operator in the following way; 


AAA .2 A 

2m r* Br Br 


+ 


A/2 

™ + U{r). 


(31.4) 


Reducing to an ordinary differential equation. The operator 
involves only the angles 0 - and 9 and derivatives with respect to them. 
All the derivatives with respect to angles in the operator ^ are con¬ 
tained in the one term M^, while all the remaining terms involve only 
r and the derivative with respect to r. Consequently, the operators 
and iff® are commutative, since iff® commutes with any function 
of r and, of course, with r itself. Commutative operators have eigen¬ 
values in the same state. Therefore, in a central field, the square of the 
angular momentum and one of its projections have (together with 
energy) eigenvalues, which, in accordance with (30.34), are quantum 
integrals of motion. All the other quantities which are not integrals 
of motion do not exist in the same energy state (in classical mechanics 
they, naturally, exist but are not conserved). 

Thus, in equations (31.3) and (31.4), we can substitute in place of 
iff®its eigenvalue A®Z (Z 1) from (30.29). Then any angular dependence 
will be eliminated from equation (31.3) and, in place of the partial 
derivative with respect to r, we will get the total derivative: 


JHz 1 ^ „2 At 4. 

2m r® dr dr ' 


hH(l + \) 
2mr^ 


(j/ -h Z7(j^=(^’4'. 


(31.5) 


It is considerably more simple to solve this equation than the partial 
differential equation (31.2). The form of (31.6) corresponds to ( 6 . 6 ) 
in classical mechanics, where it was also possible to eliminate all 
variables except r with the aid of the angular-momentum integral. 


314 


QUANTUM MECHANICS 


[Part III 


Reduction to one-dimensional form. It is convenient to reduce 
equation (31.5) to a one-dimensional form. To do this—^the treatment 
is similar to that used in the problem of the propagation of spherical 
waves [cf. (19.6)]—we introduce the function 


(31.6) 


Without repeating the com 2 Jutations by means of which the one¬ 
dimensional form (19.6) was obtained, we write down the analogous 
equation for x - 


d\x 

2 m dr^ 


2mr^ 


x + Ux = ^x- 


(31.7) 


The wave function at large and small distances from the nucleus. As 
long as the form of U (r) has not yet been made definite, we can con¬ 
sider (31.7) only in two limiting cases: for very large and for very 
small distances from the nucleus. 

The field of the atomic residue is not effective at very small dis¬ 
tances from the nucleus, and there remains only the Coulomb rela- 

tionship 1 [/=- {Z is the atomic number of the element). 

However, if r is very small then the term is. in s-ny case, 

larger than the term 17di, which involves r in the denominator only 
in the first degree, and all the more greater than S'ii. Hence, in direct 
proximity to the nucleus, the wave equation is of very simple form: 

(31.8) 

In this form it is solved by the substitution 

X-r«, (31.9) 

so that 

a (a-1) ==.1(1-1-1). (31.10) 

This equation has two roots: 

a = 1-f 1 and a =--1. (31.11) 

But the second root gives = from (31.7); at the point r — 0, 
this function of becomes infinite for all 1. Therefore, we must discard 
the root a =—1 and take the relationship between (j/ and r for small r 
in the form 

= (31.12) 


* The result (31.12) is true for 1 = 0 as well, even though the term 
in this case does not exist at all and cannot exceed 17 (r) ]i. 


hH(l + \) 
2mr^ 


Sec. 31] 


MOTION IN A CENTBAL FIELD 


316 


The greater the angular momentum, the higher the order of the 
wave-function zero at the coordinate origin. Only for i! = 0 does it 
remain finite close to the nucleus. This can be understood by analogy 
with classical mechanics: angular momentum is the product of mo¬ 
mentum by the “arm,” i.e., by the distance from the origin; 1 = 0 
corresponds to a zero “arm” and a zero angular momentum. There¬ 
fore, there is a nonzero probability of finding the electron at the 
origin. In the old version of quantum mechanics (due to Bohr), the 
electron orbit with zero angular momentum passed through the 
nucleus. The larger angular-momentum values correspond to larger 
“arms” and, correspondingly, in quantum mechanics, to a smaller 
probability of finding an electron close to the nucleus. 

The behaviour of the wave function close to the origin can also be 
explained as follows. A centrifugal repulsive force acts on the particle; 

to tills force there corresponds an effective potential energy ‘ 

This energy limits the classically possible region of motion for small r. 
In quantum mechanics the particle penetrates the centrifugal barrier, 
though more weakly the greater r, i.e., the higher the barrier. There 
is no barrier for 1 = 0 and there is nothing to prevent finding the par¬ 
ticle at the origin. 

The terms ^ ^^ must be discarded for large r in the 

wave equation, because U (r) is assumed to be zero at infinity, 
U (oo)7^:0. Then the equation is also greatly simplified: 


d^x _ imS 


Its general solution appears thus; 


(31.13) 


V— 2 V — 2 m ^ 

X = C'ie ' ArC^e '■ '. (31.14) 

Positive and negative energy values. We consider two cases. Let the 
energy be positive, S>0. Here, x appears as follows: 

. ts/'imS _ . rV2 m3 

X = Cie' '■ -j-Cae ’ ^ . (31.16) 

Both terms remain finite for any value of r. Therefore, two constants, 
Cy and Cg, must be retained in the solution. We came across the same 
situation in considering the solution of wave equation (25.33) for a 
potential well of finite depth. 

Any general solution of a second-order differential equation involves 
two arbitrary constants. Let us suppose that the solution (31.12), 
which holds for small r only, is continued into the region of large r, 
where it is not of the simple form r*, but nevertheless satisfies the 


316 


QUANTUM MECHANICS 


[Part HI 


precise equation (31.7). A certain integral ciu’ve is obtained for this 
equation. But any integral curve can be represented by properly 
choosing the constants in the general solution. As r tends to infinity 
this solution acquires its asymptotic form (31.15) if «f>0. The ex¬ 
pression (31.15) remains finite when r->oo for any constants and 
Cg. It follows that, for a positive energy, the wave equation always 
has a finite solution for any values of r. Therefore, the values for S>0 
correspond to a continuous energy spectrum, since the wave function 
satisfies the required conditions at zero and at infinity for any > 0. 
In accordance with (31.15), the probability of finding an electron at 
infinity for r->oo does not become zero; i.e., this case corresponds 
to infinite motion, as in the classical problem considered in Sec. 6 
(see also Sec. 25). 

Thus, the general rule has been confirmed that infinite motion 
pos.sesses a continuous energy spectrum. 

Now let (f<0 or —| S\. Then (31.14) must be represented 
thus: 

r V’i m i I f '\/2 m| tf | 

X = Cie ■ -fCge ■. (31.16) 

Here the first solution tends to infinity together with r and we must 
therefore put = 0, so that x will involve one instead of two arbitrary 
constants: 

f V'2 ml <f I 

» . (31.17) 

The condition ior eigenvalues. If we now draw an integral curve 
from the coordinate origin, proceeding from (31.12), then, as a rule, 
for large r it will not be reduced to the form (31.17). For all negative 
energy values, except certain ones, the integral curve is represented in 
the form (31.16) at infinity when <^’<0 and, hence, docs not satisfy 
the boundary condition imposed on the wave function. Only for those 
energy values for which it turns out that 


C'i(#)=0 (31.18) 

does the wave equation have a solution. This corresponds to a discrete 
energy spectrum. At the same time, x (oo) becomes zero, so that the 
finite motion has a discrete energy spectrum, as expected. 

The Coulomb field. The transition to atomic units. We shall now 
find this spectrum for an electron in a purely Coulomb field; 

(31.19) 

This occurs in a hydrogen atom (though not in a molecule!), in 
singly ionized helium, doubly ionized lithium, etc. Z, as usual, denotes 
the atomic number of the nucleus. 


Sec. 31] 


MOTION IN A CENTRAL FIELD 


317 


The wave equation (31.7) is now written as 


d^x 

2m dr^ 


hH(l + l) 
2mr^ X 


(31.20) 


We have straightway taken the case of negative energies that leads 
to a discrete spectrum. 

It is convenient here to change the units of length and energy 
similar to the way it was done in the problem of the harmonic oscillator 
(Sec. 26). In place of the CGS system (where the basic units are the 
arbitrary quantities centimetre, gram, second) we take the following 
units; the elementary charge e, the mass of the electron m, and the 
quantum of action h. From these quantities we form the unit of length 


= 5.2917 X 10-»cm 

me® 

and energy 


Hence, if we put c = l, m=l, h = \ in equation (31.20), then length 
and energy wUl be measured in these units. Let us caU this length 


and energy s: 


(31.21) 

(31.22) 


so that, of the constants, the wave equation will involve only the 
atomic number Z: 


d^x I 1 ( 1 + 1 ) ■ 

d%^ f ^2 A 


(31.23) 


Solution by the series-expansion method. We look for the solution 
of this equation in the form of a series expansion. We shall proceed 
here from the solutions obtained for large and small values of ^ 
(i.e., r). 

In accordance Avith equations (31.12) and (31.17), we write x 
following form: 


oo oo 

= ^/+ie-5V^27x«^"= c-5 (31.24) 

«—o «=»0 


The first factor determined the form of x for ^-^0, the second factor 
should basically correspond to the form of x for large 5, and the series 
interpolates, as it were, between the limiting values. 


318 


QUANTUM MECHANICS 


[Part III 


Differentiating (31.24) twice, we obtain 

oo oo 

0 = 2A/^e- ^ (n + l++ 

M -0 M =“0 

OO 

+e-5 („+ l+i)(n + l) xn^"+‘-^. (31.26) 

tt »»0 

The first term on the right is simply —2 ex- Hence, it cancels with the 
same term in (31.23). We group the remaining terms so that in one of 
them the degree of ^ is everywhere less by unity than in (31.24) and, 
in the other, less by two units. In addition we eliminate the common 
factor 6“ 5. We shall now have an equality between two such 
series: 


2;[l(t+l)-(n + l + 1) (n + 1)] x4"+'-‘ = 

n-0 

CO 

^2J\2Z-2V2e {n+l + l)]X^l"+‘. (31.26) 

n-O 


An equality between series is possible only when the coefficients of 
the same powers of ^ coincide. On the left-hand side the power 
will have a coefficient involving Xn + i- 
Hence 


2[Z-(» + l+1) V^] 

X«+l — X" ; + 1) i + 1) I + 2)^ • 


(31.27) 


Examining the series and the condition for eigenvalues. From the 
relationship (31.27), all the coefficients X" ar® determined consecu¬ 
tively. We must neglect the constant numbers I and Z in equation 
(31.27) when n are large; there then remains the limit 


X"+i “ 


2-v/2e Xn 
n 


(31.28) 


We met with a similar expression in the problem of the harmonic 
oscillator (26.16). In the case of large ^ it reduces the whole series to 
an exponential form: 


^ X”5" = ^ 


(31.29) 


But such a series cannot give a correct solution to the wave equation 
because, if we substitute (31.29) in (31.24), we obtain ({; (oo) == oo 
despite the boundary condition. However, if all the coefficients become 
zero from a certain Xn+i onwards, the series (31.29) degenerates to a 
polynomial. Then, being multiplied by , it gives tj/ (oo) = 0. 


Sec. 31] 


MOTION IN A CENTRAL FIELD 


319 


as expected. It can be seen from (31.27) that x« +1 is equal to 
zero if 


i.e., 


Z-(n + l + l)V^ = 0, (31.30) 


2(n + l + iy • 


(31.31) 


Finally, going over to conventional units and taking into account 
the sign of the energy, we obtain the required spectrum: 


Z’^me* 


(n + 1 + 1)® 


(31.32) 


Quantum numbers. The number n is the degree of the polynomial 
(ii’ is called the Sonin-Laguerre polynomial). A more detailed 

n 

analysis shows that this polynomial becomes zero exactly n times, 
corresponding to its degree. Therefore, if we examine the dependence 
of the wave function on radius, it has n zeros or “nodes,” not counting 
the zero at r = cx> and at r = 0, which all functions with 1^0 have. 
The term node instead of zero is given by analogy with the nodes of a 
vibrating string fixed at both ends. In future we shall call Ur the degree 
of the polynomial and denote by the letter n the whole sum 


» = «, + !+!. (31.33) 

It is convenient to use these quantities also in the more complex 
cases of many-electron atoms. Even though the energy in such a case 
does not have the simple form (31.32), the numbers n, n,, and I are 
convenient for classification of the states. 

I is called the azimuthal quantum number. As we know, it defines 
the angular momentum of an electron. The following system of nota¬ 
tion is used in spectroscopy: the electron state with i = 0 is called the 
s-state and, corresponding to ( = 1, 2, 3, we have the p-, d- and /-states. 
There are no greater values of I in nonexcited atoms. Combining the 
angular momenta of separate electrons according to the rule of vector 
addition (30.30), we obtain the angular momentum L of the atom as a 
whole. The states with L~0, 1, 2, 3 are termed S, P, D, F, while 
states with greater L are named by subsequent letters of the Latin 
alphabet. 

k, [see Eq. (29.21)], i.e., the angular-momentum projection on some 
axis in units of h, is called the magnetic quantum number, since the 
external magnetic field is usually directed along this axis. 

n, is the number of wave-function zeros as related to the radius 
(for and oo) and is called the radial quantum number. 

Finally, the sum (31.33) is called the principal quantum number. 
In accordance with (31.32), the binding energy of an electron in a 
hydrogen atom is 


320 


QUANTUM MECHANICS 


[Part in 


me* 


13,5 


ev. 


(31.34) 


An analogous expression is obtained also for the positive helium 
ion. Apart from the difference of = 4 times, there is a more subtle 
difference due to the fact that the reduced mass of the helium atom 
differs somewhat from the reduced mass of a hydrogen atom as a result 
of a difference in the nuclear masses. 

The state with n = l is the ground state. The atom cannot emit 
light in this state because it is impossible to make a transition to a 
lower state. For more detail about radiation, see Sec. 34. 

The parity of a state. The state of an electron in an atom is character¬ 
ized by one more property, which (as opposed to energy and angular 
momentum) does not correspond to any classical analogue. This is 
the parity of a wave function with respect to coordinates. 

To begin with let us consider the wave function of a separate electron. 
The wave equation (31.1) does not change its form if we substitute 

x = — x' y,= — y', z — — z'. (31.36) 

This transformation is termed inversion: it transforms a right-handed 
coordinate system to a left-handed one. No rotation in space can make 
these systems coincide (like left-hand and right-hand gloves) (see 
Sec. 16), 

The wave equation (31.1) is linear. Therefore, if it has not changed 
its form, then its solution (determined by the boundary conditions 
within the accuracy of the constant factor) can acquire only a certain 
additional factor: 

^(pc,y,z) = Cii{x',y',z'). (31.36) 

But, in principle, the primed left-handed system differs in no way 
from the unprimed, right-handed system. For this reason, the trans¬ 
formation of inversion must involve the same, transformation factor C: 


i^(x',y',z')=Ci^{x,y,z). (31.37) 

Substituting this in (31.36), we obtain 

({/ {x, y, 2 ) = (J; {x, y, z ), 

whence 

C2 = l, C'=±l. (31.38) 

The function is termed even for (7 = 1 and odd for (7=— 1 . The 
eigenfunctions of a linear harmonic oscillator possessed an analogous 
property; here the'energy operator was also even, ^ {x) = ^ { — x), 
while the wave functions alternated depending upon the eigenvalue 
number n (i.e., they were either even or odd). 

Parity and orbital angular momentnm. Let us now find out what 
it is that determines the parity of a wave function in a central field. 


Sec. 31] 


MOTION tN A CENTRAL FIELD 


321 


To do this, it is convenient to utilize its form near the coordinate 
origin: 

= (31.39) 

In order to find the angular dependence of the wave function as 
well, it is sufficient to investigate it to the approximation that yields 
equation (31.39) since the terms U and <f, which do not depend upon 
the angles, are thereby discarded. The angular dependence of the 
solutions of the precise and shortened equation is the same. This 
shortened equation is, obviously, simply the Laplace equation 

A({; = 0. (31.40) 

Equation (31.40) is satisfied by a homogeneous polynomial in x, y, z 
tfi = a:* + y +... + , (31.41) 

of degree I for certain relationships between its coefficients a, ..., 
b, ... . It is clear that the degree I of this polynomial is equal to the 
degree I in equation (31.39). But the degree I of (31.41) defines the even 
or odd nature of ip with respect to the inversion (31.35). It follows that 
wave functions with even orbital angular momenta I are even, and 
those with odd orbital angular momenta I are odd. In a multi-electron 
atom, the total parity of the wave function is determined by the parity 
of aU the wave functions for the separate electrons (this by no means 
signifies that the wave function of an atom is equal to the product of 
the wave functions of the separate electrons!). Therefore, the parity 

of the total wave function is equal to the parity of the number ^ h 

i 

where U are the orbital quantum numbers of the electrons. As we know, 
the total angular momentum of the atom is equal to the vector sum 
of the angular momenta of its electrons. 

Parity as an integral ol motion. We shall explain the significance of 
the parity of a wave function. To begin with, let us point out that the 
inversion (31.35) can be represented by an operator O such that 

0'^{x,y,z) = <^{—x,—y, — z). (31.42) 

Since the Hamiltonian operator in an atom isan even function of 

coordinates, we can write 

= (31.43) 

Whence it follows that the parity operator is commutative with the 
Hamiltonian operator 

= (31.44) 

The eigenvalues of the operator G are the numbers (7== ± 1 (31.38), 
because 

Gii = ii{—x,—y, — z) = (7(p. 


21 - 0060 


(31.45) 


322 


QUANTtm MECHANICS 


[Part m 


According to (31.44) and (29.28) these numbers exist simultaneously 
with the energy eigenvalues. 

We shall now consider what limitations can be imposed, by the 
law of conservation of parity, on possible transitions in the atom. 
Suppose we have an excited multi-electron atom with total angular 
momentum Zr = 0, i. e., in the <S-state. Then let there be in this atom 
s-electrons and an odd number of p-electrons. Consequently, the atom 
is in an odd state. Let the excitation energy be sufficient for the atom 
to emit one of the p-electrons, so that after the rearrangement of the 
electron cloud the atom remains in the iS-state with L == 0. Since angu¬ 
lar momenta are combined vectorially, such a state may result both 
for an odd and an even number of p-electrons. According to the law 
of conservation of total angular momentum, an electron may be emitted 
only with an angular momentum equal to zero because, according to 
assumption, the angular momentum of the rest of the system is equal 
to zero before and after the transition. It follows that the electron can 
be emitted only in an even s-state. 

After the emission of the electron, an even number of p-electrons 
remains in the ion, and the emitted electron is also found to be in an 
even state. But this is impossible since the initial state was odd and 
the final state was even, the total energy being constant. Hence, the 
laws of conservation of parity and angular momentum may exclude 
transitions which are permissible energetically. We have considered 
a typical case of a transition which is “forbidden” by parity selection 
rules {L — 0 into L = 0 with changed parity). 

The law of conservation of parity by no means follows from the 
law of conservation of angular momentum, since parity depends upon 
the arithmetic sum of I while total angular momentum depends upon 
the vector sum. 

The law of conservation of angular momentum in quantum mechan¬ 
ics must always be used together with the law of conservation of 
parity. In origin, these laws have a common basis: they both follow 
from the invariance of equations with respect to the orientation of 
coordinate axes in space. But aU possible orientations are not e.xhausted 
by axis rotations alone: an additional transformation is inversion which 
is not reduced to any rotation. It is this that yields the parity conser¬ 
vation law in addition to the law of conservation of angular momentum. 

In this form the parity conservation law can be unconditionally 
applied to those systems in which electromagnetic interactions occur. 

The considerably weaker interactions which occur in certain ele¬ 
mentary-particle conversions probably satisfy a modified parity con¬ 
servation law (see Sec. 38). 

Hydrogen-like atoms. Alkali-metal atoms somewhat resemble the 
hydrogen atom. The outer electron in these atoms is relatively weakly 
bound to the atomic residue, which consists of the nucleus and all the 
remaining electrons. The wave functions for electrons of the atomic 


Sec. 32] 


ELECTRON SPIN 


323 


residue differ from zero at smaller distances from the nucleus than the 
wave function for the outer electron, so that the residue screens, as it 
were, the nuclear charge. The field in which the outer electron moves 
is approximately Coulomb, provided only that it is not situated in 
the region of the residue. It is for this reason that the spectra of alkali- 
metal atoms resemble the hydrogen-atom spectrum. The energy levels 
of these atoms, which are due to excitation of the outer electron, are 
given by the equation 

[wf A(1)]2 ’ (31.46) 


where the correction A (1) depends upon the azimuthal quantum 
number. It accounts for the deviation of the field from a purely Cou¬ 
lomb one at small distances from the nucleus. 

Thus, the energy levels in alkali metals—^liko the energy levels of 
all atoms—depend upon n and 1. An exception is the hydrogen atom, 
where the energy depends only upon w, this is a special property of a 
purely Coulomb field. For example, when w = 2, the azimuthal quan¬ 
tum number can take on two values: 1 = 0 and 1 = 1, while the corre¬ 
sponding energy levels of the hydrogen atom are close to each other 
(the splitting of these levels is due to relativistic corrections to the wave 
equation). 


Exercise 


Construct and normalize the wave functions in a hydrogen atom with 

OO 

1 = 0, 1, 2 and n = 1, 2, 3. Take advantage of the fact that j" e~^afdx — n ! 

0 


Sec. 32. Electron Spin 

The insufficiency of three quantum numbers for the electron in an 
atom. From equation (31.34) the ground state of a hydrogen atom has 
a principal quantum number n equal to unity. For a = 1 the azimuthal 
quantum number 1 and the radial quantum number rir must be equal 
to zero, since n=nr-\-l-\-\, and nt and 1 can in no way be less than 
zero. The ground state of a hydrogen atom is the s-state. The orbital 
motion of an «-electron does not produce a magnetic moment because 
the magnetic moment is proportional to the mechanical moment. 
Yet, if the Stern-Gerlach experiment is performed for atomic hydro¬ 
gen, the atomic beam will split, but only into two parts. However, 
when 1 = 0, as we have already said, there should be no splitting due 
to orbital angular momentum, while for 1=1, the beam should split 
into 21-1-1=3 beams corresponding to the number of projections k 
of the angular momentum l{- —^1, 0, 1). 

The same results if, instead of hydrogen, we take an alkali metal. 
The electron cloud of any alkali metal consists of an atomic residue in 


21* 


324 


QUAimTM MECHANICS 


[Part III 


an /S-state, i.e., one lacking an orbital angular momentum and one 
electron in the s-state. In this sense, aklali-metal atoms resemble the 
hydrogen atom. For this reason, the state of the atom is not described 
by the three quantum numbers ra, I, and k. 

Intrinsie angular momentum or electron spin. Sphtting into two 
beams can be accounted for only by an angular momentum whose 
greatest projection is equal to A/2. Then it has only two projections 
A/2 and —A/2. 

The Stem-Gerlach experiment was given only as an example. In 
fact, not only this experiment, but the whole enormous aggregate of 
knowledge about the atom indicates that the electron possesses a 
mechanical moment A/2 that is not related to its orbital motion. This 
mechanical moment is termed the spin. It can be said that an electron 
is somewhat reminiscent of a planet which has an angular momentum 
due not only to its revolution about the sun, but also to rotation on its 
own axis. 

The analogy with a planet is not far-reaching since the angular mo¬ 
mentum of rotating rigid body can be made equal to any value, while 
the spin of an electron always has projections ±A/2 and no others. 
Therefore, spin is a purely quantum property of the electron; in the 
limiting transition to classical mechanics it becomes zero. We must not 
take the word “spin” too literally, for the electron actually does not 
resemble a rigid body like a top or a spindle. 

Spin degree o£ freedom. The analogy between an electron and a top 
consists in the fact that their motion is not described by their position 
in space alone, but possesses an internal rotational degree of freedom. 

There is a certain analogy between the electron and the light 
quantum. As was shown in Sec. 28, in addition to its wave vector, 
the state of a quantum is described by a polarization variable which 
takes on two values. Similarly, the electron has, in addition to its 
spatial coordinates, a spin variable a which assumes two values (since 
spin has only two projections). 

Spin operators. When we write {x) we have in mind the whole group 
of values of the wave function for all x, i.e., ij' at aU points of space. 
The action of an operator on tj; (a:) denotes a linear transformation 
of ij; in the whole space, since, in accordance with the superposition 
principle, all operators in quantum mechanics are linear. Taking into 
account the spin variable, one has to write (x, o), where a takes on 
only two values. The action of the spin operator on (x, a) denotes 
the replacement of tj; (x, 1) by some linear combination of (x, 1) 
and (}< (x, 2); the action of the operator on (x, 2) is determined anal- 
oguously. Linear operators depending upon a can denote nothing 
other than a linear substitution as applied to a function of “two 
points” 0 = 1 and o = 2. 

We shall try to determine, in explicit form, how spin angular- 
momentum projection operators should act upon functions of the 


Sec. 32] 


BMIOTBON SPIN 


325 


spin variable ct. The following requirements are to be imposed 
on them. 

1) The eigenvalues of all three spin projections must be equal to 

± A/2. 

2) The same commutation rules (29.33)-(29.36) must exist for 
them as for the cpmponents of orbital angular momentum, otherwise 
the sum of the orbital and spin angular-momentum operators will not 
possess the property of angular momentum. 

3) For the same reason we must require that the spin-projection 
operators should be Hermitian. 

4) In coordinate-system rotations, spin-projection operators must 
behave in the same way as vector components so that the commuta¬ 
tion rules for these operators, in a rotated system, should not differ 
from the rules of the original system in which the operators were de¬ 
fined. 

Corresponding to these requirements, W. Pauli found the required 
operators, which we shall now form. 

We shall write the group of functions tj^ {x, a) in columns instead of 
rows; the meaning of tp (*, <?) does not thereby change, of course. 
In addition, for brevity, we shall omit the coordinate dependence con¬ 
tained in the argument x. Thus, (a) denotes the column 


Here, each component satisfies Schrodinger’s coordinate equation 
(24.22). In the most general case the action of a linear operator on the 
function (32.1) reduces it to the form 

/a4»(l) + !3<i;(2)\ 

lYt|>(l)-l-S<j^)2)/' 

As we know, one of the angular-momentum projections can always 
exist together with the square of the total angular momentum, since 
the substitution rules for spin components are the same as for orbital 
angular momentum (condition 2). To be specific, we shall consider 
that there exists the zth projection az. If the operator dz has an eigen¬ 
value in the given state, its application leads to the multiplication of 
the function (32.1) by some number without mixing of the components. 
This number is equal to ± A/2, depending upon what the sign of 
the spin projection az is in the given state. I^t the function ^ (1) 
be multiplied by -|-A/2 and the function ij; (2) by —A/2, so that 


(32.2) 


320 


QUAKTUM MECHANICS 


[Part m 


The equality sign between columns denotes a line by line equaUty 

of the expressions, i.e., (1) = (1), 0 ^( 2 ) = — ytj/ (2). The 

form of the functions corresponding to various spin projections can 
immediately be seen: to the projection A/2 there corresponds a function 

corresponds to the projection —A/2. 

The first of them, if we substitute it in (32.2), is entirely multiplied 
by A/2, while the second is multiplied by —A/2, since the change of 
sign for the zero component of the function does not signify anything. 

The operators o* and 5y caimot have eigenvalues in these states. 
It follows that they must in any case also interchange the components 
of the wave function that we have defined, and not merely multiply them 
by numbers. Simple multiplication operators would be commutative 
with Once the form is given, we can also determine the operators 
for the other two components. 

Let us temporarily go over to atomic units (see Sec. 31), i.e., we 
put A — 1. We shall look for the operator a* (acting on the two ifunctions) 
in the most general possible form*: 


In other words, we suppose that it replaces by a(}'i + P4'2 
4*2 y4'i + ^4'2‘ We act on with ct«. Then, by definition of cr* 
(32.2), we must change the sign in the lower row and divide both 
rows by two: 


If we act upon with a*, then we must first of all put a minus sign 
in front of and divide both components by two, and then substitute 
them in (32.3). This will yield 


Y'l'i- 


From (29.36), 
atomic units: 


the difference 5*5*—5*5* must be equal to ioy in 
(5* 5* — 5*5*) tj; = |_ = i 5y(}' • (32.4) 


Thus, the operator 5y interchanges the functions tj'i and (J'a and multi¬ 
plies them by p and —y, where p and y appeared in the definition 
(32.3) for 5*. 


* The argtament o = 1 and o = 2 will in future be replaced by the index. 


Sec. 32] 


ELBCTBON SPIN 


327 


But if we proceed from esy, defining it analogously to (32.3), it will 
turn out that ux also interchanges functions, i.e., it does not contain 
the coefficients a and S. Therefore we obtain 


We now form the difference o* uy—Uy a*. First acting with ox upon 
CTytJ; and then with oy upon Ux'j'j we equate their difference to i [see 
(29.33)]: 


(&x5y-Uy5x)4<= (_ 2;Pv|i) |l) , 


whence it follows that 


(32.6) 


Hence, conditions (29.33)-(29.36) lead to the following form for 
(jx and Uy: 


(32.6) 


The operators ox, oy and 5* must be Hermitian. We once again 
deduce the Hermitian condition (30.3) for the operator ax, insofar as 
the result of Sec. 30 related to a continuous variable x, while here 
we consider a discrete variable a. Let us write similarly to (30.1a) 
and (30.1b), with spin dependence in explicit form: 


Summation with respect to a now corresponds to integration with re¬ 
spect to X. We multiply the first two equations by t}'* andj tj; J, respec¬ 
tively, and the second two by and sum with respect to a, and 
equate the results utilizing the fact that ox is a real number: 

h* ^'P2 + <P2*^'P1 = <P1^* 'P2* + <1^2 4^- ’Pi* • 

As was shown in Sec. 30, this condition must be identically satis¬ 
fied for any two functions x* and t{). From this we obtain 


Xi* ^’p2 + X2 -^’Pi-’Pi X2* + '{'2 4^ Xi* • (32.7) 

But this equation can hold only if 


328 


QUANTUM MECHANICS 


[Part III 


i.e., S == -i- e^'’. There remains an arbitrary phase factor e*'’ which 
we choose equal to unity. Thus, 


(32.8) 

(32.9) 


Wo’note three operator relations 


(32.10) 


<Sy(Sz= — Oz <Ty = 


which are directly verified by substitution, and also expressions for 
the squares c%, uj, and uj, which are obtained when they act twice 
upon 


(32.11) 


Hence, in accordance with condition (1), the eigenvalues of dx, 
Sy, and 5^ are . Since each operator is commutative with its square, 

we see that the eigenvalues of csx, uy, and az should equal the square 
roots of the eigenvalues of their squares, i.e., ± 1/2. Naturally, 
these eigenvalues of u*, uy, and oz only exist separately and by no 
means simultaneously. 

The vector properties of spin operators. In order to prove finally 
that the operators dx, uy, and dz possess the properties of angular- 
momentum components, we must be convinced that in coordinate 
rotations they transform like vector projections, i.e., we must verify 
that condition (4) is satisfied. 

Let us suppose that a rotation occurs around the z-axis through 
an angle <o. Then we must prove that the operators 

5; = u»cos(o-l-uysinw, (32 12) 

a'y— — ax sin to + uy cos to , ' ' 

formed by analogy with the vector projections on rotated coordinate 
axes possess the same properties (1) and (2) as the original operators 
ax and uy. First of all we have 

^'x = (u*cos to Oysin to) (u* cos to -f Uysin to) = 

= 5“ cos* to -1- uj sin* to (5*Uy -f- UyU*) sin to cos to . 


See. 32] 


BLECTBON SPIN 


329 


It can be seen from (32.10) that 

S*5y+5y5»=0 (32.13) 

(and analogously for any pair of components). Further, o“ and Sy, 
operating on ip-functions, act like numbers, i.e., they simply multiply 

it by 1 /4. But then this also means that S (cos® to + sin® to) = . 

Thus, the first property of 5* and 5y is retained under the rotation 
of the components. 

We now form the difference a^Sy— SyS^: 

— ~ djcostosinco + sinwcose) + 5.(ffyC08®e) — 

— Sy5*sin®e) + 5* cos to sine) — 5 J sine) cose) + c*ffysin® e) — 
— 5y5*cos® e) = SxSy — Sydr = iSz. (32.14) 

But Sz=Sz since the rotation occurs about the 2 -axis. 

Any rotation in space may be obtained by successive rotations 
about three axes. Therefore, it was sufficient to show that the basic 
properties of the operators are preserved under rotations about any 
one of the three axes. 

The total angular-momentum operator. If we now form the sum 
of the operators 

jx — ^x-\-Clx‘, jy = My-\-ay', jz — Mz-]r^z, (32.16) 

then it will possess all the properties of an angular-momentum operator. 
Naturally, we could not have added the components of a to the 
components of M if they both did not transform identically under 
rotations of the coordinate system, since, otherwise, equations (32.15) 
would be noninvariant with respect to the choice of the system of axes. 

The vector j is called the total angular momentum of an electron. 
If the orbital angular momentum of the electron has a greatest 

projection I, then the greatest projection of j can equal or 

I — Y case, we say that the spin and the orbital angular 

momentum are 'parallel', in the second case, we say that they are 
antiparaUel. 

Spin magnetic moment. The spin of an electron, similar to its orbital 
angular momentum, is associated with a definite magnetic moment. 
But experiment shows that the ratio of spin magnetic moment to 
mechanical moment is twice as great as for orbital angular momentum. 
There is nothing paradoxical in this because the result (16.25) cannot 
be applied to spin. At the same time we can deduce the spin magnetic 
moment from the Dirac relativistic wave equation for an electron 
(Sec. 38); in agreement with experiment, it is found to be 


330 QUANTUM MECHANICS 

Hence, the projection of y.a on any axis is 

((Xa)^= ± 2 ^^ Hrfl.0- 

The quantity (Aq is termed the Bohr magneton. This is a natural 
unit of magnetic moment. 

The ratio - = —- is called the spin gyromagrvetic ratio. It was 

first discovered in determining the mechanical moment caused by 
magnetization of iron rods (the Einstein-de Hass experiment). Spin 
was not known at that time and it appeared strange that the gyro- 

magnetic ratio was not equal to as follows from (15.25). It 

is now known that magnetism in iron is connected with the spin 
of certain of its electrons. 

The fine structure o! atomic levels. Spin magnetic moment interacts 
with the magnetic moment of orbital motion and with the spin 
angular momenta or other electrons, if the atom is of the multi¬ 
electron type. This interaction is proportional to the magnitude of 
both magnetic moments, i.e., it involves the product of gyromagnetic 
ratios. The latter is inversely proportional to c® and, hence, is an 
essentially relativistic effect. 

Electron velocities in atoms are everywhere small compared with 
the velocity of light, with the exception of the internal regions of 
the atoms of heavy elements. Therefore, a quantity involving c® 
in the denominator is usually small compared with other quantities 
on the atomic scale; the interaction energy for magnetic moments 
is less than the distance—due to electrostatic interaction—^between 
energy levels. As a result of the interaction between spin and orbital 
angular momenta, the energy level of a separate electron corresponding 

to a total electron angular momentum y 1 -f differs a little (when 
idaced in a central field) from a level with a total angular momentum 
)~-l -—this is because the angular momenta are parallel in 

the first case and antiparallel in the second. But the energies of 
two parallel and antiparaUel angular momenta differ. 

The only scalar quantity which is linear with respect to each of 
two pseudovectors (ij and pj is pi pj- Therefore, to the lowest approxi¬ 
mation, the interaction energy of two magnetic moments is pro¬ 
portional to Pi Pa. 

The spacing between the levels ? = 14- y and j=l — ^ is small 

compared with that between electron levels with different 1. Therefore, 
a magnetic interaction contributes only a small splitting of the 


[Part III 

32.17 


* The angulexr momentum is determined by means of its greatest projection. 


Sec. 32] 


ELECTRON SPIN 


331 


electron level with a given I into two levels. This splitting is called 
the fine structure of the level. 

We note that such a simple splitting into two levels takes place 
for a separate electron in a central field, for example, for the outer 
electron in an alkali-metal atom. 

Isotopic (isobaric) spin. The splitting of an atomic level into two 
levels with j = 1 + 112 and j = l —-1/2 is due to weak magnetic inter¬ 
actions between spin and orbital magnetic moments. Since each 
magnetic moment contains c in the denominator [see (32.16)], such 
interaction is relativistic in natirre and must vanish if the electrostatic 
forces alone are taken into account. This means that the energies 
of two states with spins parallel and antiparallel to the magnetic 
moment coincide if magnetic forces are completely neglected. 

An analogous situation exists in the domain of nuclear interaction. 
The nuclear forces which hold nuclear particles (neutrons and protons) 
together are not of electromagnetic origin. At least we have no in¬ 
dications that both types of force—nuclear and eleetromagnetic—can 
be deduced in a unique manner from some first principle. As yet, 
no experiment suggests that such derivation is at all possible. On 
the contrary, there are many facts proving that nuclear interactions 
are independent of the electrical properties of particles. 

First, we have the so-caDed mirror nuclei. These are pairs of nuclei 
which have all the neutrons mterchanged with all the protons, and 
vice versa. For example, H® consists of one proton and two neutrons, 
and He®, of two protons and one neutron. All the main properties 
of such nuclei are similar both qualitatively and quantitatively, 
and the small differences that still do exist can readily be explained 
by the difference in charge and magnetic moment of the neutron 
and proton. We can therefore say that the substitution, in a nucleus, 
of all protons by neutrons and all neutrons by protons leaves the 
nuclear interactions invariant, i.e., the nuclear interaction between 
two protons and two neutrons is the same if we neglect electro¬ 
magnetic forces. 

Second, the scattering of neutrons and protons on protons indi¬ 
cates that the elementary nuclear interactions neutron-proton and 
proton-proton are also the same. This is a stronger statement than 
the previous one, because the interaction between two unlike particles 
is also taken into account. 

Comparing this situation with that in the atom, we can say that 
there is no splitting of nuclear states if the strongest interactions 
alone are considered; the actual splitting is due to the much weaker 
electromagnetic interactions. 

Let us, therefore, neglect for a time the weakest interactions. 
We can then consider the neutron and the proton as two states of 
a single particle—the nucleon. These states do not differ in energy 
like those of an electron with two different spin projections in the 


332 


QUANTUM MECHANICS 


[Part III 


absence of a magnetic field. If such a field is switched off, both states of 
the electron fall together in energy; if all the electromagnetic interactions 
are switched off, certain states of nucleon pairs fall together, too. 

We have said that the spin can be considered as an internal degree 
of freedom of the electron. It is reasonable to say that the electric 
charge is the internal degree of freedom in the nucleon. Both degrees 
of freedom assume only two values with a dichotomic variable cor¬ 
responding to them. It will be shown that there exists a far-reaching 
formal analogy between these degrees of freedom. Let us say that 
the nucleon possesses (besides its usual, nucleon, spin) another “spin” 
variable, which defines its “charge state.” Like mechanical spin, 
this variable assumes only two values. It is called the isotopic spin 
(sometimes, and more consistently, the isobaric spin). We shall say 
that the projection on some imaginable axis of isotopic spin is equal 
to 4-1/2 wWch corresponds to a proton, the opposite projection 
corresponding to a neutron. Some years ago, the reverse convention 
was used, but this is immaterial. Now let us consider three nucleon 
pairs: proton-proton, neutron-proton, and neutron-neutron. According 
to what we have already said, the first pair corresponds to a resultant 
projection 1 of the isotopic spin, the second pair to a projection 0, 
and the third, to —1. In the absence of electromagnetic forces, none 
of the three projections split in energy. 

But if these states coincide in energy they can be considered as 
having a resultant spin 1 and differing in their projections only. 
Spin angular momentum 1 can assume just three projections; and 
here we can say that the resultant isotopic spin angular momentum 1 
has three different projections on some imaginable z-axis. In the 
absence of electromagnetic forces, the physical choice of such a 
“z-axis” is unimportant. Note, for comparison, that if no magnetic 
field is applied to the electron, any direction in space is preferred 
(for example, the z-axis). 

Changes in projections of ordinary spin can be due simply to ro¬ 
tations of the coordinate frame. If no preferred directions in space 
exist, such rotations are unlimited. Now we can consider different 
isotopic spin projection as due to “rotation” of some frame also. 
But this rotation is purely formal in nature and has nothing in common 
with the rotation of geometrical space, except their mathematical 

expressions. If the isotopic spin vector t with components t*, vy, 
is introduced, then its rotations are described exactly by the same 
formulae as (32.12). The corresponding angle of rotation has no more 
geometrical meaning than the axes which rotate. 

The formulae for isotopic spin rotation are deducible from the 
dichotomic nature of that variable and from the similarity of three 
different two-nucleon states, so there is no reason to abolish the 
vivid geometrical terminology of “projections” and “rotations.” 


Sec. 32] 


EI.ECTBON SPIN 


333 


Let us now formulate the situation in a quantum-mechanical 
fashion. For several nucleons, it is possible to define their resulting 
isotopic spin operator. 

7 = 2 ;^, ( 32 . 18 ) 

i 

Its different components do not commute. But its square, t commutes 
with one projection, say t*, which defines the resulting charge of the 
given system. The Hamiltonian of the nuclear interactions (with 
electromagnetic interactions neglected) commutes with both and 
iz, just like the Hamiltonian for an electron in a central field 
commutes (to a nonrelativistic approximation) with p* and (x*. It 
follows that and Xz exist in nuclear states with a given energy. 
In other words, nuclear states can be distinguished by their t* and 
Xz values. This ascribing of x^, Xz to nuclear levels is approximate, 
like the distinguishing of atomic levels by n, I, k, the difference being 
that no account is taken of the magnetic properties of spin. 

In heavy nuclei, electrostatic interactions are very important 
because they increase in proportion to the square of the atomic 
number. Nuclear interactions increase linearly with the number of 
nucleons, as the mass defect of nuclei does. So in heavy nuclei both 
types of interaction are of an equal order of magnitude and the neglect 
of electrical interaction has no meaning. No definite isotopic spin 
values can be attributed even approximately to the levels of heavy 
nuclei. 

The isotopic spin variables are very important in the classifying 
of elementary particles. 

Exercises 

1) Write down the transformation of <j*, oj, oz for any arbitrary rotation 
in space and prove that the properties of the operators do not change. 

The general expression for the transformation of vector component is 
the following (see Sec. 9): 

<Ji' = aa, Ofe ; a,fe = cos ( Z, aj'.ar/t) , 
where the coefficients oik satisfy the conditions 


am «nfc = 


0 for i^k, 
1 for i = k, 


where, according to the summation convention, n runs from 1 to 3. 

2) Find the eigenvalues of the scalar product ojOj of two electrons with 
parallel and antiparallel spins. 

We begin with the equation 

(5i + o,)* = of 4- o| -b 2oi<ia . 

^ ^ ^ ^ ^ 3 

But of = of = o* -1- oy-f o* = — [see (32.11)]; the maximum projection of 

oi -t- oj is equal to zero for antiparallel spins, and is equal to unity for parallel 


334 


QUANTUM SIECHANIOS 


[Part III 


spins. From this, (ai + = 0 in the first case and is equal to 1-2 = 2 in the 

second case. This gives 

3 

=-o-^ —T (a^tiparaUel spins), 

A 4 


® 1 ®2 = 5 — ~ T (parallel spins). 

3) Write down the eigenfunctions for o*, oy, and oz- 

fora* = y,4- = (J), forc*=--i,+ = ( J). 

for ay = -J- . + = (y , for ay = - A , + = ( _ . 

fora, = .l,^ = (J), fora,= -i, + = Q. 

Thus, the eigenfunctions of all three noncommutative operators differ. 

4) Express tho scalar product Ji, in terms of the resultant angular mo- 
mentmn j, j^, and /j. 

By definition — j\-\-1% +‘ijxJi- Substituting here the squares of the 
angular momenta, wo obtain 

ij Ja = Y O' (j -I- 1) - k (/i -f 1) - ii (ii + 1)}. 

Sec. 33. Many-Electron Systems 

The Mendeleyev periodic law. Long before the atom became an 
object of physical study its properties were investigated in chemistry. 
And chemistry discovered and studied such properties of the atom 
as were utterly alien to prequantum physics. In this category belongs, 
first of all, valency or the chemical affinity of atoms. On the basis of 
a vast quantity of experimental material accumulated in chemistry, 
Mendeleyev constructed a generalizing and systematic periodic law. 
This was a new law that allowed Mendeleyev to predict the existence 
of many elements, which were discovered later. And what is more, 
the basic chemical and many physical properties of these elements 
were correctly predicted. At the present time, too, the Mendeleyev 
law guides scientific investigation into the study of the periodic 
structure of nuclear shells.* 

The Pauli principle. The wave equation for a single particle is in 
adequate for an explanation of Mendeleyev’s periodic law. It is ne- 


* Wo have in mind tho quantum-mochanical theory of nuclear shells, 
and not the rather widespread speculative constructions which are mainly 
based on the arithmetic relationships between the atomic munbers and atomic 
weights of the elements. 


Sec. 33] 


MANY-EUSCTBON SYSTEMS 


335 


cessaiy to introduce a new principle concerning many-electron 
systems—^the so-caUed Pauli principle. We shall first of aU formulate 
it in such a way that it can be conveniently used to investigate the 
electron shells of an atom, to wit, an atom cannot have more than one 
electron with a given group of four quantum numbers: the principal 
quantum number n, the azimuthal quantum number I, the magnetic 
quantum number h and the spin quantum number ka. The spin 
quantum number is a measure of the spin projection onto the same 
axis onto which the orbital angular momentum is projected. 

The Pauli principle is substantiated by relativistic quantum mechan¬ 
ics (see Sec. 38). Here we shall simply use it as a supplementary 
principle of quantum mechanics. 

The addition of angular momenta of two electrons with identical n 
and /. We shall first of all show how the Pauli principle is applied in 
adding the angular momenta of two electrons for which the principal 
and azimuthal quantum numbers are the same. The accepted practice 
is to say that these electrons belong to the same shell. Usually (though 
not always) electrons m different shells possess quite different ener¬ 
gies—a fact which justifies the classification by shells. 

Let us take the simplest case when » = 1. Then, in accordance with 
the definition of n (31.33), 1 = 0. But, for I equal to zero, the magnetic 
quantum number ki is also equal to zero. Hence, three quantum 
numbers are the same for the electrons and, according to the Pauli 
principle, the fourth number k^ must differ. However, ko can only 

have two values, -f and — , and each of its values can only have 

one electron for given n, I, and ki. Thus, an atom can have only two 
electrons with 7i = l. Their spins are antiparallel and therefore the 
resultant spin S is equal to zero. The resultant orbital angular momen¬ 
tum L is also equal to zero. 

Let us now take two electrons in the p-state, i.e., with Z = 1 and with 
the same principal quantum numbers. Either the magnetic or the spin 
quantum numbers, or both, must differ. The p-electron can be in six 
states, which we list writing the magnetic quantum number firsthand 
the spin projection second; 


B:0,- 


C: 


; E:- 1,- 


\ 

'2 • 


It follows that two electrons can occupy any two different states of 
the six. As is known, the number of combinations of six t hing s, two 

6x5 

at a time, is equal to C|= —^These fifteen states differ by 

the total orbital angular momentum L and the total spin 8, as well as 
by their projections. The latter depend upon the choice of coordinate 
axes and wifi interest us only insofar as they characterize the relative 
directions of L and S. 


336 


QUANTUM MECHANICS 


[Part III 


We shall first of all find those states which correspond to the greatest 
projections of L and S, because they determine the possible eigen¬ 
values of L and S. In any case, of the fifteen states only those must be 
taken, for which the total spin projections and orbital angular mo¬ 
mentum are non-negative, since it is obvious that negative projections 
cannot be greatest in relative value. The states with positive pro¬ 
jections number eight out of fifteen, and from these eight we take 
only those which possess the greatest projections. We rewrite all eight 
states; 

AD\ 2, 0; BD: 1, 0; CD\ 0, 0\ AB‘. 1, 1;AC: 0, 1;AE: 1, 0; BE: 0, 0; 
AF: 0, 0. 

The state with maximum orbital angular-momentum projection is 
AD. Hence, a state exists for which the orbital angular momentum is 
equal to two and the spin angular momentum is zero (here, and in 
future, the angular momentum is characterized by the greatest pro¬ 
jection). The indicated state also yields projections 1, 0 and 0, 0. 
Such states are, for example, the BD and CD states; therefore they 
need not be considered, since they do not define the vector sum. The 
AB state has the maximum spin angular momentum. It follows that 
a state exists for which the orbital and spin angular momenta are 
equal to unity. Their projection can be 1, 0; 0, 1; and 0, 0. These are 
AG, AE, and BE, which, like BD and CD, no longer interest us. There 
remains one more state with projections 0, 0. 

Thus, only three states are possible: 

i = 2, 8=^0; L=l, 8 = 1; L = 0, 8 = 0. 

Composition of angular momenta for three electrons with identical 
n and /. For three p-electrons we obtain the following seven states 
with positive projections: 

ABC:0,^; ACE:0,\; ABD:2 ,\; ABE:l ,\; ABF:0 ,\; 
AGD:l,^; ABE-.O,^. 

The maximum spin projection is 3/2 for a zero orbital angular-mo¬ 
mentum projection. The maximum orbital angular-momentum pro¬ 
jection is 2 with a spin projection . These two states, together with 
their projections, are listed in order from ABC to ABF. ACD and 
ABE remain, to which there correspond L = 1 and /S = -|-. In all, we 
have L=0, )S=3/2; L = 2, = L = l, -S = y. 

Normal coupling. States with different values of the total orbital 
and spin angular momenta L and 8, and with the same principal 


Sec. 33] 


MANV-K1.KCTHON .SYSTEMS 


337 


quantum numbers for the electrons, differ in energy. This difference 
occurs as a result of the electrostatic, and not magnetic, interaction 
between electrons. In order to explain why the resultant orbital angu¬ 
lar momentum affects the interaction energy, we examine two p-elec- 
trons. The sum of their orbital angular momenta can yield two, unity 
or zero. If two is obtained, then the angular dependence for the wave 
functions of both electrons is the same (not only do the azimuthal 
quantum numbers coincide, but also the magnetic quantum numbers, 
that is why at least the spin projections must differ). We shall call the 
wave functions of both electrons ij, j (Fj) and ^ (rj), where and Tg 
are the radius vectors of both electrons and the indices refer to the 
quantum numbers I and k^. 

The interaction energy between electrons is approximately 


I '>1.1 (r.) I" I >1,1 (r.) P 
I ri - I 


dFidFg, 


because e |t{ii, ^ (I’l)]" and e |dii, ^ (Fg)!® represent the densities of the 
charge distribution. The approximation consists in the fact that the 
effect of the interaction on the wave functions and the so-called 
“exchange” [see (33.32)] has not been taken into account. 

If the resultant moment is unity, we correspondingly obtain the 
other estimation: 

^2 f I > 1.1 (■•illiLV (r^) I" 

J Iri-rd ‘ * 


Here the magnetic quantum numbers are equal to zero or 
unity. 

This integral is clearly different from the previous one. Thus, there 
appears to be interaction between the orbital angular momenta when 
they are to form the total angular momentum; this interaction does 
not involve c* in the denominator, i.e., it is electrostatic in 
character. 

In multi-electron atoms, Pauli’s prhiciple imposes definite conditions 
on the choice of spatial wave functions for given spins. As an example, 
let us consider the state with spin 3/2, which, as was just established, 
is possible in a system with three p-electrons. In accordance with the 
Pauli principle, three different spatial wave functions for the separate 
electrons having !:/= 1, 0 and —1 correspond to this state. The corre¬ 
sponding electron densities coincide in space less than, for example, 
in a state with magnetic quantum numbers 1, 1, 0, to which, according 

to the Pauli principle, there must correspond spin projections , 

Y in order that all three pairs of ki, ka should be different. But the less 

the electron wave functions coincide in space, the less the Coulomb 
repulsion energy between the electrons, because the mean distance 
between like charges is greater. For this reason, the state to which the 


22 - 00«D 


338 


QUANTUM MECHANICS 


I I’art III 


Pauli principle assigns the greatest possible spin possesses the least 
repulsion energy. 

There are three p-electrons in the ground state of nitrogen. As was 
just indicated, the state with the least energy occurs when all tliree 
spins are parallel. The next state, for whicli the orbital angular mo¬ 
mentum is equal to 2 and the spin is equal to 2 ' ^ lies approximately 
2.2 ev higher, while the state with orbital angular momentum I and 
spin -J, lies 3.8 ev higher. 

We can explain why a lesser energy coiTesponds to a greater result¬ 
ant orbital angular momentum. Wave functions, for which the 
orbital angular-momentum projections differ only in sign, are closer 
to each other than functions for which the angular-momentum pro¬ 
jections differ in absolute value. But those functions which corres])ond 
to a closer spatial electron density distribution lead to a larger re]nil- 
sion energy, while angular momenta in opposite directions, when 
summed, yield a lesser resultant angular momentum than the angidar 
momenta whose projections differ in magnitude also. Thus, the state 
with the greatest spin possesses the least energy and, for a given spin, 
it is the state with the greatest orbital angular momentum that has 
the least energy (Hund’s first rule). 

This is the way the orbital and spin angular momenta are com¬ 
bined. In calculating the electrostatic energy only, the state of the 
atom is defined by the absolute values of L and 8. But a magnetic 
interaction takes place between the resultant orbital angular mo¬ 
mentum and the resultant spin angular momentum of a system of 
electrons, analogous to that of a separate electron (.see Sec. ,32). To a 
first approximation, this interaction is described by the scalar product 
(Aj, where (w. and p,, are the magnetic moments for the orbital 
and spin motion of the system, and A is a factor of proportion¬ 
ality. 

The scalar product of two angular momenta assumes as many 
values as are possessed by the resultant angular momentum for a 
given absolute value of the component angular momenta (see exercise 
4, Sec. 32). This is clearly shown with the aid of a so-called vector 
model; a triangle is constructed on the vectors L, S and J=L-|-S. 
In accordance with the law of composition of angular momenta 
(30.30), the side J can equal L-{-S, L + S —1, ..., \ L—8 \ . The 
energy level of an atom with given values of L and 8 is split into as 
many fine structure levels as can be assumed by J, i.e., 2 jS-f 1 levels 
if 8 is less than L, and 2 L -|-1 levels if L is less than 8. 

The system of levels described here occurs with the so-called Russel- 
Saunders normal coupling (of orbital and spin angular momenta); 
the energy states with different L and 8 differ considerably more than 
the energy states with given L and 8 but different J. The group of 


Sec. 33] 


MANY-ELECTRON SYSTEMS 


339 


energy levels differing only in total angular momentum J is termed a 
muUiplet. 

In heavy elements, where the spin-orbital interaction for separate 
electrons is great, the spin of each electron in a shell is combined with 
its orbital angular momentum to form a resultant angular momentum 
j [see (32.15)]; only then do the angular momenta j of separate elec¬ 
trons combine. This may be accounted for by the fact that, the 
relativistic effect of magnetic-moment interactions is not small com¬ 
pared with the energy of electrostatic repulsion between electrons in 
the inner regions of the atoms of heavy elements, where the electron 
velocity is close to the velocity of light. The type of coupling which 
occurs when the j of separate electrons are added, is termed j-j 
coupling, j-j coupling also occurs between nuclear particles as a 
result of the large spin-orbital interaction characteristic of nuclear 
forces. 

The spectroscopic notation for levels. In general form, the spectro¬ 
scopic notation for the resultant state of an atom is written thus; 

2 s + irs.« 

} ■ 

The main symbol is L, i.e., the letters S, P, D, F, etc., depending 
upon what L is equal to: 0, 1, 2, 3, .... As a left superscript we put 
2 iS-4-1. As a right subscript we put J, i.e., the vector sum of L and 
S from the number of the fine-structure components. Finally, the 
right superscripts denote an odd {u) or even (g) state, respec¬ 
tively. 

For example, the ground state of a nitrogen atom has L = 0, /S = 3/2 
and is formed by three p-electrons. Hence, its spectroscopic designation 
is hS?/, because the total angxilar momentum can oidy equal the 
spin angular momentum {L = 0), and ^ 1 = 3 is an odd number. 

The notations for the next two states of nitrogen are 

2/)" and ®P", 

or, if the multiplet splitting is taken into accoimt, then 
®Z)?, or ®Z)?, and or , 

/s /t It It 

depending upon the resultant angular momentum J. 

If the ground state of the atom has L and S not equal to zero, the 
resultant angular momentum is determined by Hund’s second (empir¬ 
ical) rule: when there are less than half the possible number of elec¬ 
trons in a shell, the least energy corresponds to a multiplet level for 
which J = \L —(S I, and to that of J=L-\-S when there is more 
than half the possible number. Since the electron angular-momentum 
I can have 21-4-1 projections, and there are two values ka for each 


22* 


340 


QUANTUM MECHANIC'S 


[Part III 


projection ki, then there can be in all 2 (2 1 + 1) electrons in a shell 
with given values of I and n. The total number of electrons in an 
atom with a given princijial quantum number n is 


M-l 

^2(21+1) = 2 w2. (33.1) 

/=o 

The electron configuration corresponding to the least energy occurs 
in the ground state. It is determined by Hund’s first and second 
rules. 

The dependence of energy on the azimuthal quantum number. 
Before we can go over to a description of the Mendeleyev periodic 
.system we have to remark on the deiiendence of the energy of an elec¬ 
tron on the azimuthal quantum number. The energy of an electron 
in all atoms, except the hydrogen atom, depends upon I as well as 
upon n. For large I the electron is situated comparatively far away from 
the nucleus; in other words, it is more weakly bound to the nucleus 
than for small 1. For a given n, the energy of an electron is greater, the 
larger 1. When the field greatly differs from a Coulomb field, the de¬ 
pendence of energy upon I is so strong that an increase in the princi¬ 
pal quantum number n, with a simultaneous decrease in I, leads to a 
smaller energy increase than the increase of I for a given n. In other 
words, the state with quantum numbers w + 1, 0 can have a lower 
energy than the state with quantum numbers n, 1. This will become 
clear in the later examples. 

Filling the first shells. As was mentioned, the shell with n = l is 
filled by two electrons in the Is-state (the 1 in front denotes the quan¬ 
tity n). Hydrogen has one electron in this shell and helium has two. 
The helium shell is completely filled and has a state. The electron 
configuration for the ground state of a helium atom is so stable that 
if any other atom approaches close to it the total energy can only 
increase, so that repulsion forces are produced. Helium is completely 
inert chemically. The forces between helium atoms are small as a 
result of the symmetry and stability of their electron shells. Therefore, 
helium gas is liquified at an extremely low temperature.* 

After helium, the shell structure with n = 2 begins. The first electron 
of this shell, i.e., a 2 s-electron, appears in lithium. The two inner 
Is-electrons occurring in the helium configuration strongly screen 
the nuclear charge and, consequently, the outer electron is weakly 


* The condensation of helium into a liquid at low temperatures is due 
to the so-called Van der Waals forces, which arise out of the mutual electro¬ 
static polarization of approaching atoms. Tliese forces act at larger distances 
than the forces of chemical affinity, and are very small compared with them. 


Sec. 33] 


MANY-ELECTRON SYSTEMS 


341 


bound. Such is the alkali-metal electron configuration in the case 
of lithium, and analogous electron configurations subsequently result 
each time (Na, K, Eb, Cs) from the addition of an if-electron to a nucleus 
surrounded by a noble-gas electronic cloud. The next 2 5-elcctron 
has an energy which is compai’atively close to the energy of a 
2 p-electron: the energy of the electron is still weakly dependent on the 
azimuthal quantum number since the field is approximately Coulomb. 
A large energy is needed for an electron to go from a 1 s-shell to a 
2 5- or a 2 p-shell, while a small energy is needed for the transition 
from a 2 5- to a 2 p-shell. For this reason, the beryllium electronic 
configuration, having two 2 s-electrons, is not very stable with respect 
to an electron transition to the 2 ^j-shell. In other words, filling the 
2 s-shell does not give the electron configuration of a noble gas. Indeed, 
as we know, beryllinm is a metal. 

After beryllium, the 2 p-shell fills up, and is completely filled for the 
noble gas neon. Neon follows fluorine, which requires one electron for 
the shell to be filled. The energy required for an electron to be added 
to the fluorine 2 p-sholl, to fill the shell of neon, is large. This explains 
the chemical activity of fluorine and the other halogens, which are 
similarly situated with respect to the noble gases. 

There can be eight elecc-rons in a sliell for which n = 2. This is the 
first group of the Mendeleyev system. The shell with n = 3 is then 
filled, though initially only the first two subshclls: 3 s and 3 p. The 
elements of the second group have an outer electron-shell structure 
similar to the elements of the first group. The chemical properties of 
atoms are basically determined by the outer shells. This explains the 
similarity of chemical properties, on the basis of which Mendeleyev 
formulated his law. Argon has a filled shell, i.e., still another group 
of eight elements is completed. The noble-gas configuratioit is obtained 
for argon because the 3 p-state, on the one hand, and the 3 d and 4 s 
states, on the other hand, differ considerably in energy. 

By considering the possible states of shells which, to be filled, lack 
less than half the possible number of electrons, we can consider that 
unfilled states behave like electrons. For example, if there are two of 
the six electrons wanting in a 2 p-shell, then we can combine the 
states of the two “holes,” similar to the way that the states of two 
2 p-electrons were combined at the beginning of this section. In doing 
so, correct results are always obtained, provided that Hund’s second 
rule is used in finding the total angular momentum J of the ground 
state, i.e., that we take J=L+S. It is easy to see that four electrons 
in the shell are equivalent to two holes by applying the Pauli principle 
first to an electron and then to a “hole,” (see exercise 2). 

Let us now give, in one table, the scheme for building up the first 
eighteen places in the periodic system of elements; this table shows 
the number of electrons having given quantum numbers. 


342 


QUANTUM MECHANICS 


[Part III 


Element 

»«. = 1, 

1 = 0 

n — 2, 

1 = 0 

II 11 

II II 

O CO 

II II 

Ground 

state 

H 

1 


He 

2 


Li 

2 

1 


Bo 

2 

2 


B 

o 

2 

1 


0 

2 

2 

2 


N ^ 


2 

3 


() 

2 

2 

4 


E 

2 

2 

5 


No 

2 

2 

6 


Na 

2 

2 

6 

1 


Mg 

2 

2 

0 

2 


A1 

2 

2 

6 

2 

1 


Si 

2 

2 

6 

2 

2 


P 

2 

2 

6 

o 

3 


S 

2 

2 

0 

2 

4 


t'l 

2 

o 

0 

2 

.5 


Ar 

2 

2 

0 

2 

G 


The filling order alter the 3/>-shell. After argon, the 45-sliell begins 
to fill instead of the 3rf-shell. The new group begins with the alkali 
metal, potassium. The sum n + i is the same for the ‘3p- and 4s-8hells 
and is equal to 4, while it is already greater by unity in the 3d-shell. 
The 4p-shell is filled after the 3(i-shell, with the same value of the 
sum n+Z = 5, and then the Ss-shell. It is seen that this rule is observed 
later on, too; the filling of the shells with the same sum n-\-l 
proceeds in order of increasing n. But there are certain deviations 
from this rule during the filling of the d- and /-shells. 

In the shells with n = \, 2, 3, there are altogether 2.1“-|-2.22-f 
+ 2.3‘'* = 2-f8-l-l8 = 28 electrons. There are a fiuther eight electrons 
in the 45- and 4p-states, and another two electrons in the 5s-state. 
The 5s-state is followed by electrons with n + l = Q, where we begin 
with the least n, i.e., with 4d. There are 2 (4-1-1) = 10 more of these 
electrons. The 4d-electrons are followed by 6p-electrons, of which 
there are six, and then by the same simple rule we get the 6s-state. 

llarc-oartli elements. The next value of n-{-l = 7, the least being 
n — i. Hence, beginning with the 57th place (in actuality, with the 


Sec. 33] 


MANV-BLECTRON SVSTK.MS 


343 


58th place) the 4/-shell can begin to fill acquiring at once two 4/- 
electrons. This shell is already inside the atom as a result of the 
form of the potential distribution within the atom. The screening 
of the nuclear charge by atomic electrons leads, at large distances 

away from the nucleus, to the potential decreasing like -'j- instead 
1 

of — (see Sec. 44). If wc combine the potential energy of an electron, 

calculated Avith allowance for screening Avith the centrifugal energy, 
it turns out that the d- and /-states possess a minimum resultant 
effective potential energy deep inside the atom (see Sec. 5, footnote 
to p. 45). Indeed, the centrifugal energy is greater than the potential 
energy both close to the nucleus (with allowance made for screening) 
as well as far aAvay from the nucleus. Therefore, the effective poten¬ 
tial energy Um is jjositive for large r as well as for small r. In other 
words, for the d- and /-states, the Um curve goes higher for large r 
than for the s- and p-states, and it turns out that the effective poten¬ 
tial Avell for d- and /-electrons is situated closer to the nucleus than 
to the boundaries of the s- and p-electron shells. Thus, the d- and 
/-shells are, as it AA'cre, filled inside the atom. But the chemical prop¬ 
erties of atoms depend mainly upon the outer electrons which, 
in filling the 4/-shcll, change very little. This is how the group of 
2 (2‘3 4 - 1) —14 chemically similar elements, termed rare-earth 
elements, originates. 

It should be ])ointed out that the d- and /-shells arc not filled 
succes.sively as a result of “competition” Avith outer shells: for example, 
there are three d-electrons and tAvo s-elocti’ons in Fjs, five d-electrons 
and one s-electron in the next element Avhile Mugj also has five 
fZ-electrons but tAvo .s-electrons. 

The statistical theory of the atom, which will be set out in Sec. 44, 
permits us, in rough outline, to find the potential distribution inside 
an atom. It becomes possible, from this distribution, to ])redict 
rather accurately the places in the periodic system Avhere elements 
Avith 1 = 2 and 3 appear. 

The 5/-shell fills up (beginning with thorium) in a Avhole group 
of elements similar to the rare earths. A large part of this gz’oup 
consists of the artificially produced transuranium elements. 

The wave equation of a Iwo-electrou system. We shall now formulate 
Pauli’s principle using wave functions. The simplest way to do this 
is to consider a two-electron system. The wave equation for two 
electrons may be Avritten thus: 

Here Aj and Ag are the Laplacian operators Avith respect to variables 
of the first and second electrons, 17(r^, r^) is the potential energy 


344 


QUANTUM MECUANK'S 


[Part III 


of their interaction with the external field and with each other. 
For example, in a helium atom 


U {Ti, r^) 


_ 2 6^ c2_ 

ri' ' rz + I r,- r^i ' 


(33.3) 


The wave function depends upon the spatial and spin variables 
of both particles: 


<D-^<I>(ri, (Ti ; rj5, ua). 


(33.4) 


The interaction of spin magnetic moment with orbital motion 
is weak. Therefore, to a first approximation, the spin-orbital inter¬ 
action can be neglected in the potential energy operator. This cor¬ 
responds to U (rj, Ta) in equation (33.3). If the effect of siiin motion 
upon orbital motion is small then the probability of a certain value 
of spin and coordinate is equal to the product of tlic probabilities 
of both values, and the probability amplitude ^ also divides into 
a product of amjilitudes 

^ (i’ll j •‘2> *^ 2 ) ~ (*’i> *’ 2 ) X ('^i> *^ 2 )' (33.5) 


'I'hc probiibility amplitude of orbital motion satisfies equation 
(33.2), provided it does not involve the si)in operator. But even 
when the system is placed in an external homogeneous magnetic 
field II, the following operator is added to 

[(diH) -I- (a,H)J = III (5.1 + 5.,), (33.6) 

where it is taken that the c-a.xis coincides with the direction of the 
field (the minus sign is replaced by a })lus sign because the electronic 
charge is negative). The action of the operator + 5-2 on the 
spin function simply gives the total spin i)rojection of the system. 
For this reason, in the presence of an external homogeneous magnetic 
field, is replaced by a number which is added to the total energy 
of the system. 

The symmetry of the operator with respect to particle inter¬ 
change. When examining the operator in (33.2) we see that it is 
completely symmetrical with respect to a coordinate interchange 
for both particles, i.e., it does not change its form if the first electron 
is called the second, and the second electron the first: 


(Cj, Oj; r2, 02 ) •— {^2' ®2> ®i)' (33.7) 

But equation (33.2) is linear. Therefore, if the form does not change 
due to operation (3^7), then the wavm function can only be multiplied 
by some constant number P: 

^ (^ii ®i> r2> ^2) “ FO (r2, C2’ *"1’ ®i)’ 


(33.8) 


Sec. 33] 


MANV-KI.ECTROX SV.STKMS 


345 


Because r^, er^ and Tg, are involved in the same way in all the 
equations, we can interchange them in (33.8) obtaining 

® (*’2> *’i> ®i) ~ (r^, (Tj; Fj, o-i). (33.!)) 

Substituting (33.9) in (33.8), we shall have 

^ (r2, <^2; Ti- <^ 1 ) = (•‘2’ <^ 2 ; To 

or 

P2=l, P=±l. (33.10) 

In this comparatively simple case, when there are only two ])ar- 
ticles, the transformation is similar to the symmetry transformation 
for a wave function under reflection [see (31.38)]. 

The commutative operator for coordinates and spin variables. 
We can define a coordinate commutative operator for electrons 
P, such that 

7VP(ri,r2)=-'r(r2,r,). (3,3.11) 

If the wave equation is symmetric.al with respect to interchange 
of Fj and r 2 (without interchanging tlie s])iu variables <Ji and ^ 2 ) 
then, by repeating the foregoing argument, we see that the eigen¬ 
values of P, for two electrons are equal to ± 1. 

An analogous operator is also defined for spin 

f'’oX(ffi>®2) = X{'^2-'^i) - (33.12) 

where the eigenvalues of are likewise equal to -b 1. 

We denote the set of orbital quantum numbers of the first electron 
by the letter (in place of n^, l^, k/^), and those of the second elec¬ 
tron by the letter n^. Then the orbital wave function 'F is written in 
more detail as 

'F = 'F (ni,ri-, n2,r2). 

It follows from requirement (33.10) that 

PrT* (ni, Fi; ^ 2 , Fj) = T («!, Fa; Fj) = ± 'F (n^, Vj; n^, t ^). (33.13) 

The function in (33.13) with an upper sign is termed symmetric; 
with a lower sign, antisymmetric. 

The wave function of a two-electron system. Introducing, in ad¬ 
dition, the spin quantum numbers k„^ and which determine 
the form of the spin wave functions (see exercise 3, Sec. 32), we write 
the total wave function of a two-electron system as 

(Wj, Fj, A'gj) t*2> ^ 2 ) * 


340 


QUANTUM MECHANICS 


[fart lEI 


The total permutation of spin and spatial coordinates in this 
function occurs as a result of the action of the P operator, which is 

P=^P,Pa. (33.14) 


Operating with (33.14) on the function O, we have 


P^ /jo,, Tj, oTj j r.^, U 2 ) — O (?ii, Ajoj, ro, <^2* ^ 1 ) • (33.15) 


According to (33.10) tliis function is also either symmetric or 
antisymmetric. But it can now be seen immediately that only the 
antisymmetric function satislies the Pauli ])rinciple. Indeed, let 
the states of both electrons be identical, i.e., ni = n 2 , Av, = /ro,. Then, 
if the function O is antisymmetric, wo obtain 

PO (?ij, Aoj, r^, Oj j Uj, ^2’ ^ 2 ) ~ ^ (^i» ^2> *^ 2 »^^1’ An,, r^, Uj^) “ 

= — 0 {ill, ha^, I’j, rs■^ \ Mj, A’o,, r2, = 

“ O (Wj, A/n,, Tj, J ??j, /i'n,, r 2 , U 2 ) . 

(33.16) 

By definition, the operator P interchanges only the variables r 
and ( 7 , and by no means the quantum numbers n, k^- 'fhe first 
equation of (33.16) denotes the result of a P operation, the second 
takes into account the antisymmetry of the wave function, while 
the third is obtained from the first by permutation of all four 
arguments relating to the electrons. The iiossibility of such a per¬ 
mutation for any function is obvions, since it does not matter which 
particle is considered first and Avhich second when Avriting down 
the wave funetion: the interchange of the four values n^, ka^, rj 
and Uj, A’o,, rj, <73 in the last equality of (33.16) simply does not 
denote anything: it is immateriial which arguments are Avritten 
first—tho.se relating to the first electron, i.e., 7 ii, k^^, Tj. Oj, or those 
relating to the second electroii >ij, r.,, Uj. Hence, the function 
^ (a,, Tj, (Tj; Wj. r.., 02 ) is equal to itself Avith the sign reversed, 
i.e., it becomes zero identically. 

This property is possessed only bj' an antisymmetric function 
and not by a symmetric function; the latter Avould become identically 
equal to itself. But if the antisymmetric Avave function of two elec¬ 
trons occurruig in identical states is identically equal to zero, the 
probability amplitude of this state of a system of two electrons 
is equal to zero for any values of the A^ariables r,, r 2 , Oj, 02 . Only 
an antisymmetric function is compatible with the Pauli principle. 

The same applies to the wave function for a many-electron system: 
it is antisymmetric Avith respect to a simultaneous permutation of 
spatial and spin variables for any electron pair. This is the generalized 
formulation of the Pauli principle. 


Sec. 33] 


MANY-ELKCTRON SYSTEMS 


347 


Particles with half-integral spin. Experiment shows that all ele¬ 
mentary particles with half-integral spin obey the Pauli principle: 
protons, neutrons, electrons, and positrons. Complex particles, 
consisting of an even number of elementary particles with half-integral 
spm, have a symmetric wave function because, for a complete inter¬ 
change of all variables relating to such a complex particle, we must 
make an even number of permutations of the elementary particle 
variables it consists of. But by changing the sign an even number 
of times we do not change it at all. For this reason, nuclei Avith even 
atomic weights (for example, D^, He^, O'®, etc.) and, therefore, 
having symmetric wave functions are not subject as units to the 
Pauli principle, while He®, LP, etc., have an antisymmetric wave 
function, that is to say, they are subject to the Pauli principle. 

Elementary particles not subject to the Pauli principle. Light quanta 
do not obey the Pauli principle since there can be an unlimited 
number of quanta in a state with a given wave vector b and given 
polarization. All particles with integral spin possess a wave function 
which is symmetric with respect to a complete permutation of the 
variables relating to any pair of particles. 

The Pauli principle and the limiting transition to classical theory. 
The Pavdi principle enables us to understand why the wave prop¬ 
erties of light quanta are conserved in the limiting transition to 
classical theory, while the wave properties of electrons are 
not. 

Wo shall consider quanta in definite states, i.e., Iraving a certain 
polarization and wave vector. The number of such quanta can be 
infinitely large, since quanta are not subject to the Pauli princi])le. 
We note that this was not introduced as a supplementary liypotliesis 
concerning the properties of light quanta, btit was directly obtained 
in Sec. 27 in the quantization of electromagnetic field equations: 
the number of quanta in a state Nii,o is the qiiantum number for 
the corresponding oscillator. If this quantum number is large then 
the motion of the oscillator becomes classical, and, as we know, 
its oscillation amplitude is proportional to the amplitude of a field 
with a given polarization and wave vector. Thus, the limiting transi¬ 
tion yields a classical wave pattern. 

In accordance Avith the Pauli principle, there caimot be more 
than one electron in each state. Therefore, the probability-amplitude 
absolute values are always limited by the normalization to unity 
and, consequently, do not pass to wave amplitudes which can be 
defined classically. i 

The ortho- and para-states of two electrons. Let us now return 
to the case for which the wave function can be represented in the 
form (33.5). Since the whole product is antisymmetric, one of its 
factors must be symmetric and the other antisymmetric. This simple 
result refers only to the two-elcctron problem. 


348 


QUANTUM MKUHANKJS 


[Part 111 


Let us consider the wave function for two electrons. Since the 
spin of each electron is equal toY(in atomic units), the resultant 

spin can only he equal to zero or unity. Both these states of a system 
of two electrons have special names. The state with spin unity is 
termed the ortho-state, while that with spin equal to zero is called 
the para-state. 

As has already been said, the magnetic interaction of spins is 
small. If wo can neglect it, then it is easy to write down the spin 
wave functions for the ortho- and para-states. Let x (^o,; Oi) be a 
function of the spin variable for the first particle Si, assuming, as 
we know, only two values 0 ^ — 1 and == 2. denotes the eigen¬ 
value of the spin projection. Depcndizig upon whether is equal 

to or —^ , the function x has the form shown in exercise 3, Sec. 32. 

Without assigning a definite form to x> we Avrite down the spin 
wave function for two particles which do not have a spin magnetic 
interactiozi; 

X {k„,, ai-,k„^, <T.^) -- X (Ti) ■ X ffj) . (33.17) 


i.e., the probability amplitude for both pfxrticlcs breaks u]) into the 
product of the amplitudes for each particle separately. However, 
it must be taken into account that this function must be either 
symmetric or antisymmetric. If k„^ — k„^, then (33.17) is symmetric 
by itself: 


X (^'o,, ; k„^, ffj) 


lx(-;.«,)x(-^».). 


(33.18) 


For ka^j^ko, we must form either a symmetric or an antisymmetric 
combination of x: 


Xsym = [x ( 1 , ffl) X (- o . <^2) - 1 - X ( J . 0-2) X (- 2 > *^1)] > 
XmUym ■ - J ’ ®l) (- 2 ’ ^2) ~ ^ ( 2' ’ "2) ’ 


(33.19) 

(33.20) 


■ -- - is introduced for normalization. 

V2 

The magnitude of the spin projection, i.e., 0, 1 or—1, depends 
upon the choice of the z-axis. But the symmetry or antisymmetry 
of a wave function is an internal property and cannot depend upon 
the choice of coordinate axes. Therefore, the state (33.19) must be 
regarded as one state together with (33.18), if we judge by the total 
spin value. They are distinguished by the spin projections, and the 
total number of these states is three, as is required for a total spin 
equal to 1. The upper line of (33.18), as is evident, corresponds to 


Sec. 33] 


MANV-ELEOTKON SY-STEMS 


349 


a total projection 1, the lower line to a projection —1, and (33.19) 
to a projection 0. (33.20) corresponds to a total spin of zero. In the 
accepted terminology, the state with unity spin is to be regarded 
as the ortho-state while that with zero spin, the para-state. 

This definition of ortho- and para-states from the symmetry of 

the spin wave function also holds for particles with spin other than . 

But the resultant spin in the ortho- and para-states turns out, in 
this case, to be ambiguously related to the symmetry of the function. 
The example of deuterons with spin 1 will be examined in 
Sec. 41. 

Ortho- and para-states of helium. The two electrons in the helium 
atom can occur either in the ortho-state or the para-state. In the 
first case, the atom has spin unity, in the second case, zero. The 
symmetric and antisymmetric spin functions are the eigenfunctions 
of the spin-commutation operator Pa', when operated upon by 
the operator they give tl, i.e., the eigenvalues of P^. To the 
approximation (33.2)-(33.3), the Hamiltonian* is commutative 
with Pa, so that Pa is an integral of motion. Therefore, transitions 
between the ortho- and para-states, during which the total spin 
is not conserved, arc far less probable than transitions with conser¬ 
vation of spin. 

Only when spin-orbital interaction is taken into account, when 
the wave function cannot be expressed as a product of the form 
(33.5), is Pa not an independent integral of motion. But the cor¬ 
responding terms in the Hamiltonian,** which describe the spin- 
orbit interaction, are inversely proportional to c*. To this approxi¬ 
mation, it is only the total permutation operator of the spin and 
spatial variables for both electrons P that is an integral of motion, 
because the total wave function of the two electrons is always 
antisymmetric in accordance with the Pauli principle. 

The eigenfunction of a hydrogen molecule in a zero approximation. 
Concluding this section we shall consider the quantum mechanical 
explanation for the homopolar chemical bond. Such a bond occurs, 
for example, between the two atoms in a hydrogen molecule. It 
was first considered by Heitler and London. 

We assume that the atoms are independent in the zero approx¬ 
imation. Each electron is situated close to its own nucleus. We 
shall denote the nuclei by the letters a and 6, and the electrons by 
the numbers 1 and 2. In the initial approximation, the interaction 


♦ The Hamiltonian moans tho Hamiltonian operator. 

** Seo A. I. Akhiezer and V. B. Berestetsky, Quantum Electrodynamics, 
GTTI, 1953, equation (37.10). [English translation by Consultants Bureau, 
Inc. New York, N. Y., 1957.] 


350 


QUANTUM MECHANICS 


[Part III 


between atoms is not taken into account. But this does not mean 
that the wave functions of two electrons can be taken in the form 

'F = tj; (r«.) (rt,), 

because this function is neitlicr symmetric nor antisymmetric with 
respect to the interchange of the electron coordinates. Neglecting 
the spin-orbital interaction, we must write the spatial wave function 
in one of the two forms: 

T = ^ (r,.) {r,,) + ^ (r,.) 4 , (r^.), (33.21) 

or 

= 4- (^ 0 .) 4- (^0 - 4; (rt.), (33.22) 

assuming that the total wave fimction O is obtained by multii)lication 
of the spatial function by the spin function of opposite symmetry. 
In this form the wave functions are compatible with Pauli’s principle. 

VVe ivrite r„^ and in scalar form because the wave functions 
of a hydrogen atom in the giuund state do not depend upon the 
angle. 

The wave equation for a hydrogen molecule has the following 
form: 

I--—_ ^ -^ — ^- 

\ imp imp im im ^ ra^ rb, 

- -T = .^T. (33.23) 

The first two terms describe the motion of the nuclei of the molecule. 
They involve the mass of the proton mp in the denominator, and are 
therefore exceedingly small compared with the terms describing 
the motion of the electrons. Physically, this means that the nuclei 
move considerably slower than the electrons, so that we can find 
the electron wave function for a fixed distance between the nuclei. 
Then if is a function of the distance between the nuclei. If this func¬ 
tion has a minimum, corresponding to a stable equilibrium for a 
given electronic state, then it becomes possible for atoms to form 
a molecule. Wo shall not in future write the terms corresponding to 
the nuclear kinetic energy ; they must be taken into account when 
we consider the vibrational, rotational or translational motion of 
molecules, though the very position of stable equilibrium, which 
is determined by the electronic motion, can be found without allow¬ 
ance for —— Aa and — Ai,. 

imp imp 

The terms of the Hamiltonian appearing in the first line of (33.23) 

without — Aa 

and —^^^Ai,) where the index 0 indicates the degree of 

tTlp f ^ 

the approximation. The second line involves terms due to atomic 


Sec. 3:J] 


MANV-ELECTBON SYSTEMS 


351 


interactions: the attraction of electrons to “alien” nuclei, and the 
Coulomb repulsion between electrons and nuclei. We shall call this 
part and consider it a perturbation: this is true, strictly speaking, 
only in a qualitative sense. 

Perturbation method. Using the notation and the wave 
equation is written as 

= (.^0 + jTi) T = <? 'F. (33.24) 

The energy eigenvalue can be expanded as the sum of the energies 
of the noninteracting hydrogen atoms [from (31.34)] and the inter¬ 
action energy of the atoms: 

^ = + (33.25) 

We consider the operator and the energy as corrections. Ac¬ 
cordingly, we separate the wave function into a zero approximation 
function given by one of the ex])ressions (33.21) or (33.22), and a 
correction 'Fj: 

'F = 'Fo + Ti. (33.26) 

We shall neglect the quantities Tj and Sy 'Fj because, in our 
approximation, they can be considered as being of the second order 
of magnitude. 

Substituting (33.25) and (33.26) in (33.24) and omitting these small 
terms, we obtain 

^0 ^0 + ^ 1^0 + A 'Fi = 'Fo !- To -I- ^0 ^1 • (33.27) 

But from the definition of Tg we have 

jro'Fo = ^oTo, (33.28) 

since <^o the zero approximation energy and Tq is an eigenfunction 
of 

We multiply the remaining terms by To and integrate over the 
volume dVy dV^ of both electrons*: 

l^yy^yyWydVydV^+ ^dV ^dV ^ = 

- Jt» dVydV^ + <?oJ'f'o Ti ddFa. (33.29) 

We can transform the first term on the left making use of the 
fact that .y^o is an Hermitian operator (To is a real wave function): 

\Y^,yt^'¥ydVydV^^^'¥yJe^^Y,ydVydV^==^a\^QWydVydV^, (33.30) 

* For simplicity, wo consider hero the real Hamiltonian and the real wave 
functions; in the more general case, we must multiply the left by TJ. 


QUANTUM MKr'HANlUS 


[Part III 


;{52 


SO that it cancels with the last terra oii the left, in accordance with 
(33.28). The same could have been proved by using a definite form 
for in the given case. 

Finally, from (33.29) we obtain 

-- . (33.31) 


The denominator of this e.xpression is the square of the normaliza¬ 
tion factor of the function which square appears because the 
expressions (33.21) and (33.22) are not normalized to unity. Chang¬ 
ing to a normalized wave function, 


'*0 


we re])rcscnt the energy correction to a first approximatioir in the 
form 


S\== J (33.31 a) 


This exprc.ssion is ecjual to the average of the perturbation energy 
over non])erturbed motion fsee (30.24)]. 

Jt can be .seen from the very deduction of equation (33.32), in 
which only the general Herinitian property for operators is used, 
that the result is of a general nature. 

Bound state of hydrogen molecule. From (33.21) or (33.22), we can 
substitute either a symmetric or an antisymmetric coordinate func¬ 
tion in the formula for the energy of a hydrogen molecule. Evaluating 
the integral shows that (/•„(,) has a minimum for the symmetric 
form of the spatial wave function only. The depth of this minimum 
corresponds approximately to the binding energy of a hydrogen 
molecule. We cannot ex])ect very good agreement with experiment 
here, since the aiiproximation used was more qualitative than quanti¬ 
tative in nature. 

In calculating the energy, the following integral is of essential 
imi)ortancc 

(^..i) (■'■ 02 ) (r-z) <1^ (nx) d d F.,, (33.32) 

and is termed the exchange integral. It cannot bo correlated with 
any classical quantity because it involves only probability ampli¬ 
tudes and not densities. 

We note that the antisymmetric spatial function (33.22) possesses 
a nodal surface between the nuclei because it changes sign when 
the places of the miclei a und b are exchanged. A symmetric func- 


Soo. 34] 


Tim QUANTTTM THKORY' OK UiOIATIOX 


353 


tion does not have nodes and therefore corresponds to a smaller 
total energy, i.e., a more stable state. This function is multiplied 
by an antisymmetric spin function so that a stable molecular state 
has a zero resultant spin. Thus, the homopolar bond of two hydrogen 
atoms forming a molecule is I’elated to a “saturation,” as it were, 
of spins. There is Po longer a stable equilibrium position for a third 
hydrogen atom near a hydrogen molecule. 

The tendency towards spin saturation of electron pairs is pro¬ 
nounced in homopolar bonds. 


Kxerciscs 

1) Find the possible states of a system of two (i-elot!trons with tho same 
principal quantum numbers. 

Each eltxtiron can occur in ton stahw: 


.4 : 2, J ; B: 1, 2 ; C: 0, \ , Hi - - 2, J ; Ji". 2, 


: 0 -. 1 , 


1 

2 ' ' 


Tho .statc.s with positive iirojoctions of sjiiii and orbital momont arc 

AB-. 3,1; AC-. 2,1; ADi 1,1; AM-. 0,1; AF-. 4,0; AU-. 3,0; All-. 2,0; 
AI-. 1,0; AJ-. 0,0; B(7: 1.1; BD-. 0,1; BF-. 2.0; BO-. 2,0; BH-. 1.0; BI-. 0,0; 
CF-. 2,0; OGi 1,0; CH-. 0,0; DF-. 1,0; DG-. 0,0; BF-. 0,0. 

Choo.sing the .states with raa.'cimum angular-momentum projections, wo 
obtain three rcsidtant states with zero spin: 

hS', W, >6' or 'Bg, 

and two states with imity spin: 

ap, 5P or sp^, apf, ®Pg and ^Ff^. 

2) Show that, in a system of four p-electrons with tho same principal num¬ 
bers, tho states are tho same as in a system of two p-olectrons; in other words, 
that two electrons have tho same states as two “holes.” 


Sec. 34. The Quantum Theory ot Radiation 

In this section we shall find the probability of an excited atom 
emitting a light quantum in unit time, and we shall compare the 
probabilities of such radiation transition.s as corre.spond to various 
changes in the atomic states. But first we must deduce a general 
formula for the probability of quantum transitions (this formula 
will also be used in Sec. 37). 

Transitions between states with the same energy. Let us suppose 
that a system has two states corresponding to the same energy but 
different in some other respect. For example, in this section we 


23 - 0060 


354 


ejUAN'J'UM WKCJHANICS 


[Part III 


will coiisider an excited atom having energy excess — (^o above 
the ground state. This atom is capable of emitting a light quantum 
with energy Aoi =<$’, — <fin which case the atom will go to the 
ground state. Stri(!tly speaking, there is only one state consisting 
of an atom and electromagnetic field with a certain energy (if 
there are no other quanta in the field). The energy of such a system 
is rigorously determined, though the state is not defined in more 
detail. 

Also ])ossible is the following approach to the problem. Let an 
atom at the initi.al instant of time be in an excited state but capable 
of tlic spontaneous emission of a quantum. Then the energy of the 
atom is no longer specified with full rigour, but lies in some narrow 

interval A S, where A ~ i® mean lifetime of the 

atom in tlic excited state before radiation [see (28.15)]. If the mean 
lifetime of the aton> in the excited state Af is such that is con- 

siderably less than the energy level spacing of the atom, then the 
energy uncertainty can be neglected to a first approximation, 
assuming that the atom initially occuiTed in a state with an accurate 
energy value it is also necessary to calculate the probability 
that, in a certain interval of time t, the atom will go to the gi'ound 
state, and a quantum with energy h co — Sq will appear in the 
electromagnetic field. 

The reason for the traTisition is interaction with the electromagnetic 
field. Here the lifetime of the atom in the excited state is so great 
that — (jfj. For this reason, the interaction of the atom 

with the electromagnetic field can be interijreted as a small pertur¬ 
bation superimposed on the excited atom with energy S’l. 

The same type of problem concerning the transition probability 
due to a perturbation can also bo formulated for other transitions. 
For example, if the total excitation energy of an atom is greater 
than its ionization energy, it is possible for an electron to be emitted 
from the atom without radiation. In this case the excited state of 
the atom and the ion-]-electron state belong to the same energy. 
Each of them separately does not have a strictly defined energy. 

Transition probability. A radiation transition with the emission 
of a quantum is caused by interaction between an atom and an 
electromagnetic field. We shall suppose for the time being that this 
interaction is “switched off”; then the energy of the atom and field 
separately becomes an exact integral of motion. We shall call its 
eigenvalue in the initial state tfj. Then, if the interaction is “turned 
on,” a finite probability exists of the system making a transition 
to some state which, energetically, is very close to but otherwise 
very difl'erent from the initial state; for example, the atomlwas 
excited in the initial state and there were no quanta in the field. 


Soc. 34] 


THE QUANTUM THEORY OP BAtlTATION 


355 


while in the final state the atom went to the ground state and a 
quantum appeared in the field. 

Let us divide the Hamiltonian of the system into two terms: 

, where corresponds to the separated atom and 
field while describes the interaction. We then deduce a general 
formula for the transition probability, and apply it to a radiation. 
Wo shall therefore call the Hamiltonian of the unperturbed 
system, and regard jf’d) as a small perturbation causing the transition. 
The eigenfunctions and eigenvalues of the operator are deter¬ 
mined from the equation 

• (34.1) 

Allowing for perturbation, the wave function satisfies the equation 

“ T 4r = (3^-3) 

Considering that is a small perturbation, we represent the wave 
function in the form 

= (34.3) 

the “product” .3^^® ij<d) will be neglected as being of the second order. 
Then, for we obtain the nonhomogeneous equation 

_ ^ _ J’CO) ^(1) == jf(i) (j,{0). (34.4) 

We shall look for ij'W hi the form of an eigenfunction expansion 
of the operator 

(34.6) 

m 

each of the functions satisfying the homogeneous equation (34.1). 
Substituting the series (34.6) in the nonhomogeneous equation and 
using the indicated property of the function 4'm\ we arrive at the 
following equality: 

m 

The coefficients Cm can be determined therefrom by taking advantage 
of the orthogonality property of the eigenfunctions (30.6). For 

this it is necessary to multiply both sides of (34.6) by and integrate 

h Be 

over a volume. Then only the term —remains on the left, 

while on the right a certain integral is obtained which is characteristic 
of the perturbation method set out here: 


?3* 


35B 


QUANTUM MECHANICS 


[Part III 


(34.7) 


In order to integrate tliis equation we must determine the time 
dependence of the right-hand side. It involves wave functions [that 
satisfy equation (34.1)] together with their time factor in the form 
(24.21). It is assumed that the operator does not depend ex¬ 
plicitly on time. Then 

= r ' '■ (34.8) 

where the integral multiplied by the exponent does not depend 
upon time. We have supposed that at the initial instant of time 
the system was in a state with energy in other words, that 
Ic, (0)1 = 1, c„Tti (0) = 0. Therefore, equation (34.7) is integrated thus: 


h 

. c„ = 

I 


' ' I, 


. ^ (34.9) 


or, once again introducing the exponential factor under the integral 
sign, i.e., reverting to the functions we obtain 


i (^1 - <f„) I 


Cn(t) = 


(34.10) 


Consequently, the probability that at the instant of time t the 
system will bo in a state with wave function is, from (30.13), 
equal to 


= |2-2cos 


,2 i^LT 
“2 A 


J V * 

■ r' 


(34.11) 


Matrix elements. We shall be concerned with the expression (34.11) 
somewhat later. First of all let us introduoe a system of notation 
which, in general, is very convenient in quantum mechanics. The 
integral (34.7) for any pair of eigenfunctions (Iin, and for an arbitrar}'^ 
operator X is denoted thus: 


(34.12) 


Soc. 34] 


THE yUANTlTM THEORY OF RADIATION 


357 


where the integration is performed over all the independent variables 
involved in the Hamiltonian .jT. 

The quantities X,* form a square table so that the index n will 
designate the rows, and k the columns: 


Xn. 

^12 > ^13 > • • 

• t , . 

^21 > 

^22 > ^23 > • • 

• . Xjfc, . 

^31) 

^32 > ^33 > • • 

• . X3*!, . 

Xni, 

Xn2, X«3, . . 

• > Xfife, . 


. (34.13) 

Such a table is termed a matrix in mathematics, while the separate 
quantities Xhi- are called matrix elements. The right-hand side of 
(34.7) contains the matrix element 

We note an important property of the matrix elements of Hermitian 
operators. In accordance with the Hermitian condition (30.3) 

, (34.14) 

where the conjugate sign* on the right refers to the whole integral. 

I’roeeeding now from the definition (34.12), wo write: 

X,.fc = Xfc„. (34.1.5) 


A matrix whose elements satisfy equation (34.15) is termed Hermitian. 

The relationship between matrix elements of different quantities. 
Let us take the matrix elements of both sides of the operator equalities 
(30.35) and (30.37), and put the time derivative before the integral: 


'd( ’ 


(34.16) 

(34.17) 


The time dependence of the matrix elements was found in equation 
(34.8), namely 

I (»r« - <s ’i) I 


x„k = e 


) 


p«k = e 


i (ffn - rf’ri ( 


(34.18) 


— I 

h' 


■Pnk. 


(34.19) 


Therefore, 


358 


QUANTUM MECHANICS 


H’art Hi 


Thus, matrix elements depend harmonically iipon time. 

The energy difference A — A can be conveniently represented 
by means of the Bohr frequency condition (see beginning of Sec. 23): 


A — A 


h 


(34.20) 


Therefore, the operator relationships (34.16) and (34.17), rewritten 
for matrix elements, appear thus: 


p„k = immnXnk, (34.21) 


The matrix form of the equations of quantum mechanics was 
found by Heisenberg. 

The probability for transition to a continuous spectrum. Let us now 
investigate the expression for the transition iH'ol)ability (34.11), 
rewriting it in matrix notation: 

(A-A)“ ■ 


(<) 4 sin® 


(34.23) 


In the examples dealmg with radiation and ionization that we 
have mentioned, the final state of the system belonged to a continuous 
energy spectrum. Indeed, the energy spectrum of an electromagnetic 
field is continuous since the field can contain quanta of any frequency 
w. In the ionization example, the spectrum of the electron emitted 
from the atom was continuous because the motion of the electron 
was infinite. 

If the state n belongs to a continuous spectrum, it is more interesting 
to determine the total probability of transition to any of the states 
with energy i.e., to find the integral of (34.23) with respect to 
dA. Now it is not advisable to state the final energy A since it 
varies continuously; it is better simply to write «f. Then there are 
dN ((^) states contained Avithin the energy interval between S’ and 
S + dS. The example of dN (S) was given in Sec. 25, equation (25.25). 
(The transition to a continuous spectrum in Sec. 25 is achieved 
simply by meairs of an infinite increase in the box dimensions, which 
are not contained in any physical result. The distance between neigh¬ 
bouring levels is then infinitely reduced.) 

So let 

dN (A = z (A dA (34.24) 


The total probability of transition to the continuous spectrum is 

If= J (A , A dN (S) = J -^ z-A?-I ^ 

(34.25) 


Sec. 34] 


THF. QUANTU.M THEOtlY OF UAI>IA’L'IO.V 


350 


For clarity in notation we shall put the indices ^ of .Jf’W in 
brackets and not as subscripts, treating them as the arguments 
of the function, which in fact they are with respect to . 

We shall denote the argument of the sine by the letter 

t _j. 

2h 

Passing to the integration variable ^ we obtain 


Ty_ 

~ h J 5" 


(rfi+ dl. (34.26) 


The function ■ has a i)riucipal maximum for ? = 0. Its next 

maximum is already twenty times smaller. For this reason, in the 
integral (34.26), the main part is played by the values of ^ of the 
order unity. But then the instant of time t can alwavs be chosen 

so that —^ is considerably loss tlian 6\. In other words, it is ]3cr- 

missible, in the arguments of the functions S') and (S), 

to replace S simply by S^ and to take out the functions | (S^, 

S~S^)\^ aTid z(S=Si) from under the integral sign. It is shown 
thereby that if the time t is sufficiently long, the energies of the initial 
and final states S^ and S are defined so acciirately that tliey can be 
considered simply equal to one another, in aecorclance with tlie law 
of conservation of energy in the transition. Naturally, the law of 
conservation of energy holds always, hut, for sufliciently small 
values of t, it is impossible to determine the energy of the final state, 
for the uncertainty relation (28.15) for the given case is of the form 
{S — Si) t~2 nh. Hence, if t tends to infinity, the precise equality 
S—Si is obtained. 

Since the function —^ - decreases rapidly with increasing the 

integration should be extended from —oo to oo. Since the remaining 
values have been taken out from under the integral sign, the integral 
itself can be evaluated. It is 


From this 


oo 


— oo 


w=^\ yso-) {Si,s=Si) p 2 {Si) ■ t. 


Then the transition probability in unit time is 


(34.27) 

(34.28) 


(34.29) 


300 


QUANTUM MECHANICS 


[Part III 


We write the second argument S to emphasize that the state 
ij; {S) coincides with (j; (<^y) only with respect to energy. The formula 
(34.29) has very many applications. 

The matrix clement corresponding to the emission of a quantum. 
With the aid of expression (34.29) it is possible to obtain rigorously an 
expression for radiation intensity. This result is based on quantization 
of the electromagnetic field performed in Sec. 27. We shall not give 
the other, less rigorous, result based on the analogy between classical 
equations and the equations for matrix elements. 

In order to simplify subsequent computations, we shall, from the 
start, take advantage of the law of conservation of energy for radiation. 
In considering transitions of an atom from a state with energy 
to a state with energy <fo, we take only those quanta which satisfy 
the energy conservation law in accordance with the Bohr frequency 
condition ^ Further, we shall first of all consider quanta 

with a definite direction of the wave vector k, and a definite ])oIari- 
zation o. In addition, we assume that there were no quanta of this 
type in the field initially, i.e., in the initial state iV^kn==0. 

In this case the perturbation energy operator is the product of two 
operators: 

(vA). (34.30) 

This expression is obtained from (15.32), if the term that is linear in A 
is retained in the equation, and if we put tp = 0. 

The wave function of the atom and the field in a nonperturbed 
state, i.e., with the interaction between them “switched off,” is 
expressed as the product of the wave functions of the atom and the 
field. The wave function of the field is represented as the product of 
the wave functions of separate oscillators with different k, o. All 
these functions are orthogonal and normalized. Therefore, in calculat¬ 
ing the matrix element of the quantity Ag or of the coordinate of 
the oscillator with given k, 0 , we must take the wave functions 
corresponding to the given oscillator. In accordance with the nor¬ 
malization condition, the integral over all the remaining coordinates 
of the field gives unity. 

In the vector potential we take the term relating to k and o; 


- ikr 


, (34.31) 


where we have used equations (27.20) and (27.21) that express the 
field amplitudes in terms of real variables. 

We must calculate the matrix element describing the transition 
between two states, with no quantum described by wave vector k 
and polarization a in the first one, and with only one such quantum 


iSec. 34] 


XHK QUANTUM THEOUY OK ItAUIATlON 


361 


in the second one. We shall write these states with subscripts 0 and 1 . 
Then, from (34.21), 

(^k)oi “ *^10 (Qk)oi ~ (Qk)lU > (34.32) 

since is equal to the energy difference between the initial and 
final states of the field divided by h, i.e., just equal to the frequency 
of the emitted quantum coj.. Substituting this into the expression for 
the matrix element (Ak)oi we find that 

(A£)oi = 2]/'^“- <■£ , (34.33) 

since the coefficient of becomes zero. 

For simplicity we shall temporarily omit the indices k and a. 
We must evaluate the integral 

Qoi-j^iQwlQ- (34.34) 

Here, the field-oscillator wave functions are <lcnotcd by 9 ^ and cp^ 
in order not to confuse them with the atom wave functions. 

We know the functions cpo and 9 , from Sec. 26. From equations 
(26.22) and (26.23) we have 

1 (oQ® 1 , , 

9o = ffoe '■ , 9i--!7i'' " '■ (34.35) 

The coefficients and g^ are found from I he normalization condition 
(see exercise 1 , Sec. 26): 


uQ*\2 
'2h 


1 

V 


— 03 

>Q* — - \2 ^ ^ 

]/ t c - d.c = g‘ 


ffi 


■If 4<o 

'' [• hr:' 


We note that the product Q 90 is proportional to 91 . Hence, the 
integral (34.34) would have vanished due to the orthogonality con¬ 
dition, if any other function, 92 , 93 , ..., 9 n had been substituted, in 
place 9 i, into the integral. Therefore, only one quantum can be 
emitted with a given frequency, direction, and polarization. The 
same can also be shown for any arbitrary initial stage of the field. 
Absorption of quanta also occurs singly. For we have 


(34.36) 


QUANTUM MECHANICS 


[Part III 


3(i2 

The dipole approximation. It is now necessary to calculate the 
matrix element given by two atomic states: 

= ijVo (VAoi) ^^dV. (34.37) 

Substituting from (34.33) and (34.36), we reduce this matrix ele¬ 
ment to the form 

)io = e iK V) ^xdV. (34.38) 

The wave function of the discrete spectrum of an atom differs from 
zero in the region near 10“® cm, i.e., of the order of atomic dimensions. 
The wavelength of visible light is about 0.6 x 10”* cm, i.e., several 
thousand times greater. Therefore, we can consider that the phase 
of a wave changes very little in the region of the atom, and we remove 
the exponential factor from the integration, takmg it at some mean 
point (for examjile, the nucleus). 

This corresponds to a dipole approximation defined in Sec. 19, 
the wavelength being considerably greater than the dimensions of 
the radiating system. The other condition concerning the applicability 
of the dipole approximation is that the electron velocity must bo 
considerably less than the velocity of light—a thing that occurs 
in atoms of small and medium atomic weight. 

To the dipole approximation we have 

(ff<i))i„ = dV . (34.39) 

From (34.18) the velocity matrix element is directly expressed 
in terms of the coordinate matrix element 

^10 — f <>^01 ^10 ~ fc>icr2o, (34.40) 

^ ^ 

because, according to the law of conservation of energy, Wm = ^^ 

is equal to the frequency of the radiated light. 

The square of the modulus of the matrix element is 

1.3fmi2=e=>-^^|e£r2„12. (34.41) 

We shall now take into account the fact that an emitted quantum 
can have two different polarizations. If we are not specially interested 
in the probability of quantum emission with a given polarization, then 
the probability must be summed over the polarizations, i.e., over o. 
To begin with, let us assume the vector k to be in the direction of the 
2 -axis. Then the unit vector ej can have two directions: along the 
a;-axis and along the y-axis. Accordingly, 


{Sec. 34J 


THK QUANTUM THEORY OF RADIATION 


303 


^10 i 


' + |yiol 


(34.42) 


Let us find the average of this expression over all possible directions 
of quantum emission. It is then obvious that 

l^iol^= |yiol““ I ^10 I * = "jj’1 *"10 1^ • (34.43) 


By iierformhig this averaging after summation with respcot to o, 
we obtain 


O 


(34.44) 


In order to find the probability of emission of a quantum in unit 

27C 

time, wo must multiply (34.44) by -^'-z(^). z (S') is found from 
equation (25.24), where Ave must put 


dN{<„) 


(34.45) 


Fmally, from equation (34.29), we find the exjjression for the 
probability to a dipole approximation; 


dW_ 

dt 


3 hc^ 


i-e‘ 


* 101 


(34.46) 


Wo can UTite the product er^g as i.e., as the dipole moment 
matrix element. 

The intensity of radiation is equal to the radiation probability in 
unit time multiplied by the energy of the quantum: 


d<S’ _ 4 u* I , 12 


(34.47) 


This equation greatly resembles the classical formula (19.28). How¬ 
ever, we have the square of the modulus I d^o in place of the 
square of the second derivative of dipole moment d®. The correspond¬ 
ence between classical and quantum theory displayed here may be 
demonstrated by means of the matrix equations (34.16) and (34.17) 
too. Directly applied to electrodynamics, it leads to equation (34.47). 
We have given a more rigorous deduction, based on the quantization 
of electrodynamical equations, in order to illustrate the generality of 
the methods of quantum theory. 

Compared with the classical formula (19.28), the quantum expression 
contains an extra factor of two (4/3 instead of 2/3). This is explained 


304 


QUANTUM MECHANIUS 


[I’art 111 


in the following way. We represent a classical dij)ole moment, varying 
harmonically, in the following manner: 

d = e'"' + dj e, d = — (dje'“' + dj. (34.48) 

The terms dj e'"' and dj e~'“' depend upon time like the matrix 
elements djo and dg^. Let us form the time average of (d)®. The terms 
involving and drop out in averaging, and there remains 

(d)2 = (o«(2di dj) = 2w«|di j2. 

But it is the quantum formula which corresponds to the mean 

radiation intensity Ihoi ■ , so that the factor of two is due to the 

time averaging of the square of the dipole moment, given in the 
form (34.48). 

The expression (34.47) confirms what was said in See. 22 about an 
atom being stable in quantum theory: radiatioir is always associated 
with a transition of the atom from one state to another. But no 
atomic state exists for which the energy is less than the energy of the 
ground state, that is why the atom can exist in the ground state for 
an indefinitely, long time. 

The selection rules for the magnetic quantum number. It follows 
from equation (34.41) that, if rio = 0, the intensity is equal to zero, 
at any rate to a dipole approximation. We shall now find the con¬ 
ditions for which r^o differs from zero. First of all we notice that if 
the vector defining the polarization direction is equal to eS then, to 
a dipole approximation, radiation of such a quantum is possible 
]irovided the matrix element for the projection of the electron radius 
vector along the direction of quantum polarization differs from zero. 
Let us a.ssurae that the quantum polarisation is along the z-axis. 
Let the magnetic quantum number of the electron be equal to A: 
before the transition, and k' after the transition. Then the dependence 
of the wave function upon azimuth is given by the equations 

>i'i A A'. . 4'(* = /(*! (r> , 


because and c'*' '’’ are eigenfunctions of the angular-momentum 
z-projection. Hence 

2it 


Zjo = J /o {r> h) r cos h • /j (/•>>) sin 9- dr dhj ~ > d cp ; 

u 


J (> i(p (fc - k') (l(p 
0 


ei<f (.k k') 

i (k — k') 


0 for k yi k ', 
272 for k = k'. 


z—r cos in polar coordinates and. therefore, does not depend 
upon <p. For this reason, the matrix element z^ differs from zero 
only when k' — k. 


Sec-. 34] 


THK yCTANTUM TllEOttY OF RADIATION 


365 


Inatcarl of coiisideriiig plane ])olarized radiation along x or y, let 
us take circularly polai-ized radiation in the a;y-plane. In such radiation 
there is a constant phase shift between the a;th and the yth components 

equal to ^ (see sec. 17, Fig. 25). Consequently, we must determine the 

matrix elements of the quantities 

(x-\-e ‘ y)io = (a: f iy)iQ == (rsinS-e±''P)io. 

Substituting the expressions for the wave functions involving <p 
explicitly, we find 

2 TT 

= iik’^h± 1. (34.49) 

0 

Hence, radiation which is circularly ]>olarized in the a:«/-plane can 
only be emitted if the magnetic quantum number changes by + 1. 

The rules that determine what change of quantum number governs 
the emission of a given radiation are called selection rules. 

Tlie selection rules for dijiole radiation with respect to the magnetic 
quantum number forbid the changing of k by more than unity. 

The selection rules for the azimuthal quantum number and parity. 
The magnetic quantum number is the angular-momentum projection. 
Since the angular-momentum projection does not change by more 
than unity, the angular momentum itself (i.e., the azimuthal quantum 
number) cannot change by more than unity. 

But I for a separate electron cannot remain unchanged, because 
then the functions and Aq must have the same parity. Here, the 
product ipo 2 will turn out to be an odd function while its integral, 

i.e., the matrix element | iLo zijii d F, will become identically zero. 
In exactly the same way V and J d F will also become 

zero. This is why, for a dipole transition of one electron in the atom, 
the azimuthal quantum number changes by ±1. 

Angular momentum and parity of a light quantum. As was indicated 
in Sec. 13, an electromagnetic field possesses angular momentum. If 
from equation (13.28) we determine the angular momentum of a 
quantum emitted during dipole radiation, it comes out equal to unity. 
And the state of the quantum is odd because it is determined by the 
parity of the dipole-moment vector components d, which, obviously, 
change sign for the interchange x-^ — x, y^ — y, — z. Hence, the 
selection rules for the azimuthal quantum number and parity of the 
state of the atom must be interpreted as the conservation laws of 
total angular momentum and total parity of the atom-)-quantum 
system in radiation. Clearly, if the angular momentum of a quantum 


QUANTUM MECHANICS 


[Part III 


3(i6 

is equal to unity, the angular momentum of the atom cannot change 
by greater than unity during radiation. 

The selection rules for spin and total angular momentum. If the spin 
is in no way related to the orbital motion, the spin functions for the 
initial and final states must be the same, otherwise the transition 
dipole moment is equal to zero due to the orthogonality of spin func¬ 
tions that correspond to different spin eigenvalues. 

This selection rule is approximate in character and is valid for light 
atoms. Taking into account the spin-orbital interaction, we must 
consider the selection rules for the total angular momentum j=M±o 
[see (22.15)]. 8 ince the angular momentum of a dipole quantum is 
equal to unity, we obtain the condition for j'—j: j'=j or j'=j±.\. 
Hero, the parity of the state must change. However, since the parity 
is not directly related to j but only to I, the transition j'=j is also 
possible. But the transition from = 0 to 7 ' = 0 is forbidden, because 
in this transition the quantum caimot acquire the angular momentum. 
It is necessary to note that the angular momenta for quanta of higher 
multipole order than dipole can only be greater than unity, so that the 
transition from / = 0 to f = 0 is forbidden for all approximations, and 
not only to the dipole approximation. 

The selection rules for many-electron atoms. By considering a light 
quantum as a particle with unity angular momentum, it is easy to 
obtain the selection rules also for cases when the states of more than 
one electron change. Neglecting the spin-orbital interaction, the 
selection rules are the following: S' =8, L' = L or Z('=B±1 and the 
parity is reversed. The transition L'—L is possible here, because, in 
a many-electron system, parity is not related to total angular momen¬ 
tum. 

Magnetic dipole radiation. A system of charges may radiate as a 
magnetic dipole as well as an electric dipole. Magnetic dipole radiation 
is usually related to a change of spin projection k^. Since the spin of 
an electron is one-half, the angular momentum of an atom changes 
by unity for a “flip” of the spin of an electron and for an unchanged 
orbital angular momentum. The moment of a magnetic dipole quantum 
is equal to unity just like the moment of an electric dipole quantum. 
But the parities of the electric and magnetic quanta are reversed. 
Indeed, the components of electric dipole moment change signs in an 
inversion of the coordinate system (31.35), while the magnetic-moment 
components do not change signs because the magnetic moment, like 
the angular momentum, is a pseudovector (see Sec. 16). 

As was pointed out in Sec. 19, the intensity of magnetic dipole ra¬ 
diation is less than the intensity of electric dipole radiation, their 

ratio being , where v is the charge velocity and c is the velocity 
of light. This ratio is about 10 -® for light elements. 


Sec. 34] 


THE QUAISTTUM THEORY OF RADIATION 


367 


Quadrupole radiation. In Sec. 19 it was shown that radiation is 
possible due to the change of quadrupole moment for the system. 
Here, electric quadrupole quanta occurring in an even state are radiat¬ 
ed because the electric quadrupole moment is an even coordinate 
function. The angular momentum of a quadrupole quantum is equal 
to two. 

Quadrupole radiation can occur when dipole radiation is forbidden 
by the selection rules. From Sec. 19, quadrupole radiation is obtained 
when taking into account the retardation inside the system. The order 
of magnitude of this retardation is determined by the ratio of the di¬ 
mensions of the system to the wavelength of the emitted light. There¬ 
fore, the probability of quadrupole radiation is less than the proba¬ 
bility of dipole radiation in the ratio where r is the size of the 

system. 

X~0.5x 10“* cm for visible light while the atomic dimensions are 
r~0.5 X 10“® cm. Therefore, in order of magnitude, the probability 
of a quadrupole transition is 10® times less than the probability of a 
dipole transition. 

Metastable atoms. If an atom can go from an excited state to the 
ground state only by means of a transition which is forbidden in dipole 
radiation, it remains excited considerably longer than for a dipole 
transition. For a strong forbiddence it may remain excited for a very 
long time (even on the ordinary scale, and not the atomic scale). 
Such an atom is termed metastable. Usually, in gases which are not 
highly rarefied a metastable atom gives up its excitation energy to 
other atoms in collisions and not by means of radiation. Radiation will 
then not be observed. But in highly rarefied matter, for example in 
the solar corona or in a gaseous nebula, the spectral lines due to the 
de-excitation of raetastable atoms are very bright. For example, in 
the spectra of nebulae, there occurs an intense magnetic dipole line 
of doubly-ionized oxygen atoms. 

Nuclear isomerism. Transitions with very large A; (up to 5) are 
observed in nuclei. For small excitation energies, of the order of several 
tens of kilovolts, metastable nuclei have very large de-excitation 
times—days or months. Such nuclei are called isomers with respect 
to the basic unexcited state of the nucleus. The phenomenon of nuclear 
isomerism in artificially radioactive nuclei was first discovered by 
I. V. Kurchatov and L. I. Rusinov (in Br®®). 

The totally forbidden transition. The transition from j = 0 to j' = 0, 
with an energy of 1,414 kev is observed in the RaC nucleus. Since the 
radiation in this case is completely forbidden, the nucleus simply 
ejects an electron from the inner atomic shell by means of an electro¬ 
static interaction; this may be explained as follows. 

If an internal nuclear rearrangement occurs, the charge distribution 
inside it somehow changes. For a 0->0—transition, one spherically 


QKANTI7M MK('HANK'S 


[Part Ill 


3()8 

symmetrical charge distribution is rearranged into another, which is 
also symmetrical, but •with a different radial dependence. Therefore, 
in accordance with Gauss’ theorem, only the electric field inside the 
nucleus is changed. The field outside the nucleus cannot change; for 
instance, it cannot radiate quanta. The wave functions of the s-states 
of the electrons differ from zero in the nucleus. It follows that a change 
of field inside the nucleus is capable of influencing an electron and im¬ 
parting to it an energy sufficient for ejection from the atom. In accord¬ 
ance with the law of conservation of energy, the electron, upon 
ejection from the atom, will have an energy equal to the energy differ¬ 
ence of both spherically symmetrical states of the nucleus minus 
the binding energy in the atom. 

It may be stated generally that the ejection of electrons from an 
atom shortens the lifetime of metastable isometric nuclei, since it 
makes transitions more possible. 

Sec. 35. The Atom in a Constant External Field 

A classical analogue. In considering the behaviour of a system of 
charges situated in an external magnetic field, it is very convenient 
to proceed from the idea of the Larmor precession of magnetic moment 
around the field. The only component of the angular momentum con¬ 
served in such precession is that directed along the field, both trans¬ 
versal components averaged over the precessional motion being 
zero. 

The situation in quantum mechanics is analogous, with the differ¬ 
ence that the projections perpendicular to the field do not exist as 
physical quantities. In this way a simple correspondence is established 
between the integrals of classical and quantum mechanics. The angu¬ 
lar-momentum projection on the magnetic field is such a corresponding 
quantity; it can be called a quantum integral of motion. 
f, An external magnetic field superimposed on an atom perturbs its 
state in a definite way. The Hamiltonian operator for such an atom 
may be divided into the operator for the unperturbed atom and 
the perturbation operator ./f <*> due to the magnetic field. 

Addition ol magnetic moments. Let us first of all write down the 
operator explicitly. It was shown in Sec. 32 that spin motion does 
not produce the same magnetic motion as orbital motion, namely, the 
magnetic moment for orbital motion is 

( 35 . 1 ) 

and th(7 s])in magnetic moment 


e 

■m c 


s. 


(35.2) 


Sec. 35] TIIK ATOM IN A CONSTANT BXTKltNAL I’lKLl) 3(5i) 

Therefore the total magnetic moment is 

(t=jior6+jisp =+ 2S). (35.3) 

Hence, the magnetic moments are not combined according to the 
same law as mechanical moments: 

J = L + S. (35.4) 

Comparing (35.3) and (36.4), we see that the magnetic moment of an 
atom is not proportional to its mechanical moment. 

In accordance with (15.35), the perturbation energy caused by the 
magnetic field is equal to (the moments are expressed in h units) 

= - (ttH) = (L 1- 28) = [ioH(.T + S). (36.5) 

Here is the Bohr magneton. The plus sign resulted 

because the charge of the electron is —e. We note that the magnetic 
energy in expression (16.35) was defined as a correction to the Hamil¬ 
tonian, i.e., to the energy expressed in terms of momenta. Therefore, 
in quantum theory it is directly interpreted in terms of operators. 

The vector model ol the atom. To a first approximation, the energy 
correction is equal to the mean value of the perturbing energy taken 
over unperturbed motion [see (33.31)]. Therefore, we first find the 
unperturbed state of the atom without the superimposition of a magnetic 
field. Let us suppose that a normal coupling exists in the atom 
between the total spin and the total orbital angular momentum 
(see Sec. 33), i.e., all the orbital angular momenta of the electrons are 
combined in one resultant orbital angular momentum L, and all the 
spin angulai' momenta are combined in one resultant spin angular 
momentum S. Examples of such orbital and spin angular-momentum 
composition were given in Sec. 33 (in the text and in the exercises). 
For example, in combining the angular momenta of two »p-electrons, 
the following states are obtained: ^D, ®P, and ^8. All these states are 
formed in accordance with the Pauli principle, and possess spatial 
wave functions of different forms. For this reason, in all three states, 
the energy for purely electrostatic electron interactions differs by 
magnitudes of the order of an atomic unit, i.e., by several electron- 
volts. 

Let us choose the ground state of these states. In accordance with 
Hund’s first rule, this is the state. We have not written the sub¬ 
script J here because it can have three values: J — 2, J — 1, and J=0. 
Accordingly, we have written 3 on the upper left. The states which 
differ oidy in J, for identical L and 8, are considerably closer to each 
other than the three states with differing 8 or L listed above. 

Let us estimate the order of magnitude for multiplet level splitting, 
i.e., the spacing of levels with different J. A magnetic field of moment 


370 


QUANTUM MECHANICS 


[Part III 


[i is of the order ~, so that the interaction energy of two moments is 
2 ^ 

^ 3 -. To evaluate the order of magnitude we put one Bohr magneton 

in place of p, i.e., 10“*°, and r~0.5 x 10 “®. This results in an inter¬ 
action energy of the order 10 “'®, i.e., thousandths of an electron-volt 
(in practice, greater multiple! splitting is observed due to larger (a 
and smaller effective radius values). In any case, the levels ®Pi, 
and ®Po, which are comparatively different from the other two levefe 
'£) and '/S', occur close to one another. The ®P level is split into three 
fine-structure levels which, in the given case, corresponds to the super¬ 
script 3. If P < S, the number of components of the multiple! splitting 
is determined by L. 

Each of the levels with a given J corresponds to a definite configu¬ 
ration of the vectors L and S. In classical theory, we would say that 
L and S are parallel in the state with J= 2 , antiparallel for J = 0, 
and perpendicular for J— 1. Of course, the latter one is not at all 
meaningful in quantum theory because only one angular-momentum 
projection exists. The projection oi 8 on L is equal to zero for J = 1, 
and the other projections do not exist. 

At the beginning of this section we indicated that, in the classical 
analogy, those components which are not conserved are, in some way, 
averaged over the Larmor precession of angular momenta and yield 
zero. In this case, we are not concerned with precession in an external 
magnetic field, but with that in the internal field of the magnetic 
moments themselves. Since J is an exact mtegral of motion we can, in 
a visual demonstration, consider that the direction of J is fixed in space, 
while the triangle consisting of the vectors L, S, and J processes about 
3 in space. In the cases for which J = 2 and J = 0 the triangle degen¬ 
erates to a straight liue. Thus, to each of the multiple! levels there 
corresponds a definite vector model given by L, 
H S, and J. We note that this refers to normal 

_ coupling. 

An external magnetic field H causes the 
vectors L, S to process about its direction. It 
is most simple here to consider the two opposing 
limiting cases. We shall examine them. 

A weak external field. Let the external field 
be weak compared with the effective internal 
field that the multiple! level splitting is due to. 

Fig. 44 Since the Larmor precession frequency is propor¬ 

tional to the magnetic field, the triangle L8 J in 
this case rotates about the side J considerably faster than the precession 
about H. During the time of one rotation about H, the triangle can 
rotate very many times about 3 . Therefore, the coupling of the vectors 
L, S, and 3 in the triangle is not disrupted, as it were, due to the 
internal magnetic forces forming the triangle being large compared 


Sec. 35] 


THK .4TOM IN A CONSTANT EXTERNAL FIELD 


371 


with the external magnetic force. We have shown this idea in 
Fig. 44. 

Lot us now find the correction due to the magnetic field. In calculat¬ 
ing the mean value of the perturbation energy from the unperturbed 
motion, it is convenient to make use of the Larmor precession model. 
In this case, two forms of precession must be considered; the triangle 
LSJ about J and the precession of J about the magnetic field. 

Equation (35.5) involves the vectors J and S. It is very simple to 
average J: we must take its projection on the magnetic field Jz. We 
shall consider that the z-axis coincides with the direction of H. The 
projection of S upon H is not meaningful because the vector S together 
with the triangle LSJ rotates considerably more rapidly about J 
than about H. The component S, perpendicular to J, is averaged by 
the precession in motion unperturbed by the external field. There 
remains the projection parallel to J and equal to 

S; = . (35.6) 

(S J) 

Obviously, the projection of this vector upon H is equal to Jz . 
Thus, the mean value of , 5 ^ 6 ) is proportional to Jz and is equal to 

^(1) = .5^, = J. (1 + . (35.7) 

It is now necessary to give a quantum meaning to the product 
(SJ). From the definition of J (35.4) we have 

L=J-S. (35.8) 

Squaring this equation, we get 

L2 = J2-hS2-2(SJ). (35.9) 

Expressing the square of the angular momentum in A units, in accord¬ 
ance with (30.29), we have 

L^=L(L + l). 

Let us make similar substitutions for and S®. Therefore 
(S J ) _ J {J 1) -f- <Si ()g -t- 1) — J< (L -h 1) 

~ 2J{J+1) • (30.10] 

Substituting (35.10) in (36.7) we obtain finally 

d>(i) = J’(i) = iL,HJz[l + , (36.11) 

Thus, the fine-structure level with a given J is split into as many 
levels as there are different projections of J on the magnetic field, 


(;i:AN'rt'M MliClIAMf'S 


ii’lu t m 


:J72 


i.e., 2 J +1 levels. For given L and S, the following definite factor 
corresponds to each value of J: 


J(J +\) + S {S +l)-L{L+ 1) 


2J(J+1) 

It is called the Lande factor. 

For example, for L — we obtain 

, , 1 2-I-2-2 3 

_ 1 . 

Analogously, for ,/-=2, B—L = \ 

. , 1 (5 I 2-2 3 

2 - 6 ■= 2 - 


(35.12) 


I’lio level with J — Q does not split. 

Splitting in a strong field. 'J’lie representation of splitting set out 
here corresponds to reality only as long as the magnetic field is so 
weak that the spacing between the 2 7 + 1 levels of (i.^ gHJz in the 
magnetic field is small comiiared with that between the unsplit multi- 
plet levels themselves with differing J. When the splitting in the mag¬ 
netic field is comparable with that of the multiplet itself, or is somewhat 
greater, the pattern becomes more complicated, but in a strong field it 
once again becomes very simple. 

Therefore we shall consider the opposite extreme case, when the 
external field is strong compared with the internal field, so that 
the coupling between the vectors L, S, J in the triangle is disrupted. 
The necessity for this disruption in a sufficiently strong magnetic 
field can bo explained by the fact that S processes twice as fast as L. 
Then, from the classical analogy, each of the vectors S and L processes 
independently about the magnetic field, so that the correction to 
the energy is given by a different expression from (35.11): 

^(1) = iro) = g.^eH{Lz -f 2Sz ). (35.13) 

Here Lz is the projection of the orbital angular momentum upon 
the 2 -axis, and 8z is the total-spin projection of the atom upon the 
same axis (in h units). Naturally, the total values of L and 8 are not 
changed by the magnetic field, though the distribution of levels 
in a strong magnetic field is not related to the multiplet structure, 
as was the case in a weak field, but only with the possible projections 
of L and S on the magnetic field. The vectors L and S process about 
the field far more rapidly than they process about J without the 
field. This is why the coupling in the triangle is disrupted. 

The projections 8z and Lz are changed by unity, therefore all 
the levels <!’(*> in expression (35.13) are equidistant. Of course, certain 
values of <^(i) may be repeated several times if the sum Lz-\-28z 
assumes the same value in several ways. For example, if i) = l, <9=1, 


Sec. 35j 


TUB .Vixm IN DOSSTANT KXTEUNAL FIEI.I> 


373 


then we get the following range of values of the sum: 1 + 2 = 3, 
0 + 2 = 2, 1+0 = 1, — 1+2 = 1, 0 + 0 = 0, — 1+0= — 1, 1 — 2=—1, 
0 — 2 = — 2, — 1 — 2 = — 3; there are in all seven equidistant 
values, and 1 and — 1 are obtained in two ways (i.e., each of them 
from the confluence of two levels), so that there are nine states in 
all. We note that in a weak field the same multiplet split thus: J = 2 
into 5 levels, J = 1 into 3 levels and J = 0 did not split. As was to 
be expected, the total number of different states in the strong and 
weak fields is the same. 

The radiation spectrum for level splitting in a strong field. Lot us 
now see what .spectral lines appear when light is emitted from an 
atom situated in a magnetic field. To begin with, let us consider a 
strong field, because the pattern of the .splitting of spectral lines is 
simpler in this case than in that of a weak field. Both levels, upper 
and lower, resulting from two multiplets are split into a certain 
number of equidistant levels in accordance with formula (35.13). 

Let the radiation be observed in a direction perpendicular to the 
magnetic field. The radiation polarization vector is peiqicndicular 
to the direction of propagation, i.e., it is either directed along the 
magnetic field or in a third perpendicular direction, say along the 
x-axis (the magnetic field is along the z-axis). The selection rules 
for radiation polarized along z and along x are different. For polari¬ 
zation along the z-axis, the orbital magnetic quantum number mu.st 
be conserved. Bz is also conserved for all polarization in which the 
spin-orbital interaction is neglected. Therefore, all lines polarized 
along the z-axis, i.e., along the magnetic field, have the same frequency, 
Avhich corresponds to the energy difterence of the two initial levels 
prior to splitting in the magnetic field: the correction (35.13) 
is cancelled in calculating the difference — d’W. A wave pol¬ 
arized along the x-axis can be represented as the sum of two waves 
circularly polarized with opposite directions of polarization. The 
selection rule for these lines is that h can change only by + 1. (Jon- 
sequently, the radiation polarized along the x-axis has a frequency 

that differs from the initial frequency by ± • In observing 

the spectral lines emitted perpendicularly to a strong magnetic field, 
the original line is thus split into three lines separated by an interval 
which is equal to the Larmor frequency for the given field. 

If we drfil a hole in the shoe of an electromagnet it is possible to 
observe radiation propagated along the magnetic field. It is circularly 
polarized in the xy-plane. The selection rules for right- and left-hand 
circular polarization correspond to a change of ifc by ±1, so that 

there will be observed two lines spaced from the centre by ± . 

Thus, when the field is switched on, the original line will split into 
two lines separated by an interval equal to twice the Larmor frequency. 


374 


QUANTUM MKUHANIUS 


[Part 111 


Exactly the same picture is found in the classical oscillatory motion 
of a charge situated in a magnetic field. This problem was considered 
in exercise 6 , Sec. 21. 

The effect of the splitting of spectral lines in a magnetic field 
was discovered by Zeeman before the quantum theory of the atom 
appeared. Therefore, the then accepted theoretical explanation of 
the Zeeman effect corresponded to the classical problem, where it 
was considered that the charge performs an oscillatory motion. 

However, in observing spectra, this classical picture applies only 
in strong magnetic fields such that the splitting of lines obtained 
is considerably greater than the spacing between multiplet levels. 
Under these conditions the Zeeman effect is termed normal, because 
outwardly it corresponds to the theoretical ideas of the time at which 
it was discovered. It may be noticed that a field which is strong 
for one multiplet can still be weak for another. 

Spectral-line splitting in a weak magnetic field. The Zeeman effect 
in a weak magnetic field is termed anomalous. A spectral pattern 
is obtained which is entirely different from the classical. First of all, 
the number of splitting components can differ from the normal. 
The distances between them are also quite different. 

As an example let us consider the anomalous Zeeman effect in 
the so-called X)-line doublet of sodium. This line is double without 
an external magnetic field. It corresponds to the two transitions 

1 -> I and 3 ->^Si. The *P level has an orbital angular momen- 

2 ' '2 I *2 3 

turn 1 and spin - 5 -. Therefore, the resultant value of the total angular 
momentum J can be 1 -F - 5 ^ = and 1-^ 5 -. This is where we 

^ Jt h U 

get the fine doublet structure of the ^P level in the absence of an 
external field. The level cannot split without a field because it 
has an orbital angular momentum of zero. The double 2)-line in the 
sodium spectrum arises in the transition from the doublet level 
to the single. According to our rough estimate of the fine-structure 
splitting, the difference in frequency between its components amounts 
to about one thousandth of the mean frequency of the doublet. 
The ®Pi level is lower than the ^P^ level. 

2 2 

Let us now calculate the Lande factor for tlu’ee levels. 


1 ) ‘“P./,: L=l, S = ^l,, 


?= 1 + 


1 lkLll^+ V 2 • V? 

2 " 


1 • 2 


£ 

3 • 


J = V2, P=i, -sr = V2. 


<7 = 1 + 


1 V2-V2 + V2-=’/2-1-2 

2 


£ 

3 • 


2 ) *P./.: 


Sec. 36j 


THE ATOM IN A CONSTANT EXTERNAL FIELD 


375 


J = V 2 , L = 0, 5 = 1/2. 


g = i + 


LUjillA±2hili^ _ 2 
2 v.-Va 


In accordance with equation (35.11), we have an expression for 

the energy of the state in a magnetic field. For conciseness we 
2 

shall denote the quantity by the single letter p. Then 


But Jz takes on four values: 3/2, 1/2, —1/2, —3/2. Hence, in a field, 

the level splits into four levels, whose energy differences from 
2 

the central, unperturbed, state are 

^a)(-3/^) = _ 2 p, ^(i)(_i/,) = _|p, 


^(i)(+l/,) = |p, ^(I)(q.3/^) = 2P 

respectively. 

We obtain two energy values for the ®Px fine-structure level: 

2 

^(^>(-V 2 ) = -~ 3 --p; ^"(+V 2 )=yP- 

And, finally, for the lower level, we get 

2 

^(l)(_l/,)=_P, ^(l)(-f-V2) = P. 


Let us now find the spectral pattern. We start with the *Pi -> *5^ 

2 2 

transitions. The oscillations polarized along the field obey the se¬ 
lection rule AJ 2 = 0. Hence, their frequencies are shifted relative 
to the central position by 

^( 1 ) , _ X/^) _ ^( 1 ) _ 1 /^) p + p = I p 

and by 

. V2) - , V2) - y P - P = - 4 P. 


Unlike the normal Zeeman effect, a double line has been obtained 
also for radiation polarized along the magnetic field. 

For perpendicular polarizations we have 


-V2)-^<^>{*5v., V2) = 


— & — &= — — S (right-handed 
3 P 3 polarization). 


,^'a)( 2 p./., 


V,)-^d)(25v,, -V2) = -|-P + P = |P 


(left-handed 

polarization). 


370 


QUANTUM MKCIIANICS 


[Part III 


Let us take the transition If the oscillation is polarized 

2 2 

along the field we once again have, of course, two lines, though with 
other spacing; 

^(1) , _ 1/^) _ ^(1) _ 1/^) __ 2 p _ I P ^ 

m. . V 2 ) - . V 2 ) -1 p - p - -1 p • 

We have, for both circular polarizations: 

^-(1) (2p,/^ , _ 3/^) „ ^( 1 ) (2,Sf./, , - V,) =. _ 2p + P - P , 

('Pv., - V2) “ , V2) - - 3 p - p - - 3 p . 


These are the results for right-handed polarization. The corresponding 

rj 

splitting for left-handed polarization is S and ij- S. 

Thus, one component of the Z)-line is split into six Zeeman compo¬ 
nents, and the other into four. 


In the given case, the Zeeman effect remains anomalous as long 
as p is negligibly small compared with one thousandth of a volt, 


or the magnetic field is very much 
less than 5,000 CGSE units. 

A diagram of the splitting is shown 
in Fig. 45. 

The atom in an electric field 
(Stark effect). The multiplet levels 
for a certain total angular momen¬ 
tum J split in an electric field, too. 
We shall consider first of all the 
case of a weak field, when the level 
shift caused by the field is small 
compared with natural multii)let 
splitting. 

First of all, we must bear in 
mind that the aiigiilar-momentum 
projection on the electric field is 
determined only within the accuracy 


Fig- 45 of the sign, becariso the angular 


momentum is a pseudovoctor while 


the electric field is a real vector. In reversing all the coordinate signs, 
the angular-momentum components change sign while the electric- 
field components do not change sign. But since the choice of right- 
handed or left-handed coordinate system is arbitrary, the projections 
of the angular momenta on the electric field are physically deter¬ 
mined only to the accuracy of the sign. If J is an integer, the number 


See. 35J 


THE ATOM IN A CONSTANT EXTERNAI, EXELO 


377 


of its projections which differ in absolute value is equal to J +1 
(0,1 ,..., while if J is a half-integral number then the total number 

of projections is; J -f-y |y , y, ..., . For example, H J = ~, there 
is only one nonnegative projection. Therefore, the state with angular 
momentum y is not split by an electric field, at any rate as long 
as the coupling between L and S is not disruiited. For comparison 
we note that the magnetic field splits the state Avitli into two 

states because the magnetic field, like the angular momentum, is 
a pseudovector. 

in a stronger electric field the couiiling between L and S is disrupted. 
In this case the scheme of splitting is the following. The vector L 
is integral. It has L + l projections on the electric field. We must 
project S onto its projection. But since L and S are both pseudovectors, 
the number of projections of S upon L is already equal to 2j6f-|-l. 
The only exception is when the projection of L upon the field is equal 

to zero. This level splits into <5 + 1 or + y levels according to S. 

The square-law Stark effect. The amount of splitting is determuaed 
by the relative shift of neighbouring levels. As was shown in Sec. 33 
(33.31a), the shift of an energy level is equal to the average of the 
perturbation energy for unperturbed motion. Proceeding from (14.28), 
we have the following expression for the perturbation energy in 
a homogeneous electric field 

i^(i) =_• _ (dE). (35.14) 

But it is easy to see that the average of this quantity is equal to 
zero. Indeed, the wave function of an atomic state with given J 
is always odd or even (with the exception of hydrogen, see below). 
Therefore, the product 4'} '\>j must be even. From (30.24), the average 
of .^0) is equal to 

jFo) = - eE j . (35.15) 

But the integrand is an odd function, so that its integral is identically 
equal to zero. 

Level splitting is obtained only to a second approximation, if 
into (35.15) we substitute wave functions which have already been 
perturbed by the external field. This splitting is governed by a square- 
law field dependence. 

The linear Stark effect. In a hydrogen atom the electron energy 
depends only upon the principal quantum number n and does not 
depend upon 1. Therefore, the state with S’n is represented as 
a superposition of states with I varying from 0 to » — 1. But the wave 
function is even for even I, and odd for odd 1. Hence, the function 


378 


QUANTUM MECHANICS 


[Part III 


with § = Sn does not have a definite parity, so that the integral 
(36.16) does not become zero. Therefore, in the hydrogen atom we 
observe line splitting which depends linearly upon the electric field.* 

Highly excited atomic states always more or less resemble hydrogen- 
atom states, because the nucleus and the atomic residue act upon 
an electron, which has receded far from the nucleus, in a way similar 
to a point charge. The energies of these states depend upon I in ac¬ 
cordance with the expression (31.46). These states give a linear 
Stark effect if the perturbation produced by the field shifts the levels 
more strongly than they are split in 1. 

Ionization of the atom by a constant field. A constant electric field 
not only shifts the energy levels of an atom, but also qualitatively 
changes its whole state. 

Let us write down the potential energy of an electron in an atom 
situated in an external electric field E which is directed along the 
z-axis: 

U =U^(r) +eEz. (36.16) 

For a sufficiently large and negative z the potential energy far away 
from the atom is less than in the atom. The potential well in the atom 
is separated from the region of large negative z (where the potential 
energy can be still less) by a potential barrier. But there is always 
the probability of a spontaneous electron transition through the 
potential barrier into the free state. Transitions of this type were 
considered in Sec. 28 as applied to alpha disintegration. 

Any state of an atom put in a constant electric field may be ionized, 
but, naturally, if the field is weak the probability of ionization be¬ 
comes vanishingly small. In a strong field the potential barrier be¬ 
comes transparent, especially for highly excited atomic states. If 
the time for the spontaneous ejection of an electron in such a state 
turns out to be less than the radiation time, the corresponding line 
in the spectrum disappears. 

Thus, a weak perturbation inside an atom (the atomic imit of field 
intensity £! == 5.13-10® v/cm, so that the external field is 

always small compared with the atomic field) essentially affects 
the state since the conditions at infinity change. But if the broadening 
of the atomic levels is stUl small compared with the distance between 
them, they can be regarded, as before, as discrete. 

Exercise 

Construct a diagram for the splitting of the multiplet and the 

transitions in a strong and weak magnetic field. 


* The relativistic expression (38.28) for the energy of a hydrogen atom 
involves n and j. The orbital angular momentum 1 = / ± Va for a given j, 
so that a state with given n and j (in the same way as to a nonrelativistio 
approximation) does not have definite parity and yields a linear Stark effect. 


Sec. 36] 


QUANTUM THEORY OF DISPERSION 


379 


Sec. 36. Quantum Theory o! Dispersion 

The classical theory of dispersion, a brief outline of which was 
given in exercise 19, Sec. 16, proceeds from the concept of a charge 
elastically bound in an atom. The forced oscillations of these charges 
under the action of a sinusoidally varying field lead to an electrical 
polarization of the medium proportional to the field. Whence the 
dielectric constant can be easily calculated as a function of the fre¬ 
quency. 

The classical theory of dispersion is in good agreement with ex¬ 
periment. Yet, at the present time it is well known that the charges 
in atoms are by no means bound by elastic forces. For this reason, 
the success of classical dispersion theory may appear incomprehensible. 

Even though the charges are not bound by elastic forces, there exist 
quantities, relating to the motion of the charges, which vary har¬ 
monically with time: these are the coordinate matrix elements 
[see (34.18)]. Similar harmonic oscillations occur, as is well known, 
in the classical mechanics of elastically bound particles. The dipole 
moment of an atom, induced by an external alternating field, is 
expressed in terms of the dipole-moment matrix elements directly 
related to the coordinate matrix elements. In the present section, 
a quantum theory of dispersion will be formulated which will lead 
to the same expression for dielectric constant as classical theory; 
it will also indicate which quantities should correspond to each 
other in both theories. 

The wave equation for an atom in a given field of radiation. In 
order to calculate the dipole moment induced by a field, we must 
first of all determine the wave function of the atom in the external 
field. In contrast to the previous section, where the behaviour of 
an atom in a constant external field was studied, we shall here con¬ 
sider the interaction of an atom with an alternating external field 
which varies according to the law 

E —Eocoscot. (36.1) 

It turns out to be more convenient here to write down the field, 
straightway in real form instead of taking the real part of the final 
result in order to have a real Hamiltonian. 

The wavelength of a light ray is rightly considered large compared 
with atomic dimensions (this was confirmed by the estimate in Sec. 34), 
so that the field E may be considered homogeneous: its phase is 
constant over the whole atom. 

We determined the energy for a system of charges in an external 
homogeneous field in Sec. 14 [see (14.28) and (35.14)]. The correction 
to the Hamiltonian—due to a homogeneous electric field—^looks like 

jra) = _ (dE). (36.2) 


QUANTUM MECHANICS 


380 


ll’art III 


If we call the Hamiltonian of an unperturbed system then 

Schrodinger’s equation will be of the form 

- h- -If- = if + <l>. (36.3) 

t’ ot 

Separating the wave function into an unperturbed part and 
a perturbation and regarding the perturbation as relatively 
small, we obtain an equation which wo have already used in Sec. 34: 

- - i't®) ii (1) .^(1) (1,(0). (36.4) 

Expansion in eigenfunctions. Wo seek the unknown function 
in the form of a wave function expansion with time-dependent co¬ 
efficients : 

(36.5) 

ft 

We obtained an equation in (34.7), Sec. 34, for the expansion 
coefficients 

- i>T d V . (36.6) 

The right-hand side of this equation depends upon time in a difiForent 
way from that in equation (34.7), because the perturbation operator 
uivolves time explicitly [see (36.1)]. Let us consider that the 
unperturbed state of the atom is its ground state, which we shall 
write with a subscript 0, i.e., Then there will simply bo the 

matrix element don on the right-hand side of (36.6) multiplied by 
— Eo cos wf. The time dependence for the matrix element was found 
in Sec. 34. Using the notation (34.20) we can write 

don — e' “«o d'o„ . (36.7) 

The representation of a matrix element together with its time de¬ 
pendence is termed the Heisenberg representation, and that without 
the dependence, in the form d'on, is the Schrodinger representation. 
Substituting (36.7) in (36.6) we obtain 

” T ''' 

In order to integrate this equation we must impose a certain initial 
condition upon c„. It is natural to suppose that the external field 
acts for a sufficiently long time so that all the transition processes 
related to turning on the field do not affect the states. We can assume, 
for example, that the external field depends upon time according 
to the law: 


Soc. :!t)j 


IJl' VN'i'UM THEOUV or DISPERSIOE 


381 


E = Eo e“'cos 6>1 for t <0 , 
E = Eo cos bit for < ^ 0 , 


(36.9) 


i.e., the amplitude gradually rises with time to the value Eq. This 
law for the change of the field must be substituted into (36.8), inte¬ 
gration performed from — oo to any t, and a must tend to zero. 
After this, at each instant of time {t<0 or t^O) there will be a single 
dependence of c„ upon t: 


Ch 


1 

~2h 


(w„o + w) t 

“n 0 + “ 


j (Eo d'on) . 


(36.10) 


Induced dipole moment. The mean value of the dipole moment 
is calculated according to the general formula (30.24) for mean values; 


“d =-J (A(«)*-l-ij;(i)*)d(4-(»)-)-4/(i))dF. (36.11) 


The quadratic term in must, of course, be discarded, since 
the calculations are performed to the accuracy of terms proportional 

to E in the first degree. In addition, the term d(p(o)fl!F does 

not depend at all upon E and, therefore, is irrelevant to the problem 
of polarization produced bj an external field. Also, this term is usually 
equal to zero, as indicated in the previous section in connection with 
the expression (35.15). Hence, the mean dipole moment responsible 
for disi)ersion is 

d =-. J (tj>( 0 )*d 4 (h) -t- t}((i)*d ({;(«))dF. ( 36 . 12 ) 

We shall substitute here the expansion (36.5) and integrate the 
series term by term: 

d = J dC^^F -h c* Id^r a f) . (36.13) 

n 

The integrals involved here are once again dipole-moment matrix 
elements. Substituting their expressions from (36.7), we write the 
mean dijiole moment as 


d 2'(c„e' “«'‘‘ d'„o + d'o„). (36.14) 

n 

With the aid of expression (36.10), we finally obtain for c„: 


i Uli --- + (Eodo«) d- 

+ _«:_i“L)d„„ (Eod„o)l. 

\ «.)„() — M W„0 -1- 0> / J 


(36.15) 


Here we could already have written d„o in place of d^o> because the 
time factors of d;,Q and d^n cancel. 


382 


QUANTUM MECHANICS 


[Part III 


Polarization. In order to calculate the polarization of atoms by 
a light-wave field, it is sufficient to know only the dipole moment 
projection on the field. If, for example, the electric field of an incident 
wave is directed along the a;-axis, then the expression (36.15) involves 
only angular-momentum transition components directed along the 
a;-axis, i.e., the matrix elements of x\ 


n 

{ 


— 1 - 
o> u„o + CO 


+ 


"nO 


' “nO + w ' 


1- 


a:on I ® + 


(36.16) 


We have made use of the Hermitian nature of matrix elements 
expressed by the relationship (34.15). In other words, we have put 
Ixq,. |2 in place of XgnXng. 

Now, by performing a simple algebraical transformation and 
introducing the electric field E itself instead of its amplitude Ef^, 
we have: 


1 _ 2 Ci>no 6^ [ I ^ 


E. 


(36.17) 


The dispersion formula. Let us consider the polarization of a medium 
P=Nd, where N is the number of atoms in unit volume.* The electric 
induction is related to the electric field and polarization by the relation¬ 
ship (16.23) which, in the given case, is of the form 


D-=E-|-47tP = (l(36.18) 

n 

But I)=eE from the definition of dielectric constant, so that 


s = 1 -p ' • (36.19) 

We note that this expression is correct only when the frequency 
of the incident radiation is not close to one of the natural frequencies 
of the atom <on(,. Otherwise the denominator in (36.19), and cor¬ 
respondingly in all the previous equations up to (36.10), can become 
zero; in any case, it becomes small. But then the perturbation caused 
by the field is large, and it cannot be regarded as weak, utilizing 
the expansion neglecting | (pO) |2. Physically, this 

means that if the frequency of the incident light is close to the fre¬ 
quency of one of the absorption lines (or, what is just the same, 
the emission lines), we must take into account the damping in the 
amplitude of the excited atomic states duo to radiation. In other 


* Such additivity of dipole moments is true only for gases. 


Sec. 36J 


QUANTUM THEORY OF DISPERSION 


383 


words, the amplitudes of the excited atomic states must not be taken 
with a purely imaginary time dependence, but of the form (28.14): 

r„‘ 

(36.20) 

The quantum evaluation of r« for radiation damping is rather 
complicated, and we shall not deal with it. 

A comparison of classical and quantum dispersion formulae. We 
shall now go over to a comparison of the quantum formula of dispersion 
(36.19) and the classical formula. We shall write the latter for the case 
when there are many types of oscillation whose frequencies are Wno 
(instead of a single frequency coj). We shall separate the total number 
of oscillators N into parts corresponding to each separate frequency <o„o: 

(36.21) 

tt 

If we introduce the relative fractions of each oscillation by the formula 

then it is obvious that 

2’/« = 1. (36.23) 


Let us suppose that the frequency of the incident radiation is 
not close to any one of the natural frequencies m„o. Then the classical 
dispersion formula is generalized to the case of many frequencies 
in the following way: 


s = 1 + 


4 7tJVe^ yi in 

m w <0® — <0* ■ 
_ no 


(36.24) 


Comparing it with the quantum formula of dispersion (36.19), 
we see that both formulae become identical if we put 

(36.26) 


But to make this equation meaningful we must, in accordance 
with (36.23), impose upon the right-hand side of (36.26) the condition 


= 1. (36.26) 


Let us prove that this is actually what occurs. To do so we write 
the commutation relation between and x [see (29.3)] 

h 

PxX — X'Px = -J-. 


(36.27) 


384 


QUANTUM MEUHANtCS 


[Part III 


We multiply it by on the left and tpo on the right (we omit the 
upper index 0) and integrate over the whole volume. We can take 
advantage of the normalization condition on the right-hand side 

of the equality, i.e., of the fact that J |4 'o|®dF — 1: 

I ViPxX'ifodV — ^ Y • (36.28) 

We exjjand the products x'\)q and tj/Ji in an eigenfunction series i|/n, 
according to the genera! formula (30.8): 

dn'pn > '^IqX — ^ Ctn'l'n • (36.29) 

n n 

The expansion coefficients are determined from (30.11): 

a„ - I V; a,:, = j V • (36.30) 

In other words, they are equal to the matrix elements x^n. Now, 
substituting the expansion (36.29) into (36.28), we obtain 

^ j"■ *"oj “ J • (36.31) 

n 

But this expression contains the momentum matrix elements, which 
can be replaced by coordinate matrix elements by equation (34.21): 

J '^^Px^nd V — iPx)nQ “ i'W6>Q/iXnQ , 

J V ~ (Pa')o» “ imoitiQXQtt . 

After this substitution, equality (36.31) can be easily reduced to 
the required form (36.26) if we take advantage of the fact that Xno =xon 
and chqii ” — oibq. 

We note that the oscillator fractions /,, (they are also called “oscilla¬ 
tor forces”) are proportional to the same matrix elements as are involved 
in the probabilities of radiation or absorption of the appropriate 
quanta. Therefore, tlie dispersion properties of a substance may be 
associated with the intensity of the spectral lines emitted by it. 

Incoherent scattering. In addition to the dipole moment d deter¬ 
mined by equation (36.11), we can also calculate the transition 
moments corresponding to radiation with a frequency which is less 
than that of the incident light. In other words, we can calculate the 
intensity of light scattering with a change of frequency. Such scattering 
is termed incoherent. 

A very important case is when the radiation energy, which remains 
in the substance upon incoherent scattering, contributes to exciting 


Soo. 37] 


QUAKTUM THKORY OF SCAXTKRINO 


385 


the oscillatory motion of the molecules. This phenomenon was dis¬ 
covered by L. 1. Mandelshtam together with G. S. Landsberg and, 
independently, by Raman. It is frequently accompanied by the exci¬ 
tation of oscillations which do not manifest themselves in the direct 
absorption of quanta as a result of the appropriate selection rules for 
molecular oscillation^. In this case, incoherent scattering yields im¬ 
portant information concerning the molecular structure of substances. 

Sec. 37. Quantum Theory of Scattering 

The effective cross-section concept in quantum theory. The concept 
of an effective scattering cross-section of particles, which was defined 
in Sec. 6 in terms of classical mechanics, is directly extended to 
quantum mechanics. Indeed, the differential effective scattering 
cross-section of the particles inside a given solid angle is the ratio 
of the number of scattered particles in this element of angle to the 
flux density of the incident particles. Since flux and flux density 
can be defined quantum-mechanically, the effective cross-section has 
the same sense in quantum theory as it has in classical theory. 

In practice, however, it is very difficult to calculate the effective 
cross-section. Therefore, we shall consider certain special cases in 
which the solution to the problem is comparatively simple. 

The Bom approximation. Let us suppose that a particle with 
energy S is scattered in a given potential field U. We shall first 
consider the case of S’ p U. Then, the change in the wave vector 
of the particle in the field is of the order 

■\/2m(S—U) ‘\/2m S 1 /^*" ^ 

h “ h f 2 S T" • 

If the dimensions of the region in which the field acts are of the 
order a then the total phase change of the wave function in the 
scattering field is estimated as 

I / jn. Ua 

H ~2¥ ~~hr’ 

This quantity must be considerably smaller than unity in order 
that the perturbation produced by the field may be regarded as 
weak. In the case when TJpS, the wave number is estimated as 

p follows then that the criterion of smallness 

for a phase change is l (upper estimate). 

Under these conditions the action of the field TJ must be regarded 
as a weak perturbation imposed upon the wave function. 

We shall proceed from the general formula (34.29) for the transition 
probability. Let the initial momentum of the incident particle equal p 


25 - DOSD 


386 


QUANTUM MJJCHANirs 


[Part III 


prior to scattering in a centro-of-mass system (see Sec. 6), and p' 
after scattering. We consider the scattering to be elastic, so that 
p — p'. To a zero approximation we choose the wave functions (p) 
and (p') in the form of plane waves, which corresponds to free 
motion. We write them as 


1 JPL 1 _ ' P- 

<i;(0)(p) = -^ye“'* ;^(0)-(p') = y^-e X-. (37.1) 

These functions are normalized to unity in the volume F (which, 
of course, falls out of the final result). The approximation (37.1) 
for (p), tl*^®** (P ) is called a Born approximation. 

The function (p) corresponds to a flux density ; this is im¬ 
mediately seen from (24.20): 


h 

2mi V 


(37.2) 


From (37.1), the matrix element for the transition probability that 
appears in (34.29) is 

i (p—p')*' 

(p,p') = - Jr-Je '■ U(t)dV. (37.3) 


In order to find the scattering probability, we must multiply the 
square modulus of (37.3) by-^, by the number of finite states z (S) 

in a unit energy interval, and by the element of solid angle dO. 
This number can be determined directly from (25.26) if we take 
into account that the fraction of the states corresponding to a solid 

angle element d Q is . 


Therefore, 


z(^) = 


dN(g) 

dS 


da 

'47t 


Fm*/« 


-dSl. 


(37.4) 


The differential effective scattering cross-section inside a solid-angle 
element d O is equal to the scattering probability in unit time [defined 

V 1 


from (34.29)], divided by the incident flux- 


Therefore, 


do = 


(p^pO* 

h 


U{T)dV 


471* A* 


dSl. 


(37.6) 


The integral appearing here is the matrix element calculated for 
two functions (p), (p') with normalization over unit volume, 

F = l. Therefore, introducing kH= -F, k'= we UTite 


Sen. 37J 


QTTANTUM THEORY OF SCATTERING 


J7kk-=Je'<k-'‘>C/(r)dF, 


(37.6) 

(37.7) 


Scattering by a central field. Simplifications appear in expression 
(37.6) if the field U is central, i.e., if it depends only upon r. Let 
us calculate C/tt' for this case. In defining the polar angle 3- we choose 
the direction of the vector k — k' as the polar axis. Then 

OO TC 

Uul' = Je'C^-kOr{7(;.)dF=2iT:|r2dr(7(r)|c''Ik-k'Ircos»ginda-. (37.8) 

0 0 

Bearing in mind that sin 3 d3= — d cos 3, we can integrate with 
respect to 3 immediately, obtaining 


Uul' = ‘2njr^dr U (r) | 


i I k — k' r 


— r C7 (r) sin ( 1 k - k' 1 r) d r . 


(37.9) 


As we have already said, k = k'. Therefore, the vector difference 
is easily expressed in terms of the deflection angle 6 for the particle: 

|k- k'|2 = 2)fc2- 2(kk') = 2ifc*(l - cos 6 ) = 4A:2sin2 |-. (37.10) 

This can also be seen from a geometrical construction. We have 


= fcsin 


OO 

—^ Cr U (r) sin l2kr sin —j dr. 

smy J 


(37.11) 


Thus, a calculation of Utic' reduces to calculation of a single integral 
(37.11). 

Rutherford’s formula. For the case of a (k)ulomb field, U = ±-. 

The integral Uu^' is found in the following artificial manner. We 


define the integral J sin x dx thus 


OO 

lim fsin xdx— lim -5-^1 = 1 


25* 


388 


QUANTUM MECHANICS 


[Part ni 


Then 


and, finally, 


Jsinascda: = 


U. 


kk' 


OO 

271 ^ 6 ” r . I 


2fcrsin 


1 

a 


± 


TzZe^ 


yfc'isin^-i 


(37.12) 


Substituting this in (37.7), we obtain a final expression for the 
differential effective scattering cross-section: 


da = 


Z^e*dCl 

0 ^ 

4 OT* v* sin‘ - - 


(37.13) 


where we have taken advantage of the fact that p = hk~7tiv. This 
result curiously agrees with the precise classical Rutherford formula 
(6.19). 

It turns out that the result (37.13) is also obtained from a precise 
solution of the wave equation for the case of a Coulomb field. Thus, 
Rutherford’s formula is extended to quantum mechanics unchanged. 

The Born approximation in the theory of scattering by a Coulomb 
field can be regarded as a series expansion in square powers of the 
charge, or, more precisely, Ze®. But since the precise formula does 
not involve powers higher than the result of the Born approxi¬ 

mation coincided with the precise result. 

We shall now estimate the limits of applicability of the method 
under consideration for the Coulomb field. To do this, we make use 
of the first criterion established at the beginning of the section for 
the applicability of this method. Since the product Ua in this case 
is equal to Ze^, we arrive at the following condition: 


Ua 

h 


Ze^ . - 


(37.14) 


The quantity e®/Ac = 1/137. Therefore, we write (37.14) otherwise 
thus: 


137 V 


(37.16) 


But Z~90 for heavy elements, so that (37.15) is not satisfied in 
general. Of course, Rutherford’s formula is applicable to nonrela- 
tivistic particles in this case too, because it is exact; but in calculating 
a correction, for example, arising from a distortion of the nuclear 
field by the field of atomic electrons, the Born approximation yields 
an incorrect result. 


Sec. 37] 


QUANTUM THKOEY OF SCATTKRINO 


38S) 


With the condition (37.15), formula (37.13) is applicable also to the 
scattering of relativistic particles at small angles provided m is re¬ 
placed by 

The collision parameter (aiming distance) and angular momentum. 
The Born approximation cannot be used when large forces act upon 
a particle, even if they are concentrated in a small region. First of all 
let us define what is meant by a “small” region. 

ft is convenient here to compare the classical aiming distance of p 
(see Sec. 6) with the angular-momentum eigenvalue in quantum 
mechanics. For large angular-momentum eigenvalues, when the quasi 
classical approximation is applicable, we can give the following esti¬ 
mate ; 


Whence 


hi ~ mv p. 


P 


hi 

m» 


X/ 

2tt’ ’ 


(37.16) 

(37.17) 


where X is the de Broglie wavelength. 

It can be seen from here that to a change in the angular inomeututn 

by unity there corresponds an increment of * 2 ™ ~ I aiming 

distance. Accordingly, the smallest collision parameter is given by 
i = 0 and p ~ - 2 "-. Here the particle is scattered in the s-state. 

Let us consider the case when the radius of action of the forces is 
less than . Then a particle with an angular momentum other 

than zero hardly at all experiences scattering. We have shown [see 
(31.12)] that the wave function for a particle with angular momentum 
I becomes zero, like W, at the origin. Therefore, the probability of 
finding a particle with (> 0 in the region of action of the forces is 

very small if the radius of action of the forces is much less than . 

Separating the wave function with zero angular momentum. Let 
us take the term, corresponding to 1 = 0, out of the wave function. 
To do this, it is necessary to expand the function (37.1), i.e., a plane 
wave, in a series of eigenfunctions of the operator The function 
corresponding to Z = 0 is especially simple: it does not depend upon 
the angle. Indeed, the operator involves only angular derivatives. 
Operating upon a function which does not depend on the angle is 
equivalent to multiplying the function by zero. Normalizing the angu¬ 
lar function of the s-state to unity, we find 


_l_ 

\/4n 


then J I 9 o I® dQ 


1 . 


90 = 


390 


yUAJITUM MECHANICS 


LPart ill 


The expansion coefficient for this function is, according to the gen¬ 
eral formula (30.11), 

c(r) =J(po(j;(0)(p)dF= = 

1Z _ 

= » sin 0 do = 2 ]f^ . (37.18) 

V47:FJ V V kr 

0 


c (r) satisfies the radial wave equation for a free particle (31.6), 
if we put 17 = 0, 1 = 0 in it. In actual fact, equation (31.7), which is 

obtained from (31.5) by substituting = when 1 = 0, f7 = 0, has 
the solution 


sin-^—T— r. 


But — k, so that the function x = ^c (>■)• 

c (r) tends to a finite limit when r = 0. This corresponds to the 
boundary condition for a radial ij^-function at the coordinate origin. 
Let there now be, close to the origin, a scattering field U (r), which 

diminishes so rapidly with distance that U (r) = 0 when r ~ gT* 
in its radial dependence, the s-state wave function satisfies, as before, 
equations (31.5)-(31.7) for because no forces act upon the 

particle in this region. The solution to (31.5), which is more general 
than (37.18), is 

c' (r) = ^ - 2 ]/^ , (37.19) 


where 8 is some phase shift depending upon the definite form of the 
potential d (r). Naturally, the solution (31.19) cannot be extended 

to r < -s—, because the particle in this region is no longer free from the 

M 7C 

action of forces. 

We shall show that the effective scattering cross-section is expressed 
in terms of 8. An example of determining S is given in exercises 
2 and 3. 

Determining the scattered wave. Let us suppose that, in some way, 
an exact solution <J/' of the wave equation (31.12) has become known 
in a given scattering field U (r). This solution must satisfy the same 
conditions at infinity as a plane wave, because these conditions corre¬ 
spond to the particle scattering problem. We represent the function 
tp' in the form of the sum 


(j;' = (j;0 (p) + ^ 


(37.20) 


Sec. 37] 


QUANTUM THEORY OF SCATTERINO 


391 


This equation separates the function (p), which corresponds to 
the incident wave, from the complete solution of the wave equation. 
The second term (j/«at==4''—'1'® (P) describes the scattered particles. 

If we now expand and tj^® (p) in a series of eigenfunctions of the 
square of the angular momentum, it turns out that a short-range force 
field distorts only the term tj;', which corresponds to l — O. All the 
remaining terms of both functions are the same. As a result, when 


r > V - we obtain 


= [c' (r) — c (r)] (Po 


sin (A;r -f 8) — sinAjrl * _ 

kr J "v/iTt 


(37.21) 


But for large r the scattered particles can move only away from the 

gifcr 

scatterer. This means that tj^scat involves only the function -y and 

g-tHtr gtfer 

does not contain the function ^ . Indeed, if we put il* == —p— in 

(24.20). then the flux j will acquire a positive sign, while if we substi- 

f~ikr 

tute - the flux will be negative, i.e., incoming. 

We write Ascat in complex form: 

=--L- [A (e‘(*'+«) - c-‘»f + '») - (e**^ - . (37.22) 

ZiyV ■ kr 

To exclude the incoming wave 6“'*", we put Ae~‘* —1, whence 

A=-e'«, (37.23) 

1) • (37.24) 

2 ^ V 

The effective scattering cross-section. From (24.20), the flux density 
of the scattered particles at infinity is 

The total effective scattering cross-section <t is equal to the whole 
flux 4 7 tr®/ divided by the flux density of the incident particles 

V hk 

V ~ ^ • 

CT=^|e2'»-Ip. (37.26) 

Passing to real quantities Ave write 

|e 2 i 8 _ ip = (e2is_l)(e-2‘«—l) = 2-2cos2S = 4sin*S, 
so that 


ff = ~ sin2 S . 
k^ 


(37.27) 


392 


tJTJANTtJM MECHANICS 


[Part III 


] t can be seen from here that the greatest value of <j is for the scatter¬ 
ing of particles in the 5-state ”. This formula has many applications 

in nuclear physics since nuclear forces are large and short-range. 

Since scattered particles occur in the s-state, their distribution in 
a centre-of-mass system is isotropic, i.e., it does not depend upon the 
scattering angle. This agrees with the statement made at the end of 
Sec. 6. 

Exercises 


1) Find tljo offeotivo scattering cross-section of fast particles by hydrogen 
atoms in tlio ground state. 

The wave function for the groimd state of a hydrogen atom with a = 1. 

i! -- 0 is 

ij/o = Be ■ ^ = Be" ’, 


becau.se, of the polynomial x there remains only the first term, which does 
not depend upon while =1. The coefficient B is found from the nor¬ 

malisation condition 


OO 


The potential interaction energy of the (hargo c with Ibo atom i.s 


V: 


r ,,,,, 


The first term of the integral l/kk' was found in the text. It is 


sin® - 


Wo integrate the second term in the following way: 
W(r')dV' _ 


i (k - k”) (r - r') 


■dV. 


In the last integral it is necessary to take the origin at the point r', so 
that it i-educos to the same form as (37.12): 


- -i(k-k')r 


fc® sin® 


0 ■ 


Ukk' = - - - 


Tce® 


ifc® sin® - 


(■-/ 


V (»•') e 


i (k - kO 


, 


Honco, 


Soc. 37J 


QUANTUM THKOIIV OF .SUATTKIlINC) 


393 


Tho quantity inside the brackets is called the scroonins factor. Kvaluntiiift 
it in tho same manner as (37.8), ■\ve obtain 


j dV'— 'i'o“ ('■’) sin/2ir'sin-|-J dr'. 

fcsin-^- 0 ' 

The integral reduces to the form 

OO 

J xsin a.re"*’*d.t 
0 


Here, a =2A:8in-^-, b= , so that tho screening factor is 


= -Ajsinu.rc 


d.r 


8 a- 
8 b 


2(^ 


it was assumed in the last tvausformatiou that th(> scattered particle is an 
electron. Then, strictly speaking, we should have formed a function which 
is antisymmetrical together with the function of the atomic electron; this 
we did not do. Tho final formiila for tho effective scattering ero.ss-.section 
differs from (37.13) by the .square of tho screening factor. We note that this 
factor is cori-ectly obtained only in the Born appro-viination, in contrast to 
Rutherford’s formida (37.13), which is exact. 

For 0=0, the effective cross-soction turns out to bo finite, bocauso 0=0 
corresponds to large aiming distances, when tho nuclear charge is screened 
by the charge of an electron. 

2) Calculate the effective scattering cross-section for a pai'ticlo by an im¬ 
permeable sphere of radius a, which is very much less than -jr— = . 

A 7T fC 

In accordance with (25.1), the wave function at tho surface of tho imper¬ 
meable sphere becomes zero. Hence, tho solution (37.19) has tho form 


c' (r) = 4 • 2 


sin k (r — a) 
kr 


From this, 8 = — ka, while 


a 


k^ 


sin“ ka . 


B\it ka 1 from the conditions, so that sin ka ~ ka. 

Finally, cr = 4 w o*, i.e., the effective scattering radius is twice tho radius 
of the sphere. In classical theory a = (see exercise 1, Sec. 6). 

3) Examine the scattering of particles with energy S in the s-state by a 
spherical potential well of constant depth | U„\ and radius a; consider that 
there exist the following relations: 


- 


■ < e 1 


8 ma* 


(1 + e), 


394 


QUANTUM MECHANICS 


[Part 111 


(see See. 26). For e > 0, there exists in the well a bound state of a particle 
of energy close to the upper boundary of the well. Unlike Fig. 38, it is 
assumed here that U becomes zero for r > a. Express the cross-section in 
terms of the energy level ^o- 

The conjugation condition for the wave fimctions with r = a is of the form 

k cos (ka + S) _ x cos xa 
8in(l:a-f8) sin xa ’ 

. + \U„\) , \/2m (S’ xt i *• i it • 

where x = — -v —■—, k = —j-. Neglecting to, wo obtain 

A A 

an expression for the effective cross-section: 

ire . ire ire 

o - am d - ^ ^ . 

In accordance with the conditions imposed upon Ej |f/ol wo have, approx¬ 
imately, 

cot xa ^ . 

4 


From (25.41) the condition for finding the level ffp is of the form 


Xp cot Xp o = — X 


Supposing that 


Up 


<? e , we obtain 


u, 


assumption concerning the order of smallness 
the cross-section we finally have 


"Lh 


This confirms the 
, since \/e > e . For 


ire 


'\/2 Yfh ! ^ I 

where lip = - -^. We note that the formula obtained also holds 

h> 

for c < 0, when in actuality there is no level at all in the well. In this case 
we talk about a “virtual” level. The straight line in Fig. 39 intersects the 

first half-cycle of the sinusoid just before xa = • 

A similar case occurs when neutrons are scattered by protons with anti- 
parallel particle spins. 


Sec. 38. The Relativistic Wave Equation lor an Electron 


The equation lor a spinless particle. Schrodinger’s equation (24.11) 
is formed on the basis of the nonrelativistic relationship between 
energy and momentum 


2m 


+ u. 


Therefore, it can be applied only to electrons whose velocity is consid¬ 
erably less than that of light, and whose kinetic energy is considerably 
less than the rest energy: 


Sec. 38J 


THK aELATIVISTlC WAVE EQUATION 


395 


r = 


— mc^. 


Immediately after Schrodinger obtained the nonrelativistic equa¬ 
tion, the first attempts were made to build a relativistic wave equation 
(Fock, Klein, Gordon). In formula (21.30) 


— 69)® = — 

h B 

-i- was substituted in place of as is usual in quantum mechan- 

t Ot 

ics, and in place of p, the operator V. In this way a wave equation 
was obtained in relativistically invariant form 

(t-It+ 4- + ®'?)'{' = 

= c® (4 V - -J A ) (i V - -J a) 4- -I- m® C-* , (38.1) 

which equation, however, is not applicable to electrons. The fact of 
the matter is that equation (38.1) does not take into account the spin 
of the electron, because it involves only a single wave function. Yet 
in Sec. 32 we saw that a particle with spin 1/2 must be described by 
at least two wave functions. These two functions could be introduced 
into the nonrelativistic equation purely formally, assuming that each 
of them satisfies it. But the interaction of spin and orbit is a relativ¬ 
istic effect; therefore, a correct equation for fast electrons must take 
it into account automatically, without any additional hypotheses 
concerning spin magnetic moment. This equation must involve op¬ 
erators which act upon the spin degree of freedom. 

The inapplicability of equation (38.1) to the electron was very 
quickly seen; the fine structure of the levels of the hydrogen atom 
obtained from this equation was incorrect. A nonspin equation cannot 
explain, first of aU, the number of splitting components; this is deci¬ 
sively against it. 

Charged particles without spin—mesons—^take part in nuclear 
interactions. Equation (38.1) can be applied to them, at least if it is 
shown that such mesons can be regarded, to some sort of approxi¬ 
mation, separately from protons and neutrons. 

But for electrons one has to form a relativistic wave equation that 
takes spin into account. Such an equation was obtained by Dirac. 

The Dirac equation. Following the line of Dirac’s argument, we 
begin with the equation for a free electron. The starting relationship is 


~ Ar + m®c® 


= V c® (pi -f pf + p%) -b m®c*. 


(38.2) 


390 


QUANTUM MKCHANICS 


[Part. Ill 


Instead of S’ and » we must substitute the derivatives — and 
^V. However, to do this it is necessary to define the meaning of the 

square root of an operator. Dirac supposed that, in the operator sense, 
a root is equal to an expression like 

V + p? + pi) + m^c* — c («.r p* + ay J)y + 5.zi>z) + Pmc*, 

(38.3) 

where a*, ay, a^, and p act on the internal degrees of freedom of the 
electron such as, for example, the spin degree of freedom. 

Ijet us square both sides of the equation and attempt to choose the 
operators a«, ocy, a^^, and P in such a way as to obtain an identity, i.e., 
so as to eliminate terms of the typo px py, .. •, me® px, ■ ■ ■ '■ 

c® (pi + p? + pi) + = c® (alpl + a|p| + a|p|) -1- + 

+ r.^{oixS.y + aya*) pxpy -|- e®(a.^aj -|- a^a.v) pxpz + 
c®(aya^ f S.iS.y)pypx + (ix^ + Pa.v) px + 

■h mc®(ayp + PSy) py + Ttic^ (a^p -|- Pa^-) pz ■ 

Henoe, the operators must be subject to the conditions 
a.v — oCy — aS — p® == 1 , 

S-xS-y -(- aya.v — a.va.; -|- S-xS-x — oLyiz |- ajdty — 

= a.vP + P«j.' = aj’P + Pay = a^p + pde^ — 0 . (38.4) 

These operator equalities greatly resemble the spin operator relation¬ 
ships (32.10), (32.11). It can already be seen from this that the oper¬ 
ators 3.x, 3y, 3z, and p at least act upon the spin degree of freedom of 
an electron. To the accuracy of the factor 1/4, the relations (38.4) agree 
with (34.11) and (32.13) for 5*, oy, and Sz. 

But the operators a and a are not identical. This can easily be seen 
by proceeding from the opposite: assume that a* = CT.v, ay = 5y, a^ = 5^. 
In order to obtain the wave equation we must equate the right-hand 

Jt 5 til 

side of (38.3) to-perform an inversion of the 

coordinate system. All momentum components will change sign so 
that the sign of pz will also change. But the operator a^ in front of pz, 
if it equals S^, should not interchange the wave-function components. 
Therefore, the equality between the left- and right-hand sides of the 
wave equation breaks down when the coordinate system is inverted; 
but this should not be. Therefore, a^a. 

The necessity for a four-component wave function. We shall be 
confronted with the same difficulty if we consider that the operators 
Sx, dcy, a* act upon the same wave functions as those appearing with 


Sec. 38] 


THE RELATIVISTIC WAVE EQUATION 


397 


— y ^ on the other side of the equation. Therefore, we must assume 

that coordinate differentiation, on the one hand, and time differentia¬ 
tion, on the other, are applied to different pairs of functions, of which 
one pair changes sign in inversion of the coordinate system while the 
other does not. This is sufficient to ensure invariance of the equation 
with respect to inversion. 

Thus, we shall say that the wave functions depend not only upon 
the spin variable a, but also upon some other internal variable p, 
which also takes on two values. 

Let us define the operators which arc completely analogous to spin 
operators and which act upon the variable o and upon the variable p. 
Bearing in mind that the factor 1 / 2 , with which the spin operators 
were multiplied by in Sec. 32, is no longer needed, we shall write 
Oj, 62 , 63 and, correspondingly, p^, pg, and pj for the variable p. These 
operate upon the wave function analogously: 


(38.5) 


(38.6) 


From formulae (32.10) and (32.11) we find the basic relations for the 
operators a and p: 


— SI = 53 = I'ySiS^—iS^'yS^Si^iSi 
SiS^— S^Si', SiS^— 535i;CT2a3 = — 


> 52I 

■® 3®2 I 


(38.7) 


and similarly for p. 

All the operators o are commutative with all the operators p 
because they operate upon different variables. In order that the 
operators a*, ocy, ot^ together should satisfy the “anticommutative” 
relations with the operator as in the third line of (38.4), we arrange 
that all three components of a are proportional to one operator of the 
components of p, for example, py and ^ is simply P 3 . We notice that 
Py interchanges the functions while pj does not. To have the operators 
Six, oLy, and a^ anticommutative, as in the second line of (38.4), we 
put them proportional to dy, a^, 63 , respectively. Thus, 

*.*“Pi®i> *i'“Pi®2> **“Pi®a> P“P3" 


(38.8) 


398 


QUANTUM MJEUHANIOS 


[Part III 


It is obvious that, as a result of the commutative nature of p and o 
and of the definition of the operators (38.5) and (38.6), all the opera¬ 
tors formed in (38.8) satisfy the conditions (38.4). 

Thus, the wave function in the Dirac equation has four components, 
according to the number of values for a and p (o = l, 2; p = l, 2). 
For convenience in future we shall number the symbols a, p from one 
to four, putting 

<i-i = ^(l, 1), = (2,1), <J^3 = (;/(1,2), <1^4 = '}'(2, 2). 

It is convenient to write the functions ({'ll 4'2> '{' 3 > '{'4 columns, as in 
Sec. 32. Now using (38.5) and (38.6), we find out how the operators 
a*, S.y, 5.x, and p act upon the four-component wave function 


('Ki.i)] 


K 

K2,l) 


<]'(2, 2)1 


'{'4 

■l^ia.i) 


'(1,1) 


'I'(1,2) 


'{'3 

^(1,2) 

— Pi 


' (2, 2) 


'{'(2,1) 


4 - (2, 2) 


Ml.2) 


['{'(1,1) 


'I'l 


-*'{'4 


'{'3 


'j'l 


ay4' = 

i'l'i 

; 5z’if = 

-'{'4 

'{'1 

l-'{'2 

; p4; = 

'{'2 

-'{'3 

i-'{'4 

(38.9) 


The choice of operators (38.8) and (38.9) is not unique. It is possible 
to form other operators with the same properties. For example, one 
could have chosen p^ instead of p^. We shall examine below (exercise 4) 
the implications of this fact. 

The Dirac equation in expanded form. Summarizing, according to 
(38.2), (38.3) and (38.8) Dirac’s equation can be written as 

— = c(6tp)t{/ -F 

(38.10) 


In accordance with (38.9), this equation must be understood as a 
system of four equations, which we shall write explicitly, first of aH 

replacing —^<]^|because is proportional to the factor 

lEf \ 


(Tipi = c (p* i}'4 — ^ Pr '!'4 + Px '{'a) + 4'i. 

g'^i = c(px^3-\-ipy^i — px'^i)+'rnc^^z, 

«f^4 = c (p*(}<i + ipy'\ii — Ps'l'a) — ] 


(38.11) 


As usual, hero p*, py, pz are the components of the vector-j-V, i.e., 
’ T iy ’ / ^system of equations (38.11) is applied to a 


See. 38] 


THB BKLATIVISTrO WAVJ5 EQUATION 


399 


free electron since it does not involve the scalar potential and compo¬ 
nents of the vector potential. Therefore, the coordinate dependence of 
the wave function is determined by the factor 


so that the whole group of four (j; has the form 


Oil 

02 

O4. 


(38.12) 


where the amplitudes o^, 02 , 03 , and 04 do not depend upon coordinates. 
The action of the operators p,, p* on this group of four <j; leads 
simply to a multiplication of its components by p*, py, p*. Conse¬ 
quently, the differential equation (38.10) for a free electron leads 
to the algebraic equation 

(^o = c (ap)a + »ic*po, (38.13) 

where by a is understood the whole column 


a = . 

IO4) 

Here, operator properties are preserved only by «*, ay, a*, and p, 
which rearrange the amplitudes a in the same way as the functions (j;. 
In other words, the amplitudes depend upon the internal variables a 
and p. 

Energy eigenvalues. We apply the operator 

/=c2(ap)-f »rac*p (38.14) 

to both sides of equality (38.13). As a result of the anti-commutation 
properties of the Dirac operators (38.4), only their squares, equal to 
unity, remain on the right. Hence, the following equation will result: 

= c*p*a + m®c^o. (38.15) 

Here, the components of a are no longer interchanged because there 
are numbers in front of a, and not operators. All four equations (38.15) 
have the same form =c® p** Ui -|-m* c® a^, etc. For these equalities 
to be satisfied we must subject the energy to the usual relativistic 
relationship, i.e., we cancel Oj: 

® = c® p® + m® c*. 


(38.16) 


400 


QUANTUM MBf'HANIO.S 


[Part III 


The last equality refers simply to magnitudes. Thus the energy 
eigenvalue for a free electron, determined from the Dirac equation, is 

^ . (38.17) 

The two signs in the equality correspond to the internal degree 
of freedom which an electron possesses in addition to spm. Only the 
plus sign is taken in classical mechanics, since free electrons do not 
have negative energy. The square root in (38.17) is not less than 
mc^ in absolute value, so that a region of width 2 mc^ exists in which 
the energy cannot occur. But all the quantities in classical equations 
vary continuously; therefore, once the energy has been defined with 
a f)ositivo sign, it cannot jump across this forbidden region of width 
2 me®, and remains positive all the time. In other words, energy which 
is defined as positive in the initial conditions remains positive from the 
equations of motion. 

Negative kinetic energies in the Dirac theory. In quantum tlieory it 
is shown that discontinuous transitions, too, are possible between 
different states. For example, an electron with energy greater than 
me® could emit a light quantum and remain with an energy less than 
—me®. But such electrons with negative energy and mass are not 
observed in nature. Their properties would be very strange: upon 
radiating light, they would each time reduce their energy and, as it 
were, drop into a state with ^ — — oo. All the electrons in the universe 
would rather quickly fall into this state; but, as we see, this has not 
happened! 

Thus, Dirac’s equation admits of the possibility of states, which 
cannot simply be excluded, because electrons may transfer to them 
from other observable states. But, on the other hand, there are no 
electrons in nature with negative energy — Vm® c* +c® p®. At the 
same time, the Dirac equation describes quite correctly a great assem¬ 
blage of electron properties: as we shall soon see, it yields a relationship 
between the spin and magnetic moment of an electron that agrees with 
experiment, it leads to an accurate formula for the fine-structure levels 
of the hydrogen atom, etc. In addition, mathematical investigations 
show that there is no essentially different relativistically invariant 
wave equation for a particle with spin one half and mass differing from 
zero. Therefore, one should not simply reject the Dirac equation; it 
is better to attempt to supplement it with some kind of hypothesis. 

Vacuum in the Dirac theory. Dirac suggested that vacuum should be 
redefined. Earlier, vacuum was understood as being a state of matter 
in which there are no charges, say electrons. A vacuum must now be 
called that state in which all negative energy levels are occupied by 
electrons. That this redefinition is not verbal but has a physical mean¬ 
ing will be seen very soon from what follows. 


Sec. 38] 


THK RELATIVISTIO WAVE EQUATION 


401 


If all the negative energy levels are occupied, then, in accordance 
with the Pauli principle, no electrons can transfer to them from positive 
energy states. Thus, the Pauli principle is necessary for relativistic 
quantum theory to be able, in general, to describe the properties of 
electrons. This is the basic reason why the Pauli principle is necessary 
as an element of quantum mechanics. In order to avoid misunder¬ 
standing we shall give a somewhat fuller definition of a “vacuum” 
in field theory, which has a rather different meaning from that in 
experimental physics. In field theory, vacuum signifies the ground 
state of the field; for an electromagnetic field, for example, it is the 
state of the field in which there are no quanta. We saw in See. 27 that 
this state is endowed with observable physical properties. 

In exactly the same way, if all the negative energy states are occu¬ 
pied, then all the remaining electrons can no longer reduce their energy 
by making a transition to negative states. When there are no electrons 
with positive energy, electrons can no longer reduce their energy in 
any way if all the negative energy states are occupied. This explains 
the definition of vacuum as a ground state. 

Pair production. All observed phenomena occur, so to speak, on the 
backgrormd of a state in which the negative kinetic-energy levels are 
filled. 

However, this “background” can manifest its existence in a real 
physical process. Close to a nucleus, a quantum with energy greater 
than is capable of effecting the emission of an electron from a nega¬ 
tive-energy to a positive-energy state. Proximity to the nucleus is 
necessary so as to satisfy the law of conservation of momentum. For 
the proof of this simple statement, see exercise 1. 

But after an electron has been removed from a negative energy 
state there will remain a “hole,” i.e., an unoccupied level. In an elec¬ 
tric field, electrons with negative mass (mass has the same sign as 
energy) do not move against the field, towards the anode, but along 
the field, to the cathode, against the applied force. And with them 
moves the hole which thus behaves like an electron of positive charge 
and mass. 

Experiment will show that as a result of the ejection of an electron 
from a negative energy state, two charges have appeared: negative 
and positive. Such a positive electron, or positron, was discovered by 
Anderson after Dirac had formulated his theory of the backgroimd. 
The attitude towards the Dirac equation was somewhat suspicious 
before the discovery of the positron, while the idea of background was 
considered far-fetched and intended only to hide the defects of the 
theory. 

In actual fact Dirac’s theory is an unusual example of scientific 
foresight. The discovery of the “antiproton” (i.e., a proton with 
negative charge) by Segrfe once again confirmed the generality of 
Dirac’s conceptions concerning particles with spin 1 / 2 . 


2« - 0060 


402 


QTTANTDM MECHANICS 


[Part III 


Pair ann ihil ation. When a positron and electron meet they can 
annihila te each other if the electron transfers to an imoccupied level 
belonging to a negative energy state. Its energy wiU be imparted to 
electromagnetic radiation in the form of two or three quanta. A single 
quantum cannot result when annihilation occurs in free space because, 
in this case, momentum is not conserved, in the same way that a single 
quantum cannot form a pair in free space. Single-quantum annihi¬ 
lation, too, is possible in a nuclear field. 

The Dirac equation and quantum electrodynamics. Modern quantum 
electrodynamics is based upon the quantum theory of the electro¬ 
magnetic field (Sec. 27) and the Dirac electron theory, with account 
taken of direct and reverse transitions from negative energy to posi¬ 
tive energy. It actually describes an electron and a positron in a com¬ 
pletely symmetrical way: the nonsymmetry of the charge that has 
appeared in our terminology as a result of the fact that the positron 
was defined as a “hole,” is only apparent. The background of elec¬ 
trons with negative energy may be, as it were, subtracted from the 
equations without altering the physical content of the theory, and so 
the equations become symmetrical with respect to the sign of the 
charge. The concept of particles and antiparticles is extended also to 
particles without spin (for example, tc+- and 7t“-mesons). 

It is characteristic that the relativistic quantum theory describes 
change in the number of particles: electrodynamics treats of ab¬ 
sorption and emission of quanta, electron theory is concerned with 
the creation and annihilation of pairs. 

The eleetron-positron-photon field. Electrons together with a field 
form a sort of unified electron-positron-electromagnetic field. Insofar 
as interaction exists between electrons and positrons, on the one 
hand, and quanta, on the other, the division of a unified field into 
charges and quanta becomes, in a certain sense, artificial and, in any 
case, approximate. The strength of interaction between field and 
charge is determined by the dimensionless parameter 

JL - _ 1 _ 

he 137 • 

Since this parameter is a rather small number, the approximation 
arising from the separated charges and field seems satisfactory. 

As was mentioned in Sec. 27, the theory still has certain difficulties. 
The most difficult problem—even the approach to which is not 

known—consists in an explanation of the number -jgi^ : since it is an 

abstract number, the theory shbuld, in principle, derive it from certain 
general physical principles. But these principles have not yet been 
formulated. 

Nevertheless, all problems in which quantum electrodynamics is 
used in calculating experimental quantities have complete and unique 


Sec. 38] 


THE BELATIVISTIO WAVE EQUATION 


403 


solutions. Therefore, despite certain imperfections, the quantum theo¬ 
ry of the electromagnetic field possesses the essential features of a 
correct theory: agreement with experiment and a very specific mecha¬ 
nism for calculation. 

The quantum theory for other fields is quite a different matter. 
The only other comparatively satisfactory situation is in the theory of 
the field responsible for the beta disintegration of nuclei and other 
so-called “weak interactions,” like 7t-(i-meson decay, etc. (see below). 

As regards the theory of nuclear forces, all that is known from a 
series of unsuccessful attempts is what form the theory cannot have. 
However, there are many experimental facts that permit of a con¬ 
clusion concerning the physical nature of nuclear forces. These forces 
are undoubtedly related, at least in part, with the so-called Tt-mesons— 
particles with mass 273 times that of the electron mass. These mesons 
play the part of quanta in the field of nuclear forces. But they interact 
with nucleons (i.e., with protons and neutrons) so intensely that it is 
doubtful whether there is any sense in a separate consideration of 
nucleons and Tt-mesons as opposed to electrod 5 mamics, where, to an 
initial approximation, electrons and quanta may be considered sepa¬ 
rately. 

In addition to Tt-mesons, there are heavy AT-mesons, which dis¬ 
integrate into three, and sometimes only into two, 7s-mesons. 

Upon decay, Tt-mesons give [A-mesons, which interact weakly with 
nuclei. The role played by such weakly interacting particles in the 
general scheme of nuclear forces is mysterious in the extreme. 

An analysis of experimental data shows that the three basic tsqies 
of elementary interactions differ essentially as to their strength: 

1 ) The strongest are nuclear interactions. These include, for example, 
interactions between Tc-mesons and nucleons. 

2) Electromagnetic interactions between quanta and charged ele¬ 
mentary particles are approximately one hundred times weaker than 
nuclear interactions. 

3) Interactions which are related to beta disintegration or, for 
example, to the decay of heavy mesons of mass 900 electron masses 
into 2 or 3 Tt-mesons are weaker than nuclear interactions by a factor 
of 10«. 

Landau, and also Lee and Yang, have shown that the laws for weak 
interactions cannot be invariant with respect to a simple transfor¬ 
mation from a right-handed to a left-handed coordinate system (in¬ 
version). For the interaction to remain invariant, it is necessary, 
simultaneously with inversion, to transfer from particles to antipar¬ 
ticles, i.e., from electrons to positrons, from protons to antiprotons, 
from TS+- to Tt"-mesons, etc. 

Thus, the simple law of conservation of parity, which obtains for 
nuclear and electromagnetic interactions, changes its form for weak 
interactions. Starting with the principle of “combined parity,” 


26* 


404 


QUAIWUM MECHANICS 


[Part III 


R. Feynman and R. Gell-Mann (and, independently, R. Marshak 
and Snolarshan) have succeeded in constructing a universal Hamil¬ 
tonian for all the weak interactions. The original form was suggested 
by Fermi. It contains a new universal constant of the order of 
10 “^® erg-cm®. 

Classification of elementary particles. It is rather difficult to define 
exactly just what an “elementary” particle is. At the beginning of 
this century, the atoms of elements wore considered to be elementary 
particles, since they were thought to be indivisible. Now we know 
that atoms consist of electrons, protons, and neutrons, and we are 
quite sure that these latter do not consist of still other particles of a 
more “elementary” nature in the same sense as atoms do. 

Several elementary particles transform into one another. In some 
cases we can precalculate the laws of such transformations, for in¬ 
stance, for electrons, positrons, and photons, or, to a less extent, for 
beta transformations; but we know very little about strong nuclear 
interactions, which are undoubtedly also connected with some con¬ 
version of elementary particles. 

To visualize this situation better let us consider the following proc¬ 
ess: a neutron emits a negative it-meson, and a proton absorbs it. 
The neutron converts into a proton, and the proton into a neutron, 
and the whole process of emission and reabsorption can be treated 
as an “exchange” interaction. This sort of interaction is to some 
extent analogous to electromagnetic interaction, where a photon is 
emitted by one electron and absorbed by another, but unlike the elec¬ 
tromagnetic forces, we can describe the mechanism of nuclear forces 
only in words. All attempts to do more have as yet failed. 

Despite the lack of a theory of elementary particles, it is now possible 
to bring them into some order. This classification is due to R. Gell- 
Mann. 

We shall not describe the experimental proofs of the existence of all 
the elementary particles listed below; such proof can be found else¬ 
where. Let us be satisfied in stating that their actual existence is quite 
definite, in contrast to the “existence” of the ephemeral elementary 
particles which disappear from the pages of scientific papers after a 
careful investigation. 

The Gell-Mann classification is based primarily on particle inter¬ 
action. First, a particle exists which is capable of electromagnetic 
interaction only. This is the photon, or the quantum of electromagnetic 
radiation. Another group of particles is not capable of strong nuclear 
interactions: the (x-meson, the electron, and the neutrino—the so- 
called “leptons” (light particles). Still another group of particles, 
consisting of tt- and A^-mesons, is capable of nuclear interactions. The 
masses of these particles are intermediate between those of nucleons 
and leptons. 7t- and A-mesons can appear and disappear in the nuclear 
transformations of other highly energetic particles; no conservation 


Sec. 38] 


THE BELATIVISTIC WAVE EQUATION 


405 


law for their number exists, but the charge conservation law is never 
violated. The name baryon (heavy particles) has been given to a 
fourth group of elementary particles. This group consists of stable 

nucleons (neutron and proton), and unstable hyperons: A, ^,and5, 

which transform spontaneously into nucleons. A conservation law 
concerning the baryon number exists, and is as strong as the charge 
conservation law. 

All the particles listed above, except the photon and the 7t°-meson 
have counterparts—antiparticles. It is very important that a particle 
need not to be electrically charged to be able to have an antiparticle, 
as witness the antineutron. The process of particle-antiparticlo inter¬ 
action is one of annihilation; thus, the antineutron and the neutron 
are mutually annihilated, and in the process create 7t-mesons. True 
neutral particles are only those that do not possess antiparticles, or, 
in other words, such that physically coincide with them in a trans¬ 
formation from the ordinary world into the “antiworld.” This means 
a mathematical transformation of aU the wave equations interchang¬ 
ing positrons and electrons, negative antiprotons and protons, etc. 
(some think that antinebulae actually exist in the universe). 

If the physical laws governing the antiworld are the same as those 
that govern our ordinary world, then the Hamiltonian of electro¬ 
magnetic interaction must be invariant under transformation from 
one to the other. The sign of the charge evidently changes in such a 
transformation, and the charge enters the Hamiltonian multiplied 
by the amplitude of the electromagnetic field, the vector potential A, 
so the latter must change its sign simultaneously. We conclude that 
the amplitude of the photon is odd with respect to a transformation 
from the ordinary world to the antiworld. The 7t°-meson decays into 
two photons; consequently its amplitude must be even. 

Such parity of true neutral particles is one of their very important 
characteristics. 

We now pass to a nontrivial point in the GeU-Mann classification. 
We first consider the decay of the electrically neutral A particle 

A—> n° -j-n or A—> n--l-p 

(both are possible). The mean time of such decay is of the order of 
10 “^° sec, but the A particle itself was created in a nuclear collision 
which lasted less than 10-®^ sec. It appears rather puzzling that a 
particle that can be created so quickly should disappear so slowly. 
It seems to contradict the general reversibility of physical laws in 
time. This was why the A particle came to be called “strange.” The 
only explanation is that both processes are of a totally different na¬ 
ture. The generation of a particle is due to a strong interaction, and 
the decay, to a weak interaction. It is enough to suppose that the 
A particle is always created from a nucleon accompanied by a 


406 


QUAimJM MECHAITICS 


[Part III 


A’-meson. This has been verified experimentally, though indirectly. Such 
a process does not violate the baryon conservation law. If A and K 
particles take part in the interaction simultaneously, it is strong, 
and if only one of them transforms into something else, the interaction 
is weak. 

To distinguish between these two types of interaction, Gell-Mann 
introduced a new characteristic of mesons and of baryons—their 
“strangeness,” 8. It is defined as follows: 


Q 

e 


-^ + + -^ 


Here, Q is the charge of the baryon, e is the magnitude of the elemen¬ 
tary charge, n is the difference between the number of baryons and 
antibaryons (1 for baryons, —1 for antibaryons, and 0 for mesons), 
and -Zz is the z-component of the isotopic spin (see Sec. 32). As all the 
heavy particles take part in strong interactions, a definite value of rz 
can be ascribed to each one of them, n must be taken equal to 1 for 

each baryon. For nucleons, -Zz is -f y, as they can have only two values 
of charge: 1 and 0 . A has no charge and can have only Xz = 0. J^has 

three values of charge: 1, 0 , — 1 , and equal values of t^. Lastly, S is 
either neutral (Ta = l/ 2 ) or negative (t*= — 1 / 2 ). Substituting these 
values of charge and into the definition of strangeness, we find that 

nucleons have 8 = 0, A and^ hyperons have / 8 =—1 and 2 has 

8= — 2 . It is noteworthy that^"*" and^~ are not particle and anti¬ 
particle, as both have n = l. Each of them has an antibaryon, for 
which n =— 1 . 

For K- and u-mesons, n = 0 , since they are not baryons. This gives 
8 =0 for TT-mesons and (S — 1 for A"-mesons. Unlike f're related 

as particle and antiparticle. 

Then a selection rule is defined: the given interaction is strong 
only when the resulting strangeness of all particles entering into the 
reaction is conserved. For instance, every interaction of nucleons and 
7t-mesons only is strong (if it does not violate any other conservation 
laws, except strangeness). 

The simultaneous creation of A and K particles belongs to strong 
interactions also, since one of them has 8 = 1, and the other 8 = — 1 . 
But the spontaneous decay of the A particle into a nucleon and a 
7t-meson is due to a weak interaction, because, here, the strangeness 
is not conserved. 

Transitions with A8 = 2 are forbidden more strongly than with 
A/S = l. That is why the 2 particle must decay first into a A or^ 


Sec. 38] 


THE RELATIVISTIC WAVE EQUATION' 


407 


particle, which in turn decays into ^nucleons and ix-mesons. These 
statements agree with the cascade nature of S decay. 

There is no reason as yet to attribute definite values of -r* and S 
to leptons, since they do not take part in strong interactions. 

The transition to the .nonrelativistie wave equation. It is instructive, 
in comparing the relativistic wave equation with Schrodinger’s equa¬ 
tion, to perform the limiting transition. We shall consider that the 
energy of the electron is positive and that its velocity v is considerably 
less than the velocity of light. Then differs from mc^ by a small 

quantity-2^. If we take and mc^<p 2 to the left-hand side in the 

first and second equations (38.11), the components and are mul¬ 
tiplied by the quantity S — mc^, i. e., by —The wave-function 

components <{^3 and ^{(4 appear on the right, multiplied by cp* or by 
cpy. It follows that, for a positive energy, the components <{*3 and <{<4 

are less than and in the ratio —.The same follows from the second 

two equations (38.11): the components tj/g and <{<4 appearjon the right 
multiplied by ^—2 me*, and and ipg are on the left with a factor of 
the order cp. 

For negative energies the components 4*3 and 1^4 are large, while 

and 4^2 are less in the ratio —. 

c 

Consequently, in the nonrelativistic limit Dirac’s equation is re¬ 
duced to two-component form (as is required according to[Sec. 32) for 

a description of particles with spin , 

Spin magnetic moment. We shall now show how spin magnetic mo¬ 
ment is obtained. It is first of all necessary to write down the Dirac 
equation in an electromagnetic field. We know that for a transition 
from the equation for a free particle to an equation for a particle in a 

field it is necessary to replace the momentum p byp— — A, where A 

is the vector potential, and the energy S by S —69 (see Sec. 21 ). 
Thus, the Dirac equation in the presence of an electromagnetic field 
isJ_of the form 

(<5’ —69)4'= c^p —-|-Aja4'-f-pme^tj'. (38.18) 

As was pointed out, the relations between the wave-function com¬ 
ponents in the nonrelativistic limit without field appear as 


(38.19) 


408 


QUANTUM MECHANICS 


[Part III 


It is convenient to write down these relations, according to Sec. 32, 
with the aid of the operators o, the definition of which involves the 

factor 4-[see (32.2), (32.8), (32.9)]: 

Here, the two small wave-function components 4*3 and are for short 
denoted by and the two large components ({'i and 4'2 are called 4^. 

In addition, the momentum p is replaced by p — — A. 

c 

We shall also call S ~-mc^-\-S', where S' is the energy value which 
appears in the nonrelativistic theory. Then, after replacing p by 

p-L we obtain from the first two equations of (38.11) 

(^'-C9)4. = 2c(d,p--jA)4-'. (38.21) 

We can now eliminate 4^' from the equations, so that only the rela¬ 
tions for the large components 4^ remain. 

Substituting in (38.21) from (38.20), we shall have 

(®>P -y a) (a,p —^ a) 4 -. (38.22) 

When squaring the operator |s, p —must take into account 
the commutation relations between the components of c and also 
between p and A. Taking advantage of the fact that o| — 0 “ — a* ^ , 
and calling p *——Ax Px- -, and analoguously for Py and Pz, 
we shall first of all have 

{S' — e<p) 4^ — [twIp ^ m (OxOyPxPy-\- OyOjcjPyP*)-!- . . . j 41 . 

(38.23) 

Further, we utilize the fact that a*Oy=—OyS* —[see (32.13)], 
and also the commutation relations of the form 


P^Py-PyPx = 


(Px Ay 4- A* Py — Py Ax — -4yP*) = 
(pxAy-AyPx) +-^{pyAx-AxPy)-= 


eh 1 

I 8 Ax 

8Ay\ 

eh 

ci ' 

8y 

dx I 

~~ ic 


Hz 


(38.24) 


[cf. (30.36)]. Substituting this in (38.23), we obtain an equation for 
the components 


Sec. 38] 


THE BBLATTVISTIC WAVE EQUATION 


409 


- e<p) ^ (p - -J a)‘‘ - ^ (S*£r* + SyHy + S.H.)] i>. 

Going over to vector notation, we arrive at the nonrelativistic 
wave equation 

'1' == -^ (p - 7 - (o H) ({). (38.25) 

Compared with the Hamiltonian operator for a spinless particle, 
the Hamiltonian of an electron involves an additional term: 


(38.26) 


But since 0 is an additional mechanical moment, we see that the elec¬ 
tron has an additional magnetic moment 


{io = 


eh - 

- a 

me 


(38.27) 


in accordance with what was affirmed in (32.17) (a here is a dimension¬ 
less operator). Spin differs from orbital angular momentum in that 
its magnetic moment does not contain a factor 2 in the denominator. 
Thus, the so-called spin magnetic anomaly follows naturally from the 
Dirac equation. 

The radiation-field correction to the magnetic moment. Equation 
(38.27) is, of course, correct only in the nonrelativistic limit. But even 
ill this limit it is not completely exact. As was indicated in Sec. 27, 
the state of an electromagnetic field in which there are no quanta 
interacts with charged particles. Strictly speaking, insofar as there 
is an interaction between the charges and the field, the state of each 
separately cannot be defined with complete precision. It is therefore 
not surprising that any state of a field is in some way perturbed by the 
presence of charges, and any state of the charges is perturbed by the 
field. As a result of this, the magnetic moment of an electron, as is 
shown by the rather exact calculations of Schwinger and others, is 
greater than one magneton by a very small quantity, whose relative 

fraction is ^-^hc ' result is in complete agreement with experi¬ 
ment. 

The magnetic moments of the proton and neutron do not at all 
agree with the Dirac theory. For instance, on Dirac’s theory, a neutral 
particle (the neutron) should not have a magnetic moment at all. 
In actual fact, the neutron possesses a magnetic moment directed 
opposite to the spin. 

This is usually explained by the strong interaction between the 
nucleon and the nuclear force field or, as it is sometimes called, the 
meson field. There is a certain analogy here with the correction to 


410 


QUANTUM MECHANICS 


[Part ni 


the magnetic moment of the electron. This correction is small because 
the interaction constant is small. Nuclear interaction is very strong, 

and so the result is a large “correction,” if one can use that expression 
for a quantity which, in the case of a proton, is twice as great as the 
basic magnetic moment given by the Dirac theory. 

At the present time we are unable to calculate the magnetic moment 
of a nucleon, since no theory of nuclear forces exists. 

Nevertheless, the nucleon is undoubtedly to some extend a Dirac 
particle, as confirmed by the existence of the antiproton. 

Energy eigenvalues of a hydrogen atom. In accordance with the 
Dirac equation, the energy eigenvalues of a hydrogen atom, or of 
any single-electron atom, are calculated in the following way; 


S 

mc^ 


1 


h 

r 


aZ 

'12 

n- 

hi) 


hi) 


- 1 . 


(38.28) 


where n is the principal quantum number, i.e., j is the total 

6 * 1 ^ 

electron angular momentum, “ = -^ = • If regard a.Z as small 

compared with unity, then the nonrelativistic formula (31.34) results. 
It follows from formula (38.28) that the states 2pi/j and 2st/j, with 

the same »^2 and j — have the same energy. In practice, these 

states of the atom are somewhat split as a result of the interaction 
of light quanta with the groimd state of the field. The calculated 
splitting agrees with experiment with considerable accuracy. 


Exercises 

1) Prove that a quantum cannot give rise to an electron-positron pair 
in free space in the absence of an additional external field. 

The conservation laws in the absence of a field are written thus: 

II <i> 

— -t- c*p* + hut = y/rn^c* + 'p\, p -i-^n = Pi. 

Here, p is the electron momentum in a negative energy state, n is a unit 
vector in the direction of the quantum momentum, Pi is the electron momentum 
in a positive energy state. Substituting Pj in the first equality and squaring 
the left- and right-hand sides, it is easy to see that this equation is not satis¬ 
fied. 

Another method of proof is based on simple reasoning. A transition to 
another inertial system can always make the energy of a quantum less than 
2 «tc“. A quantum cannot give rise to a pair in such a system, simply because 
it has insufficient energy. But what is impossible in one reference system 
is impossible in all systems, because the possibility or impossibility of an 
event does not depend upon the choice of the reference system. 


Sec. 38] 


THB BBIiATmSTIO WAVE EQUATION 


411 


The preceding argument no longer holds if pair production is considered 
close to a nucleus. Here, the nucleus is at rest in one reference system and 
in motion in another. Where the energy of the quantum is less than 2 mc^, 
the moving nucleus will “help” it to give rise to a pair. Naturally, it is in no 
way possible for a quantum to give rise to a pair if its energy in the rest system 
of the nucletis is less than 2 me*. 

2) Obtain the solution to the Dirac equation for a free electron. _ 

I^t us equate to zero. Then the first equation of (38.11) is satisfied if 
we take Ac (px — ipy), <(14 = --Ac pz. The second equation of (38.11) 


gives 


■dc*(pK + py+ Pz _ A (<y»—m*c«) 


S —me* 


S —me* 


= A{S + me*). 


The third equation of (38.11) reduces to the identity 

(S -I me*) = .4c (<? + me*) (p* — iPy) = c(px — tpy) < 1 ), = Ac (p* — fpy) (d* me*). 

The fourth equation of (38.11) also reduces to an identity. The number A 
is determined from the normalization condition 

I 'I'l I* + i '!'* I* + I 'I'S I* + I '1'4 I* = 1 

or 

•4* [(^ + me*)* 4 - c* p* -t- c* pj + c* p*] = 1 , 


•v/2<y(4’-|-mc*) 

The components <{(3 and i |/4 are small compared with <(13 if v c. Therefore 
the solution corresponds to positive energy. Another solution with positive 
energy is obtained if we take ijij = 0. Negative energy solutions are obtained 
if we choose >)i 3 = 0 or ij(« = 0 . 

3) Show that from Dirac’s equation there follows a charge-conservation 
equation which is analogous to (24.16): 

-^|il<i*= - div(^.*caili), 

where i-ji I* = I <|(i I*-f (*-f-I < 1/3 1 *-f [ '>4 I*. 

Write down equation (38.18) and its complex conjugate; multiply the 
first by <)/* and the second by i]/; subtract the second from the first and utilize 
the Hermitian nature of the operators & and p. 

4) Show that if <j/ is a solution with positive energy then pj <]/ is a solution 
with negative energy — 

The equation for < 1 / is 


<j/ = c(diji)<l/-f me* p <)/. 


WTience 

S P 2 1/ = c pj (a J) <1/ -f me* p, p <1/ = — [0 (fi J) + me* P] Pg <1/. 

This proves that a negative energy solution cannot be avoided. 

5) Prove that the operators 


3* 


h ^ ^ ^ h ^ ^ ^ h ^ ^ 

-^<tyaz, ay==-^ Kzixx, 1* =-^ a* ay. 


acting upon four-component functions, are spin operators. 


412 


QUANTUM MECHANICS 


[Part III 


We have 

.2 . - . . .2 

a* = - ay ay a^ = -j- ay = — , 

-- ^ ^ .h^ 

Ox Oy —-^ ay a^ a* =- — ay a;* = 2tJ^, 

. A - 

tty Ox ~ — 


80 that the spin operators determined here possess all the required properties 
(see Sec. 32). This can also be seen from the definition of a in terms of 6 and 
p. Wo notice that the spin operators do not perraiite functions of the first 
pair ijii. >{J 2 and of the secoiul pair tj)!, but make peiTnutations only inside 
each pair. 

0) Show that according to the Dirac equation only the sum of the orbital 
and spin angular momenta and not each angular momentum separately satis¬ 
fies the angular-momentum conservation law. 

The total angiilar momentum is defined as 

J = M -f o= [rp] -f a. 

Jx XPy ~ 1 /Px + Yf “* “y 


We calctdate the commutator with the Hamiltonian: 


J- ---[c(axPx + aypy «zPz} + pmc^l^l^py — i/px l “ 

— ^XPy- PPx-l S.vSyj [C (SxP* -t- SyPy-l- S ^ P«) H g «JC®] = 

Jh 0 

- (•.axPy{pxX~~xpx)—Ca.yPx{pyy — ypy) + Y^px{ixax^y—axS.yix) 


-I- -^Py(ayaxay -a^aySy) (ajcPy — SyP* -j- p* “y — Py a.v) -0. 

The Hamiltonian is commutative also with the square of the total angular 
momentum J* - J*-!-Jy-h . The integrals of motion are ./* and Jz, and 
not 3* and Mz, separately. 


PART IV 

STATISTICAL PHYSICS 

iec. 39. The Equilibrium Distribution ol Molecules in an Ideal Gas 

The subject of statistical physics. The methods of quantum mechanics 
et out in the third part make it possible, in principle, to describe 
ny assembly of electrons, atoms, and molecules comprising a macro- 
copic body. 

In practice, however, even the problem of an atom with two elec- 
rons presents such great mathematical difficulties that nobody, so 
ar, has solved it completely. It is all the more impossible not only 
o solve but even to write down the wave equation for a macroscopic 
lody consisting, for example, of 10®® atoms with their electrons. 

Yet in large systems, we encoxmter certain general laws of motion 
or which it is not necessary to know the wave function of the system 
o describe them. Let us give one very simple example of such a law. 
Ve shall suppose that there is only one molecule contained in a large, 
ompletely empty vessel. If the motion of this molecule is not defined 
»eforehand, the probability of finding it in any half of the vessel is 
qual to 1/2. If there are two molecules in the same vessel, the prob¬ 
ability of finding them in the same half of the vessel simultaneously 

3 equal to = 1/4. The probability of finding all of a gas, consisting 

•f N particles, in the same half of the vessel (if the vessel is filled with 
;as) is (1/2)^'^, i.e., an unimaginably small number. On the average, 
here will always be an approximately equal number of molecules in 
lach half of the vessel. The greater the number of molecules forming 
he gas, the closer to unity will be the ratio of the number of molecules 
n both halves of the vessel, no matter at what time they are 
(bserved. 

This approximate equality for the number of molecules in equal 
volumes of the same vessel gives an almost obvious example of a sta- 
istical law applicable only to a large assembly of objects. In addition 
o a spatial distribution, molecules possess a definite velocity distri- 


414 


STATISTICAL PHYSICS 


[Part rV 


bution, which, however, can in no way be uniform (if only because the 
probability of an infinitely large velocity is equal to zero). 

Statistical physics studies the laws governing the motion of large 
assemblies of electrons, atoms, quanta, molecules, etc. The problem 
of the velocity distribution of gas molecules is one of the simplest 
that is solved by the methods of statistical physics. 

Statistical physics introduces a series of new quantities, which can¬ 
not be defined in terms of single-body dynamics or the d 3 mamics of a 
small number of bodies. An example of such a statistical quantity is 
temperature, which turns out to be closely related to the mean energy 
of a gas molecule. If a gas is confined only to one half of a vessel, and the 
barrier dividing the vessel is then removed, the gas will itself uniformly 
fill both halves. Similarly, if the velocity distribution of the molecules 
is disturbed in some way, then, as a result of collisions between the 
molecules there will be established a very definite statistical distri¬ 
bution, which, for constant external conditions, will be maintained 
approximately for an indefinitely long time. This example involving 
collisions shows that regularity in statistics arises not only because a 
large assembly of objects is taken, but also because they interact. 

The statistical law in qnantnm mechanics. Quantum mechanics also 
describes statistical regularities, but relating to a separate object. 
Here, the statistical regularity manifests itself in a very large number 
of identical experiments with identical objects, and is in no way relat¬ 
ed to the interaction of these objects. For example, the electrons in a 
diifraction experiment may pass through a crystal with arbitrary time 
intervals and nevertheless give exactly the same statistical picture 
for the blackening of a photographic plate as if they had passed through 
the crystal simultaneously. 

Regularities in alpha disintegration cannot be accounted for by the 
fact that there are a very large number of nuclei: since there is practi¬ 
cally no interaction between nuclei inducing the process, the statisti¬ 
cal character predicted by quantum mechanics is only manifested 
for a large number of identical objects; it is by no means due to their 
number. In this connection, a description of phenomena m quantum 
mechanics involves the concept of probability phase, similar to the 
concept of the phase of a light wave. 

In principle, the wave equation can also be applied to systems con¬ 
sisting of a large number of particles. The solution of such an equation 
represents a detailed quantum-mechanical description of the state 
of the system. Let us suppose that as a result of the solution of the 
wave equation we have obtained a certain spectrum of energy eigen¬ 
values of the system 

= (39-1) 

in states with wave functions 

•••.'i'". ••• 


Sec. 39] 


THE BQUIIJBKI0M DISTBIBTJTION OB MOLECULES 


415 


Then the wave function for any state, as was shown in Sec. 30, can 
be represented in the form of a sum of ({/-functions of states with defi¬ 
nite energy values: 

( 39 - 2 ) 

#1 

The square of the modulus 

W„ = \Cn 1^ (39.3) 

gives the probability (when the energy of a system in the state t{/ is 
measured) that the result will be the wth value. 

The erpansion (39.2) makes it possible to determine not only ampli¬ 
tudes, but also relative probability phases corresponding to a detafled 
quantum-mechanical description of the system. 

The methods of statistical physics make it possible straightway 
to determine approximately the quantities w„—\cn\^, i.e., the prob¬ 
abilities themselves omitting their phases. For this reason, the 
wave function of the system cannot be determined from them, al¬ 
though it is possible to find the practically important mean values 
of quantities that characterize macroscopic bodies (for example, 
their mean energy). 

In this section we shall consider how to calculate the probability 
Wn as applied to an ideal gas. 

Ideal gases. An ideal gas is a system of particles whose interaction 
can be neglected. The interaction resulting from collisions between 
molecules is essential only when the statistical distribution Wn is in 
the process of being established. When this distribution becomes 
established the effect of interaction is very slight. 

As regards condensed (i.e., solid and liquid) bodies, the molecules 
are all the time in vigorous interaction, so that the statistical distri¬ 
bution depends essentially upon the forces acting between the bodies. 

But even in a gas the particles must not be regarded as absolutely 
independent. For example, Pauli’s principle imposes essential limi¬ 
tations on the possible quantum states of a gas. We shall take these 
limitations into account when calculating probabilities. 

The states of separate particles of a gas. In order to distinguish 
the states of separate particles from the state of the gas as a whole 
we shall denote their energies by the letter e and the energy of the 
whole gas by S. Thus, for example, if the gas is contained in a rec¬ 
tangular potential well (see Sec. 26), then the energy values for each 
particle are calculated according to equation (26.19). 

Let e take on the following series of values: 

e = eo.®i.ea.-•-.efc. (39.4) 

where there are particles in the state with energy So and in general 
there are m, particles with energy ek in the gas. Then the total 
energy of the gas is 


416 


STATISTICAL PHYSICS 


[Part IV 


—^nktk. (39.5) 

k 

By giving different combinations of numbers Uk, we will obtain 
the total energy values forming the series (39.1). 

We have repeatedly seen that the energy value Ck does not yet 
define the particle states. For example, the energy of a hydrogen 
atom depends only upon the principal quantum number n,* so 
that the atom can have 2n^ states for a given energy [see (33.1)]. 
This number, 2n*, is called the weight of the state with energy e„. 
But it is also possible to place the system under such conditions that 
the energy value defines the energy in principle uniquely. We note, 
first of all, that in aU atoms except hydrogen the energy depends 
not only on n, but on I also, i.e., on the azimuthal quantum number. 
Further, account of the interaction between spin and orbit shows 
that there is a dependence of the energy upon the total angular- 
momentum j and, finally, if the atom is placed in an external magnetic 
field, the energy also depends upon the projection of angular momen¬ 
tum on the magnetic field. Thus, one energy value mutually corre¬ 
sponds uniquely to one state of the atom. 

In a magnetic field all the states with the same principal 
quantum number are split. We now consider how the states of a 
gas in a closed vessel are split. We shall suppose that the vessel is 
of the form of a box with incommensurable squares of the sides 
af, a|, a\. Then, in accordance with equation (25.19), the energy 

of the particles is proportional to the quantity —r H—|- -f —~ 

af a.| a§ 

where %, « 2 > % ^'’^e positive integers. Any combination of these 
integers gives one and only one number for the incommensurable values 
af, a|, a|. Therefore, specification of the energy defines all three 
integers n^, M 3 . If the particles possess an intrinsic angular mo¬ 
mentum, we can, so to speak, remove the degeneracy by placing 
the gas in a magnetic field (an energy eigenvalue to which there cor¬ 
respond several states of a system is termed degenerate). We shall 
first consider only completely removed degeneracy. 

States ol an ideally closed system. We shall now consider the energy 
spectrum of a gas completely isolated from possible external influences 
and consisting of absolutely noninteractmg particles. For simplic¬ 
ity, we shall assume that one value of energy corresponds to each 
state of the system as a whole, and, conversely, one state corresponds 
to each energy value. This assumption is true if all the energy eigen¬ 
values for each particle are incommensmable numbers).* We shall 

* Not to be confused with nfe! 

* In a rectangular box the state s (»i, n^, n^) has an energy which is com¬ 
mensurable with the energy of state e (2wi, 2 n^, 2 n,). Therefore, the energy 
of all states can be incommensurable only in a box of more complex form than 
rectangular. 


Sec. 39] THE EQUILIBRIUM DISTRIBUTION OF MOLECULES 417 

call these numbers Sk. Then, if there are nk particles in the i:-th 
state, the total energy of the gas is equal to <o = ^nttu . But, for 

k 

incommensurable tk, it is possible in principle to determine all 
nk from this equation, provided S is precisely specified. It is clear, 
liowever, that the energy of a gas consisting of a sufficiently large 
number of particles must be specified with truUy exceptional ac¬ 
curacy for it to be possible to really find all nk from S. 

It is not a question of determining the state of an individual par¬ 
ticle from its energy but of finduig the state of the whole gas 
from the sum of the energies of all of its iiarticles. Every interval 
of values d^, even very small (though not infinitely small), will 
include very many eigenvalues Each of them corresponds to its 
own set of values nk, i.e., to a definite state of the system as a whole. 

States of a nonidcally closed system. Energy is an exact integral 
of motion only in an ideally closed system. The state of such a system 
is maintained for an indefinitely long time, and the conservation 
of the quantity ^ provides for the constancy of all nk. But nature 
does not (and cannot) have ideally closed systems. Every system 
interacts in some way with the surrounding medium. We will regard 
this interaction as weak and will determine how it affects the be¬ 
haviour of the system. 

Let us assume that the interaction with the medium does not 
noticeably disturb the quantum levels of separate particles. Never¬ 
theless, every level Sfe ceases being a precise number and receives 
a small, though finite, width LSk- This is sufficient for the meaning 

of the equation S=2^nkek to change in a most radical way; in a 
k 

system consisting of a large number of particles, the equation con¬ 
taining approximate quantities s* no longer defines the number nk. 

In other words, an interaction with the surrounding medium, 
no matter how weak, makes impossible an accurate determination 
of the state from the total energy (f. 

Transitions between close-energy states. In an ideally closed system, 
transitions were forbidden for all states corresponding to an energy 
interval dS because the energy conservation law held strictly. For 
weak interaction with the medium, all transitions that do not change 
the total energy of the gas as a whole are possible to an accuracy 
which is, in general, compatible with the determination of the energy 
of a nonideally closed system. 

Let us suppose that the interaction with the medium is so weak 
that, for some small interval of time, it is possible, in principle, to 
determine aU the quantities nk and thus to give the total energy of 

the gas <f=^nfcefc. But over a large interval of time the state of 

k 


27 - 0060 


418 


STATISTICAL PHYSICS 


[Part IV 


the gas can now vary within the limits of that interval of total energy 
which is given by the inaccuracy in the energy of separate states 
Aefe. All transitions will occur that are compatible with the approx¬ 
imate equation S — y,nk(tk ± Asit). Naturally, the state in which 

k 

all Ae* are of one sign is extremely improbable; this is why the 
double symbol ± is written. We must find the state that is formed 
as a res^t of all possible transitions in the interval d^. 

The probabilities of direct aud reverse transitions. A very important 
relation exists between the probabilities for a direct and reverse tran¬ 
sition. Let us first of all consider this relation on the basis of equation 
(34.29), which is obtained as a first approximation in perturbation 
theory. Let there be two states A and B in the system, with wave 
functions and i};b. The same value of energy corresponds to 
these states, within the limits of inaccuracy given by the inter¬ 
action of the system with the medium. In the interval d^, both 
states may be regarded as belonging to a continuous spectrum. 
Then, from (34.29), the probability of a transition from A to jB in 

2 TV 2 TV 2 

unit time is equal to~ yfAis\ ga, and from B to A,—^ J^da Qa , 

ft' I 

where 

jeAB^l^'a^^AdV, 

M’ba = j <pA d V 


(the weights of the states are denoted by gA, ga). But, if gA = gB, 
then the probabilities for direct and reverse transitions, which we 
shall call Wab and Wba, are equal because \J^ab\^=\-^ba\^- Natu¬ 
rally, the transition is only possible because the energies S'a and 
<oB are not defined with complete accuracy, and a small interval 
d<^ is given in which the energy spectrum is continuous. In an ideally 
closed system we woxdd have «fa ^ Sa- 

The relationship we have found only holds to a first approximation 
in the perturbation method. However, there is also an accurate 
relationship that can be deduced from the general principles of 
quantum mechanics. In accordance with the accurate relationship, 
the probabilities for transitions from A io B and from B* to A* 
are equal; here A* and B* differ from A and B in the signs of all 
the linear- and angular-momentum components. 

The equal probability lor states with the same energy. We have 
seen that, due to interaction with the medium, transitions will occur, 
in the system, between all kinds of states A, B, C,. . belonging 
to the same energy interval dS. If we wait long enough the system 
will pass equal intervals of time in the states A, B, C,.... This is 
most easily proven indirectly, supposing first of all that the probabil- 


Sec. 39] THE EQUILIBKIUM DISTEIBXJTION OF MOUBCULES 419 

ities for direct and reverse transitions are simply equal: Wab = Wba. 
The refinement Wab — Wb-a* does not make any essential change. 

For simplicity we shall consider only two states such that Wab = Wba. 
We at first assume that tA is gieater than < b , so that the system 
will more frequently change from A to B than from jB to .4. But 
this cannot continue over an indefinitely long time, because if the 
ratio (a: (b increases, the system will finally be constantly in A de¬ 
spite the fact that a transition is possible from A to B. Only the 
equality tA~tB can hold for an indefinite time (on the average) 
on account of the fact that direct and reverse transitions 00010 ” on 
the average with equal frequency. The same argument shows that 
if there are many states for which direct and reverse transitions 
are equally probable, then over a sufficiently long period of time 
the system will, on the average, spend the same time in each state. 

We can suppose that tA = tA *, because the states A and A * differ 
only in the signs of all linear and angular momenta (and also the 
sign of the external magnetic field, which must also be changed so 
that the magnetic energy of all particles is the same in states A 
and A*). If we proceed from the natural assumption that tA*, 
then all the precinged argument can be extended to the case when 
Wab~Wb*a*- 

We have thus seen that the system spends the same time in all 
states (with the same weight) that belong to the same total energy 
interval d S. 

The probability of a separate state. We will call the limit of the 
ratio tAlt, when t increases indefinitely, the probability of the state qA. 
The equality of aU tA implies that corresponding states are equally 
probable. But this allows us to define the probability of each state 

p 

directly. Indeed, if there are P states, then ^qA=l, because 

p 

A-l 

2^tA = t. But since the states are, as proven, equiprobable, we find 
.4-1 

that qA — tjP. Similarly, the probability that a tossed coin will 
fall heads is equal to 1/2, since the occurrence of heads or tails is 
practically equiprobable. 

Hence, the problem of finding probability is reduced to that of 
combinatorial analysis. But in order to use this analysis we must 
determine which states of the system can be regarded as physically 
different. When computing the total number P we must take each 
such state once. 

Specification of gas states in statistics. If a gas consists of identical 
particles, for example, electrons, helium atoms, etc., then its state 
is precisely given if we know how many particles occur in each one 
of the states. It is not meaningful to ask which particles occur in a 
certain state, since identical particles cannot, in principle, be distin- 


27’ 


420 


STATISTICAL PHYSICS 


[Part IV 


guished from one another. If the spin of the particles is half-integral 
then Pauli’s exclusion principle must hold and in each state there 
will occur either one particle or none at all. 

As an illustration for calculating the number of states of a system 
as a whole, let us suppose that there are only two particles and that 
each particle can have only two states a and h (ea—Sb), each with 
weight imity. In all, three different states of the system are con¬ 
ceivable : 

1 ) both particles in state a; state b is unoccupied; 

2 ) the same in state 6; a is unoccupied; 

3) one particle in each state. 

In view of the indistinguishability of the particles, state (3) must 
be coimted once because the interchange of identical particles between 
states does not have meaning. If, in addition, the particles are subject 
to the exclusion principle, then only the third state is possible. 

Thus, if the exclusion principle is applicable to the particles then 
the system can have only one state and, if it is not, then three states 
are possible. Pauli exclusion greatly reduces the number of possible 
states of a system. A system of two different particles, for example, 
an electron and proton, would have four states because these particles 
can obviously be distinguished. 

Let us further consider the example of three particles occupying 
three states. If Pauli exclusion is operative, then one, and only one, 
state of the system as a whole is possible; one particle occurs in each 
separate state. If there is no exclusion, then the indistinguishable 
particles can be distributed thus: one in eaeh quantum state, or 
two particles in one state and the third in one of the two remaining 
states (this gives six states for the system as a whole), and all three 
particles in any quantum state. Thus we have obtained l-|-3-f-6 = 10 
states for the system as a whole. 

If these three particles differed, for example, if they were 7r+-, 
Tc®-, and ^--mesons of zero spin, then each of them could have any 
one of three states independently of the others. Consequently, a 
system of three such particles could, as a whole, have 3® = 27 states. 
Later on we shall derive a general formula for calculating the number 
of states. 

Particles not subject to the Pauli exclusion principle. There is no 
sense, for future deductions, in considering that each state of a par- 
tiele of given energy has unit weight. We shall denote the weight 
of a state with energy Zk by the letter gk. In other words, gk different 
states of a particle have the same energy s*. For every particle 
these states are equally probable. 

Let us assume that Uk particles have energy sk and are not subject 
to Pauli exclusion. It is required to calculate the number of ways 
that these Uk particles can be distributed in gk states. We shall 


Sec. 39] 


THE EQUILIBBIIJM DISTRIBOTION OP MOLBCTTLES 


421 


call this number In accordance with what we have proved 

above, the probability for each arrangement as a whole is — . 

In order to calculate Pjt«» we will, as is usual m combinatorial 
analysis, call the state “box” and the particle a “ball.” The problem 
is: how many ways are there of placing w* balls in gu boxes without 
numbering the balls, i.e., without desiring to know which ball lies 
in which box. If the particles are not subject to the Pauli exclusion 
principle then we must suppose that each box can accommodate 
any number of balls. 

Let us mix up all the boxes and all the balls so that we obtain 
nk + Qk objects. Prom these objects we take any box and put it 
aside. The rik+gk —1 objects which remain are then randomly 
taken from the common pile, irrespective of whether they are box 
or ball, and placed in one row with this box from left to right. The 
following series may be obtained: 

bx, bl, bl, bx, bx, bl, bl, bl, bx, bl, bx, bl, bl, bx, bx, bx. 

Since a box must appear on the left, the remaining objects can be 
distributed among themselves (nk-\-gk —1)! ways. 

Wo now tlirow each ball into the closest box on the left. In the 
distribution we have used there will be two balls in the first box, 
none in the second box, 'three in the third, one m the fourth, and 
so on. There are {rik + gk —1)! distributions in all, but they are 
not all distinguishable. Indeed, if we place the second ball in the 
position of the first or any other one, nothing will change in the 
series shown. There are nk\ permutations between the balls. In 
exactly the same way the boxes can bo interchanged with the boxes 
because it does not matter in which order these boxes appear. Only 
we must not touch the first box, because it always appears on the 
left by convention. In all, there are (gk —1)! permutations of the 
boxes. It follows that, of aU the possible (rife + g'k—1)! arrangements 
in the series, only the following set of arrangements is different: 


P «*«* — 


(Wfe + gfe — 1)! 

Wfc I (S'*! — 1) I 


(39.6) 


If, for example, «/!=3, ^>*=3 then = 10, which is 

what we have already seen from direct computation. 

Particles subject to Pauli exclusion. In the case of particles subject 
to Pauli exclusion the calculation of is still simpler. Indeed, 
here we always have the inequality m, ^ gk, because not more than 
one particle occurs in each state. Therefore, of the total number of 
gk states Uk are occupied. 

The number of ways in which we can choose nu states is equal 
to the combination of gk things nk at a time: 


422 


STATISTICAL PHYSICS 


[Part rV 


gfe! 

«*!! (gfc—Hfc)! 


(39.7) 


There are as many possible different states in the case of nk < ffk, 
and there is one particle or none at all in any of the ffk states. 

The most probable distribution of particles among states. The 
numbers ffk and Uk refer to a single definite energy. The total number 
of states of a gas is equal to the product of the numbers for 
all the states separately; 

P=nPsk'’k- (39.8) 

le 


So far we have only used combinatorial analysis. And besides it has 
been shown that all separate states taken .separately are equally 
probable. The quantity P depends upon the distribution of particles 
among the states. It can be seen that, in fact, a gas is always close 
to a state where the distribution of separate particles among the 
states corresponds to the maximum value of P ])ossible for a given 
total energy S’ and for a given total number of particles. 

We shall explain this statement by a simple example from gambling, 
as is usually done in probability theory (most easily seen here is the 
manifestation of large-number laws in a game of chance). Let a coin 
be tossed N times. The probability that it will fall heads once is 
equal to 1/2. The probability that it will fall heads all N times is 
equal to (1/2)^. The probability that it will fall N—1 times heads, 
and once tails, is equal to (l/2)''^-i X 1/2 x W, because this single 
occasion can turn out to be anyone, from the first to the last, and 
the probabilities for mutually exclusive events are additive. The 

probability for a double tails is equal to “ • 

The last factor shows how many ways two events can be chosen from 
the total number N (the number of combinations of N two at a time). 
In general, the probability that the coin will fall tails k times is 


|1\N-^‘|1\I‘ Ni 

12/ k<.{N-k)C' 


The sum of all probabilities is, of course, unity: 


2^=1 


N{N-l) {N-2) 

1-2-3 


because the sum of binomial coefficients is equal to 2^. 

Considering the series qk, we can see that ffk increases up to the 
middle of the sum, i.e., as far as k=NI'2, and then decreases sym¬ 
metrically with respect to the middle of the sum. Indeed, the A:th 


Sec. 39] 


THE EQUILIBRIUM DISTRIBUTION OF MOLECULES 


423 


N — k+ 1 

term is obtained from tbe {k —l)tb term multiplied by-- 

SO that the terms increase as long as NI2>k. 

Every separate series for tails appearing is in every way equally 
probable with all other series. The probability for any given series 
is equal to (1/2)^. But if we are not interested in the sequence of 
heads and tails, but only in their total number, then the probabilities 
will be equal to the numbers qk. For Npl, the function qu has a 
very sharp maximum at k=NI2 and rapidly falls away on both 
sides of NI2. If we call the total number of N tosses a “game,” then 
in the overwhelming majority of games, heads will be obtained ap¬ 
proximately iV/2 times (if N is large). The probability maximum 
wiU be sharper, the greater N is. Wo will not, hero, refine this as 
applied to the game of pitch and toss (see exercise 1), but will return 
to the calculation of the number of states of a gas. 

On the basis of the equal probability for the direct and reverse 
processes between any pair of states, we have sliown that any pre¬ 
viously defined distribution of particles among states lias the same 
probability of being established for a given total energy. In the 
same way, every separate sequence of heads and tails in each separate 
game is of equal probability. But, if we do not specify the states 
of a gas by denoting which of the gu states with a given energy 
are filled, and give only the total number of particles in a state with 
energy Sfc, then we obtain a probability distribution with a maxi¬ 
mum similar to the probability distribution of games according to 
the toted number of occurrences of tails irrespective of their sequence. 
The only difference is that in the example of pitch and toss the 
probability depends upon one parameter k, and the probability 
for the distribution of gas particles among states depends upon 
all rik. 

Our problem is to find this distribution for particles with integral 
and half-integral spins. 

It is most convenient to look for the maximum of the logarithm 
of P rather than P itself. In P is a monotonic function of the argument 
and assumes a maximum value at the same time as the argument P. 

Stirling’s formula. In calculations we shall require logarithms 
of factorials. For the factorials of large numbers, there is a con¬ 
venient approximate formula which we shall deduce here. 

It is obvious that 

n 

In w! = In (1 •2.3*4... w) = k . 

it-i 

The logarithms of large numbers vary rather slowly since the difference 
In(w-l-l) — Inn is inversely proportional to n. Therefore, the sum 
can be replaced by an integral: 


424 


STATI8TICAI, PHYSICS 


[Part IV 


n n 

In nl = ^Ink ^ | In kdk = 7i\n n — n = nln ~. (39.9) 

fc-i b 

This is the well-known mathematical formula of Stirling in somewhat 
simplified form. It becomes more accurate the greater n is. 

Additional maximum conditions. And so we must look for the values 
of the numbers w* for which the quantity 

= In P = In n 

k 

is a maximum at a given total energy 

S = ^nkSk 

k 

nnd for a given total number of particles 

N==2Jnk. 

k 

This kind of extremal condition is termed bound, because addi¬ 
tional conditions (39.11) and (39.12) are imposed upon it. 

Wo shall first of all find rik for particles which are not subject 
to Pauli exclusion, i.e., those with integral spin. To do this we must 
substitute the expression for P from (39.C) in (39.10), take the dif¬ 
ferential dS with respect to all Uk, and equate it to zero. We have 


(39.10) 


(39.11) 


(39.12) 


iS = In P = In |~] 

k 


({fk + Ilk — 1 )! 
(fiffe— 1)! 


Win (gfe + ~ • 

nfc!(sffc-l)! • 

*= (39.13) 


We substitute here the expression for factorials according to 
Stirling’s formula (39.9): 

+ l)ln -^^ + - ^*~ ^ -rtkhi^ - {gk - l)In-g^ j: ^ j. 

* (39.14) 

Since gk is a large number, unity can naturally be neglected every¬ 
where. We must, of course, differentiate with respect to nk in formula 
(39.14), because gk is the given number of aU states. Then 


dS — drik [In {gk -b «*) — In rik] — ^dn*ln —= 0. 

“ (39.15) 

It must not be concluded from this equation that the coefficients 
of every duk are equal to zero, because nk are dependent quantities. 


Sec. 39] 


THE EQUILIBRIUM DISTRIBUTION OF MOLECULES 


42£ 


The relationship between them is given by the two equations (39.11] 
and (39.12) and, in differential form, are as follows: 


d(^= 2Jekdnk = 0, (39.16; 

k 

dN= 2Jdnk = 0. (39.17; 

k 

From these equations, we could eliminate any two of the numbers 
drik, substitute them in (39.15), and afterwards regard the remaining 
diik as mdependent quantities. Then their coefticients may be regarded 
as equal to zero. 

The method ol undotormined coefficients. The elimination of de¬ 
pendent quantities is most conveniently acliievod by the method 
of undetermined coefficients. This makes it possible to preserve 
the symmetry between all 7ik. Let us multiply equation (39.16] 
by an indefinite coefficient which we denote by 1/0; the meaning 
of this notation will be cxifiained later. Wo multiply the second 
equation (39.17) by a coefficient Avhich we denote g/O, so that we 
have introduced, as is required from the number of supplementary 
conditions, two quantities, 6 and [l. After this we combine all three 
equations (39.15)-(39.17) and regard all dnu as independent, and 
6 and jx as unknoAvn values which should be determined from equations 
(39.11) and (39.12). The maximum condition is now written as 

dS— (39.18] 


We look for the extremum of one quantity 8 -^ 

then choose 6 and [x so that the energy and number of particles 
equal the given values. But if the extremum is determined foi 
one function without conditions, then all its arguments become 
mutually independent, and we are entitled to equate any differentia] 
dtik to zero regardless of the other differentials. 

Equation (39.18), written in terms of dnk, has the following form: 

k 

Bose-Einstein distribution. Let us now put all the differentials 
except dnk equal to zero. According to what we have just said this 
is justified. Then, for equation (39.19) to hold, we must put the 
coefficient of dnk equal to zero: 

+ (39.20) 

nfc 6 6 ' ' 


426 


STATISTICAI, PHYSICS 


[Part TV 


Naturally, this equation holds for all h. Solving it with respect 
to rik, we arrive at the required most probable distribution of the 
number of jiarticles according to state; 

nfc=—_. 

(39.21) 

e » '-1 

This formula is called the Boso-Einstein distribution. As to particles 
for which this distribution is aiiplicable, they are said to obey Bose- 
Einstcin .statistics or, for short, Bose statistics. They have either 
integral or zero spin. The unknown quantities 6 and [a, i.o., the para¬ 
meters in the distribution, are given by equations as functions of 
N and 


_Efe?)!_ _ a 

^ ‘fe-i* ’ (39.22) 

^ ‘k->^ ’ (39.23) 

* e ® - 1 

so that the problem posed of finding the most probable values of 
Wfe is, in principle, solved. 

Fermi-Dirac distribution. We shall now find the quantities nu 
for the case when the particles are subject to Pauli exclasion. In 
accordance with (39.7) and Stirling’s formula wo have for the 
quantity 8: 

* * nil! {gii — > 0 ,)! 
k 

- Wfeln-- - {gii - rik) In . (39.24) 

k 

Differentiating 8 and substituting into equation (39.18), we obtain 


JO , gdN 

"--T+ 0 


+ (39.25) 


whence, by the same method, we arrive at the extremum condition: 


In 


Ok — nk 
nk 


-T + i = 0> 


and the required distribution appears thus: 

9k 


nk 


et - H 
6 


(39.26) 


+ 1 


Sec. 39] 


THE EQUILIBRIUM DISTRIBUTION OF MOLECULES 


427 


Here, rik^gk as is the case for particles subject to Pauli exclusion. 
For such particles, formula (29.36) is called a Fermi-Dirac distri¬ 
bution. The parameters 9 and jjl are determined analogously to (39.22) 
and (39.23): 

-= ^ > (39.27) 

fee « +1 

27- = (39.28) 

fee “ +1 

Concerning the parameters 9 and [i. The parameter 9 is an essentially 
positive quantity, because otherwise it would be impossible to satisfy 
equations (39.22), (39.23) and (39.27), (39.28). Indeed, there is no 
upper limit to the energy spectrum of gas particles. For an infinitely 

large e and 9 < 0, we would obtain e ® = 0, so that, by itself, a Bose 
di.stribution would lead to the absurd result nu < 0. In (39.23), 

on the left, we would have the negative infinite quantity ~ 

which can in no way equal N. Similarly, a Fermi distribution would 
lead to infinite positive quantities on the left-hand sides of (39.27) 
and (39.28); and this is impossible for finite N and ^ on the right. 
Therefore, 

6> 0. (39.29) 

In the following section it will be shown that the quantity 9 is pro¬ 
portional to the absolute temperature of the gas. 

The quantity p is very important in the theory of chemical and 
phase equilibria. These applications will be considered later (see 
the end of Sec. 46 and the succeeding sections of the book). 

The weight ol a state. Here we give a few more formulae for the 
weight of a state of an ideal gas particle. The weight of a state with 
energy between e and e-pde is given by the formula (25.26), whose 
left-hand side we shall denote now by dgr (e). In addition we assume 
that the particles have an eigenmoment j, so that we must take 
into account the number of possible projections of j, equal to 2/ -f 1: 

= (39.30) 

For electrons / = l/2, so that 2/-f-l=2. 

For light quanta we must use formula (25.24), replacing K in 
it by w/c and multiplying by two, according to the number of possible 
polarizations of the quantum: 


42S 


STATISTICAr. PHYSICS 


[Part IV 


It is also useful to know the weight of a state whose linear momen¬ 
tum is between px and px + dpx, and py-\-dpy, 7)^ and pz + dp^. 
It is determined in accordance with (25.23), also with account taken 
of the factor 2/-[-l. Thus, for electrons, we obtain 


dg (p) 


9 ydpxdpydj)z 

(27t;fe)» 


(39.32) 


Exercises 

1) Write (lewti tlio fortniilu for tlio probability that heads are obtained k 
times for largo N, where k is close to the maximum </*. 

The general fovmida for probability is of tho form: 


qk ■■ 


N\ 


(N - k)\k\ 


2 - 


We shall consider tho numbers N and k a.s largo. It is more conv'eniont hero 
t<i use Stirling’s formula in a somewhat more exact form than (39.9), namely: 


lnW!=Wln \ \n i n N . 


Wo put k 


N 


+ x,N -k^ 


N 


•>■, where x is a (piantity small com- 


N 


pared with . Then, in tho eorrection forms of Stirling’s formula 1/2 In ’2T:k 

and 1/2 In 2 n {N — k). tho quantity x can bo noglocteil. Wo oxpanil tho ilonom- 
inator in a series u|) to a:*: 

N 


In {N - k )! In hi - x 

,n.l = ln(|- + .)l = 4h4-i-.h4-+ J-fllnS.^. 


Tlio eorrootion terms are 


1 / A' \ 1 2 

i(ln2,iW-21n2::-i-) = -2ln-^. 


Substituting this in Iho expression for qk and taking antilogarithms we 
arrive at the roquiretl formula: 

2.x* 


N 


q has a maximum at x =-• 0 and dies away on both sides, q is reduced e times 
in the interval Xe = > characterizing tho sharpness of the maximum. 

Compared with tho whole interval of variation x, the interval Xe comprises 

X. 1/2" 

example, for N = 1,000, the maximum is approximately 

equal to 1/40. Tho ratio is about 2 so that, basically, tho heads fall between 

475 to 52.5 times. Tho probability that heads (or tails) will fall 400 times out 

■_ 2 • lo.n no 

of a thousand is equal to 1/40-e = 1/40 e'®”. In other words, it is 


Sec. 39] 


THE EQUILIBRIUM DISTIilBUTIOX OF MOLECULES 


429 


e"*'"®, i.e., several hundreds of millions of times less than the probability for 
heads appearing five himdred times. 

2) Verify that the probability q has been normalized, i.e., thatj*g (x) dx = \ . 

Since the probability decreases \ (>ry rapidly with increase in the absolute 
value of X, the integration can bo extended from — co to oo without notieo- 
ablo error. Then 


J q(x)dx = 1/-^ J 6 ^ dx ^ j e- . 

Wo shall now show that the integral ai>ponring hero is indeed equal to y/ k 
We shall call it /: 

/=/«-- 

Squaring, wo get 


= j e d 5 j e — j J e 


(V t V) 


d^ dti. 


The integration .sproa<l 8 over th(' whole J/j-plano. Lot us go over to polar coor¬ 
dinates : 

5 = p cos 9 , I) = p sin 9 . 


Instead of d^di) wo must put pdp ^ 9 , as is usual for polar coordinalos. 
Then, 

to 2 7t - ‘ ■ 

I -1 


or 

so that 


r- = J, 

0 


<e ^ (I 


Jd9 = 

0 

/ = TC , 

OO 

j q {x) dx — I . 


2 fi - P 


3) Find the mean square deviation, for the occmTonco of heads, from the 
most probable number, i.e., x^. 

We have 


= j x'^q (x) dx = J N dx = J S'*® “ . 

— OO — OO — 00 


In order to caleidato the integral wo make use of the result of exorcise 2, 
tvriting a in the exponent instead of 5^*. 


je = 


V « 


— CO 


430 


STATISTICAL PHYSICS 


[Part IV 


Wo differentiate both sides of this equation with respect to a: 


— OO 


Putting a =- 1 we arrive at the required formula: 


-- OO 


Wo notice, incidentally, that 


OO 


— OO 


and in general 


2 ■ 3^ .. (2ji -J.) 
2 " 


For T^, wo obtain 

— N 
x~ — - -- 


Expressing JV in terms of a:®, wo can write down the probability-distribution 
law: 

<l(x)dx — r- -—e . 

V 2a a:“ 

Thus, the width of the distribution is i-ery simply related to the moan 
square deviation of tho quantity from its most j)robablo value*: 

Xe - V 2x^ . 


Of course, this relationship holds only for an exponential distribution of tl.o 
type obtained in exorcise 1. 


Sec. 40. Boltzmann Statistics 

(Translsitioiial Motion ol a Molecule. Gas in an External Field) 

Roltzinann distribution. Long before the Bose and Fermi qtiantum 
distribution formulae (31).21) and (31).21!) had been obtained, Boltz- 
tnann deriveil a classical energy distribution law for the molecules 
of an ideal gas. This law is obtained from both quantum laws by 
means of a limiting itrocess. We shall perform this transition imrely 
formally at lirst, and then decide which real conditions it corresponds to. 

* For .r = .IV, tho probability q (.r) decreases e times compared with q (0); 
it is for this reason that .tv characterizes the width of the distribution. 


oc. 40] 


BOLTZMANN STATISTICS 


431 


Let e be measured from zero, and let the ratio [x/O bo large in absolute 
alue and negative. Then 


e 


i considerably greater tluin unity for all e. Hei'e, unity in the dcnom- 
lator of both distribution formulae can be neglected as compared 
ith the exiionent, and both the Bose and Fermi formulae take 
n the same limiting form 

nk-=gk>- » . (40.1) 

'his is the Boltzmann distribution. Let us now determhie the constant 
from the normalization condition for the distribution: 


2^nk=A. (40.2) 

k 

Let us sn])j)ose tlmt the gas molecules possess some iiiti'rual degrees 
f freedom (in addition to the external transport degrees of freedom) 
lat may be related to electron excitation, the vibration of nuclei 
ith respect to eacli other, and the rotation of the molecule in s]iace. 
'he energy of all these degrees of freedom is quantized. Without 
efining it more c.xactly for the time being, we can write tlie total 
aergy of a molecule e in the form of a sum of the energies of trans- 
dional and mternal motion: 


Accordingly, the weight of a state of given energy is also n^presented 
3 the product of tw'o weights: one relates to translational motion 
tid is given by the formula (39.32), while the other wo denote simply 
y go) (we also agree to include in it the factor 2; + l): 


_F dpx dpy d] X 
^ (27 z hf 


(40.4) 


Jierefore, formula (40.2) can be written thus: 


V - -‘^'1 “ 7 “ - - 

(2,/t)3 g ® J J If N. (40.5) 

t -fO — oo oo 

2 2 2 

Expandmg the translational-motion energy into 2m ’ 

e see that the momentum integral is represented as the product 
r three integrals of the form 

2 

j e dpx. 


— CO 


432 


STATISTICAL PHYSICS 


[Part IV 


These integrals are easily calculated from the second formula 
of exercise 3, tSec. 39. Eacli of them is equal to V27i:m0, so that con¬ 
dition (40.5) reduces to the form 


c 


a 

» = 


V 

~N' 


6 


(40.6) 


Jf the gas is monatomic then the quantities refer to electron 
excitations. Therefore, if £0);j> 0, then, actually, only the zero term 
apjiears in the summation ovi-r the states*. But since the energy 
is measured from sf") as from zero, the svliole summation, actually, 
reduces only to tin; zero term y(„). It is of the order of unity. For 
exanifile, when the ground state has angular momentum 1/2, y(o) = 2. 
We then olitain the condition for the applicabilitj' of Boltzmann 
statistics in the form 


(40.7) 


For the inc(|uality (40.7) to be satisfied, it is sufficient to satisfy 
one of two conditions: 

1 ) the density of the gas is very small, i.e., the volume ocoiqiied 
by the gas at a given tcm^ierature 0 is large; 

2 ) the temjierature 0 for a given volume V is very high. 

In the case when the gas is not monatomic, these conditions are 

_e(i)^ 

quantitatively ehangctl somewhat because yg{i)e ® is also some 

i 

function of 0. But ipialitatively, the conditions of applicability of 
Boltzmann statistics still hold. 

Classical and quantum statistics. Wo have seen that for small 
densities or high temperatures the (piantum distribution laws for 
a gas ])ass into the classical Boltzmann law. From now on v'e shall 
agree to call the Bose and Fermi statistics quantum statistics and 
the Boltzmann statistics, chusical, regardless of wheather the energy 
spectrum is discrete or continuous. Those statistics will be termed 
quantum for which the indistinguishability of separate particles is 
taken into account. In other words, a quantum definition of the state 
of a system lies at the basis of quantum statistics: the number of 
jiarticlcs in all quantum states must be given. The classical definition 
of the state of a system indicates which particles are found in the 
given states. The lioltzmann formula (40.1) can be obtained from 
this classical definition. 

Maxwell distribution. In this section Ave will not be concerned with 
the statistics of the internal motion of molecules, and Avill consider 


* The relation between 0 and tenipiTntare is given by formula (40.25). 


Sec. 40] 


BOLTZMANN STATISTICS 


433 


only their translational motion. In accordance with (40.3), the energy 
of the translational motion of molecules is separable from their 
internal energy. Therefore, the Boltzmann distribution breaks up 
into the product of two factors. We are not interested in tlio first 
of the two factors, but the second, relating to translational motion, 
is of the form 

_p]_ 

g 2 m 0 


The weight of a state relating to a given absolute value p is obtained 
by changing to polar coordinates in formula (39.32); 

dgiv) (40.8) 

[cf. (2.5.24)]. 

Thus, the distribution according to the kinetic energies of transla¬ 
tional motion is written in the form: 

p* 

dn{p) =Ae (40.9) 

It is applicable both to monatomic and polj’-atomic gases if m is the 
mass of a molecule as a whole. 

The constant A is found from the normalization condition 


A 


0 


p* _ 

2>i'0'dp= 


(40.10) 


I'he value of the integral w'as found in exercise 3, Sec. 39. Ji’rom this 
.ve obtain 


■%/ 0 )* 


(40.11) 


, ! n place of the momentum distribu- 
■ O;! of molecules, it is sometimes useful 
. ^ave their velocity distribution. For 
i.ij.’' it is sufficient to substitute p = mv 
1 f’e distribution (40.9): 


dr. iw) = N 


v^dv. 


(40.12) 


Tb'^ distribution had already been deduced by Maxwell, before 
Bol^uanann, and for this reason it is called the Maxwell distribution. 

).u .f’ig. 46 we have plotted the ratio on the ordinate. For 

KtUu*!! V, this quantity is close to zero becau.se of the factor in the 


St - 05«0 


434 


STATISTICAl, PHYSICS 


[Part I\ 


equation for the weight of a state; after the zero point it reaches a 
maximum and exponentially decreases to zero again for large velocities, 
We thus sec that a gas contains molecules with every possible velocity 
value. 

The velocities of gas molecules. The greatest number of molecules 
have a velocity corresponding to the maximum of the distribution 
curve shown in Fig. 46. This maximum is determined from equation 
(40.12). The corresponding velocity is termed the most probable; 
it is 

=]/^. (40.13: 


We find the mean velocity by calculating the integral (we omit 
the factor N, because the mean value of velocity relates to a single 
molecule): 


V = 


A 


(40.14; 


The mean square velocity is also interesting 


0 

(the result of exercise 3, Sec. 39 is used in the derivation). 
The ratio V P ; v : tim.p. = V 3 : : V^. 


The mean energy of a single molecule is equal to 


and the mean energy for the whole gas is N times greater; 

^■= JV£ = -|W0. 


(40.16; 

/ 

(40.1^7 


This result relates to the energy of tran.slational motion of the mole¬ 
cules. Numerical evaluations of velocity will be performed belovp. 

The relationship between energy density and pressure. We tihal 
now derive a very important relationship between the density o; 
kinetic energy of a gas and its pressure. This relationship holds for* any 
statistics and depends only upon the form of the expression for energy 
in terms of momentum. 

The pressure of a gas is defined as the force with which the ga.s actf 
upon unit area perpendicular to its direction. This force is eqval tc 
the normal (to the surface) component of momentum transmitted by the 


Sec. 40] 


BOLTZMANN STATISTICS 


435 


gas molecules in unit time. Let the direction of the normal to the sur¬ 
face coincide with the cc-axis. We first choose those molecules which 
have a velocity component along the a:-axis equal to Vx. They will 
reach the surface of a volume in unit time if they initially were situated 
in a layer of width Vx- Let us cut out a cylinder from this layer with 
base of miit area and height equal to Vx- The volume of this cylinder 
is Vx- If dn. (u*) is now the number of molecules whose velocity compo¬ 
nent normal to the surface is then the density of these molecules 

is There are Vx such molecules in a cylinder of volume Vx. 

Each of them, upon elastically colliding with a wall, will reverse its 
normal velocity component, and the wall will receive a momentum 

mvx — (— mvx) = 2mvx- (40.18) 

Thus, all the gas molecules having a velocity Vx, transfer to the wall 
in unit time a momentum 

„ dn{vx) c\ 2 dn(vx) 

2mvx — ^—^•Vx=2mv% — Y~ • (40.19) 

In order to obtain the gas pressure on the wall we must integrate 
(40.19) over all Vx from 0 to oo, and not from — oo to oo, because 
molecules moving away from the wall will not strike it. Thus, the pres¬ 
sure of the gas on the wall is 

oo oo 

p = -^jv%dn(vx) = ~jvldn{vx). (40.20) 

0 —oo 

On the other hand, the mean kinetic energy of the gas is 

oo oo ' oo 

^ = ~ j vldn (vx) + -^ J Uydn (Wy) + -^ j vldn (vx) = 


= \ v%dn{vx), (40.21) 

—oo 

because the mean values of the squares of all the velocity components 
are identical. 

Comparing now (40.20) and (40.21) we find that the gas pressure is 
equal to two thirds of the density of its kinetic, energy: 

2 S’ 

P = Yir- (40.22) 

This result was published by D. Bernoulli, as early as 1738, a century 
and a half before statistical physics began to develop as an independ¬ 
ent science. 


28 * 


436 


STATISTIOAI, PHYSICS 


[Part IV 


Only two assumptions have been used in the derivation of (40.22): 
identical values of the three velocity projections are equiprobable 

and the kinetic energy is equal to . The concrete form of the 

distribution function is not essential. 

The Clapeyron equation. If a gas is subject to Boltzmann statistics, 
then, in accordance with (40.17), the mean kinetic energy i is equal 
to —^ . Substituting this in (40.22) we obtain 

pF = iVe. (40.23) 

But from the definition of absolute temperature 

pF = RT. (40.24) 

From this wo obtain the relationship between “statistical” temper¬ 
ature 0, measured in ergs, and the temperature T, measured in 
degrees Centigrade: 

6 = 4^ = -6^4 x'io- = T . (40.25) 

R 

The ratio I;=-^-is called Boltzmann’s constant. It is equal to 

1.38 xlO-i*. The temperature can also be measured in electron-volts, 
one electron-volt being equal to 1.59 x 10“^^ erg. Translating ergs into 
degrees with the aid of Boltzmann’s constant, we find that 1 ev = 
= 11,600°. 

As is known, the specific heat of an ideal monatomic gas is equal 

3 3 

to -g- R, thus corresponding to an energy ^ RT. Replacing RT by 

ATO, we find ^ = -|-A70 in agreement with (40.17). 

The relationship (40.25) allows us to calculate the mean velocity 
of gas molecules without using the Avogadro number N. Indeed, 

M — ]I^RT 

^ 1 Ttm [ 7cA?m \ kM ’ 

where M is the molecular weight of the gas. For example, the mean 
velocity of hydrogen molecules at a temperature of 300° K is 

- 1/8-¥.3-10’“-300 , , 

V = -2 -- = l,800w/sec. 

This value is comparable with the exit velocity of a gas into a 
vacuum or with the velocity of sound [see (47.30)]. 

The thermonuclear reaction. When nuclei collide reactions are pos¬ 
sible between them that proceed with the release of energy. For exam¬ 
ple, in a douteron-deuteron collision one of two reactions can occur 
(besides elastic scattering): 


Sec. 40] 


BOLTZMANN STATISTICS 


437 


Df+Df = 


) He’ + no, 

^H! +ffl. 


Here 11\ is tritium and nj a neutron. Another example is 

Li§ + D! = 2HeJ. 


In order that charged nuclei may be able effectively to collide, they 
must overcome the potential barrier of Coulomb repulsion, which was 
considered in Sec. 28. The dependence upon energy for the probability 
of passing through the potential barrier is basically determined by the 
barrier factor [sec the first term on the right hi (28.12)]; 

_ 2k ZiZ, (* 

e . (40.26) 


Here, Z-^e and Z^e are the charges of the colliding nuclei and V|| is the 
relative velocity along their joining line [recall that (28.12) refers to 
one-dimensionnl motion]. 

The reaction can be produced by accelerating the particles in a dis¬ 
charge tube. But charged ])articles, striking a substance, mainly 
spend their energy on ionization and excitation of the atoms. And 
since, according to (40.26), the jirobability of the reaction at small 
energy is vanishingly small, the majority of incident particles do not 
cause a reaction. Of the total number of particles it turns out that 
10 ~®—10'® are effective. Therefore, the energy yield of the reaction 
is considerably less than the total energy spent in accelerating the 
beam of particles. 

The situation is different if the substance used for the reaction is 
at a very high temperature, of the order of 10’ degrees. At this tem¬ 
perature, the nuclei of the heated substance already react at a suffi¬ 
ciently high rate, and transmission of energy to electrons does not 
occur because their mean energy is the same as that of the nuclei. 

Let us calculate the rate of a nuclear reaction occurring under such 
conditions. It is termed thermonuclPMr. 

Let the effective cross-section for the reaction between nuclei with 
relative velocity ?;j| be a (wn). We assume that different nuclei react: 
we shall call them 1 and 2. Let us construct on each nucleus 2 a cyhnder 
with base area a (i^u) and height numerically equal to ^n. Then, by 
definition of a (i;||), all those nuclei 1 which occur in the volume of 
these cylinders and which have velocity V\i relative to nuclei 2 will 
be involved in the reaction in unit time. 

The number of such events in unit volume and unit time is equal 
to the product of 

Wi • V|| (T (U||) dq (i?,,), (40.27) 

where and are the numbers of nuclei 1 and 2 in unit volume, and 
dq (Vji) is the probability that the relative velocity is equal to V\\. 


438 


STATISTICS AT PHYSICS 


[Part IV 


Indeed, a cylinder of volume V\\ a (V||) can be constructed on each 
nucleus 2, and there will be a (^n) nuclei 1 in each cylinder. The 
velocity distribution of the nuclei is taken into account by multiplying 
by dq (tsu). If 1 and 2 are identical nuclei, then expression (40.27) 
must be halved so that each reaction is not taken into account twice. 
We indicate this by the factor (2) in the denominator of expression 
(40.28). 

Let us now determine the probability factor dq (iS||). The absolute- 
velocity distribution is given by the product of two Maxwellian fac¬ 
tors of the form 


m,vl “•■'I 


In the exponent of this expression is the sum of the energies of both 
nuclei. In accordance with formula (3.17), it can be split into the kine¬ 
tic energy of the motion of the centre of mass of the nuclei and the 
kinetic energy of their relative motion. Hence, in the product a factor 
is separated that gives the relative-velocity distribution: 

mi'« ""'i 

g 2(m,+ m,)6^g 26 ^ ^ 26 ^ 26 ^ 

where m = [see (3.20)], v^^v\ + vl. 

Normahzing the distribution over V\\ to unity and passing to the re¬ 
duced mass m, we obtain an expression for the probability that the value 
of relative velocity along the line joining the nuclei will be U||. 

dq (t’li) = e dv||. 

The barrier factor (40.26) depends upon V[\. 

Thus, the overall rate of a thermonuclear reaction is 

oo 

0 

Taking into account the barrier factor, we write the dependence of 
effective cross-section upon the rate as 


2nZi2,e» 

The factor Uq here depends considerably less upon the rate than the 
exponential function. 


Sec. 40] 


BOLTZMANN STATISTICS 


439 


The integral in (40.28) now reduces to the form; 

T — 

Oo(vil)i;iie hv\ 20 di?||. (40.29) 

0 


It can be calculated, to a good approximation in the case when the 
temperature is so low, that the greater part of tlie reaction jirooeeds 
at the “tad” of the Maxwelhan distribution at a rate greater than the 
mean. Let us show how this calculation is done. 

We denote tlie argument of the exponential in the integrand thus; 


/ («il) 


2-!tZiZ2e^ _ a , ft 2 

_ 2 TcZ^Z^e^ , w 

= ^ ® = T • 


We find the minimum of the function / (wn) from the condition 

|(-=-4 + H, = 0; .!=(/»:. (40.30) 


We shall see that the basie contribution to the integral is given by 
values of v\\ close to wjj. Near the minimum, / (wn) can be represented 
in the form 


/ (t’li) = / (*’1!) + Y (^11 - (-£i-)o = 


=-|-Va*fe + -1-6(1)11 —i;jl)2, (40.31) 


and the integral (40.29) is written as 

r —/(■'ji)—I'll)* 

J cfo(«ii)«iie di;||. (40.32) 

0 


The minimum of / (wu) corresponds to the rate v\ at which the 
greatest number of reactions occur. The ratio of the rate wlj to the 
mean relative rate | V\\ \ is, according to (40.30), 


ji - ]/-J VaVh ~ (f)''-(f), (40.33) 


smee 


■'”•-1/11'= 1/1- 


We shall call the temperature low if the ratio 


0 

n 
i«ii I 


is several times 


greater than unity. Then the maximum of the integrand of (40.32) 


440 


STATISTICAL PHYSICS 


[Part IV 


is very sharp at the point V\\=v\, because it decreases e times when 


W|l deviates from vH by an amount j/ , which is considerably less 
than W|). 

It was therefore justifiable to terminate the expansion in (40.31) 
with the second term. In addition, the quantities og (vn) and V\\ 
can be taken outside the integral sign when U|| = t)il. The error in 

both approximations is of the order —. The integration can be taken 

V|| 

from — oo to CO because the integrand rapidly decreases as wn re¬ 
cedes from v“, so that the error is exponentially small. 

Thus, 


-1(4- 


/ 0 ^ (*^ 11 ) *’11 
0 


d«li: 


Zh 


■'ll’ 


dO|i = 


-/(ril) 


(40.34) 


Substituting the values a and b and using (40.28), we find the 
expression for the rate of a thermonuclear reaction 


_WiW2 

(2) Vs 


® V( e* y ' 

5 ! V h ~ ) e . n 

> *’il = 


hm 


(40.35) 


The exponential factor depends very strongly upon the temperature. 
For example, for a reaction in deuterium, this factor clianges 3,600 
times when the temperature is increased from 100 to 200 ev. 

Thermonuclear reactions are the source of stellar energy and 
for this reason play as important a part in our life as chemical 
reactions! 

Ideal gas in an external field. We shall now consider an ideal gas 
acted upon by an external field with potential U. The potential ener¬ 
gy can depend both upon the position of the molecules in space as well 
as their orientation (if the gas is not monatomic). 

The total energy of a molecule is 

e = -£- + e<--) + C/ . (40.36) 


If U depends upon the position of a molecule in space, i.e., 
U=U (x, y, z), then we must pass from a finite volume V, m the weight 
factor (40.4), to an infinitely small volume dV=dxdy dz. Then part of 
the distribution function that depends upon coordinates x, y, z can be 


Sec. 40] 


BOLTZMANN STATISTICS 


441 


separated, and a formula is obtained defining the dependence of gas 
density upon coordinates: 

dn{x,y,z) = 71^6 ® dxdydz. (40.37) 

Here, the potential energy calibration is U (0, 0, 0) = 0, and tlie gas 
density at this point is equal to Wq. For example, in a gravitational 
field, U=mgz, so that 

mjjz 

dn{z) — 7if^e ** dz. (40.38) 

It should be noted that in the earth’s atmosphere the “barometric” 
formula (40.38) is rather more applicable qualitatively because air 
temperature is not constant with height. 

In addition, the “barometric” formula indicates that the composition 
of the air must vary with height as a result of the different molecular 
weights of nitrogen, oxygen, and other gases. Actually, the air com¬ 
position is almost uniform vertically because of vigourous mixing 
processes. 

The nonequilibrium state of planetary atmospheres. In place of the 
approximate ex]iression for the i)otential energy in a gravitational 
field, let us substitute its exact expression (3.4). Let us first of all 
express the constant a in formula (3.4) in terms of more convenient 
quantities. The force of gravity at the earth’s surface is —mg and, 

from the general gravitational law, it is equal to —, where is the 

radius of the earth. From this a=xmgr\,so that 17= — —Therefore, 
the gas density must vary with height according to the law 

n — n^e^'^. (40.39) 

This quantity remains finite even at an infinite distance from the 
earth, and since the exponent is equal to unity at infinity we have 
called the proportionality factor Woo . 

Near the earth, where r = r^, the density is greater than ?ioo by as 
many times as the quantity 

Hi^fo Mgrp 

e ® 


is greater than unity. 

The radius of the earth m 6.4 x 10® cm, g <=» 10® cm/sec®. From 
this we obtain for oxygen 

Mgrl 32 • 10“ • 6.4 • 10* 

RT ■ — 8.3.10’ • 300 • 


442 


STATISTICAli PHYSICS 


[Part IV 


In actual fact the density of the earth’s atmosphere at infinity is 
equal to zero. Therefore, it follows from formula (40.39) that the 
atmosi>here cannot arrive at the most probable state when i?! the 
earth’s gravitational field, and is gradually dis])ersed into space. 
The most probable state of a gas is called statistical equilibrium 
(see 8 ec. 45). The equilibrium density of the atmosphere at infinity 
is e®®® times less than at the earth’s surface. Therefore the present 
state of the atmosphere is very close to equilibrium. For the moon, 
equilibrium has been reached: its atmosphere has comifietely 
escaped! 

A kinetic interpretation of the dispersion of planetary atmospheres. 
It is easy to understand the reason for the recession of gases to infinity. 
Any particle who.se velocity exceeds 11.5 km/sec is capable of over¬ 
coming the earth’s attraction: its motion is infinite. In accordance 
with the Maxwell distribution (40.12) a gas will always have molecules 
with every possible velocity. In literal notation, the velocity of mole¬ 
cules capable of going to infinity is defined by the equation 

(40.40) 

Taking u® from this equation and substituting into the Maxwell 

distribution, we once again obtain the exponential e ® for the 
fraction of molecules capable of leaving the atmosjihere. It is easy 
to estimate the number of such molecules in the atmosphere at any 
instant of time. The earth’s surface is 5x10^® cm®. There is about 
1,030 gm of air above every square centimetre, i.e., about 35 moles. 
Hence, the total number of molecules in the atmosphere is 5 x 10^® x 
X35 x 6 X10®® = 10**, and the fraction of molecules of velocity greater 
than 11.5 km/sec is e ®®®=10“®**. Therefore the mean number of mole¬ 
cules capable of leaving the earth at each instant is only 10“®®®. Of 
course, those molecules close to the earth’s surface will not be able to 
“carry” their energy to the upper layers of the atmosphere because 
of coUisions with other molecules. 

The dielectric constant of a gas. We shall now consider a gas whose 
molecules have a constant dipole moment in a constant and uniform 
electric field. Those molecules can have characteristic dipole moments 
for which there is some preferred direction: NO, CO, HjO (along the 
altitude of the triangle passing through O), NHg (along the axis of 
symmetry of a three-sided pyramid). The more symmetrical mole¬ 
cules do not possess moments: Hg, Oj, CHj (tetrahedron), COj (this 
proves that the COj molecule has the form of a rod with the C atom 
in the centre). 

Rotational motion is quantized. In the next section it will be shown 
that for all gases except hydrogen, at a temperature of several tens of 
degrees (from absolute zero), the states with large quantum numbers 


Seo. 40] 


BOLTZMANN STATISTICS 


443 


are already excited. In these states the motion may be rega ded as 
classical. Then the total rotational energy of a molecule simply breaks 
down into the kinetic energy of rotation (see Sec. 9) and potential 
energy, which depends upon the orientation of the dipole moment 
relative to the external electric field: 


U— — (dE) = — d’Fj cos & 

[see (14.28)]. In classical motion, the potential and kinetic energies 
may be regarded as quantities instead of operators. Therefore, in the 
Boltzmann distribution the factor that depends only upon potential 
energy is split off: 

d • E cos S' 

dn(%^)~Ae ® sin^-d-S-. (40.41) 

Here, sin d is proportional to the element of solid angle in which the 

vector d lies [cf. (6.15)]. 

Let us now determine the electric polarization of a gas in an external 
field. For this we must calculate the mean projection of the dipole 
moment upon the electric field, i.e., 

d ■ j e ® cos a sin 0 d a 

d • cosO^ = —5---, (40.42) 

Je ® sinada 
0 


It turns out that it is sufficient merely to find an expression for 
the integral in the denominator, because equation (40.42) can be 
rewritten thus: 


d-cos&-- 


e~ln 

dE 


d* B cos^ ' 

fe ® sinO'dO' 


0 


(40.43) 


Indeed, differentiating the integral with respect to the parameter E, 
we revert to (40.42). The integral can be calculated using the fact that 
sin d9-== —d (cos ff): 


” d’Ecos^ 


sin 


(/•Ecosa’^ 


Ed 


20 . . Ed 

Ed^^-r 


(40.44) 


The integral in formula (40.43) is called a statistical integral. For 
quantized energy levels, it is replaced in the general case by a statisti¬ 
cal sum. The expression for the summation and integral will be met 
with many times again. It is very convenient in calculating mean 
values. 


444 


STATISTICAL PHYSICS 


[Part IV 


Substituting (40.44) in (40.43), we obtain an expression for the mean 
projection of the dipole moment on the electric field 

d-cosO'= d • |coth^ —. (40.45) 

This expression was obtained by Langevin. Let us investigate the 
right-hand side in two limiting cases: E<^ -^(weak field) and -j 
(strong field). 

If the field is weak then we can use tlie expansion of coth x in terms 
of x: 

cothx = ^ + ^, 

whence 

d^c^ = ~ . (40.46) 

The polarization of the gas is 

P == iVd-cosO- = , (40.47) 

and the dielectric constant is calculated from the definitions of induc¬ 
tion (16.23) and (16.29); 

P = 47tP = p(l+^^^) = eP. (40.48) 

In a strong field coth tends to unity and tends to zero. 

Therefore, cos 0- tends to unity. This means that aU the dipoles are 
orientated along the external field and saturation sets in. Then 
D=E + 4:n Nd. We notice that, for P = 300°K, this case would 
correspond to a field P > 10’ v/cm, which is considerably greater than 
the breakdown potential. 

Paramagnetism of gases. We shall now find how the magnetic per¬ 
meability X is calculated. Here we must take into account the fact that 
the magnetic moment is related to the mechanical moment of electrons, 
and the latter is a quantized quantity, i.e., it takes on a discrete 
series of values. Usually an electronic mechanical moment does not 
have a value greater than several units, so that the limiting transition 
to classical theory cannot be performed. An atom can also have a 
magnetic moment (as opposed to an electric dipole moment). There¬ 
fore, let us determine the magnetic susceptibility arising from the 
orientation of atomic magnetic moments in an external field H. 

Let us suppose that an atom in the ground state possesses an orbi¬ 
tal angular momentum L, a spin angular momentum S and a total 
angular momentum J. In other words, the ground state is a multiplet 
state. Let the multiplet splitting (fine structure) be defined by the 


Sec. 40] 


BOLTZMANN STATISTICS 


445 


energy interval A, so that the level with the closest value ^ ± 1 differs 
from the ground level by the quantity A. If the energy of the ground 
level is e^, then the closest level has an energy Eq + A. The ratio of the 
number of atoms in the ground state to the number of atoms in the 
closest state, belonging to the given multiplet, is, according to (40.1) 


2J+1 e ® _ 2J+1 -!■ 

2(J±l) + l' (ep + A) ~ 2(J+ 1) + 1 ^ 
e 8 


(40.49) 


Thus, if the multiplet splitting A is considerably greater than 6, 
the majority of the atoms are in the ground state. K they are placed 
in an external magnetic field then each of the multiplet levels is 
split into 2 J +1 levels, corresponding to its value of J. Suppose that 
the field corresponds to the anomalous Zeeman effect in the sense 
that the splitting of each multiplet level in the magnetic field is con¬ 
siderably less than the fine-structure splitting A as defined in Sec. 35. 

Then, from (35.11), the energy of an atom in the ground state is 

^ = (40.50) 


where gL is the Lande factor [see (36.12)] and po is the Bohr magneton. 

The number of such atoms is given by the Boltzmann distribution 

n {Jx) = Ae 9 . (40.51) 

We must again determine the mean value of the magnetic moment 
projection on the field: 

J 

1^0 

—J 

We have put the minus sign on the left because the electronic charge 
is negative. Formula (40.62) involves a statistical sum. The summation 
is performed only over those levels which are obtained when the ground 
state of the multiplet is split in the magnetic field, since the number of 
atoms in an excited state is small. 

Summing the geometric progression, it is not difficult here to obtain 
a general formula similar to the Langevin formula (40.45). But we 
shall confine ourselves to the case of a weak field, when the exponen¬ 
tial function is expanded in a series. The expansion must be taken 


446 


STATISTICAL PHYSICS 


[Part IV 


up to the second term inclusive because the sum of the terms which 
are linear in is equal to zero: 


—j 


V^9lHJz , 1 / v.(,gLB:Jz\i~\ 


'z , z \^~\ _ 

2 V 9 / J 


2«/ -f" 1 “h 


{MlHY 
20 >= 


J 


We calculated the sum of Jz^ in Sec. 30 [see (30.27)]. Using the value 
for the sum then obtained, we write the required mean moment 
thus: 


— gi, = 6 In ^2 J + 1 + 


1 ( 2 /+ 1 ) 
6 0 * 


_ 1 i,lglnj{j+l) 
i 0 


(40.63) 


where we have once again neglected terms of higher order in H. 

Formula (40.63) is completely analogous to (40.46) for the electric 
moment of dipole molecules produced by a field. The characteristic 
magnetic moment is represented by the quantity iiqQlvJ {J + 1) 
so that the Lande faetor gt takes into aecomit the spin magnetic 
anomaly. Thus, magnetic susceptibility can be calculated from data 
obtained from spectroscopic observations. 

Paramagnetism oi rare earths. There are almost no elements for 
which we can completely verify formula (40.63) as applied to the 
gaseous state. But in rare earths the moment of the electronic cloud 
is due to the 4 /-shell, which occurs, as mentioned in Sec. 33, deep 
inside the atom. When such an atom is part of a crystal lattice, the 
4 /-shell is but slightly subjected to the action of the electric field 
of the neighbouring atoms so that its state may be regarded as being 
almost the same as for a free atom of a rare-earth element. Thei’cfore, 
(40.53) is applicable to those chemical compounds of rare-earth ele¬ 
ments where other elements do not possess a characteristic magnetic 
moment. Its agreement with ex])eriment is very satisfactory for almost 
all the elements of the rare-earth group. 


Exercises 

1) Find tlie mean relative velocity of two molecules of dilTorent gases occur¬ 
ring in a mixture. 

The relative velocity distribution is given by a formula similar to the I’n 
distribution but written for all three velocity components. This formula is 

similar to (40.12), but it involves the reduced mass w = ——instead 

nil + ni^ 

of the mass of a single molecule. Hence, like (40.14), the mean relative velocity 
tmms out equal to 


Sec. 41] 


BOLTZMANN STATISTICS 


447 


86 

Ttm 


1/8 9 (m, + Wg) 

y re »ii Wj 


If the molecules are identical, their mean relative velocity is ^2 times the 
moan absolute velocity. 

2) Calculate the velocity of a bimolecular reaction r', if the effective crosa- 
section depends upon the velocity component (along the lino joining the nuclei) 
in the following way: 


(Vii) 


0 Vn < 


do Vi, > 


r m 
1 ^'. 


Then, from the general formula (40.28), we find 


The decisive quantity in this result is the exponential factor e ® . The 
quantity A is called the activation energy. It is equ^ to the height of tho poten¬ 
tial barrier over which the colliding pai'ticles must pass in order that tho reaction 
may occiu'. Unlike a thermonuclear reaction, it is assumed here that the motion 
3f the reacting particles is classical. Transitions below the barrier make a vanish¬ 
ingly small contribution in chemical reactions. 


See. 41. Boltzmann Statistics 
(Vibrational and Rotational Molecular Motion) 

Molecular energy levels. In order to apply statistics to gases consist¬ 
ing of molecules, we must classify the energy levels of the molecules. 
The fact that a nucleus is considerably heavier than an electron, and, 
therefore, moves much slower, is very helpful here. We have used this 
in Sec. 33, when considering the question of the binding energy of 
two hydrogen atoms in a hydrogen molecule. The eigenfunction 
can be found for any relative positions of the nuclei. In a diatomic 
molecule the position of the nuclei is defined by a single parameter— 
the distance between them. The energy eigenvalue of the electrons 
depends upon this distance. Adding the energy of Coulomb repulsion of 
the nuclei to the electron energy, we obtain, for a given electron wave- 
function, the energy of the molecule as a function of the distance be¬ 
tween the nuclei. For example, in a hydrogen molecule, the curves for 
this relationship are of different form in the case of parallel and anti- 
parallel spin orientations (Fig. 47). The lower curve refers to the state 
with a symmetric spatial wave function and antiparallel spins, while 
the upper curve relates to the states with an antisymmetric spatial 


448 


STATISTIOAI, PHYSICS 


[Part IV 


function and parallel spins. The lower curve has a minimum at r = r„ 
so that hydrogen atoms may form a molecule only in a definite elec¬ 
tron state. 

In the general case, several different electronic states can have a 
minimum. The distances between corresponding potentional curves 
are defined from a wave equation of the type (33.23). In this equation 
we can neglect terms involving the masses of the nuclei in the denomi¬ 
nator. Therefore, the energy scale separating different electronic states 
of the molecules is the same as for an atom, i.e., from one to ten elec¬ 
tron-volts. 

Close to the miniraum of potential energy, nuclei may perform small 
oscillations. To a first approximation, these oscillations are harmonic 
so that their energy is given by the general formula (26.21): 

hvi{v (41.1) 


Here, v is called the vibrational quantum number of the molecule. 
This number is, naturally, integral. Fig. 47 shows a more general 
dependence of energy upon v, taking into account that the potential 
energy curve is not a parabola as in Fig. 41. However, practically, the 
deviations from formula (41.1) affect but little the statistical 
^ 1 quantities, because dissociation occurs when oscillations with 
11 large v are excited (see Sec. 51). 

1\ The frequency w depends upon the electronic state in 
l\ which nuclear oscillations occur. In accordance with the 
\\ general formulae for frequency (7.10)-(7.12), 

\ the fre- 


Fig. 47 


quency is inversely proportional to the square 
root of the reduced mass of the nuclei. There¬ 
fore, the vibrational quantum is considerably 
less than the distance between electronic lev¬ 
els, which is independent of the nuclear mass. 
It is of the order of tenths of an electron-volt. 


In addition to vibrational motion, a molecule with two atoms may 
also perform rotational motion, notation is most simply taken into 
account when the resultant spin of the electrons is equal to zero and 
the total orbital angular-momentum projection of the electrons on 
a line joining the nuclei is also equal to zero. These conditions are satis¬ 
fied in the electronic ground state for nearly all molecules, with the 
exception of Og, the resultant spin of which is equal to unity (but the 
projection of the electronic moment on the axis is zero), and NO, 
where the spin is one half (and the orbital angular-momentum pro¬ 
jection of the electrons on the axis is zero). Disregarding these excep¬ 
tions, we may consider a molecule of two atoms as a solid rotator, 
i.e., a system of two point masses at a fixed distance re corresponding 


iSoc. 41] 


BOLTZMANN STATISTICS 


449 


to the mimmum of the lower potential curve in Fig. 47 (see exercise 2, 
Sec. 30); (our case corresponds to J 3 = 0 and 1;=0, so that the closest 
excited level in k with A: = 1 is moved to infinity. The rotational 
moment of the rotator is perpendicular to the line joining the nuclei 
since its projection on this line is equal to zero). 

As we know from Sec. 5 [see (5.6)] the rotational energy of two 
particles is 

_ 


where m is the redueed mass and mr\ is equal to the moment of inertia 
of the rotator Going over to the quantum formula, we substitute 
the angular-momentum eigenvalue. It is usual to denote it by the 
letter K, so that 


2 jrerl 


(41.2) 


This formula corresponds to the energy of a symmetric top with 
A -=0 (cf. exercise 2, Sec. 30). It mvolves the mass of the nuclei in the 
denominator. Therefore, the distances between neighbouring rotation¬ 
al levels are of the order of a thousandth of an electron-volt and less. 

Thus, to a good approximation, the total energy of a two-atom 
molecule can be written in the form of a sum with three terms: 


S — \- £r 


h^K (K+ I) 
' 2 m r'} 


(41.3) 


where £c (be., it is independent of the nuclear mass m), 

m 

The excitation of electronic levels. If we substitute the expression 

(41.3) in the Boltzmann distribution, the latter separates into the prod¬ 
uct of thi’ee distributions involving electronic, rotational, and vibra¬ 
tional states. Let us suppose that a gas is considered with temperature 
not exceeding several thousand degrees, for example, 2,000-3,000°. 
Then if the energy of electronic excitation is several electron-volts 
(1 ev =11,600°, since the temperature can be defined in energy units), 
the fraction of molecules in excited electronic states is a very small 

ze 

number: e ® . In those cases when there are very low electron levels, 
the Boltzmann factor may also be other than a small quantity. But, 
as a rule, dissociation of the molecules sets in earlier than excitation 
of their electronic levels (see Sec. 51). 

Excitation of vibrational levels. Let us examine the vibrational states. 
For generality we may consider not only molecules ivith two atoms, 
but also polyatomic molecules. If the oscillations of such molecules 
are harmonic, we can make the transition to normal coordinates, as 
was shown in Sec. 7. Then the vibrational energy assumes the form of 


29 - 0060 


450 


STATISTICAI, PHYSICS 


[Part IV 


a sum of the energies of independent harmonic oscillators. The energy 
levels for each such harmonic oscillator are given by a formula of the 
form (41.1) with a frequency w correspondmg to a given normal oscilla¬ 
tion. 

Molecular oscillations are basically divided into two types: “valent,” 
in which the distances between neighbouring nuclei mutually change, 
and “deformational,” where only the angles between the “valence 
directions” change. For example, in a COg molecule, having a straight- 
line equilibrium form 0 = 0 = 0, valent oscillations alter the distance 
between the carbon nucleus and the oxygen nuclei, while deformation¬ 
al oscillations move the € nucleus out of the straight-line configura¬ 
tion. The frequencies of deformational vibrations are several times 
less than those of valent oscillations. The estimation /^co~0.1ev 
related to valent oscillations. 

In any case, if the vibrational energy breaks up into the sum of 
energies of separate independent oscillations, then the distribution 
function also splits into the product of distribution functions for each 
separate oscillation. 

Let us calculate the mean energy for one normal oscillation: 

oo '■“(■'+1) 

- . - 

CO '>“(■'+t) 

* 

..-0 

here we have used the same transformation as in the derivation of 
(40.43) and (40.62). Formula (41.4) involves the statistical sum for a 
harmonic oscillator. The sum of the geometric progression inside the 
logarithm sign is very easily calculated. Indeed, 


/l<i> 


do 


(41.4) 


t /-0 


t '^0 


1-e 0 


Substituting this in (41.4) and differentiating, we get 


tv 


ha 


ha 


(41.5) 


(41.6) 


The first term in (41.6) simply denotes the zero energy of an oscilla¬ 
tion of given frequency. The oscillation possesses this energy at abso¬ 
lute zero because then the second term in (41.6) does not contribute 


Sec. 41] 


BOLTZMAim STATISTICS 


461 


anything. The second term has a very simple meaning. If we write 
the mean energy in terms of the mean vibrational quantum number 'v 


then it is obvious that 


— ft a . — 
eK= ^~ + h<pv. 

(41.7) 

- 1 

1' = ' - -. 

h lo 

(41.8) 

6®'-l 


For this reason, the factor (e ® — i/ signifies tlie mean number of 
quanta possessed by a vibration at a temperature % — kT. At a low 
temperature, U is close to zero. For example, for oxygen and nitrogen, 
h to is about 0.2 ev, or 2,000-3,000°. Therefore, at room temperature 
oxygen and nitrogen occur in the ground vibrational state. In the 
case of hydrogen the reduced mass of a molecule is 14 times less than 
that of nitrogen. Its vibrational quantum is close to 6,000°. In poly¬ 
atomic molecules, where deformational vibrations occur, such oscilla¬ 
tions can bo excited at temperatures of the order of 300-600°. 

Vibrational energy at high temperatures. If the temperature is very 

high compared with h w, then e ® can be replaced by the expansion 
1 + Substituting this in (41.6), we obtain 

= (41.9) 


The first term does not relate to thermal excitation. Besides, it is 
considerably less than 8. Thus it turns out that at a sufficiently high 
temperature the mean energy per oscillation is equal to 0, irrespective 
of the frequency. The same can be obtained by proceeding from the 
nouquantized expression for the energy of a harmonic oscillator: 


(41.10) 


Substituting this in the Boltzmann distribution and calculating the 
mean energy, we have 

«> CO e„ 

I dp I dff sioC ® , , 

] dp] die- «■']. ( 41 . 11 ) 

J dp J dqe ® CO CO 

— oo -• oo 

The statistical integral inside the logarithm is calculated in the usual 
way: 


25 * 


452 


STATISTICAL PHYSICS 


i 


[Part IV 


” __r ” ._-I/ 9 ft- 9 

J e je dq — V^v:mQ.y—^=-^Q. (41.12) 

- OO — fjO 

Whence eo> =6. Then the total vibrational energy of a gas occurring 
at a frequency w is 

'^~^=-.NQ = lcT, (41.13) 

and its contribution to the specific heat is corresx)ondingly equal to B 
[see (40.17)]. Thus, at a high temperature the specific heat due to 
vibrational degrees of freedom tends to a constant limit. 

The excitation of rotational levels.* Let us now consider rotational 
energy. The weight of a state with a given value of moment K is, 
as usual, equal to 2 /v +1, in accordance with the number of jiossiblo 
projections of K. Especially interesting is the case when a diatomic 
molecule consists of two identical nuclei. In classif 3 dng the states of 
such a molecule it is necessary to take nuclear spin into account. 
Indeed, the wave equation for a molecule consisting of identical atoms 
does not change form when the nuclei are interchanged. Therefore, 
if the nuclei have half-integral spin, the wave function must be anti¬ 
symmetric with respect to the interchange of both nuclei, while if 
they have integral spin it must be symmetric. The symmetry of the 
eigenfunction of a molecule is determined by the symmetry of its 
factors fin the approximation (41.3) it is separated into factors]: 
electronic, vibrational, rotational, and nuclear sjiin. The electronic 
term of most molecules docs not change when the nuclei are inter¬ 
changed. The vibrational function depends only uiion the absolute 
value of the distance between the nuclei and therefore does not change 
either. The rotational eigenfunction is even with respect to this 
permutation in the case of even K, and odd for odd K. Therefore, if 
the nuclear spin is half-integral, then the spin function must be anti¬ 
symmetric for even K and symmetric for odd K, so that the resultant 
wave function may always be antisymmetric. If the nuclear spin is 
integral, the position is reversed, and if it is equal to zero, then odd K 
are in general excluded because then the spin factor simply does not 
exist. 

Rotational energy of para- and ortho-hydrogen. We shall now consider 
the rotational states of a hydrogen molecule. The total nuclear spin 
for hydrogen can equal unity (the ortho-state) and zero (the jiara- 
state). The weight of a state with spin 1 is 3 and that with spin 0 is 1. 
The state with if = 0 is even in the rotational wave function. Hence, 
it must be odd in the sjiin function, i.e., it must have spin 0, (see 
Sec. 33). But the state with zero moment possesses the least rotational 


* The hypothesis that the rotation of molecules participates in the thermal 
motion of a gas was put forward by M. V. Lomonosov as far back as 1745. 


Sec. 41] 


BOLTZMANN STATISTICS 


453 


energy. Therefore, only para-hydrogen is stable close to absolute 
zero. 

At a temperature other than zero all those states, for which the 

_ h' K (K + 1) 

Boltzmann factor e is of tlie order unity, are excited. Taking 

the moment of inertia of a hydrogen molecule to bo equal to 0.45 x 
>10-^0, we can see that already at 2’ —300° K the summation over 
all odd moments 

2^(2K+l)e^ 

i:=i, 3, 


differs from the summation over even moments by several thousandths. 
But since the states with even moments are, for hydrogen, nuclear- 
spin ortho-states, each state with even moment has an additional 
weight factor 3 according to the number of projections of spin 1. 
Thus, at room temperature, 3/4 of hydrogen is ortho-hydrogen and 
1/4 para-hydrogen. If hydrogen is rapidly cooled the ratio 3:1 is 
retained for a long time because the ortho-para-transition proceeds 
slowly. Such a state is obviously nonequilibrium since all the hydro¬ 
gen m an equilibrium state, at a temperature close to absolute zero, 
must be in the para-state. 

One of the methods of obtaining pure para-hydrogen is to adsorb 
hydrogen onto any substance that disrupts the molecular bonds 
during adsorption, for example, activated carbon. When desorbing 
the hydrogen by pressure reduction at low temperature, the change is 
that to the para-state. If the hydrogen is then heated to room temper¬ 
ature it stays in the para-state for quite a long time. 

Let us now write down the formulae for the mean rotational energy 
of ortho- and para-hydrogen. For simplicity we shall denote the 

factor in the rotational energy by the letter B. Then 


2J(2K+l)e * -liKiK + l) 


K~0,2,i. 


(2A+l)e 
X^O.2,4, .. 


-fC(K + l) 


= 02 


e 

80 


In 


r 


(2A+ 1) 


-f k:(k:+1) 
e *’ 


(41.14) 


The difference between e ortho and e para is that the summation is 
performed over odd K. For a mixture at room temperature 

I _ 3. 

Sr- - Spara r Sorttio- 


(41.16) 


464 


STATISTICAI. PHYSICS 


[Part IV 


At very low temperature, it is sufficient to retain only the term with 
A^ = 2 in the summation (41.14), so that 

f _ ® j 5 .\ 

■spara- 0* In 11 + 5e » 1 30£e » . (41.16) 

For ortho-hydrogen we obtain 

/ _2C 

iortho=0*^ln(3e «+le « J ^ 

W Vill 

^ - ®--r^2il(l -f 14 e 

3e~ ® 

The determination of nuclear spins from rotational specific heat. 
The rotational specific heat of hydrogen makes it possible to determine 
the spin of a ])rotoii. Let ns consider formula (41.17). In it, the first 
term is a constant. It is due to the fact that a molecule of ortho¬ 
hydrogen would have a rotational energy 2 J3 even at absolute zero. 
This energy does not contribute to the specific heat because it does 
not depend upon the temperature. Defining specific heat as the deriv'- 

ative ^, we see that for a sufficiently low temperature the ratio of 
the specific heats of ortho- to para-hydrogen tends to zero as 

_4B 

e 

Therefore, if ordinary hydrogen is rapidly cooled to a low temperature, 
its rotational specific heat will be determined by a quarter of its mole¬ 
cules in the para-state. It will be four times less than the rotational 
specific heat of pure para-hydrogen at the same temperatiire. 

Thus, by measuring the specific heat of the equilibrium state of 
hydrogen at low temperature (i.e., the para-state) and of rapidly 
cooled hydrogen, we can determine the spin of a proton or, knowing 
the spin from other data, we can show that protons are subject to 
Pauli exclusion because they possess an antisymmetric wave function. 

The rotational specific heat of molecules consisting of different atoms. 
Diatomic molecules that do not consist of identical atoms possess 
equal nuclear-spin weights for states with odd and even K. Therefore, 
their mean rotational energy is expressed thus; 

°° BK(K+l)-j 

(2/i:-fl)e » . (41.18) 

- K = 0 J 

The sum inside the logarithm cannot be written in finite form, but 
it is easily tabulated. Let us evaluate the temperature at which use 


10B\ 


(41.17) 


Seo. 41] 


BOLTZMANN STATISTICS 


455 


of an integral as a substitute for the summation is justified. Thus, 
for hydrogen 

2TOr? = 1.67-10-“(0.74)“-10-i« ‘ 

which corresponds to a temperature of 87° K. 

Here, m is the reduced mass of two protons, equal to half the proton 
mass; r* '~0.74x 10-® cm (where we obtained the moment of inertia 
used above). For other gases B is of the order of several degrees so 
that for all temperatures at which these gases are not in the liquid 
state the ratio JB/6 is a small quantity. To a good approximation, the 
summation in (41.18) may be replaced by an integral. If we take 


then 

and 


K [K + 1) = .r, 

(2 + 1) clK -= 2K + 1 = rfx (dK — 1) 


27(2/v + 1). 


BK(lCf 1) 


Bx 


0 ~ e 0 dx = 


B • 


(41.19) 


Substituting this in (41.18), we have an expression for the rotational 
energy of a diatomic molecule or any linear molecule 


-0 


RT_ 

N 


(41.20) 


We note that the concepts of “high” temperature for vibrations and 
rotations do not coincide in the least. With respect to the rotational 
.specific heat of oxygen, the temperature must be higher than 10° K 
to be regarded as high, while with respect to vibrational specific heat, 
it must be above 2,000° K. Therefore, in a very wide range of temper¬ 
atures, in particular at room temperature, the specific heats of diatom- 

ic gases are constant, and consist of a translational part B and 

^ 5 

a rotational part equal to B, so that the total specific heat is y B. 

It may be seen by numerical computation that the rotational specif¬ 
ic heat does not tend to a constant limit monotonically, but passes 
through a maximum at 6 = 0.81 B, equal to 1.1 B. 

The rotational energy for a polyatomic molecule will be calculated 
in Sec. 47. 


Exercise 

Find the rotational energy of para- and ortho-deuterium. 

Particles with integral spin have a symmetric wave fimction. Let us now 
consider a system of two particles with integral spin, for example, a deuterium 
molecule. For comparison we shall also take two particles with spin zero. The 
.spin function of the latter is identically equal to unity; therefore their orbital 


456 


STATISTICAL PHYSICS 


[Part IV 


wave function can bo only symmetrical. With respect to the rotational function, 
interchanRo of the nuclei is equivalent to a reflection at the coordinate origin. 
Hence, if the spin of a deutoron were zero then the spectrum of molecular deute¬ 
rium would show the linos, coiTcspoiiding to odd rotational quantum numbers, 
to bo absent. In actual fact they exist in the deuterium spectrum, and the weight 
of stat(!S with even K is twice as great as for those with odd K. This is seen from 
the relative intensity of spectral lines that correspond to transitions from the 
appropriate states. 

W(( shall show 1 ,hat for a deutoron spin of unity, the weight of the ortho¬ 
states turns out twice the weight of the para-states. A spin projection of unity 
takes on three values: I, 0 , - - 1 . We denote the spin wave functions (of both 
douterons) that correspond to these projections as ( 1 ), ( 0 ), (— 1 ) 

’t'z (0), (—1). Let us form all the spin wave functions of deuterium that 
correspond to a total spin projection 0 ; wo shall only take symmetric and anti¬ 
symmetric combinations: 

Symmetric functions Antisymmetric functions 

( 1 ) (-- 1 ) ■! (- 1 ) ( 1 ), ( 1 ) (- 1 ) - (- 1 ) ( 1 ) . 

'I'l ( 0 )+ 2 ( 0 ), 

For the total spin projection 1, we obtain 

i., (1) (0) -I- (0) (1) , 4,1 (1) 4, (0) - (0) 4, (1), 

(~ 1 ) 42 (0) + 4i (0) 42 (- 1) , 4i (- 1) 42 (0) - 42 (- 1) 4i (0) • 


And for a total projection ± 2 wo have 


4i (1)42(1). 
4 i(-1)42(-1)- 


The symmetric state has a maximum spin projection of two. Hence, the 
state for which the spins are parallel is symmetric. But there are six symmetric 
spin wave-function projections in all, and spin 2 has 2-2-t-l = 5 projections. 
Hence, of the functions with zero resultant projection, we can construct one 
function corresponding to a zero projection of spin 2. The other function with 
zero residtant projection corresponds to a resultant spin 0. 

In all, deutorumi has six ortho-states with a symmetric spin wave fimction. 
A spin unity has states given by an antisymmetric spin function because the 
maximum spin projection in those states is equal to unity. Thus, there are three 
para-statos. An oven rotational function of a deutoriiun molecule corresponds 
to the ortho-states, and an odd rotational function corresponds to the para- 
states. Then the total function is symmetric, as the ca.so should be for integral 
particle spins. The weight (duo to spin) for the ortho-states is six and for the 
para-states it is throe. Therefore, the statistical sum of ortho-deuterium is 

B KCfC-t I) 

(iJJ {2K + l)e ® , 

A'-=0,2,4,... 

and for para-douterium it is equal to 

BKtK+ 1) 

K = 1,3,5,... 


Sec. 42] 


THE APPLICATION OP STATISTICS 


457 


Here the equilibrium state at absolute zero is the ortho-state. The energies 
of both states [see (41.16) and (41.17)] are 


_ 6B 

^ortho ^ • 

®para Si 7? (2 -f 28 e ® ) . 

Compared with hytlrogen, the ortho- and para-states are interchanged here. 
(7ose to absolute zero, the basic contribution to the specific heat is given only 
by the ortho-state. Two thirds of all the moleoulos in equilibrium deuterium 
occur in this state at room temperature. Therefore, the rotational specific heat 
of rapidly cooled deuterium is loss than that of equilibrium deuterium at the 
same temperature in the ratio 2/3. Thus, by measuritig this ratio we can show 
that the spin of a deuteron i.s equal to >mity and not zero. 


Sec. 42. Tbc Application of Statistics to the Electromagnetic Eield 
and to Crystalline Bodies 

The statistical equilibrium of matter and radiation. lu this section 
we shall first of all consider radiation in a state of statistical equilibrium 
with matter. The conditions for such equilibrium are achieved inside 
a closed cavity in an opaque body. The walls of the opaque cavity 
absorb radiation of all frequencies and hence they also radiate all 
frequencies: if a direct quantum transition is permissible, then the 
reverse transition is also permissible. Therefore, radiation arrives 
at a statistical equilibrium with matter, that is, in unit time there 
is an equal amount of absorbed and emitted energy of electromagnetic 
radiation per unit surface of the cavity for every direction, frequency, 
and polarization. 

An equilibrium density of radiation energy is thus set up in the 
cavity. It can be shown that in this case temperature of radiation 
is equal to the temperature of the walls. The necessity of this will 
be es])ecially clearly seen in the sections dealing with the fundamentals 
of thermodynamics (Sec. 46 and 46); for the time being we shall 
merely note that it is natural to regard the temperatures of systems 
in equilibrium as identical. 

The absolutely black body. Equilibrium radiation can be experi¬ 
mentally studied by making a small aperture in the wall of the cavity: 
if it is of sufficiently small dimensions the equilibrium state will not 
bo noticeably changed. Radiation incident on such an aperture 
from outside the cavity is absorbed in it and does not get outside. 
In this sense the aperture resembles a black body which does not 
reflect light rays. For this reason it is called an “abaolvte black body," 
and the equilibrium radiation coming from the aperture is called 
“black-body radiation." 


458 


STATlSTIOAIi PHYSICS 


[Part IV 


This term is somewhat paradoxical since it contradicts the obvious 
picture. Indeed, an absolutely black body in equilibrium radiates 
more than a nonblack body because it absorbs more, and in equilib¬ 
rium the radiation and absorption are equal. If a body having a 
cavity and aperture is brought to an incandescent state, the aperture 
will exhibit the brightest glow. 

The statistics of an oscillator field representation. Planck’s formula. 
In this section we shall consider the application of statistics to 
equilibrium radiation. For this it is necessary to quantize the radiation. 
Unlike the statistics of a gas, the statistics of radiation does not 
permit a limiting transition to equations, with the quantum of ac¬ 
tion being eliminated entirely. This will become clear a little later. 

In quantizing the field, a double approach is possible. Firstly, 
a field may be represented as a set of linear harmonic oscillators 
by characterizing each oscillator with a definite wave vector k and 
polarization a (a — 1, 2). It is obvious that all these oscillators are 
difierent (as to their k and a). The quantum properties of such os¬ 
cillators are not apparent in calculating the number of states of the 
field; their only manifestation is that the energy of each of them 
cannot be equated to an arbitrary number, but belongs to an oscil¬ 
lator-energy spectrum; i.e., equal toh<o |n. -i- where n is an integer. 

When an oscillator is in thermal equilibrium, the mean number 
of its vibrational quanta is given by a formula similar to (41.8): 


e« -1 

The energy of each quantum is equal to ho> and the number of oscil¬ 
lations with frequency o> is, according to (25.24), 

J , , Fto® dci / jn 

dg(0i)== . (42.2) 


Here, in contrast to formula (25.24), both possible polarizations 
of oscillation with a given frequency are taken into account, and 
K = to/c has been substituted. Hence the energy of an electromagnetic 
field in the frequency interval dco is 


d if (w) = 


Vha^ do) 


(42.3) 


The radiation spectrum of the sun is close to this frequency distri¬ 
bution. 

The statistics of light quanta. Let us now approach formula (42.1) 
from another direction. We have said that the electromagnetic 
field is viewed as an assemblage of elementary particles—flight quanta. 


Sec. 42] 


THE APEUOATION OV STATISTtOS 


469 


Quanta of the same frequency, direction, and polarization are in¬ 
distinguishable from one another. Therefore quantum statistics 
are applicable to them as to particles. At the same time quanta 
have integral angular momenta; this was mentioned in Sec. 34. 
Therefore they are not subject to Pauli exclusion, and possess a Bose 
and not Fermi distribution. But, as opposed to gas molecules, which 
are subject to a Bose distribution, the number of quanta is not a 
constant quantity, since quanta may be absorbed and radiated. 
This is why the supplementary condition (39.12) does not apply 
to quanta. 

It is easy to pass from the general Bose distribution to a special 
case, when condition (39.12) is not imposed; for this it is sufficient 
to put equal to zero the parameter (x, by which equation (39.12) 
is multiplied (fx was introduced to satisfy the' condition N =const). 
Then the Bose distribution is simplified: 

n= . (42.4) 

6®-l 

Taking into account that for a quantum z = h(>i, we once again 
obtain (42.1). Thus, formula (42.1) denotes either the mean vibra¬ 
tional quantum number of an oscillator in an assembly subject 
to Boltzmann statistics, or the mean number of light quanta subject 
to Bose statistics. As we have already said, certain oscillators obey 
Boltzmann statistics; they are difierentiated by the numbers 
n^, rig, a (see Sec. 27), while the statistics of distinguishable particles 
is nonquantum. Let it be recalled that we differentiate between 
quantum and nonquantum statistics according as the particles are 
distinguishable or not. 

The impossibility of the limiting transition h -> 0 in the statistics 
ol the electromagnetic field. Let us now turn, for a time, to the oscil¬ 
lator picture. On classical theory, the mean energy of an oscillator 
is equal to 6 [see (41.11)-(41.13)]. If we multiply it by dg (w), the 
classical Rayleigh-Jeans formula for the energy of equilibrium 
radiation results. 

d^(w)cias. = ^J^0. (42.5) 

But this formula is obviously inadequate for large frequencies; 
upon integration with respect to to it gives an infinite total energy. 
It was precisely here, in statistics, that the classical representations 
first so obviously failed. Therefore, in 1900, Planck proposed for¬ 
mula (42.3); it was here that the quantum of action appeared for the 
first time in physics. 

Formula (42.6) is correct only for frequencies that satisfy the 
inequality 


460 


STATISTICAL PHYSICS 


[Part IV 


The total energy of equilibrium radiation. It is easy to find the 
total energy of equilibrium electromagnetic radiation from formula 
(42.3). Integrating with respect to w, wo obtain 


S-- 


oo 


Vh r 

J e«--l 

0 


h* 


r dx 

J • 


(42.6) 


4 

I’he integral in (42.6) is merely an abstract number, equal to^ 

(see Appendix, p. 586), so that the required energy is proportional 
to the fourth power of the absolute temperature (the Stefan-Boltz- 
mann law). 

•*—»»> 

Radiation from an absolutely black body. The result (42.6) can 
be verified from the emissivity of an “absolutely black body.” It 
is easy to rebate it to the energy S’. For this it is sufficient to calculate 
how many quanta fall from inside in unit time upon unit surface 
of a cavity, normal to the surface. We have indicated that if we 
take away a small section of the wall, radiation will pass tlirough 
the aperture with the same composition as that falling on the wall. 

The velocity of each quantum is c, so that its normal component 
is equal to c cos 0, where 6^ is the angle with the normal. In miit 
time these quanta will strike a square centimetre of the wall from 
the whole volume of a cylmder with base 1 cm* and height c cos 0. 

The energy included in the volume of this cylinder is equal to -pr • c cos h. 

Tlic fraction of quanta flying in unit solid angle is equal to , 

so that the total energy falling on a square centimetre of the wall 
in unit time is 


TC 

27T T 

—~ J d 9 I sinO-dS- • 
0 6 


ccosh 


a- 


r f 

~4~V 


7r“ O'* 


-r'^ h* 

__ A- 

■ 60 c=>/t» ■ 


(42.7) 


The constant in front of T* is equal to 5.67 x 10~® erg/cm* sec • deg*. 
Formula (42.7) cannot be directly applied to an incandescent 
solid body without ascertaining to what extent it may be regarded 
as black. 

Due to the fact that the sun’s luminous shell (chromosphere) is 
nearly opaque to radiation, the spectrum it emits is close to the 
equilibrium spectrum (42.3), even though it does not exactly coin¬ 
cide with it. The temperature of the chromosphere, as determined 
from (42.3), is approximately 5,700°. 

The pressure of equilibrium radiation. It is also easy to calculate 
the pressure of equilibrium radiation. It is convenient in doing so 
to apply the same reasoning that led to formula (42.7). Now, however. 


Soc. 42] 


THE APPLICATION OP STATISTICS 


461 


instead of calculating the number of quanta, it is necessary to cal¬ 
culate their normal component of momentum transmitted through 
a square centimetre of surface. This component is equal to the quan¬ 
tum energy Ato divided by c and multiplied by cos Therefore, 
unlike formula (42.7), we must integrate cos®^ instead of cos 11. 
In addition, for every incident quantum in the equilibrium state 
there is a similar quantum radiated in the reverse direction, so that 
the transferred momentum is doubled. Whence the pressure is 

2 Tt Jt/2 

0 0 

i.c., one third of the energy deiLsity. The same woidd be obtained 
from the derivation of equation (40.22) if the momentum were put 
equal to e/c instead of m.v. We note that in Lebedev’s experiments, 
where the pressure of a directed beam was measured, and not of 
light arriving uniformly from all directions, p = ^/F; the pressure 
of the directed beam is equal to the energy density without the 
factor 1/3 (see Sec. 17). 

From (42.8) and (42.6), the pressure of electromagnetic radiation 
increases in proportion to the fourth power of the temperature while 
the gas iiressure is proportional to the first power. Therefore, radiation 
pressure will always predominate at a sufficiently high temperature. 

At high temperatures the pressure of a substance can always be 
calculated from the ideal-gas formula, because the interaction energy 
between particles becomes small compared with their kinetic energy. 
Hence, 

JVO 

p=—• 

By considering that atoms are dissociated into nuclei and electrons, 
it is easy to express the ratio NjV in terms of the mass density. Let 
us suppose that the substance consists of hydrogen. Then for every 
proton there is one electron. If the density of the substance is p, 
then the ratio NjV is 2plm., where m is the mass of a proton and 
the factor 2 takes into account the electron. This gives 

(42.9) 

From (42.8) and (42.6), the radiation pressure p, is 

From this we obtain the relationship between density and tem¬ 
perature when the radiation pressure becomes equal to the gas pres¬ 
sure: 


462 


STATISTICAL PHYSICS 


[Part IV 


For example, for a density p = 1 gm/cm®, both pressures become 
equal if the temperature is equal to 4 x 10’ deg. Radiation pressure 
is important in the interiors of certain classes of stars. 

The frequency corresponding to the maximum radiation-energy 
density in a spectral interval dw. The maximum energy in the distri¬ 
bution occurs at a frequency determined from the equation 


- 

d<3i 

\ 


Performing the differentiation, wo have 


(42.11) 


_h <.>o 

1 - C 


ht,i„ 

30 * 


This equation has a single solution with respect to 


ft 


(42.12) 


ftOig 

~ir 


2.822. 


(42.13) 


Thus, the frequency corresponding to maximum energy in the spec¬ 
trum of black-body radiation is directly proportional to the absolute 
temperature (Wien’s law): 


2.8220 


“0 — 


A 


(42,14) 


We notice that the numerical coefficient in 
the formula would have been different if we 
had considered the wavelength distribution 
instead of frequency distribution (see e.xer- 
cise 1). It is interesting to note that the 
corresponding Avavelength in tlie solar 


spectrum is very close to that for the maxi¬ 
mum sensitivity of the human 
eye. The curve of the distribution 

h 6> 

e ® —1 is shown in Fig. 48. 

Spontaneous and forced emis¬ 
sion of quanta. At the beginning 
Fig. 48 of this section we pointed out 

that thermal equilibrium between 
atoms and radiation is attained in a closed cavity. The presence 
of atoms capable of radiation and absorption is necessary m general 
in order that the radiation may arrive at equilibrium; this is because 


Sec. 42] 


THE APPUtOATION OP STATISTICS 


463 


separate oscillators, corresponding to normal oscillations of the 
electromagnetic field, are completely independent of one another, 
and any initial nonequihbrium distribution is maintained until 
there is an exchange of quanta via absorbing atoms. 

In Sec. 34 we derived an expression for the probabfiity of fight 
emission by an atom. According to (34.46), the radiation probability 
in unit time is 

(42.15) 

We shall now consider atoms which are in thermal equilibrium with 
matter. Let the frequency Wi, satisfy the relationship AtOio=ej—E q, 
where and Eq are the energies of two atomic states. In equilibrium, 
atoms with energy e^ radiate as many quanta with frequency Wm 
as are absorbed by atoms with energy Eq. 

In accordance with the principle of detailed balance, the probabili¬ 
ties for direct and reverse transitions are connected by the following 
relation: 

!7i^io “ ^0^01 • (42.16) 

Indeed, the first-approximation formula of perturbation theory 
(34.29) is applicable to radiation and absorjition processes, since the 
interaction of matter with radiation may be regarded as weak. From 
this formula, the probabilitie.s for the transitions 1 >0 and 0 ->1 
are, respectively, 

W^oi = 4^l^oil=‘?,- (42.17) 

But according to the Hermitian condition (34.15), the squares of 
the moduli of the matrix elements | M’qi and | |* are the same so 

that if we multiply expressions (42.17) by the weights of the initial 
states, the result will be equation (42.16). 

The formula for the probability of absorption related to the case 
when a single quantum of frequency toi^ existed in the field before 
absorption. If there were n (wm) such quanta before absorption, then 
it is natural to assume that the probability of absorbing one of them 
in unit time is n (wjo) times greater. This assumption is justified in 
electromagnetic-field quantum theory. 

We shall therefore assume the probability of absorbing in unit time 
one of the n (ojo) identical quanta in the field to be equal to n (wm) 
^olFoi. In accordance with the principle of detailed balance we must 
have the same probability for the reverse transition, i. e., the emission 
of a quantum by an atom occurring in state 1 when there are n ( to^o)—■ 1 
such quanta in the field; this is because the transition is reversed with 
respect to the one just considered. We represent both transitions 
thus: 


464 


STATISTICAI. PHYSICS 


[Part IV 


quantum 
absorption 

quantum 
radiation 

Thus, in accordance with the principle of detailed balanee, the prob¬ 
ability of emission of a quantum must likewise be which can 

also be represented as [(n — 1) + 1] !7i IFm- Because of equation (42.16) 
the probabilities for both direct and reverse transitions will be equal. 
Hence, if n --1 quanta exist in the field, then the probability of emis¬ 
sion is proiJortional to n, i.e., to the number of quanta increased by 
unity. If, for exanijile, there were no quanta in the field before emission, 
this factor of jirojiortionality is equal to unity. In this case the emis¬ 
sion is termed spontaneous. But when there are quanta in the field, 
they stimulate, as it were, further emission of quanta with the same 
frequency, direction of jiropagation, and polarization. The emission 
produced hy them is called forced. The existence of forced emission 
can also be proved by means of quantum field theory, just as the pro¬ 
portionality factor n in the absorjition jjrobability. The idea of forced 
emission was introduced by Einstein. 

The derivation of Planck’s formula from the relationship between 
the quantum emission and absorption probabilities. Let us now consider 
atoms in thermal equilibrium with an electromagnetic held. Let the 
quantity n (wio) denote the equilibrium number of quanta. The condi¬ 
tion of statistical eipiilibrium is that atoms occurring in state 0 absorb 
as many quanta with frequency in unit time as are emitted by atoms 
in state 1. Then the number n (wjo) docs not change with time, i.e., 
equilibrium is attained. 

The number of acts of absorption by all the atoms in unit time (from 
state 0 in which there are atoms) is equal to 

■^0 ^^01 ^ (^lo) • (42.18a) 

The number of acts of emission by all atoms in state 1 in unit time is 

JViIPiof«(<o.o)-Mj. (42.18b) 

because, as we have seen, it involves the number of quanta increased 
by unity, i.e., n (w^q)-)-!. 

Naturally, expressions (42.18 a) and (42.18 b) no longer denote the 
probabilities for direct and reverse transitions, but the probabilities 
for transitions from a state with the same number of quanta n (wiq), 
which probabilities lead to a reduction or to an increase by unity of 
the same number. The condition for thermal equilibrium is that these 
probabilities are equal: 

^0 ^^01 ^ (^lo) ~ ^ 10 (‘^lo) + 1] • 


Ist state of the system; 

atom with energy Sq 
n quanta with frequency 
Wio 


2nd state of the system: 
atom with energy Sj 
n -—1 quanta with 
frequency Wio 


(42.19) 


Sec. 42] 


THE APPLICATION OF STATISTICS 


465 


Here, we substitute Nq and Ni from the Boltzmami distribution 
(40.1): 

e “ (Jo Woi n (wio) = r « j/j W^o [»(t»io) + 1] • (42.20) 

We now take advantage of the fact that /twio—E i—Eq, and also 
of the relationship (42.16). Then there remams an equation for the 
equilibrium number of quanta n (oijg) 

• n (Wio) = n (wio) H- 1 . (42.21) 

Wlience Planck’s formula is immediately obtained 

= (42-22) 

e » ■ - I 

Thus, the idea of forced emission leads to a correct frcciucucy distri¬ 
bution of quanta. Note that the part of forced emission is the more 
important the greater n is coiT,pared with unity. But large n corre¬ 
spond to the classical limit; it follows that forced or induced emission 
is by nature classical and spontaneous emission is a quantum effect. 

We notice that the theory of a field consisting of Bose particles 
always leads to the concept of forced emission, provided the principle 
of detailed balance is used. 

The probability of the apiiearance in the field, of the (w -l-l)tli par¬ 
ticle is i)roi)ortional to w-f-1, while the probability of the particle dis¬ 
appearing is proportional to n. Naturally, if the Bose particles are 
charged (as, for example, Tt-raesons) then only those transitions are 
l)ossible which arc compatible with the conservation of total charge 
of the system. 

As regards Fermi particles, we must bear in mind that a transition 
to a filled level is imiiossible. Therefore, if the probability that a level 
is fiUed is /, then the number of transitions to this level in unit time is 
proportional to 1—/. 

The oscillation spectrum for the lattice of a solid body. Let us now 
apply statistics to the crystal lattice of a solid body. As applied to 
the crystal lattice, statistics is in many ways similar to the theory of 
equilibrium radiation. 

The vibrations of atoms in a lattice may be described in normal 
coordinates, after which their energy is reduced to approximately 
the same form as (27.22): it consists of the sum of the energies of sepa¬ 
rate oscillators. To each oscillator, there corresponds a travelling wave 
(in the lattice) of the displacements of atoms from their equilibrium 
positions. An example of such a wave, travelling along a chain of 
atoms, will be given in exercise 4. 


30 - 0060 


406 


STATISTICAL PHYSICS 


[Part IV 


However, there exist the following differences between the set of 
oscillators for an electromagnetic field and those for a solid crystalline 
body. 

1) The number of degrees of freedom for an electromagnetic field 
is infinite, so that it always contains all frequencies from zero to oo. 
A solid body has a finite number of degrees of freedom equal to 3A", 
where N is the number of atoms. Therefore, the range of vibrational 
frequencies extends from zero to some maximum frequency Wmax- 

2) The dependence of frequency upon the wave vector of an electro¬ 
magnetic field is defined by the simple law w^c/c. In the oscillations 
of a solid body, the frequency depends upon the wave vector in a very 
com])lox manner. Only in the limit do the atomic vibrations become 
elastic vibrations of a continuous medium for very long waves (i.e., for 
small k), so that the atomic structure of the crystal can be ignored. 

In a continuous medium the frequency is proportional to the wave 
vector oi ~Uak, where the index a must denote that the wave velocity 
u depends upon its polarization. Here, as opposed to the electromagnet¬ 
ic field in an elastic body, there are three wave polarizations for each k 
(when taking into account the atomic structure of a crystal, waves of 
another type besides elastic sometimes occur: see Fig. 49). The direc¬ 
tions of polarization depend upon the elastic properties of crystals 
and upon k. 

In an isotropic elastic body two of these polarizations are transverse, 
for which the velocity is equal to ui, and one is longitudinal with velo¬ 
city ui, so that cr ranges through three values as in the crystal. 

The number of oscillations occurring in a given interval of values k 
can be obtained, as usual, by proceeding from the relationship between 
the oscillation number and the wave vector. Each oscillation is defined 
by three integers (Wj, n^, n^) and a. The ivave vector has components 
proportional to yq, n^, n^ : 


jf, _ ^ ^ 1 . . 7, _ 


(42.2,3) 


(fltj, a^, rtj are the crystal dimensions). From this. 


j I j 7 fl * fl •) (I ft (t h'T A*» 

(hj^^dn^driidn^^ 


V dk^dkgdk. 


(42.24) 


Comparing this formula with (26.22), we notice that now the denom¬ 
inator is (2 7t)®, while before it was simply ttj. The difference is due 
to the fact that here the numbers %, n^, are found from the peri¬ 
odicity condition, as for an electromagnetic field [see (27.4)], and the 
expansion is performed for travelling waves instead of standing waves. 
For this reason, an additional factor 2® appears in the denominator 
of (42.24) as compared with (26.18). But here the numbers tii, n^, 
range through all values from —cx) to oo, while in Sec. 25 they varied 


Sec. 42] 


THE APPLICATION OP STATISTICS 


467 


only from zero to oo, filling one octant. Thus, the total number of 
states turns out to be the same, irrespective of the method used to 
count them- -by travelling waves or by standing waves. And this is 
the way it should be [cf. (26.23)]. 

The energy of a solid body. It is now easy to write down an expression 
for the energy for an interval dkx, dky, dkz and for a given polarization 
of oscillations a. Like in (42.3) we have 


d^(k,c) 


Vh (/ A*- dky dkg 
» —1 


(42.26) 


In order to find the total crystal energy, we must integrate this 
expression over all dkx dky dkz and sum over a. Unlike the case of an 
electromagnetic field, here the integration must not be performed to 
infinity, but only between limits such that the total number of oscil¬ 
lations equals the number of degrees of freedom 3iV (because there 
are N atoms in the lattice and each one has three vibrational degrees 
of freedom): 

--3A. (42.26) 

a 

In order to find out what is meant here by summation over a, let us 
consider the possible types of vibration of a crystal lattice composed 
of atoms. 

Two types of lattice vibration. If we confine ourselves only to crystal 
lattices of elements where there is only a single atom in an elementary 
cell, then the index a will indeed range in value from 1 to 3. In reality, 
lattices are sometimes of more complex form, and the possible types 
of vibration become correspondingly more complicated. This can be 
illustrated by a simple example. Let there be two atoms in a cell, 
as shown in Fig. 49 by solid and open circles. The length d corresponds 
to one constant of the lattice. Let us imagine a vibration of some 
definite wavelength X. In Fig. 49 X = 4 d (a single half-wave is shown). 
This vibration may be effected in two ways: both atoms in the ele¬ 
mentary cell are either displaced to one side (Fig. 49a), or in opposite 

directions (Fig. 49b). The 
second vibration corre¬ 
sponds to a greater fre¬ 
quency for a given wave¬ 
length than the first be¬ 
cause the restoring force 
for the second vibration 
is greater. 

If there are i atoms in an elementary cell then 3 i types of vibration 
exist in the three-dimensional case. Three types correspond to the 
case (a) in Fig. 49, when ail the i atoms are displaced in the same direc- 


u*— a —^ V X 


aj 


6 ) 


Fig. 49 


30* 


468 


STATISTICAL PHYSICS 


[Part rV 


tion, and, in the limiting case of long wavelengths, the whole lattice 
vibrates like a continuous medium. 

The total number of crystal vibrations is equal to 3 uV' — 3 N, where 
N' is the number of elementary cells. Obviously, the number of vibra¬ 
tions is equal to the number of degrees of freedom, i.e., throe times 
the number of atoms in tlie lattice. 

Calculating the energy of a crystal lattice. (42.25) cannot be integrat¬ 
ed in general form because the dependence of frequency upon k 
and o is different for different lattices and for different types of vibra¬ 
tion. Wo must tlierefore confine ourselves to two eases. 

a) The temperature 0 is considerably greater than the limiting-fre¬ 
quency quantum ft is then all the more so greater than the 

other quanta, so that wo can neglect all terms, except tlie first, in the 
exponential series 


Substituthig this in (42.25), we obtain a simjile expression for the lat¬ 
tice energy: 

£ VQjjj f f _= 3iV0 = 3 RT. (42.27) 

n 

Here wo have made use of the fact that the total number of lattice vibra¬ 
tions is equal to the number of its flegrees of freedom 3i\'^. Hence, 
the sjiccific heat of the lattice is equal to 3iZ and is the same for all 
elements in molar units. This law is well satisfied for very many ele¬ 
ments already at room temperature (the Dulong and Petit law). 
Exceptions are, for exanqile, diamond and beryllium, for which the 
largo frequency wmax is due to a relatively small atomic weight, since 

_ 1 

frequency is proportional to M - . 

Expression (42.27) tits the general law for the temperature dependence 
of vibrational energy at high temperatures (41.13). 

Very frequentl}', in a crystal lattice wo can distinguish the molecules 
of the substance of which it is formed. We cannot, of course, draw a 
really strict distinction between atomic and molecular crystals but, 
qualitatively, this distinction is fuUy meaningful. In molecular crystals 
we can separately consider the motion of atoms inside molecules 
(purely vibrational, in the given case) and the motion of molecules 
as a whole relative to their equilibrium positions in the crystal. The 
latter correspond not only to definite centre-of-mass coordinates of 
the molecules in the lattice, but also to certain distinct orientations 
in space. Usually all the degrees of freedom of the motion of molecules 
in a crystal are vibrational. Solid hydrogen forms an exception where 
the molecules rotate almost freely (this rotation is similar to the rota¬ 
tion of a pendulum if its total energy is sufficient for transition through 


Sec. 42] 


THE APPLICATION OP STATISTICS 


469 


its upper position). The frequency spectrum for all vibrations, trans¬ 
lational and rotational, consists of very inan}'^ dispersion curves with 
different a (according to the number of modes of vibration) and with 
its frequency dependent upon the wave vector. This spectrum is 
complicated by the vibrations of atoms inside molecules, similar to 
case (b) in Fig. 49. Since all the possible vibrations are excited by 
temperature increases, the dejiendence of specific heat upon tempera¬ 
ture is of very complex form in molecular crystals. 

b) The temperature is considerably less than /iMmax- Then the factor 

*'“max \ ' 

f -1/ 

is so small that integration can be taken to infinity without any essen¬ 
tial error, because only the small frequencies, for which the quantum 
is of the order 0, contribute noticeably, i.e., Am ~6. For large fre- 

/ '■'■L \ 1 

quencies, the Planck factor \e ® — 1/ cancels tlie contributions of 
the corrosjionding vibrations. 

However, in the case of small frequencies the lattice vibrations pass 
into the vibrations of a continuous medium, for which vibrations the 
frequency is related to the wave vector by the simple formula 

(42.28) 

The jiropagation velocity of such waves depends upon the direction 
of propagation and u])on polarization, but docs not depend upon the 
absolute value of A. The remaining types of vibration, whose frequency 
does not become zero for small A, are not excited at low temperature 
since the coiTesi)onding quanta are comparable with i^comax- 

It is expedient to transform the volume element dAxdAydA^ to sjiheri- 
cal coordinates, i.e., to replace it by the expression A^dk dQ, where 
dQ is an element of solid angle for the directions k. Here, in accordance 
with what has just been said, the integration with respect to A must 
be taken to infinity. 

Thus, we obtain a formula for the total crystal energy at low tem¬ 
perature 

3 CO 

./ 6 » - 1 
0 


Vh yf r 

I 

-\l 


The inner integral is taken in the same way as in the formula for the 
energy of an electromagnetic field (42.6), so that 


TzV 6 * 
120 


(42.30) 


470 


STATISTICAI, PHYSICS 


[Part IV 


Thus the energy of a crystal lattice is proportional to the fourth power 
of the absolute temperature, while the specific heat is proportional to 
the third power. This refers to temperatures considerably smaller 
than h 

The Debye interpolation formula. P. Debye—^the author of the theory 
of crystal specific heats at low temperatures which is set out here — 
proposed an interpolation formula for intermediate temperatures 
when the results (42.30) and (42.28) do not hold. The Debye formula 
reduces to both these formulae in the limiting cases of high and low 
temperatures. The intermediate interval is described qualitatively, 
but in certain agreement with experiment. In order to obtain the 
Debye formula, we suppose that the law 

cj = k 


holds for all k, where is the usual propagation velocity of elastic 
waves. We may even take where Ut and ui are 

the velocities of transverse and longitudinal waves in a given substance 
in the polycrystalline state, which velocities are independent of the 
direction of propagation of the wave. We define the upper frequency 
limit Wmax from the condition that the total number of vibrations is 
equal to 3N. For this we must go over to spherical coordinates in 
(46.26): 


V_ 

2it“ 


■iloi ~ SN, 


(42.31) 


or, changing to ui, U(, we have 


C*>inax — 


(42.32) 


Condition (42.31) is selected so that at high temperatures the correct 
law ^ = 3 NQ is automatically obtained. At medium temperatures 
6 k= -^^?*-i8 substituted as the upper limit in the integral 

(42.29) in place of oo, so that the energy expression has the form 


^max 


(42.33) 


Changing to the integration variable x — and denoting Awjnax = 
=0D, we can rewrite the lattice energy thus: 


Sep. 42] 


THE APPLICATION OF STATISTICS 


471 


271* \«; u) I /t» J € 


' a:* dx 

T 


(42.34) 


At low temperature 0 d > 6, so that the upper limit in the integral is 
replaced by infinity. Then the integral is equal to , and for the 
energy we have 

The exact formida (42.30) assumes the same form if wo replace 
in it by Vt and ui, which are independent of direction. 

We shall now show how to determine 0 d from experimental data on 
specific heat and, independently, from elastic constants. The following 
values of specific heat G are known for tungsten (from the data of 
F.F. Lange): T —26.2°K, C —0.21 cal/mol • deg.; T = 3S.0°K, 
C = 0.75 cal/mol • dog. The cube of the temperature ratio is equal to 
3.37, and the ratio of specific heats is 3.68. We may assume that in 
the given temperature range the T® law for specific heat holds. Substi¬ 
tuting Or) = Acoma.'c in formula (42.35), we determine Wmux with the aid 
of (42.32). This gives 

12 


Converting this to heat units, we write 


Here R = 1.96 cal/mol • deg. Substituting the specific heat at the lowest 
temperature, we find I’d = 340°. 

We now determine To by proceeding from the elastic constants for 
tungsten. We have to give, without derivation, the formulae which 
connect u, and ui with the shear modulus and the bulk modulus for 
tungsten (see L. D. Landau and E. M. Lifshits, The Mechanics of 
Continuous Media, Gostekhizdat, 1953, p. 744 or A. Love. A Treatise 
on the Mathematical Theory of Elasticity, Ch. XIII, Cambridge, 1927). 


Here, K is the bulk modulus, which, for tungsten, is about 3.14 x 10^® 

dyne/cm® at low temperature. 0 is the shear modulus equal to 

1.35x10®® dyne/cm®. The density of tungsten is p = 19.3 gm/cm®. 

Hence,«; = 6 x 10® cm/sec, = 2.64 x 10® cm/sec. For tungsten the ratio 

NjV is equal to 0.635 x 10®®. Whence, if we calculate it from (42.32), 

14. 4 Cl V, lAi^ 1 Am 4.61 X 10'“ X 1.05 X 10-” oeoo 
Wmax IS equal to 4.61 X 10®®sec-®andTD=- 1 38 x 10 ^ ’*-= 352 . 


472 


STATISTICAL PHYSICS 


[Part IV 


The agreement with what waa obtained from specific heat turns 
out to be even better than could have been expected, because the 
elastic constants do not strictly refer to the temperature at which 
the specific heat was determined, and also because tungsten is a 
crystalline substance and its elastic properties are characterized by 
three moduli of elasticity instead of two (see Landau and Lifshits, 
loc. cit., p. 675). For a number of substances we have the following 
values of Debye temperature To- Pb — 88°, Na — 172°, Cu — 315°, 
Fe — 453°, Be— 1,000°, diamond—• 1,860° (all from absolute zero). 

At high tomi)crature, 0 > Od, we must put — 1 x, so that 


K_/ 2 1 \ 0-Oj^ 

StcS I u] 


0 = 3iV6, 


(42.36) 


which is wliat we demanded. 

For 0 6 d, formula (42.34) agrees with experiment qualitatively. 

Wo note that we must not expect complete agreement, because the 
initial assumptions made in deriving this formula are not quantitative 
in character. It is not worth the attempt to make formula (42.34) 
more accurate, without taking into account the exact form of the 
dependence of w upon k. The attempts at correcting this formula, 
which are sometimes made, are simply in the nature of adjustments. 


Exercises 


1) Write down the fornnila for the wavelength distribution of black-body 

27rc 


radiation energy. Proceeding from the fact that o 

(X) - — -V 

X^e -l) 

The nnvximum is debned by the C(juation 

2 7t:7ic 

X---== 4.90.5. 


, wo have 


2) Show that if lio.se particles interact with a Boltzmann gas the probability 
of a pai’ticlo appearing in a certain state is proportional to n -t-1, where is the 
number of particles already in that state, and the probability of a particle 
disappearing is n. 

Lot the energy of a Boltzmann particle be e and that of a Bose particle, v). 
Lot ns consider the process in which there occurs the transition 

e Tf) -> e' -t- T)' 

i.e., the interaction of these particles changes their initial state with energies 
e. I) to a state witli energies z' and rj'. In statistietd equilibrium we must observe 
the balance 

IPee' Ac Ut) (1 -r n,)') “ TV z'z V e' (1 W,) 

where TVee' is the probability of direct transition and TFe'e is the reverse- 
transition probability. Putting 


Sec. 42] 


THK APPLICATION OF STATISTICS 


473 


n-g g 

Nz-=ge6 ® . Ns’==gt'e ® , 

/ 11-1^1 \-i / -n' - ii, y -1 

«„=\6 ® - 1/ , >iV=\e ® — 1/ , 

we see that the balance equation is satisfied if We z' — Wz' z. For simplicity 
wo have put gz — gz’- The presence of spontaneous emission is duo to the Bose 
distribution. 

3) Find the total number of quanta in black-body radiation at a given 
temperature 


N ■■ 


V 


Further (see Appendix), 


I ha Tt^/FC® I 

J e 1 J 
0 0 

oo 


(lx 

>—r 


Hence, 


ri-10 


e^ — ] 

ti - I 

3 OO CO OO 

e-"^ x= (lx = 2]’ J c ” !J^ H = ' 

H = 1 0 »-l 


The sum is approximately 1.2, so that 


JV- 


2.4 

2 

TC“ 


V(P 

'h^ (y< ■ 


4) The atoms are situated in the form of a linear chain. Wo shall denote 
the disjfiacoment of the nth atom by a„. The force acting between the nth 
and (n.-f- l)th atoms is equal to a (on-t-l— On). Find the equations for the vibra¬ 
tions of the chain. Ignore the interaction between the more distant neighbours. 
The vibration equation for the nth atom is 

ma„ a {a„ + i -f a„_i — 2a„). 

We look for a„ in the form 

(In ~ h (t) e'f". 

Substituting this in the initial equation, we find, after cancelling eF" 

tni) (t) = a}) (t) (e'f -b e“F — 2) == 2 a}) {t) (cos / — 1) ^ — 4 a sin* ■ b (t) > 
so that the oscillation frequency for a given value of / is 


If the distance between the atoms is d then n = -^, where x is the equilib¬ 
rium position of the nth atom. Putting = fc, we have e‘^" = e’*'*, so that / can 

be called the wave vector, considering that the length is measured in imits 
of d. For small /, as was asserted, the frequency is proportional to f ; 


474 


STATISTICAI, PHYSICS 


[Part IV 


Sec. 43. Bose Distribution 


The choice of sign of |x. The Bose distribution has very peculiar 
properties at low temperatures. We shall suppose that the atoms 
do not have spin; such, for example, are helium atoms with atomic 
weight 4. Both the electrons in the cloud of the helium atom and the 
protons and neutrons in the helium nucleus are in the l«-state. They 
all go in pairs and by the Pauli principle the spins are antiparaUel. 
Therefore, the resultant spin is zero. 

From (39.30), the weight of the state of a spinless particle is 


d(j (e) -= 


I’«i% s/tdz 


(43.1) 


The normalization condition (39.23) looks like 


(43.2) 


This condition can be satisfied only for negative (x. Indeed, if we 
suppose that jx is greater than zero, then the denominator of the 

g - t* 

integrand will be negative for e < |x because then e * < 1. But 

this is impossible because the distribution function is, by its very 
meaning, a positive quantity. 

Hence, ;x<0. At high temperatures the Bose distribution ])asses 
into the Boltzmann distribution in accord with (40.6). 

The sign of . As the temperature diminishes, (x decreases in 

absolute value. This can be shown generally with the aid of (43.2). 
Differentiating this equation as an implicit function wo have 


The integrands in (43.3) are essentially positive quantities [(e — (x) >0, 
because tx<0], and therefore <0. Hence, as 0 decreases, the 
absolute value |(xl diminishes monotonically since [x must increase. 


Sec. 43] 


BOSE DISTRIBUTION 


475 


We shall now show that [x becomes zero at a temperature other 
than zero. To do this we put (i, = 0 in (43.2) and find the corresponding 
value 0 = 0o: 


CO 

VmVt i \tdz 

J 

0 


V m’lt 0„*/» / V-'’ ^ Ar 
"2V27rVi»“' I e'“ 1 ~ • 

0 


(43.4) 


The integral simply represents an abstract quantity: it is equal 
to 2.31 (see Appendix). Therefore equation (43.4) is satisfied by 
a value of 0^ that is different from zero. 

Bose condensation. What will happen when the temperature is 
reduced further ? p cannot go from negative to positive values since, 
as wo have shown at the beginning of the section, this would lead 
to negative probability values, p cannot become negative once again, 

because is always less than zero so that p varies only monotonically, 

if it is at all capable of varying. Therefore, the only possibility is 
for p to remain equal to zero after it has once attained its zero value. 
But then equation (43.2) is no longer satisfied if the temperature 
is less than Oq, and N does not change. On the contrary, it can be 
seen from (43.4) that if we define the number of particles as 


CO 


2.31 Fm7,(rt, 
2'UTz‘‘h^ 


(43.6) 


for 0<0o, it decreases with the temperature in proportion to fi’/s. 
What happens to the remaining particles which number N — A' ? 
As opposed to light quanta these particles cannot be absorbed. 
Therefore, they will pass into a state which is not taken into account 
in the normalizing integral (43.2). The only state of this kind possesses 
an energy equal to zero: due to factor Ve it does not contribute 
anything to the integral (43.4). In normalization we can isolate the 
particles occurring in the zero state in a separate term. If a finite 
number of particles go to the zero-energy state, they will naturally 
fall out of the integral. N' particles remain continuously distributed, 
but with the value p = 0. Thus, at a temperature 0<0o, the whole 
distribution consists of an infinitely narrow “peak” at e = 0 and of 

particles distributed according a (e ® — l) law. At absolute zero 
all the particles are in a zero state: this state of a Bose gas is obviously 
defined uniquely. It will be noted that a Boltzmann gas would behave 
in an entirely dififerent way when the temperature tended to zero. 


476 


STATISTICAL PHYSICS 


[Part IV 


Liquid helium. Helium with atomic weight 4 obeys Bose statistics 
since the spin of its nuclei and of tlie electronic shells is equal to zero. 
It is therefore interesting to see whether anything like this “Bose 
condensation” is observed in helium. 

It is difficult to give a unique answer because at low temperature 
helium is a liquid, and the Bose distribution, which relates to an ideal 
gas, does not apply. Nevertheless the qualitative aspect of the result 
obtained for a gas may still hold. Namely, it may be supposed that 
at a certain temperature part of the gas will pass into a zero energy 
state and, accordingly, will not contribute to the specific heat. 

Liquid helium does, in fact, experience a peculiar change of state 
at a temperature of 2.19° K (at atmospheric pressure). S]ieaking of 
a monatomic liquid, which is what liquid helium is, it is difficult 
to imagine any change of state related to a rearrangement of the 
atoms in space. Therefore, it is interesting to compare the actual 
temperature of transition in liquid helium with the tem])orature at 
which Bose condensation would occur in gaseous helium of the same 
density. 

The density of liquid helium is equal to 0 . 12 gm/cm*. Whence 
thcratio ^ — ^ 4 ^ x 6 x 102® = 0.18x 10^®. Consequently, according to 
(43.4) the temperature Oq is 


0 „ - 


0.18 • .0.86 • l. tl . 1.18 . 10 

2.:U • 17.1 • 10 - 3 « 

: 5.86 . 10 " 
i.:i8”io i« 


7' - 
' 0 


r- 

2 . 8 °, 


3.86- 1()-1«; 


which is close to the transition temperature. At the transition, the 
specific heat of helium experiences a discontinuity. In the case of 
a Bose gas, only the derivative of the specific heat with respect to 
temperature has a discontinuity. 

Superfluidity. P. L. Kapitsa discovei’cd that below the temperature 
of phase transition, liquid helium possesses a most remarkable prop¬ 
erty : it is capable of passing through the finest slit without exhibiting 
any signs of viscosity. This property was called superfluidity. 

L. D. Landau developed a theory of superfluidity proceeding from 
the supposed quantum-level spectrum for a liquid. On the basis 
of this theory he built the hydrodynamics of a superfluid, which 
differs from conventional hydrodynamics in that each point possesses 
two velocities instead of one: a normal and a superfluid component. 
The occurrence of two velocities means that in a superfluid two types 
of sound vibrations may be propagated: ordinary sound, in which 
pressure and density oscillate, and “second sound,” which is connected 
with the relative motion of the normal and superfluid components. 
The second sound was demonstrated in an experiment carried out 
by V. P. Peshkov using a method proposed by E. M. Lifshits. The 


Soc. 44] 


FERMI DISTRIBUTION 


477 


experimentally found velocity of second sound (which is small com¬ 
pared with the velocity of conventional sound) is in excellent agree¬ 
ment with Landau’s theory. 

The question of thq relationship between superfluidity and Bose 
condensation cannot be considered fully resolved. It may be suggested 
that the superfluid component corresponds to that part of the helium 
which has passed to the zero state. This hypothesis is strongly sup¬ 
ported by the fact that the liquid isotope of helium with atomic 

weight 3 is not superfluid: the nuclear spin of helium 3 is equal to -i-, 

so that its atoms are subject to the statistics of Fermi and not Bose. 
Accordingly, they cannot aU pass into the zero state together: the 
Pauli principle does not permit this. 

N. N. Bogolyubov showed that a gas which is close to an ideal 
gas and consists of Bose particles possesses an energy spectrum which, 
according to Landau’s theory, a superfluid liquid should have. How¬ 
ever, no one has so far succeeded in proving theoretically that it is 
precisely liquid helium below the transition point that should possess 
such a spectrum. 


Exercise 

Calculate the onorgy and pressure of a Boso gas below tho transition point. 
Kor tho energy wo have 

CO 

0 

(see Appendix). The pressure is determined from the general relationship 
(tO.22): 

_ 2 <# _ l-lS/a’/jO'/s 

Thus, tho pressure of a Boso gas below the transition point is independent of 
volume ami depends only upon the temperature. If wo compress such a Boso gas 
its particles will go to tho zero-energy state. Conversely, upon expansion tho 
particles will come out of the zero-energy state until there are none left. If expan¬ 
sion continues tho pressiu-e will begin to decrease. 


Sec. 44. Fermi Distribution 

The form of the Fermi-distribution curve and its interpretation. 
The criterion for tho transition from quantum statistics to classical 
statistics is that [see (40.7)] 

If the inequality is reversed, then essentially quantum properties 
of the statistical distribution appear. In this section we shall consider 


478 


STATISTICAL PHYSICS 


[Part IV 


the properties of the Fermi distribution when the inverse inequality 


N ^ 3(0) I in 0 
T ^ I* 


or an equivalent inequality 


(44.1) 

(44.2) 


is satisfied. 

From (30.26) and (39.30), the Fermi-distribution curve is of the 
following form: 


dn{e) - 


K (2m®)'/ieVt(le 


(44.3) 


Here, a weight factor 2 is introduced, since we have put j — y ' 

first factor in (44.3) represents the total number of states between 
e and e -( de, while the second factor represents the probability that 
these states are occupied. We can interpret the function 


/(e) = 


! 1 


(44.4) 


as a probability and as the mean number of particles, because / (e) 
is contained between zero and unity. A similar function in the Bose 
distribution could only denote the mean number of particles in one 
of the quantum states with a given energy, because the Bose-distri- 
/ V 1 

bution function \e ® — 1 / is sometimes even greater than unity 

and must not be interpreted as a probability. 

Let us see how the curve / (e) behaves when > 1. When s = 0 
we obtain 

/(«)=-—,— - = 1. 

e ® + 1 

JiL e- ti 

because e ® is a small number. The quantity e ® is also a small 
number as long as e remains smaller than [x, while / (e) is close to 

e- |i 

unity, like / (0). Only when e — [x is comparable with 6, is e ® of 
the order of unity, so that / (s) begins to decrease noticeably with 

further increase of e. For e = ix, / (fx) decreases tOy: 

= -go^ri = 2 • 


Sec. 44] 


FBRMI DISTMBimOJSr 


479 


For still greater values of e, / (e) decreases exponentially because 
unity can then be neglected in the denominator, and, for e > (i, / (e) 
becomes the Boltzmann distribution 


\JIS1 


/(e) ~e ® . 

The Bose distribution also has the same limiting form. The curve 
/ (e) is roughly shown in Fig. 50. The region s, where / (e) changes 

from unity to zero, has a width of the order 0, since ^ T~ is comparable 

with unity only if e - for smaller s the exj)onential is con¬ 

siderably smaller than unity, while for larger e the exponential is 
considerably greater than unity. 

Fermi distribution at absolute zero. We shall call the region of 
transition of / from unity to zero the spread region of Fermi distri¬ 
bution. As the temperature decreases the spread 
region narrows and, at absolute zero, becomes a 
shar]) discontinuity /, so that the distribution 
function takes the form of a right angle. Fig. 60 
shows this step by a broken lino. The value of (ji 
at absolute zero is called (Xq. Hence, 
at 6 = 0, all states with energy less 
■ than po are occupied with unity prob¬ 
ability (i.e., with certainty), while 
those with energy greater than are 
empty, also with certainty. 

This result can likewise be obtained directly from Pauli’s principle 
without resorting to statistics. From (39.32), a definite interval of 
momentum-component values £\px, Apy, Apz corresponds to one state 
of particle motion. If the particle is contained in a box with sides 
%, Uj, aj, then it follows from the uncertainty relation (23.4) that 


hioiy 

Fig. 50 


Ap* 


271*. 


Apy 


2 Tzh 


I^Pz 


2 Tzh 


since these quantities show by how much the momentum components 
of two particles must differ in order that the particles may be regarded 
as occurring in different states of motion. 

This follows not only from the uncertainty relation, but can also 
be seen strictly when computing the states leading to formulae 
(26.23) and (39.32). Here, each state must be identified not with 
the volume of the parallelepiped, but with one of its vertices whose 
coordinates are given by the three integers Ui, n^, n^. The coefficient 2n 
in the uncertainty relations is taken so that both definitions for the 
number of states agree. 

If we plot p*, py, pz on coordinate axes, then to each state of spatial 
motion of the electron there correspond three quantum numbers 


480 


STATISTICAL PHYSICS 


[Part IV 


Wi, Wj, n^. These quantum numbers specify the number of the parallele¬ 
piped with sides Ap*, Apy, Ap^. It is shown in Fig. 51. AU the space 
in which the axes p*, py, pz are drawn can be 
A ~jA filled with such boxes. Smee three quantum 

f l_ numbers correspond to a single box and, in 

'' 7ap addition, the state is also given b^y the spin, there 

may be two particles with .spin having momen¬ 
tum projections in the same interval Ap*, Apy, 
/ l\pz. The spins of these two particles are anti- 

/p^ ^ parallel. 

ji'jg gj Thus, the space p*, fy, Pz may be divided 

into boxes or cells with dimensions 


ApxApyApz = 


(_27t/t)3 

«3 


Clr.h)^ 

V ’ 


(44.5) 


where there are no more than two jiarticles in each cell. 

The closer the cell to the coordinate origin, the less the energy it 

possesses, because the energy is equal to e = (pj -f pj -1- pi) . 

In other words, it is proportional to the square of the distance of 
the cell from the origin. 

Let us now consider the state of a gas at the absolute zero of temper¬ 
ature. If the gas consisted of only two particles, then at absolute zero 
the states of both particles woidd fill the cell closest to the origin. 
In accordance with the Pauli principle, the next two particles cannot 
enter the same cell: they are forced to take up positions further 
from the origin. As the number of particles increases, cells are filled 
which are situated further and further from the origin; but each 
time two particles are added they fall into a free cell closest to the 
origin, because, by definition, absolute zero corresjionds to the least 
possible energy of the gas as a whole. 

If there are very many particles, their cells will densely fill a sphere 
whose centre is the coordinate origin. All states inside the sphere 
are filled with unity probability, while those outside the sphere are 
free—also with certainty. 

The limiting energy of Fermi distribution. If we denote the energy 
corresponding to the boundary of the sphere by e^, then it can be 
seen from Fig. 50 that Sq —(Xq. [Xq is the limiting energy of a particle 
at absolute zero. It is very easy to calculate Sq or (Xq. Since at absolute 
zero the function / (e) is equal to unity for all e < [Xq, the total number 
of particles N is, from (44.3), 


N 


r(2TO^)Vi j\/ -j V{2mfItem’ll 

- J ^ 3 > 

0 


(44.6) 


Sec. 44] 


FEBMI DISTMBUTION 


481 


whence 


So 


= 3’/.7t‘/, 


2m \v) • 


(44.7) 


The same can be seen without the aid of / (s). Indeed, the radius 
of the sphere of greatest energy is 


Po = V2mso . 

Its volume is 


But this same quantity is equal to the number of filled elementary 
cells (with two particles per cell) multiplied by the volume of a single 

cell — - . Consequently, 


-|-7c(2mEo)’/» = 


N (2izhf 
3 V ’ 


(44.8) 


whence equation (44.7) is again obtained. 

At absolute zero the state of a Fermi gas as a whole is defined 
uniquely: in quantum statistics it is necessary to indicate which 
states are occupied by separate particles, but it is impossible to deter¬ 
mine by which 'particles they are filled. In the given case all the 
states inside the sphere with limiting energy Sq are filled by particles. 

The criterion for the closeness of the Fermi distribution to the 
distribution at absolute zero (based on the form of the distribution). 
At a temperature close to absolute zero thermal excitation can be 
imparted only to those particles whose energy is close to eo = [i.o. 
Indeed, as long as 0'^So> ^ thermal excitation of the order 0 cannot 
be imparted to a particle whoso cell lies deep beneath the surface 
e = eQ, because the states between the surface and the given cell 
are occupied, and the energy 0 is insufficient to remove the particle 
beyond the limits of the surface boundary. Therefore, only those 
particles whose energy differs from by an amount of the order 
of 0 can take up free places. Deeper states will remam densely filled 
as before. Thus, the filling probability wiU be almost equal to unity 
for all energies e<SQ, and will fall to zero in a region of the order 
of 0 close to s~eQ, as shown in Fig. 50. 

The criterion that the curve is close to the step is the inequality 

0 « So , (44.9) 

and this agrees with (44.1) within the accuracy of the numerical 
factor. As we shall soon see, the concept of “closeness” of temperature 
to .absolute zero, according to the criterion (44.9), difiers greatly 
from conventional. 


31 - 0060 


482 


STATISTICAL PHYSICS 


[Part rv 


Electrons conducting electricity in metals are usually considered 
as an ideal gas. The main basis for this is the fact that we as yet 
have no better theoretical model. It has not been possible to consider 
electrostatic interactions between electrons sufficiently fully to obtain 
quantitative results that might compare with experiment. This is 
why the phrase “electron gas” in metals is used. In many cases the 
conclusions from such a model are in good agreement with experiment. 

Without considering the electron theory of metals, we shall take 
the electron gas only as an example in which condition (44.9) is 
satisfied. Let us suppose that there is one conduction electron for 
each atom. This assum 2 )tion appears to bo satisfied for alkali metals, 
in which the outer electron is weakly hoimd and is separated from 
the atom in a lattice. 

Let us find Sq for the electron gas in metallic sodium. The density 
of sodium is 0.97 and atomic weight 23. Hence, unit volume contains 

0 07 

- • 6.02 • 10 ^ = 0.25 • lO^s 

atoms and as many conduction electrons. Whence, from (44.7), 
So - 2.1.4.6 q • 0-08 • 10-i« = 4.8 • IO -12 


[the sequence of the numbers is the same as in (44.7)]. In degrees 
is 34,800. Hence, at all temperatures for which we can speak of 
sodium as a metal, the electron gas in it is close to a Fermi gas at 
absolute zero. Similar results are also obtained for non-alkali metals, 
though with a less reliable value of electron density. 

The compressibility of alkali inotals. Let us derive a formula for 
the com])ressibility of a Fermi gas at absolute zero. From (44.6), 
the energy at absolute zero is 


I'(2w) U s, 
5 TtVi® « 


(44.10) 


In accordance with the Bernoulli equation (40.22), the pressure 
is equal to two tliirds the energy density, i.e.. 


0 

{•2m)’h .. 3’/37T‘/a h"- IN\‘I> 

(44.11) 

Whence 


(TluF _ 3 
dp r>2) 

s'/a m 

= 0.273 X 102" bai-i . 

(44.12) 


Ya. I. Frenkel noted that the compressibility of alkali metals 
is close to the compressibility of an electron gas. 


Sec. 44] 


FBBMI DISTRIBUTION 


483 


Indeed, expressing NjV in terms of atomic weight and density, 
we obtain the following table: 


Li 

Na 

K 

Rb 

Cs 

- X 10“ from equation (44.12) 

\ tip 

4.7 

13 

! 

37 i 

62 

79 

-Jr X 10“ from experimental data 

h cp 

8 

15 

32 

40 

61 


In a crystal lattice there are, of course, not only forces of repulsion 
between particles, but also cohesive forces. The equilibrium of these 
forces with the forces of repulsion determines the characteristic vol¬ 
ume which every condensed body, solid, or liquid has in the absence of 
external pressure. Ordinary atmospheric pressure gives a force which 
is negligibly small compared with these tremendous forces that keep 
bodies in their volumes. In order to change the volume of a body by 
only one per cent, pressures are required in the order of tens of thou¬ 
sands of atmos])hcrcs. 

The coincidence of theoretical and ex|)erimental data indicates that 
when alkali metals are compressed the cohesive forces change insignif¬ 
icantly comi)ared with the forces of repulsion. It is even conceivable 
that the state of the valence electrons in alkali metals is perturbed to a 
comparatively small degree by the atomic residues, and, to some ex¬ 
tent, is close to an electron gas. Compression affects but little the 
electronic shells of the atomic residues, and therefore the compressi¬ 
bility of alkali metals is close to the compressibility of an ideal Fermi 
gas. That this should bo so is, of course, not at all obvious beforehand. 

Paramagnetism of alkali metals. According to Pauli, the paramagne¬ 
tism of alkali metals can also be cxiilained on the basis of the concept 
of a free electron gas. 

If we place a Fermi gas (consisting of electrons) in a magnetic field, 
the energy of the electrons, whoso spins are parallel to the field, will 
be equal to ^while the energy of electrons with opposite direc¬ 
tion of spin will be equal to Therefore, if those electrons whose 

spin is antiparallel to the field reverse their spin directions, then the 
energy of the gas must decrease. But all the places inside the limiting- 
energy sphere are occupied; so for an electron to change its spin direc¬ 
tion it must come out of the sphere into a free cell. But this increases 
its kinetic energy. Equilibrium is established between electrons with 
spins parallel and antiparallel to the field when their total energies 
become equal. Indeed, if there occurred a further transition of elec¬ 
trons into a state with spin parallel to the field, the increase in their 

* Here the Bohr magneton is denoted by 3 instead of g, so as to avoid 
confusion with the distribution parameter g. 


31* 


484 


STATISTICAL PHYSICS 


[Part IV 


kinetic energy could not be compensated by a reduction in magnetic 
energy. 

Let there bo n electrons which have changed their spin directions. 
N 

Then there remain -- n electrons with spin antiparallel to the field, 

N ^ 

while have spins parallel to the field. The limiting energies are 

determined from formula (44.8), where we must put-^ ± n instead 

N ^ 

of -g-. Whence we obtain the following expression for the limiting 

kinetic energy of both types of electrons: 


(27tA)2 

2m ’ 


(44.13) 


and the equation for the total limiting energies is 

N 

Since the binomials can be expanded in a series as follows: 


IN , \’/. IN\'h IN\% ‘ 

(2^1 (2) ^(2) S 


4?> 

w 


Substituting this in (44.14), we find the number of electrons which 
change their spin directions in the magnetic field: 


n = N^H 


3'/. 

2 It* 1 3 


m 


(44.15) 


Each of these electrons contributes a term 2 p to the total magnetic 
moment of the whole gas, because its moment projection on the magnet¬ 
ic field has changed from — p to p. The magnetic polarization (that 
is, the magnetic moment of unit volume) turns out equal to 


Jlf = 2p-^ 


_ 3V3 mp® IN\'l 3 jj 


(44.16) 


while the magnetic polarizability a, defined as the coefficient of H 
on the right-hand side of this formula, depends only upon the density 
of the electron gas and not its temperature: 


Ttv. ip; • 


(44.17) 


Indeed, alkali metals have a paramagnetism independent of tempe¬ 
rature. Let it be recalled that in accordance with the results of Sec. 40 
[see (40.53)] atomic paramagnetism gives a magnetic polarizability 
which is inversely proportional to the temperature. Formula (44.17) 
agrees satisfactorily with experiment. 


Sec. 44] 


FEBMI DISTBIBUTION 


486 


Diamagnetism of electrons. L. D. Landau has shown that the quan¬ 
tized motion of electrons in a magnetic field—this motion is similar 
to their classical motion in a spiral—leads, in a weak field, to the appear¬ 
ance of a magnetic moment equal to 1/3 of expression (44.16), and of 
opposite sign. The nature of this eftect is purely quantum; if we regard 
the motion of electrons as classical then the additional magnetic 
moment becomes identically zero (see Sec. 46, exercise 13). 

If is of the order of 0, then the polarizability does not depend 
monotonically upon the field and exhibits much oscillation as the field 
increases. The oscillatory valuation of magnetic properties is, in fact, 
observed in very many metals. 

The potential distribution in an atom. We shall now show how to find 
the general form for the electron-density distribution in atoms via 
the notion of a Fermi gas. To a certain approximation, the electrons 
in heavy atoms resemble a Fermi gas. However, it must be noted that 
each electron occurs in the inhomogeneous electric field formed by the 
nucleus and the entire eonfigurfition of the remaining electrons. 

Let us first of all consider a Fermi gas at absolute zero in a potential 
field of the form showui in Fig. 52. U = 0 for 0<a;<a, U=U^ for 


a <x <b, U=UifoTX>b. Then the limiting energy of the electrons 
must be the same for 0<a: <a and for a<a; <6, because otherwise 
the electrons will pass into a region of lesser limiting energy according, 
as it were, to the law of communicating vessels. Of course, as a result 
of this the total energy will diminish. But the energy of a gas at 
absolute zero is the least possible, so that the limiting energy must be 
the same in any part of the gas. 

The potential energy distribution in an atom is approximately as 
shown in Fig. 53. The potential energy is everywhere negative because 
we have taken it to be zero at infinity. 

The limiting energy of electrons must not be positive anywhere, 
because electrons with a positive total energy could leave the atom for 
infinity. The limiting energy camiot an 3 nvhere be less than the poten¬ 
tial energy. We shall show that it must be equal to zero. If, for exam- 


486 


STATISTICAIi PHYSICS 


[Part rv 


pie, it corresponded to the dashed Ime in Fig. 63, the electron density 
would become zero at the point r=ro. But then the electric field would 
be zero for all values r > because in the case of spherical symmetric¬ 
al charge distribution the action of all the electrons of a neutral atom 
balances the action of the nucleus. Accordingly, the potential would 
also be zero when r = ro, because potential is the field integral: 

r 

i. e., the integral with integrand equal to zero at r'^r^. 

CO 

Thus, the follov'^ing three conditions would bo satisfied at the point 
r — T^\ n(ro) = 0 .- 9 (^o)~ 0 , | =o|w—^.i.e., the electron den¬ 

sity). The density is proportional to the 3/2 7 )ower of the kinetic 
energy [see (44.8)], • Here, ffo==eo---e<p, i.e., the 

limiting total energy must be a constant quantity. But, applying this 
equation at the point fg, we see that <fg must equal zero. It only re¬ 
mains to show that the point rg cannot occur at a finite distance away 
from the nucleus. As we shall .see later, this follows from the equation 
for the distribution of potential. 

Thus, putting <^g = 0 we obtain 

(3 tc^)‘/3 -^—K.Va ^ -rep, (44.18) 


Here, from formula (44.7), e,, is expressed in terms of density, i.e., 
the charge density is related to potential. 

The equation for a sell-consistent field. A second relationship be¬ 
tween ])otential and density is given by the electrostatic equation 
(14.7). Since the electronic charge is negative, this equation should bo 
written with a plus sign on the right-hand side: 


1 ,( , 


dr 


ATzrn. 


(44.10) 


dN 


Tlie density must be eliminated from (44.18) and e(p inserted 

in place of Eq. Then we obtain 

I d 2 ^^9 2^/2 wVs 

r* dr ^ dr ” Sn 


r ,*/2 , 


(44.20) 


We transform this equation like (19.6). For tliis, we substitute 9 
in the form 

9 (44.21) 


The function 6 is nondimensional since — has the dimensions of 

T f. 

potential. In immediate proximity to the nucleus, <p is determined only 
by the nucleus because the potential of the nucleus tends to infinity 


See. 44] 


FBBMI DISTRIBUTION 


487 


like while the potential of the spatially distributed charge of the 
electrons remains finite. 

Therefore, close to the nucleus (i.e., when r — 0) At large 

distances from the nucleus, its charge is completely screened by the 
electron charge of opposite sign, so tliat the potential of an atom must 

tend to zero more rapidly than — . This shows that tp ( 00 )=^= 0. 

Substituting (44.21) in (44.20), we have an equation for 


_ 2^/a (/(’/a 3 

dr'^ 371 ^ ^ ^ r'lt 


(44.22) 


It is convenient to get rid of the dimensional factor on the right- 
hand side. To do this, we must introduce a new unit of length similar 
to the atomic unit [see (31.21)]; 


(3^)V I ir- 

275 mc^ 


X. 


(44.23) 


0 880 

This unit differs from the atomic unit by the factor ■ After the 

introduction of a nondimersional variable x, equation (44.22) reduces 
to standard form (the Thomas-Fermi equation): 


d^'i, _ 

d 'x^ '" ~x'h • 


(44.24) 


Now it does not involve atomic number. Both boundary conditions 
for tj; ( 47 ( 0 ) = 1 and 4' ( 00 ) = 0 ) are also the same for all atoms. There¬ 
fore, it is sufficient to integrate equation (44.24) once with these bound¬ 
ary conditions. 

If we return to the dimensional radius r, the function (x) gives the 
potential distribution for each Z: 

<{> = 47 ( 1 . 125 ZV 5 . r) . (44.25) 

If the distance from the nucleus is expressed in terms of x, the elec¬ 
tron density distribution is the same for all atoms to which the statis¬ 
tical method is applicable, i.e., for all elements of large and medium 
atomic weight. But the same x denotes a geometrical distance inversely 
proportional to , as can be seen from (44.23). Therefore, in heavy 
atoms, the main part of the electrons is concentrated closer to the nu¬ 
cleus than in the lighter atoms. 

The accuracy of the Thomas-Fermi equation (44.24) is determined 
by the quantity as can be shown from a strictly quantum- 

mechanical derivation by using a quasi-classical approximation. There¬ 
fore, equation (44.24) cannot, of course, be applied to the very lightest 
atoms that contain few electrons. 


488 


STATISTICAL PHYSICS 


[Part IV 


Substantiation of the boundary conditions for equation (44.24). The 
integral curves of equation (44.24) begin at the point 4' = 1 for * = 0, 
and fall with increasing x, accounting thereby for the screening effect, 
i.e., weakening of the nuclear field by the atomic electrons. The dimin¬ 
ishing function may either pass through a minimum, without attain¬ 
ing (p = 0, and then begin to increase, or it may intersect the a;-axis at 
a certain point x—x^, or it may tend to this axis asymptotically. 
The first possibility must be rejected at once, because it results 

oo 

in an infinite total number of electrons proportional to J x'l* dx, 

0 

see (44.18) and (44.21) (if we take (oo)>0). It is impossible to cut 
off the integration at some Xq when > 0. since this would correspond 
to a limiting total energy not equal to zero. 

If we take the second possibility, then the total number of electrons 

*■0 

has a finite value and will be proportional to J x'i‘ dx. The electron 

0 

density and, hence, the electric field of a neutral atom also, must, by 
definition, become zero at the point x = Xf„ since the nuclear charge 
in it is completely screened by electrons. In accordance with (44.21) 
the electric field will be determined by the expression 

^ _ d<f _ Zeii Ze 

dr r* r dr ' 

But the condition — = 0 is satisfied where E = Q and tj;=0* There¬ 
fore, the point x—Xq must correspond to tangency of the integral 
curve with the a:-axis, and not to intersection. In the general case, the 
integral curve close to the point of tangency has the following form ; 

= a(x — Xo)^+'‘+..., 

where fc is a positive number. Terms with large values of k are denoted 
by the dots. Substituting this expansion in equation (44.20), we have 

#/ n I 3fe 

{2 + k) (1 + k)a{x-Xo)>‘ = -^(x-Xo) , 

**0 

whence it follows that k— — 6, in spite of the assumption that 
i;>0. Hence, tangency of the integral curve with the a:-axis is impos¬ 
sible at a finite distance from the origin, and asymptotic tangency 

must be assumed. And the condition = 0 is automatically satisfied 
at infinity. 

The charge distribution in positive ions. In positive ions, the charge 
of all the electrons does not completely screen the nuclear charge. 


Sec. 44] 


FERra DISTRIBUTION 


489 


because, at the point where = the condition ^-^==0 should 

not be satisfied. The electron density distribution in an ion is given 
by the integral curves intersecting the a;-axis. The point of inter¬ 
section determines the radius of the ion Zg. 

The order for flllihg the electron shells. From the potential distri¬ 
bution in an atom, we can determine the values of Z for which d- 
and /-electrons first appear in the atom. 

We first of all note that the electron density distribution in an 
atom must be associated with the angular-momentum distribution 
of the electrons. As we have already indicated, the limiting momentum 
of eleetrons is proportional to the ^/g power of the electron density. 
Therefore, close to the nucleus, where the electron density is great, 
the limiting momentum is also great, while at large distances from 
the nucleus, the limiting momentum is small. But the angular mo¬ 
mentum of an electron is determined by the product of the momentum 
by the distance to the nucleus, and close to the nucleus it is small 
despite the large limiting momentum. At large distances from the 
nucleus the angular momentum becomes small—this time as a result 
of the smallness of the limiting momentum. Hence, somewhere 
at medium distances, the angular momentum attains a maximum 
which is larger, the greater the electron density. Therefore, in heavy 
atoms with a large electron density, we find larger values of angular 
momentum. In order to find the greatest values of angular momentum 
that arc possible for a given Z, we shall proceed from the classical 
expression for energy in a central field [see (5.7)] 


Pr‘ , 

2 m ' 2mr^ r 


(44.26) 


We must put <^ = 0 for the boundary energy, in accordance with 
the basic assumption (44.18). Then, for the radial component of 
momentum we obtain the expression 


p,= 


2mZe^p 


(44.27) 


We can substitute in place of M^. But since formula 

(44.26) is written to a quasi-classical approximation, a better result 
is obtained if we also take the quasi-classical approximation for M*. 
It can be calculated using the same methods as those for determining 
the energy eigenvalues from formula (29.18). To this approximation 

= A* |i -f yj **. We notice that + y) * differs from I (1 -[-1) only by 
a quarter. 

We write (44.27) in the following form: 


490 


STATISTTCAIi PHYSICS 


[Part IV 


(«.28) 

Let us now express the factor r in the radicand in terms of the 
nondimensional quantity x according to formula (44.23). Then pr 
will be 

p, A |/l.778^=/.x^- {l 1)1 (44.29) 


For pt to be a real quantity, the radicand must remain positive 
in a certain interval of values x. But since x<\i = 0 when a: = 0 and 
x~oo this interval is finite and contains the maximum point of the 
function x tj;. The maximum is equal to 0.488. Thus, the whole inter¬ 
val in which pt is a real quantity is contracted into a point for the 
value of Zi at which 

1.778 • 0.488 • .|. -1)^, (44.30) 

the curve y =1.778 7'^^ a;t]; being tangential to the constant straight 
line y- 1/1 A)^ 

It follows that a given value of I in an atom may occur when Z 
satisfies the condition 

Z--0.155 (2/ -I 1)3. (44.31) 


According to this equation, electrons having 1 = 2 will occur for 
Z==19, while /-electrons (/ —3) will occur when Z —53. There will 
be better agreement if we take the coefficient 0.17 instead of 
0.165. 

Using the numerical form of the function il' can be shown 

that the d- and /-shells are formed mainly deep inside the atom, 
as was shown in Sec. 33. 

The approximate integral formula for the Fermi distribution. In 
conclusion, let us consider a Fermi gas not at absolute zero, but 
at a temperature other than zero yet satisfying the inequality 
(44.9). 

It is convenient first to derive a general formula for the integral 
of the Fermi distribution that holds for 0 6^. 

Let us take the integral 


oo 


(44.32) 


where y (e) is some power function, for example V e, etc. 

We integrate (44.32) by parts: 


Sec. 44] 


FBBMI DISTB.IBOTION 


491 


CO oo 


Y(0) 


li 


(44.33) 


Let us write the second factor in the integrand thus: 


V • (44.34) 

e 0 : l/ie “ I ll 

The denominator of tliis e.xprcssion is largo both for s :;a and for 
11-6 6 - - |1 

£ > (X. The exponential e ® is lai ge in the first case, while e ® is large 
in the second case. Therefore, the whole expression differs Jioticeably 
from zero only in a narrow range of values s, different from p. by 
an amount of the order of 0. Let ns expand the function y(s) within 
this range and lot us terminate the expansion Avith the second term. 


y(E) = y(p) + (e- p)y'(p) + y" ([J-) ■ (44.35) 


We substitute this expansion in (44.33). Taking into account that 
the second factor in the integrand is very small for £ = 0, wo can 
perform the integration to s =^= — oo without making any perceivable 

n 

error. In addition, we shall neglect the quantity e * in the hitegrated 
term of (44.33). From this we obtain 

CO 

I ^ _ ,^(0) _ y (p) / d£ J—A 

./ U " + 1 , 


+ 


+ 


y' 

0 


CO 


— oo 


20 


(44.36) 


492 


STATISTICAL PHYSICS 


[Part IV 


The first integral is calculated immediately; it is 


(44.37) 


We change the integration variable in the second and third integrals, 
assuming 


Then the second integral reduces to the form; 


— oo — oo 


because the integrand is an odd function. Finally, the third integral 
(see Appendix) is 


x^dx 

(e* + 1) (6-* + 1) 


(44.39) 


Thus, the required integral appears in the form of the following 
expansion: 

n 

/= y((a) - Y(0) + (g) =|Y'(e)de + (44.40) 

0 

The zero term in this expansion corresponds to the form that the 
Fermi distribution has at absolute zero; indeed, if /=! for 0<e<(x, 

It 

then / will just equal | y^ (e) dz. The first term, which is linear in 

0 

0, drops out of the expansion. This is clearly evident from the follow¬ 
ing. The electrons which escape the limiting-energy sphere leave 
behind unoccupied levels, so-called “holes.” To a first approxi¬ 
mation, these holes are symmetrically distributed with respect to 
the occupied levels lying above the limiting energy. In Fig. 50 this 
can be seen from the fact that the shaded areals are approximately 
equal. 

Finally, the quadratic term contributes the desired correction to 
the integral I. 


Sec. 44] 


FEBMI DISTBIBOTION 


493 


The specific heat of a Fermi gas. We shall now apply the result (44.40) 
to calculation of specific heat. To do this we write down the expres¬ 
sions for the energy and the total number of particles: 


F(2«!=>)'/» /• e’/ade 

I ’ 

./ e 0 + 1 


N = 


F(2»l*)Va /" eVjrfs 
I c-tt 
.1 e 0 +1 


(44.41) 


(44.42) 


We apply formula (44.40) and obtain 


W = 


(I +4-4F-'-.e-). 


F ( 2 w ’)'/2 


(44.43) 

(44.44) 


because the function Y'(e) equalled e’^« for the first integral and 
e‘/a for the second integral. 

Using these formulae let us find the specific heat. From the defini¬ 
tion of specific heat we have 


SO 


(44.46) 


We calculate the derivative from the second equation, differen¬ 
tiating it as an implicit function: 


hi 

so 


dJV 

SO 

dN 

■S(. 


6 IX • 


(44.46) 


We have both times omitted difiFerentiating the coefficient of 0, 
because 6 is regarded as small. Substituting (44.46) in (44.45), we write 
the specific heat as 

S^ ( 44 . 47 ) 


^ so 


Finally, in place of p we must substitute the expression for the 
limiting energy (44.7). Then the specific heat will be expressed in 
terms of the gas density and temperature: 


494 


STATISTICAT, PHYSICS 


[Part IV 


0 

Thus, the specific heat per electron is approximately 5—, which, 
according to (44.9), is a very small quantity. For example, we esti¬ 
mated that for sodium Eq-- 34,800°, so that — ~0.01 at room temper- 

^0 

ature. The siiecific heat of a Fermi gas per electron at room tempera¬ 
ture is 0.0.5. This must be comjiared with the specific heat for a Boltz¬ 
mann gas, equal to 1.5 from Sec. 40 (if 0 is expressed in ergs, the 
specific heat C is an abstract quantity). 

It is easy to see why the specific heat of a Fermi gas is considerably 
less than the specific heat of a Boltzmann gas: not all the electrons in 
a Fermi distribution arc capable of being thermally excited, but only 
those whose energy is close to the critical energy. This is why the 
specific heat of a Fermi gas turns out equal to a few per cent of N, 

3 

A specific heat -- N is obtained only when all the electrons are capable 
of being thermally excited. 

I)i!ficuUics in the classical electron theory of metals. Considerable 
difficulty was experienced in the prequantum theory of metals because 
the electron gas in a metal does not have an experimentally noticeable 
specific heat at room tenqierature. The specific heat of a met.al does not 
exceed the value 3 per atom [see (42.32)]. Yet if the number of elec¬ 
trons present equalled the number of atoms, then, according to classic¬ 
al statistics, the metal would have a specific heat 3 -j- 3/2 = 9/2 per 
atom, which is never observed. 

If we ap])ly Fermi statistics to electrons, then, as we have just 
seen, the difficulty with siiecific heat is removed. 

At low temperature the specific heat of the crystal lattice of a metal 
is proportional to 0® [see (42.35)]. Therefore, if the tenqieratnre is 
sufficiently low, the electronic specific heat begins to predominate 
and can be measured. Measurements show that at very low tempera¬ 
tures the specific heat of metals is indeed proportional to 0. As can be 
seen from (44.48), if we know the specific heat we can also determine 
the number of electrons per atom. It is a curious fact that bismuth, 
which in many respects is not a typical metal, has a very small number 
of conduction electrons. 


Exercises 

1) Find the equilibrium concentration of electrons and positrons in some 
volume not containing charges at low tompcratui'e. 

In place of tho conservation of the number of particles we must take into 
accoimt the conservation of charge in the formation and aimihilation of elec¬ 
tron-positron pairs. Denoting the number of electrons in a given quantum state 
by tho letter /, and the number of positrons by the letter we have, in place 
of (39.23), the following supplementary condition: 

yjgk{lk-rk)^o. 

k 


See. 44] 


FEBMI DISTBIBITTION 


496 


Determining f and f', which give the maximum of the fimction S = ln P 
with the supplementary condition indicated, we obtain the distribution fimc- 
tions for electrons and positrons: 


< e - n ’ ' " e + n ’ ' 

6 ® h 1 6 ® -1 1 

where the constant n is the same. The total niunbor of electrons must equal 
the total number of positrons, i.e., 


\/e (/e 
e -t M- 


1 


This equation has a solution only for [i. - 0. Hence, the total number of elec 
irons in unit volume is 


p“ (Ip 

e 


1 


Let us calculate this integral when 0 <| Wo can take tho nornelati- 
vistic approximation for the energy and represent the distribution function in 

e 

the form c ® . 

Whence we have the equilibrium electron density 


1 


■}UC’" 

This quantity is equal to 1/cm® for 0 - - 8 kov. 'The energy of tho olectro- 

ma^etic field per unit volume at the same tomporatines is 0.6 x 10'* ergs, 
while only 1.6 x 10"* erg is released in pair annihilation. 'The energy of electrons 
and positrons will be close to the electromagnetic field energy only when 0 is 
of the order of me". 

2) Find the limiting energy of a superdeuse electron gas, for which tho de¬ 
pendence of energy upon momentum is in the main extremely relativistic: 
c — cp. Determine the density at which the gas may bo regarded ns ultrarola- 
tivistic. 

In place of equation (44.8), we have 


so that 


_ N_{2r.hf 
3 c*“”2 ~V~’ 


2Tzhc. 


The rest energy can be neglected if 


Eo > me*. 


496 


STATISTICAL PHYSICS 


[Part rv 


so that the condition for the density is written in the form 


N 

V 


1 

StcM A / 


«» 10®® electrons/cm®. 


Since Eq involves 


m- 


the inequality must bo great. The energy of suet 


an ultrarolativistic gas is given by the expression 


3) Find the number of electrons passing through tho sui-face of a metal 
ill unit time if only those electrons can cross tho surface for which the velocity 
component normal to tho wall is greater than I'ox- This quantity satisfies the 
inequality 


In other words, the energy of the emerging electrons differs from the limiting 
energy by an amount considerably greater than 0 (thermionic omission). 

The number of electrons with velocity Vx falling on a square centimetre ol 
surface in one second is 

Vxdn{Vx), 

whore dn {vx) is tho density of electrons having a given value of velocity projec¬ 
tion Vx- Like (44.3), wo write dn (vx) in tho form 


dn (t'x) = 


2 m® dvx dt’y dvz 


1 


whore e = - (I’j l- yj-|- I'J). Tho surface of a metal is crossed only by those elec¬ 

trons for which the difference e—g is considerably greater than 6, so that wc 
are justified in passing from a Fermi distribution to a distribution of the Boltz¬ 
mann type, but with tho same value of n as in the Fermi distribution. In othei 
words, wo take only tho “tail” of tho Fermi curve where e—g > 0. Whence, 
the ro(|uirod electron flux is 


OO CO oo 


2 m* e 0 mi^ov^TcO mO* "“F") 


If we apply an electric field to the metal, the maximum current that can be 
extracted at a given temperature (saturation cun-ent) is determined by this 
fonmda. Since it relates to electrons in a metal, tho quantity g is close to go 
(i.o., to the limiting energy at absolute zero) and does not depend upon temper¬ 
ature. 


Sec. 44] 


FEftMI DISTBIBUTION' 


497 


It will be noticed that if we apply a very strong electric field to the metal, 
electrons will emerge from it overcoming the potential barrier which appears 
at the boundary under such conditions (cold emission). But this requires very 
large fields. Cold emission is analogous to the ionization of atoms in the Stark 
effect (seo Sec. 35). 

4) CJalcidate the total energy of the electrons in an atom in accordance with 
the Thomas-Fermi statistical model. 

From (44.10), the kinetic energy of the electrons is 


^0^ • 4 " 4 (6cp)‘/. dr. 


because the limiting kinetic energy of the electrons is eqj. 

We substitute instead of eiji and go to non dimensional variables 

(44.23). Then for we obtain 


OO 


The potential energy is divided into two parts: the interaction energy of the 
electrons with the nucleus, equal to 


^pot — ~ J n • An dr, 

0 

where the electron density is determined from (44.18), and the interaction energy 
between the electrons themselves. 


1 r 

Yj~(l-^)n-Anr^dr. 

6 

The factor takes into account that each electron should be counted once. 
Combining both parts of the potential energy, we have 


1 r Ze^ 

“^pot = <^y<it+ ^pot = — ^ 41 ) M-47tr“dr. 

0 


Substituting the quantity n, we arrive at tho following expression for the 
potential energy: 


^ pot — 2 


dx 


■\/x 


The integrals appearing in the energy expression are easily calculated using 
equation (44.24). Namely, 


J.t,*/.^=J|'da:=-4'(0), 
0 0 


498 


STATISTICAL PHYSIC'S 


[Part IV 


because 'Y (oo) 0. The second integral is transformed by parts; 

CO r/j CO CO 

j *^‘ - 2 \/x 2! - sj \/X i C' dx -- - 'y' y dx ~ 


oo CO 


0 0 0 
since the integrated expressions are eipial to zero. Further, 


II 0 


dr ~ 


since (li (0) ^ 1. Hence, 


dx 

V->' 


Substituting these integral values in the expressions for <?kiii and 
wo notice that ^pot= — 2 i^’kin, so that the total energy is — (this result 
is also obtained in the exact theory, and not only in the statistical model). 

The quantity <li'(0) is equal to —1.589, whence we obtain the following 
formula for the total binding energy of aU the electrons in an atom: 

^ - 0.769 ^ . ZV, = - 20.94 2’/3e« 


For example, for uranium if = — 8 x 10® ev, or —1.6 me®. 

A relationship of the form Z’lt is also easy to obtain, without calculation, 
in the following way. Coulomb forces fall off slowly with distance. Therefore, 
all the electrons interact with each other in pairs, so that there are about Z® 
pairs. From (44.23), the mean distance between the electrons decreases like 
Z'lt. This is what yields Z'U. We notice that in the case of nuclei the total bind¬ 
ing energy is proportional to the first power of the number of particles (within 
wide limits). This points to the short-range character of nuclear forces: each 
nucleon (i.e., proton or neutron) does not interact with all the other nucleons, 
but only with the “nearest.” 


Sec. 45. Gibbs Statistics 

In this section we shall consider the general statistical method of 
Gibbs applied to any system consisting of a sufficiently large number 
of particles, irrespective of whether these systems are solid, liquid, 
or gaseous. It is very difficult to treat this method rigorously by pro¬ 
ceeding only from the equations of quantum mechanics; it is probably 
still more difficult to do so classically, since the concept of probability 
does not exist in classical mechanics (to say nothing of the fact that 
the application of classical mechanics to the motion of microparticles 
is by no means always justified). The derivation of the basic principles 
of (5ibbs statistics is somewhat intuitive in character and is justified 


Sec. 45] 


QIBBS STATISTICS 


499 


by the fact that the statistics agree with a vast quantity of experi¬ 
mental facts. 

Of course, in principle, it would be well to substantiate the statistical 
method m such a way as to be certain beforehand of its agreement 
with experiment, proceeding from the sole fact that quantum mechan¬ 
ics agrees with experiment; but, as yet, we have no such quantitative 
treatment at our disposal. 

The quasi-closed system. Fundamental to statistics is the concept 
of a quasi-closed system, i.e., a system occurring in weak interaction 
with the surrounding medium. This interaction does not essentially 
destroy the structure of the system, but governs transitions between 
those of its states which correspond to close separate energy levels of 
the closed system. In a system consisting of a sufficiently large number 
of particles, an energy interval (due to the quasi-closed nature of 
the system) contains an exceptionally large number of separate energy 
levels or, more exactly, states corresponding to separate, exceptionally 
close, energy levels of an ideal closed system. It is this that makes the 
application of statistics possible. 

Statistical equilibrium. As was shown in Sec. 39, all these separate 
states are equally probable, in other words, the system spends the 
same amount of time m each of them. If a study is being made of the 
beliaviour of a macroscopic system, the essential thing to know is not 
its detailed state (characterized by a certain wave function), but a 
large group of states to which the state of the system belongs most of 
the time. 

For example, let us consider N particles of an ideal monatomic gas 
that possess a total kinetic energy 

The separate states of the gas are equiprobable; in other words, 
the state, where a single particle has aU the energy S and the remaining 
particles have zero kinetic energy, is as probable as that where aU the 

particles have an equal energy and a strictly identical direction of 

momentum along a chosen axis (for the time being, we ignore the 
possibility of Pauli exclusion). But for the greater part of the time 
the gas occurs in a state which is incomparably closer to the equilibrium 
distribution of energy than to the exceptional state where a single 
particle has obtained all the energy. Most probable is the state which is 
described sufficiently well by a Bose, or a Fermi, distribution, depend¬ 
ing upon whether the gas particles have integraJ or half-integral spin. 
If the gas particles are close to the most probable distribution, for 
constant interaction conditions with the external medium, then its 
state will all the time be close to the most probable state. Any signif¬ 
icant deviation from the most probable state is of vanishingly smaU 
probability. 

The whole group of equally probable microscopic states in which a 
system exists for the greater part of the time is called the statistical 


32* 


500 


STATISTICAi PHYSICS 


[Part IV 


equilibrium state of the system. It is defined in far less detail than in 
the ease of the states in quantum mechanics, but fully enough for a 
description of the macroscopic system as a whole. 

The concept of statistical equilibrium can be applied to any suffi¬ 
ciently large system of particles, irrespective of tvhether they interact 
as weakly as the particles of an ideal gas, or as strongly as the particles 
in a solid or liquid body. It will be recalled that in provmg the equi- 
probability of microstates (Sec. 39) it was not assumed that the system 
consisted of noninteracting particles. The greater the number of 
microstates of a system, the more probable its state. Here, only those 
microstates are taken into account which are compatible with the law 
of conservation of energy, i.e., those belonging to the energy interval 
of a quasi-closed system. 

Probability distribution in subsystems. Instead of considering a 
quasi-closed system in an external medium, it is more convenient to 
proceed from a large ideally closed system and divide it into individual 
quasi-closed subsystems. The quasi-closed nature of subsystems mani¬ 
fests itself when the surface layer of each subsystem, through which 
interaction with the surrounding subsystems is effected, produces 
but a small effect on the processes taking place inside the volume. 
Interaction between subsystems leads to the establishment of statis¬ 
tical equilibrium over the whole large system, Avhile equilibrium in a 
subsystem is established by its internal interactions. 

Let us siipxiose that equilibrium has been established inside a sub¬ 
system. What is the probability that its energy is betAveen ^ and 
S -H dS ? To this interval there corres])ond g («f) equally probable 
microstates. As we know, g (^) is called the weight of a state with a 
given energy <f. 

Since all the separate microstates are equally jirobable, the proba¬ 
bility of P (<^) states is directly proportional to g (<^): 

P(S) = g{S)g{S), (45.1) 

where p {^) is a function which we have got to determine in the present 
section. 

Separate quasi-independent subsystems may be regarded as very 
large molecules of a Boltzmann gas. It is natural to consider that 
such a “gas” is subject to Boltzmann statistics since macroscopic 
subsystems differ from one another. It follows from this that the dis¬ 
tribution function must be of Boltzmann form: 

p(<^’)~e 

This result is preliminary and intuitive. A more strict derivation 
is given below based on the properties of the function p (^), which 
wifi now be established. 


Sec. 46] 


GIBBS STATISTICS 


601 


Liouville’s theorem. We shall prove that the function p (<o) is con¬ 
stant during the interval of time within which a quasi-closed system 
may be regarded as closed, i.e., the remaining subsystems do not 
noticeably affect its state. 

The weight of a state g (S’) is defined by the number of microstates 
whose energy is between S and S + dS. Each of these microstates is 
characterized by a definite set of integrals of motion (for example, for 
a monatomic ideal gas, the group of momenta of the separate systems). 
Therefore, g (S) is a constant quantity. 


The probability P (S) is defined as lim > when t tends to in¬ 


finity (see Sec. 39). Here, t denotes the observation time for the whole 
closed system, which includes the given quasi-closed subsystem. There¬ 
fore, by its very meaning, P (S) caimot depend upon time, because 
this is a resultant average quantity for large intervals of time. But 
if P (S) is a constant quantity and g (S), as a function of the integrals 
of motion, is also constant, then p (S) is also independent of time and 
is an integral of motion. But since all the integrals of motion are, in 
principle, known from mechanics, p must be their function. In other 
words, p cannot depend upon quantities that vary with time, and, 
apart from S, depends only upon the integrals of motion. More exactly, 
p remains constant over intervals of time for which the quasi-closed 
subsystem may be regarded as closed. The statement concerning the 
constancy of p (S) is known as Liouville’s theorem. At the end of this 
section, a classical formulation of Liouville’s theorem will be given 
that is more vivid than a quantum formulation. 

The theorem ol multiplication of probabUities. Over a certain interval 
of time, quasi-closed subsystems may be regarded as independent. 
Then the well-known theorem of probability multipMcation can be 
applied: the probability that one of the subsystems is in a state A and 
another in a state B is equal to the product of the probabilities corre¬ 
sponding to states A and B. 


P.S = P. Pb. 


(46.2) 


The statistical weights of the states, g^ and gg, are of course multi¬ 
plied because they relate to different subsystems: 


Thus, 


9AB=dA-QB- 

Pab ~ Pa Pb ~ ^ AB " ^AB ~ ^Ja9a' 9b 9b' 


(45.3) 


It follows from formulae (45.2) and (45.3) that 


Pab Pa ' Pb- 


(46.4) 


In other words, the probability density for two quasi-independent 
subsystems is a multiplicative function, i.e., it is obtained by multi- 
pl 3 dng the separate p functions. 


602 


8TATISTICA1, PHYSICS 


[Part IV 


Gibbs distribution. The logarithm of probabihty density is an addi¬ 
tive quantity, i.e., it is equal to the sum of the logarithms of this 
quantity for each subsystem separately: 

lnp^^ = lnp^-f Inp^. (46.5) 

We know from LiouviJlo’s theorem that in p is, in addition, an in¬ 
tegral of motion. Hence, In p is an additive integral of motion. 

In Sec. 4 of Part One we listed the additive uitegrals of motion: 
energy, linear momentum, and angular momentum. Por In p to be an 
additive integral of motion, it must depend linearly upon energy, 
linear momentum, and angular momentum. If we choose a reference 
system in which the subsystem as a whole does not move, then the 
linear momentum and angular momentum will be equal to zero and 
the logarithm of the probabihty density wiU turn out to be a linear 
function only of energy. 

In other words, the following relationship results: 

lnp = a^-f6. (46.6) 

The coefficient a must be the same for all subsystems of the large 
system because, otherwise. In p will not have the properties of an 
additive function. If a is the same for two subsystems, then these 
two subsystems yield 

1“ P.IA == In 9a + >n 9« = «'A + ^b) + {^>a + ^b) = 

= + (45.7) 

whence the additivity of In p can be seen. 

The probabihty of an infinitely large energy must be infinitely smaU 
because a<0. 

We shall write 

(46.8) 

The meaning of the quantity 6 is the same as in the previous sections: 
it is the temperature multiplied by the Boltzmann constant. Indeed, 
for an ideal gas, a single molecule can be regarded as a separate sub- 
sjrstem, and then the Gibbs distribution of the form 


becomes the Boltzmaim distribution 


In addition, we denote 


(46.9) 


Sec. 45] 


GIBBS STATISTICS 


603 


Finally, tlie required distribution function is 

f--<r 

p(<f)=e '“ . (46.10) 

The normalization' condition. The following condition is imposed 
uiJon the function p (^); 

2]P{<f) = (46.11) 

since ^ t ((f) = t. This simply means that the probability of finding a 

subsystem in any of the possible states compatible with the conser¬ 
vation laws is equal to unity. With the aid of the normalization con¬ 
ditions (46.11) we can express the quantity F sls a, function of 0. 
It is sufficient for this to substitute the Gibbs distribution (46.10) 
into (45.11) and iierform summation over all possible states. The 

jp 

factor e®, as a constant quantity, is taken outside the summation. 
Hence, we have the following equation for finding F: 

e, ®'!7((f). (45.12) 

<? 


As we know, the expression on the right-hand side is called a statis¬ 
tical sum. 

The mean energy o£ a subsystem. The mean values of quantities in 
statistics are determined in the following way. Let the quantity / 
assume a value fA in any state A. Then, if the probability of this state 
is equal to Pa, the mean value will be defined as 


(46.13) 

For example. 

A 

(45.14) 

because 

(f 

P{S)^-?{S)g{S). 


Let us substitute the Gibbs distribution in (46.14). Then we obtain 
an expression for the mean energy: 

rf 


(45.16) 


604 


STATISTICAL PHYSICS 


[Part IV 


Energy fluctuations. Generally, the state of a subsystem is, to a 
certain extent, characterized by the mean quantity. For this purpose, 
any statistics, and not only physical statistics, makes use of mean 
quantities; a constant mean quantity makes it possible to estimate 
the order of magnitude of a variable. 

However, if a variable quantity exhibits a wide scatter, the mean 
does not describe it sufficiently well. 

Therefore, in addition to the mean value of the energy in a sub¬ 
system, it is mteresting to know its mean dispersion. These two mean 
quantities define the state of a subsystem considerably better than 
^ by itself. Jiut if we average the quantity 

(45.16) 

the result is identical zero. Indeed, a second averaging of the constant 
quantity S in no way changes it: the mean of ^ is again equal to S. 
Whence 

■^=^-^'=0. (45.17) 

It is therefore expedient to avei’age the quantity {ts.S)'^ = (S ‘—^)^. 
Smce it is essentially positive, a deviation of S' from the mean value 
(to either side) makes a contribution. The desired mean quantity may 
be written somewhat differently: 

XKsY =(s-if = s^- 2.SJ -f ^ 

= S^-2Si + i^=s^'-s^, (45.18) 

where we have taken advantage of the fact that the mean of a con¬ 
stant quantity ® is equal to itself, and also that the constant factor 
can be taken outside the averaging symbol: 


SS = ii = i^. 


I (A«f)* is called the absolute fluctuation of the energy. This quantity 
characterizes the average extent to which the energy deviates from 

its mean value. The ratio of the absolute fluctuation ] {\Sf to the 
modulus I (f I is called the relative fluctuation of the energy. It is a 
measure of the relative fraction of the energy deviation from its mean 
value. 

Naturally, the definitions of absolute and relative fluctuation retain 
their meaning also for other quantities that describe a subsystem, and 
not only for energy. 

Calculating the energy fluctuation from the Gibbs distribution. We 
now apply the Gibbs distribution to the calculation of energy fluctua- 


3eo. 46] 


GIBBS STATISTICS 


605 


bion in a subsystem. To do this, we differentiate, with respect to 6, 
the follo^ving two identities from which F and <a are determined: 

s 


This expresses the normahzation condition (45.11), and the definition 
of mean energy (45.15) 

J. 

s 


S and (j {S) are purely mechanical quantities and do not depend upon 
the Gibbs distribution parameter 0. Therefore, only F, S, and, of 
course, 6 itself need be differentiated with respect to 0. We thus 
obtain 


i- 

dF 

F-S\ 

u 

80 

0^ 1 


e' “ ^(7(^) = 0, 


(45.19) 


1 dF 
¥ c- o 


F-S 

e « ^(^) = 


ds 
80 ’ 


From (45.19) we have 


(45.20) 


1 

0 '80 


=—y 


F-d 


F-£ 

e 0 g 


P-S 

02 


Substituting this in (45.20) we find 

dY _ yy/ F—S F 

80 ~Zj \ O'* 6» 


e 0 g{S). 


(45.21) 


As a constant, the quantity ^ may be taken outside the averaging 
sign. In accordance with (45.18), this gives 

= = (A?)0. (45.22) 

Whence the relative fluctuation is 


(45.23) 


606 


STATISTICAL PHYSICS 


[Part IV 


But this quantity is inversely proportional to the square root of the 
number of particles in the subsystem, because the energy, as an 
additive quantity, is proportional to the number of particles. 

Let us illustrate this in the case of an ideal gas. From (40.17) 

Hence, the relative fluctuation is ]/^^-- Bor example, 

N--.2J y. 10^® for 1 cm* of a gas under normal conditions, so that the 
relative energy fluctuation is a few parts in 10^®. 

For the greater part of its time, the energy of 1 cm* of gas differs 
from its mean value by this small fraction. Nevertheless, in the later 
statistical development it is more convenient to regard the energy of 
a subsystem as only slightly fluctuating, and not strictly constant, as 
in an absolutely closed system. 

Naturally, the relative fluctuation for a separate gas molecule is 
not a small quantity. Thus, from (40.14) and (40.15), the fluctuation 
in the velocity of a molecule is 

j/:50 

\ TT 

1/-“" 

y TTW 

Thus, we have shown that the probability of a given value of sub¬ 
system energy S has a very sharp maximum close to This 

maximum becomes sharper, the larger the subsystem. 

Entropy. Proceeding from the probability density for a subsystem, 
we can form an analogous function for a closed system. Utilizing the 
fact that probabilities are multiplicative quantities we obtain 

^ - n n p. <n =n p- n (^5.24) 

t ( I i 

We now make use of the explicit form of the Gibbs distribution (45.10). 
Then for the product of all the functions p,- we obtain 

i i i 

np. ri' “ -e “ = e ® . 

But if the large system is closed, ^ <?, = <^ = const and, hence, 
f]| pi = const also. Thus, the probability of a state is proportional to 

i 

the statistical weight of that state 


/1.42 


0.42. 


r 


(46.26) 


Sec. 45] 


GIBBS STATISTICS 


507 


And all states of the system with the same energy are equally prob¬ 
able : the probability of 0 (S) states is proportional to the number 
9 (^) (see Sec. 39). 

As has already been repeatedly pointed out, ideally closed systems 
do not exist in nature. Speaking of a “closed” system, we mean that 
equilibrium for its subsystems is established more rapidly than the 
entire large system attains equilibrium with the surroundizig medium. 
During the time that it takes for equilibrium to be established be¬ 
tween the subsystems, the additive integrals of the large system do 
not have time to change noticeably. Hence, we can distinguish be¬ 
tween the concept of statistical equilibrium in the whole system and in 
ts subsystems. 

It is obvious that statistical equilibrium in the large system is 
maintained for a longer period than the equilibrium established inside 
the subsystems alone. Therefore, the probability of the fuller equilib¬ 
rium is, simply by definition, greater than the probability of the less 
full equilibrium. From (4.5.25), a measure of the probability for a 
large system is the statistical weight of its state. Therefore, the closer 
a system is to statistical equilibrium, the greater the statistical weight 
of the state of a closed system, so that 0 (^) is a measure of the close¬ 
ness of a large system to equilibrium. In this way wo may also regard 
bhe quantity gt (<f) of each ith subsystem as a measure of its closeness 
bo equilibrium (internally) for those time intervals during which the 
subsystem may be regarded as quasi-closed. 

For any, not too small, interval of time we can indicate systems 
which remain almost closed during this interval. For them the quan¬ 
tity (? is a measure of the equilibrium of their states: the larger O is, 
the closer the subsystems of a given “closed” system are to equilibrium. 

It is more convenient, as a measure of the closeness of a system to 
statistical equilibrium, to use In G instead of the statistical weight 0 
itself, since it possesses the property of additivity, in 0 is called the 
intropy of a system and is denoted by the letter S-. 

S^hxQ. (45.26) 

It was shown in the previous section that the state of a Fermi gas 
at absolute zero is defined uniquely: 0 = \. Hence, the entropy oif a 
Fermi gas at 0 = 0 is In 1 = 0. At absolute zero, a Bose gas occurs 
completely in a zero energy state (see Sec. 43). Hence, its state is also 
uniquely defined, i. e., 8 = 0. 

The entropy of a subsystem. By definition entropy is an additive 
quantity, so that 

-S = ln6? = lnn 

i i f 

It is seen from this equation that it is natmal to call In gi=Si the 
entropy of a subsystem. In order to calculate it, it is convenient to 


608 


STATISTICAX PHYSICS 


[Part IV 


make use of the Gibbs distribution function for a subsystem. As was 
shown in this section, the energy of quasi-closed subsystem is very 
close to a constant value, namely, to its average value but is not 
strictly equal to it. Therefore, the formula for the entropy of a sub¬ 
system can be successfully applied also to a “closed” system, whose 
energy is strictly constant. The error here is determined by the relative 
lluctuation of quantities in the subsystem, i.e., it is negligibly 
small. 

The entropy of a quasi-closed subsystem, equal to In jr,- (^,), should 
be represented as In (ji {Si), where Si is the mean value of the energy 
in a given subsystem in the case of “frozen” interaction with other 
subsystems. Jn other words, in finding this value Si it is taken that 
the subsystems do not arrive at a more complete equilibrium during 
the time considered. 

We now take advantage of the fact that the energy fluctuations 
are small, and can replace tlie normalization condition (45.11) by the 
following simple relationship: 


Z P (<^0 U (‘5’.) P. («^.) !7.- (^'i) = 1 • (45.28) 


Substituting gt (Si) into the definition for the entropy of a sub¬ 
system, we find 

1 1 


Si — In 


Pi (-S’,) 


;ln 


pi (-S’i) 


(45.29) 


But .since a logarithm is a slowly varying function, we can replace 
the logarithm of the mean value p, (Si) by the mean value of the 

logarithm In : 

Pi _ 

&\ = ln—. (45.30) 

Pi 


The resultant error is the smaller, the larger the subsystem, because 
the relative fluctuations tend to zero as the subsystem increases. 

Substituting p,- from the Gibbs distribution (45.10) into (45.30), 
we find the following expression for entropy (omitting the index i): 

= In j= In (e « ) = . (46.31) 

% 

Comparing this with (46.21), we obtain 

5=-4^. (46.32) 

Replacing by ^ — 0 /S in this equation, we find, after differentiation 


Sec. 45] 


OIBBS STATISTICS 


609 


S = 


d(<S‘-QS) 

SO 


S^ 

so 


+ »S' + 0 


dS . 
SO ’ 


n SS _ ^ 

” so ~ so' ■ 


Here, dififerentiation occurs with respect to 0 under constant external 
conditions, of wliich and F can also be a function. Eliminating d 0 
we find 


0 = 


S'^ 
dS • 


(46.33) 


Phase space. In conclusion we give a classical formulation of Liouville’s 
theorem. In classical mechanics particle coordinates and momenta exist simul¬ 
taneously. A system of A' particles is cliaracterized by a sot of 3 N coordinates qi 
and 3 N momenta pi. As was shown in Sec. 10, these variables may bo regarded 
as independent. In order to make the sot of 0 iV variables more vivid, wo shall 
plot them on the axes of a 6 A-dimensional coordinate system. This does not 
introduce anything fundamentally new but it somewhat simplilios the eonsidora- 
tion duo to associations with two- and three-dimensional sj)ace which are in¬ 
herent in geometrical modes of expression, diich an imaginary 6 A-dimonsional 
space is termed a phase space in mechanics. A single point in phase .space speci¬ 
fies the state of a whole mechanical system, because the point is fletined 
by all 3 iV coordinates and 3 N momenta. As time passes the momenta and coordi¬ 
nates vary in accordance with Hiunilton’s equations (10.18): 


dqi _ 6H dpi dH 

dt dpi ’ dt Bqi 


(45.34) 


A point describing the state of a system moves in phase space describing a 
path in phase space. As an example, let us find the phase trajectory of a system 
with one degree of freedom—the linear hai’monie oscillator. It is obtained from 
the energy conservation law 


2m 


m 

2 


= H = S' — const. 


This is the equation of an ellipse with semiaxes V 2m S and . 

Classical Liouville’s theorem. We shall now consider a set of systems with 
the same Hamiltonian, but with somewhat different initial conditions. Those 
systems will move along different phase paths. Let us suppose that the initial 
points densely fill some volume element in phase space. As each point moves, 
this volume elomont will be displaced similar to the volume of a flowing liquid. 
We shall prove that the magnitude of the volume remains unchanged when the 
points move, so that the motion resembles the flow of an incompressible liquid. 

A volume element of phase space is expressed analogously to a three-dimen¬ 
sional volume element: 


dr s dpi...dp3Ndqi...dq3N- 


(45.36) 


We assume that at the initial instant of time there are dn phase points in this 
volume element, i.e., the initial coordinates and momenta of the dn systems 
with the same Hamiltonian are contained between Pj and Pi -f dpj, p^ and 
Pj -1- dpu, etc. The density of the systems in phase space is determined by the 
isual equation 

dn=pdr. (45.36) 


Wo shall now prove that the density p does not change in the case of motion 
of the phase points. For simplicity in notation, we shall consider a two-dimension¬ 
al space, though all the reasoning can be directly extended to a 6 W-dimensional 


510 


STATISTICAL PHYSICS 


[Part IV 


space. We shall proceed from the fact that the total number of points dn is 
conserved in motion. Therefore, a decrease in the number of points inside some 
fixed volume dV in unit time is equal to their flux across the sinfuco bounding 
the volume. In the case of a two-dimensional region dV = dp dq, the flux across 
the side dp is o<{ual to p (q) q (q) dp (see Fig. 54), and the flux across the opposite 
side is p (q + dq) q (q dp. The total flux across all four sides is 

p(q \-dq)q(q dq)dp -p(q)q(q)dp \- p (p dp) p (p d p) dq - p{p) p (p) dq 


dq 


{??) 


where the same expansions have been used as in the tlerivation of the Gauss- 
Ostrogradsky theorem (Sec. 11). The resulting flux must bo equated to the 

decrease of particles in the volume, i.e., to the quantity —As a 

result, the same equation is obtained as that of charge conservation in electro¬ 
dynamics [see (12.18)]: 


dp 

ill 


3 . . 9p , dq dp dp 


(46.37) 


Fig. 54 


But from Hamilton’s equations (46.2) 

dq dp _ d^ H 

dq dp dpdq 

so tliat equation (45.5) may be rewritten as 


dp 

dt 


(45.38) 


In other words, the density of phase points is an 
integral of motion. This is Liouville’s theorem in 
classical formulation. 


Strictly speaking, the classical density function p defined here has a meaning 
other than the quantum probability density in formula (45.1). We shall now 
show in what way these concepts are closely related. 

If wo observe the same quasi-closed subsystem very many times, taking 
different instants of time ns origins, the result will be something in the nature 


of a sot of identical systems with differing initial conditions. The definition of 
phase point density (46.36) is analogous to the definition of probability density 
(46.1), boeauso, to the classical approximation, statistical weight is written in 
the following manner: 

‘'"’“-(iSiW <46.39) 


[cf. (39.32)], putting dV = dxdydz, i.e., for an infinitely small volume. 

The difference in the two definitions of density is that the probability of 

a state in quantum mechanics can be determined directly as limit . 

while in classical mechanics w-e are to understand by probability the limit 

whore n (<f) denotes the number of times that a phase point appears in 

the region of the state under consideration with given energy and n is the 
total niunber of observations. All the individual results of observations are re¬ 


garded as equally probable. But the proof of this equiprobability is the basic 
difficulty in a nonquantum version of statistics. 


See. 45] 


GIBBS STATISTICS 


511 


Closed and quasi-closed systems. We have seen that two methods 
of statistical construction are possible. 

1) Proceeduig from an ideally closed system, in which the energy 
is strictly conserved, the problem is to determine the entropy as 
In 0 (S). The state for which the entropy is maximum is the equilib- 

rium state. The temperature is then determined as 6 = . 

2) Proceeding from a quasi-closed system, introduce temperature 
from the Gibbs distribution. Then entropy is calculated from the 

formula In — 

p 

Both methods are, of course, equivalent, but it is far more con¬ 
venient to use the Gibbs distribution in various applications. 


Exercises 

1) Verify Liouville'a theorem in the ease of four points in a gravitational 
field. At the initial instant of time, the points form a rectangle with sides 
Az and Ap* in a two-dimensional pliase plane. 

Two points possessing larger initial momenta will fall faster. Therefore, in 
the motion of the phase points, the rectangle becomes a parallelogram. Its height 
will equal Ap* and its base Az, so that its area will remain equal to the initial 
area of the rectangle. Conservation of phase volume is equivalent to conserva¬ 
tion of phase density, i.e., to Liouville’s theorem. 

2) Verify Liouvillo’s theorem for the three harmonic oscillators: 

p, — sin pj = \/2w'-f A#) sin wt; p^ — \/2»a^ sin (w<-) 8); 

1/ 2^ , 1/2'(<S’+Td’) ■ ^ 'l/’2^~ , . , 

\ -iCOSMt; .r, » / --coswt; x^-- \ -rcos(uM-S). 

* r TOW* * V mv? “ r TOW* ' 


The area of the triangle is expressed in terms of the coordinates of its vortices 
in the following way: 


(Pja-3 - p^-r^) - (Pir, -- p^rd + (pia-j - ■ p-^r,) 


whence, substituting coordinates and momenta, the statement is proved. 

3) Find how the entropy of an ideal Boltzmann gas, consisting of V separate 
atoms, depends upon energy and volume. 

We proceed from the definitions (45.39) amd (45.26): 


)S’ = In r — In 


■ dTp, drpj ... d-Zp]^...dXi ... dZjjy 


= ln 


JdTpi...dTp/v^ 
'(2^*N 


I 


where dxp = dp* dpy dp* for a separate atom. 

It is convenient, first of all, to evaluate the integral for all states in which 
the energy of the gas is less than the given value S’. The momenta of the atoms 
for all states with energy less than S satisfy the inequality 

pi, + Pn + pi, + pi, + Py, + pi, + - + pi^ + Pm + Pm < 


(the momenta of the atoms are numbered, corresponding to non quantum 
statistics). The 3;V-dimen8ional space region over which the integration is per- 


612 


STATtSTICAX PHYSICS 


[Part IV 


formed is analogous to a sphere in a three-dimensional space. The coordinates 
of the points inside the sphere satisfy the inequality 

4 - ?/“ + z^< R^, 

where K is the radius of the sphere. The radius of the 3 V-dimensional sphere is 
■\/2tnS‘ • Therefore, it is clear from the dimensionality that the volume in a 
3JV-dim<!nsionul space is proportional to just as in throe dimensions 

it is proportional to R^ (the iV-dependont coofliciont in this exercise will not be 
determined). The number of states between S and <? + is proportional to 


the term { 


The entropy is equal t o the logarithm of the statistical weight, so that it involves 
^ — l| In Neglecting unity compared with “ 2 “ ’ obtain a 

formula expressing entropy in terms of the energy and volume of the gas: 

S = In ^ + JV In P + const. 

2 $ 


Sec. 46. Thermodynamic Quantities 

Statistics and thermodynamics. The results of the previous section 
may appear somewhat abstract if they are not related to the real, 
measurable properties of macroscopic bodies, for example, specific 
heat, thermal expansion, compressibility under pressure, .'^tc. In 
turn, tliese properties are determined as derivatives of enerj^,,; and 
volume with respect to temperature, of volume with respect to pres¬ 
sure, and, m general, as derivatives of various mean values and para¬ 
meters in the Gibbs distribution with respect to other quantities. 

A quantity such as 0 (called the distribution modulus) is essentially 
defined by the way that it appears in tlie Gibbs distribution, since, 
in calculating mean values, 0 ajipears under the summation or integral 
sign as a parameter. Therefore 0 is one of the quantities specifying 
the macroscopic state of a system, since the mean values characteriz¬ 
ing this state are related to it. 

The properties of the mean macroscopic quantities defining the 
state of a body form the subject of thermodynamics. These properties 
are expressed in the form of a series of relationships—differential 
and integral—^which will be obtained and interpreted in the present 
section. 

Historically, thermodynamics appeared before statistics. It was 
usual to base it upon two postulates or laws (see below) which were 
supported by a vast quantity of experimental facts. The “laws” are 
now no longer postulates, since they are based on statistical methods. 

It must not be thought that with the aiivent of statistics thermo- 
djmamics lost its importance. Thermodynamics shows how real. 


Sec. 46] 


THERMODYNAMIC QUANTITIES 


513 


experimentally observed macroscopic quantities defining the thermal, 
chemical, etc., properties of macroscopic bodies are interrelated. 
In those cases where it is impossible to calculate some quantity 
by statistical methods (due to a lack of knowledge of the elementary 
laws of force interaction, or because of great mathematical com¬ 
plexity), thermodynamics shows how this quantity may be found, 
directly or indirectly, from measurement. 

But statistics is not only the basis of thermodynamics. Above 
all, statistics indicates the way that thermodynamic quantities 
can be calculated from the microscopic structure of bodies. In addition, 
.statistics makes it possible to calculate in advance by how much 
the actual values differ from their mean values. As we have seen, 
this type of deviation is measured by the fluctuations of the quanti¬ 
ties [see (45.22)]. Under certain conditions, fluctuations manifest 
themselves in such a way that they can be recorded experimentally 
(see Sec. 48). 

Thermodynamics is one of the most important chapters of statistical 
phj'sics. Therefore, we shall describe it on the basis of statistics, 
preferring a systematic to an historical account. 

Quantity of heat. A quasi-closed macroscopic system spends the 
greater part of its time in a state of statistical equilibrium. In this 
state, the actual values of quantities are almost constant, and are 
close to their mean values. 

The closeness to the equilibrium in a system is defined by the 
total entropy of all its subsystems, on the assumption that inter¬ 
action of the system with the surrounding medium occurs considerably 
more slowly than interaction between its subsystems. 

We shall now consider in what way interaction between subsystems 
affects the macroscopic quantities characterizing their states. 

Let two bodies be brought into contact in such a way that the 
external conditions and the number of particles in each of them 
remain unchanged. Then, in accordance with the differential equation 
(45.33), the mean energy increment of each subsystem is proportional 
to its entropy increment: 

dS = ^dS, (46.1) 

where, after the stipulations concerning the constancy of external 
conditions and the dimensions of the bodies have been made, the 
partial differentials are replaced by total differentials. 

The total energy increment for two bodies isolated from external 
action is equal to zero: 

dii + dSt = 0. (46.2) 

The total entropy increment is positive or equal to zero, because, 
as a result of interaction, the bodies will arrive at statistical equUib- 


33 - 0060 


514 


STAHSTICAX. PHYSICS 


[Part IV 


Hum between themselves, and this equilibrium is, of course, more 
complete than the equilibrium inside each of them. Hence, 

dSi + d/S'a ^ 0. (46.3) 

Using (46.1) and (46.2), we obtain 

If 01 >02, then d^i<0, i.e., the first system transmits energy 
to the second. The transmission of energy is entirely due to contact 
mteraction, i.e., to the microscopic forces between molecules. The 
energy transferred in this manner is termed heat, so that heat is not 
“a form of energy” but a mode of energy transmission (we shall ex¬ 
amine this question in more detail further on). 

In formula (46.4), 0^ and 02 are parameters in the Gibbs distri¬ 
bution for each of the subsystems separately. As long as these para¬ 
meters differ, the subsystems caimot occur as a unit in a state of 
statistical equilibrium. Approximation to equilibrium occurs as a 
result of heat transfer, with the heat always going to the subsystem 
in which the parameter 0 is least. Only then are and 02 the same, 
the macroscopic quantities of heat are no longer transferred, and the 
energy of each subsystem exhibits only small fluctuations in the 
vicinity of its equilibrium value. If one of the systems is an ideal 
Boltzmaim gas, then, as we know, 0 is proportional to the absolute 
temperature, since the Gibbs distribution for a gas as a whole loads 
to a Boltzmann distribution for the individual molecules with the 
same jiaramoter 0. The absolute temperature of a gas can be deter¬ 
mined from independent (not thermal) measurements in the Clapeyron 
equation 'pV — RT. It is natural to consider that the quantity 0, 
for any system other than an ideal gas, is also nothing other than 
temperature. If a system is in equilibrium with an ideal gas, then 
its value 0 is proportional to the absolute temiierature of the gas. 
Thus, 0 is the temperature measured in absolute units (ergs) if the 
ideal gas is taken as a thermometric substance. A little later in this 
section a definition of temperature will be given which does not 
depend upon the choice of the thermoraetric substance. 

A Gibbs distribution occurs for any group of quasi-independent 
subsystems, including those that have not arrived at a state of mutual 
statistical equilibrium. Even though the quantity 0 in this case, 
too, is, by definition, the same for all subsystems—which follows 
from the multiplicativity of the distribution function p (S) [see 
(46.4)-(45.8)]—it must not be regarded as equal to the temperature 
of the large system, which, generally speaking, cannot be defined 
for a system not in equilibrium. If the subsystems in this case are 
in internal equilibrium, they are characterized by their ovm Gibbs 


Sec. 46] 


THEBMODYNAMIC QUANTITIES 


615 


distribution, which cannot be represented by a factor involved in 
the Gibbs distribution of the large system because the parameters 6 
of both distributions are different. Thus, the distribution modulus 6 
of an equilibrium system is a measure of its temperature. 

Taking the example of temperature, it can be seen that quantities 
which are defined statistically can be identified with actually measured 
thermodynamic quantities. Any statistical quantity can be regarded 
as defined when, and only when, there is given a unique group of 
operations (of measurement and calculation) relating this quantity 
to real macroscopic quantities or to the microscopic parameters 
of a system which are found (or can be found) from experiment. 

Work. The Hamiltonian function of a system usually depends 
not only upon generalized coordinates and momenta that vary 
according to dynamical laws, but also upon certain arbitrarily chosen 
parameters. The intensity of the external field, for example, may 
be such a parameter. The energy spectrum of the system, and hence 
the mean energy S' also, depends upon the parameters appearing 
in the Hamiltonian. 

These parameters, transformed according to a given law, are 
termed the external parameters of the system. We denote them by 
the letter X, where X may mean any quantity of this tyi)e. As X varies, 
the moan energy of the system also varies. It is obvious that it can 
only vary at the expense of some external source of energy. Since X 
is a mechanical and not statistical quantity (X is involved in the 
Hamiltonian!), the variation in X is due to certain external mechanical 
work performed on the system, for example, a falling weight or a 
rotating motor. 

The mechanical work performed with changing X can be represented 
as 

dA=Ad\, (46.5) 

where it is natural to call the quantity A the generalized force (since 
work is equal to the product of “force” A and “distance” d X). If 
the entire energy change is due only to change in the external para¬ 
meter X, then 

dS==~d-k. (46.6) 

In formula (46.6), dA is the work performed on an external object 
due to a decrease in the energy of the system, so that dA — — d S. 

Comparing (46.6) and (46.6), we see that the mean quantity 
is equal to the generalized force taken with opposite sign; 


516 


STATISTICAI, PHYSICS 


[Part IV 


The most frequent external parameter of a system is the volume 
that it occupies. In mechanical terms, this may be visualized by 
considering that the potential energy of any particle included in 
the given system is equal to infinity beyond the boundaries of the 
volume, i.e., an infinite amount of work is required even to remove 
a single particle from the volume. This is how the volume appears 
in the Hamiltonian of a system. 

Wo shall consider that a system occupies the volume of some cylin¬ 
der with a movable piston. The force acting on unit area of the piston 
is called the pressure and is denoted by the letter p. Then, if the 
whole area of the piston is /, the force acting on it is pf. When the 
piston is displaced through a distance dx, a quantity of work dA— pf 
(lx is performed on it. But the product fdx is equal to the vol¬ 
ume increment dV of the system. Hence, the change in energy 
for the system is 

dS=—pdV (46.8) 

This type of energy change, produced by a change in the external 
parameters, is called work performed on a system. 

It can bo seen from formula (46.8) that pressure is a generalized 
force A rebated to a volume increment dV. 

The first law of thermodynamics. It has already been pointed out 
that energy c<an be transmitted from one body to another by purely 
contact interaction, without any change in the macroscopic para¬ 
meters or in particle interchange. This type of energy transfer is 
called heat transfer. It can be seen from this definition that the total 
energy increment of a body is made up of the work dA, performed 
by the body, and the quantity of heat dQ transmitted to the body: 

dI=dQ-dA. (46.9) 

The sign in front of dA denotes that, if the body performs work, 
its energy dimuiishes. 

The statement (46.9) looks like an identity from which the quantity 
of heat dQ is delined. Indeed, proceeding from a statistical inter¬ 
pretation of thermodynamics, we can be sure beforehand that the 
law of conservation of energy is aijplicable to thermal processes. 
Any energy imparted to a system without change of its external 
parameters must be transmitted by a contact. It is this energy that 
we have called the quantity of heat. 

But thermodynamics appeared before statistics. Equation (46.9), 
interpreted thermodynamically, signifies that a quantity of heat 
can be measured in units of mechanical work, or that work can be 
measured in heat-quantity units. In other words, equation (46.9) 
extends the law of conservation of energy to thermal phenomena. 
Therefore, the establishment of the mechanical equivalent of heat 


Sec. 46] 


THERMODYNAMIC QUANTITIES 


517 


by Mayer, Joule and Helmlioltz signified a whole stage in the devel¬ 
opment of physical knowledge. Even though the earlier views on 
heat, as the disguised motion of molecules, were close to the modem 
statistical interpretation of thermal phenomena, they did not as 
yet contain any quantitative relationships. Therefore, the theory 
of heat could develop only after the relationship between thermal 
and mechanical quantities had been experimentally demonstrated. 
After the basic principles of thermodynamics had already been 
formulated, statistics began to develop as a physical, quantitative, 
theory. 

Equation (46.9) can be reduced to another form. For this we must 
note that the energy of a body is a unique function of its state. Imagine 
a certain periodic process in which heat is delivered to a body and 
work is taken from it; this occurs m heat engines. Wc integrate 
equation (46.9) over one work cycle: 

jdI=jdQ -jdA. (46.10) 

The energy has the same value at the beginning and at the end of the 
cycle, thus producing the periodicity condition. Therefore, the total 

energy change in one cycle jd^ is equal to zero. Hence, 

jdQ^jdA. (46.11) 

The work performed by a heat engine in one cycle is equal to the heat 
delivered to the engine in that cycle. It is impossible to construct an 
engine wliich would work without an external supply of energy (heat). 
This statement is called “the first law of thermodynamics.” An imagi¬ 
nary engine, performhig work without an external source of energy, 
is called a perpetual motion engine of the first kind. The inevitable 
lack of success of all attempts to build such an engine finally led to 
the negative postulate called the first law of thermodynamies. Of 
course, if a statistical consideration lies at the basis of thermodynam¬ 
ics, then the first law emerges from the purely mechanical law of 
conservation of energy. 

Neither work nor quantity of heat, taken separately, characterize 
the state of the body to which they are transmitted. In accordance 
with equation (46.11), a body may perform any number of work 
cycles, returning each time to the initial state. In doing so it obtains 
any amount of heat and performs any amount of work equal to this 
quantity of heat. Therefore, it is not correct to speak of the “heat 
reserve” that a body possesses. It only possesses an energy reserve 
which varies due to heat transfer and performance of work. Work and 
heat are not “forms of energy,” but modes—macroscopic and micro¬ 
scopic—of energy transfer. This can be seen mathematically from the 


518 


STATISTIOAl. PHYSICS 


[Part rV 


fact that dA and dQ are not total differentials of any quantities. 
For example, dA= — pdV. But pressure is a function not only of 

,^0 

volume, but also of temperature. Thus, for an ideal gas P = 
that dA=—-^^dV. This equation cannot be integrated until we 

know how the temperature 0 varies with volume V in the given 
process. And so the quantity of heat and work characterize a process 
performed by a body and do not characterize the state of the body. 

In certain cases the quantity of heat transferred in a process can 
be expressed very simply. If, for example, the volume of a body does 
not vary (an isochoric process) then dA—0. In general, dA = Q if 
the external parameters X are constant, the quantity of heat being 
equal to the change in energy of the body: 

dQ = Q= A«r (F= const). (46.12) 

If the pressure does not change (an isobaric process), dA=—pdV = 
=— d(pV). Then 

dQ — dS + d {pV) =d(S -\- pV) • 

The quantity _ 

^ + pF= I, (46.13) 

like energy, is a unique function of the state of the body. It is called 
the heai function, or the enthalpy, of a body and is denoted by the 
letter I. Thus, the quantity of heat in an isobaric process is equal to 
the change of enthalpy of the body 

dQ = dl, Q = AI{p = const). (46.14) 

Reversible processes. A definite state of statistical equilibrium corre¬ 
sponds to each value of the external parameters X that describe a sub¬ 
system of any closed system. We can, for example, visualize a sub¬ 
stance under a piston in a nonthermaUy isolated cylinder. In this 
case the substance and its external medium should be regarded as a 
single system. The external parameter defining the state of the system 
is, in this case, the volume F occupied by the substance. 

For every value of the volume, a state of statistical equilibrium is 
established between the substance and its surrounding medium when 
the temperature of the substance and the medium is the same, and 
the total entropy has a maximum corresponding to a given value of 
the total energy of the whole system and to the volume F under the 
piston. 

Let us now suppose that the external parameter of the subsystem 
X varies so slowly that for every value of X a total equilibrium inside 
the system has time to build up. In other words, the state of the system 


Sec. 46] 


THERMODYNAMIC QUANTITIES 


619 


depends only upon the value of X at the given instant. Maximum en¬ 
tropy corresponds to this value of X, so that the system is all the time 
in a state of statistical equilibrium. But this means that in such a 
process the system never approaches a more complete statistical 
equilibrium, because it is not brought out of equilibrium. And since 
entropy is a measure of the fullness of the equilibrium, it does not 
vary with X. 

The constancy of entropy for slow variations in X can be explained 
in the following way. Entropy is the logarithm of the number of 
equiprobable states of a system in a certain range of energy values 
close to . If X varies infinitely slowly, the entire large system may be 
considered at each instant as conservative so that all its states remain 
equally probable (a rapid variation may stimulate transitions in some 
definite direction and, in this way, destroy the equiprobable nature 
of the states that follows from the principle of detailed equilibrium). 
The total number of states is conserved for slow variations of X. 
The most probable range of states having an equal probabUity of 
occurrence is, in principle, determined (over all states, if they are 
known), purely by combinatorial analysis, and cannot therefore depend 
upon the particular value of X for which the states are taken. For 
this reason, the number of states in the most probable region, and 
hence the entropy also, are conserved. 

Thus, to each value of X there corresponds a completely defined 
state of the system, quite independently of the way in which the 
value of X varied prior to this, provided it varied slowly enough. Let 
X change first from X^ to X 2 , and then from Xj to X^. Then, when X 
varies in the reverse direction, the system goes through , the same 
series of values that it passed through in the direct variation. This is 
why the process is termed reversibU. 

We can imagine the following two limiting cases. 

1) The subsystem and the surrounding medium are all the time in 
statistical equilibrium, so that their temperature is the same. If the 
external medium is sufficiently large, its temperature does not change 
at all, and, hence, the temperature of the subsystem also remains 
unchanged in the process. Such a reversible process is termed isother¬ 
mal. In an isothermal process, the total entropy of the whole system 
is conserved while the entropy of the subsystem, and, hence, of the 
medium also, varies. 

2) The parameter X varies so rapidly that an approach to statistical 
equilibrium between the medium and subsystem does not have time 
to occur, but at the same time the variation is so slow that the equi¬ 
librium inside the subsystem is not affected. Such a process would 
occur if the system were separated from the medium by an ideal ther¬ 
mally insulating barrier. Since the thermal transfer process is, in 
general, rather slow, we can easily imagme such rapid variations in 


520 


STATISTIOAI, PHYSICS 


[Part IV 


the parameter X that there is not time enough for the heat to be trans¬ 
ferred. In this process, the entropy of the subsystem and medium is 
conserved separately. For this reason, it is termed isentropic (or 
adiabatic). 

Later we shall also consider some irreversible processes. 

The second law of thermodynamics. Let us find an expression for 
the quantity of heat received by a system in a reversible process. As 
usual, we shall consider the given system to be a subsystem of some 
large closed system. The state of such a quasi-equilibrium subsystem 
is completely determined at each instant by its entropy and external 
parameters. According to (46.1) and (46.6), the energy increment, 
for a constant number of particles, is expressed in terms of the entropy 
increment in the following way: 

d^=GdS + ^d\, (46.16) 

0 X 

which, from (46.6) and (46.7), can be also written as 

dJ=^dS-Ad>. = QdS-dA, (46.16) 

whence it follows that _ 

%d8 = dS + dA. (46.17) 

But the right-hand side of the last equation is nothi^ other than 
the quantity of heat d Q received by the system. Hence, in a reversible 
process 

dQ^^dS. (46.18) 

This is one of the most important equations in thermodynamics. It 
determines the entropy increment of a system in terms of the quantity 
of heat directly measured from experiment. It is significant that the 
quantity of heat obtained by a system in a process depends upon the 
development of the process, while the entropy increment is determined 
only by the initial and final states of the system. The quotient ob¬ 
tained from the division of an infinitely small quantity of heat (received 
by the subsystem in a reversible process) by the temperature is the 
total differential 

= d8. (46.19) 

If an irreversible process occurs inside the system, equation (46.19) 
may not hold. Indeed, let the system consist of two subsystems at 
different temperatures. In the process of temperature equalization, 
such a system approaches statistical equilibrium and its entropy 
increases. But no heat reaches the system from outside, so that dQ 
for the whole system is equal to zero and d8>Q. 

Let us consider another example of an irreversible process. Let a 
gas expand into a vacuum. The phase volume AT [see (45.35), (45.39)] 


Sec. 46] 


THERMODYNAMIC QUANTITIES 


621 


naturally increases, since the geometrical volume increases. But this 
means that the entropy also increases. When expanding into a vacuum, 
the gas does not perform work (since there are no opposition forces) 
and does not receive heat. In other words, it may be regarded as a 
closed system whose entropy increases as statistical equilibrium is 
approached (when a gas expands isothermally in a cylinder situated 
in an external medium, the entropy of the gas also increases, but the 
entropy of the medium decreases to the same extent). Consequently, 
the entropy increment in the case of an irreversible expansion of a gas 
into a vacuum is positive, and the quantity of heat transferred is 
equal to zero. 

The two foregoing examples show that if an irreversible process 
occurs inside a system, then 

^<dS, (46.20) 

since entropy increases without any heat transfer. 

But if the given system irreversibly exchanges heat with other 
systems, and no irreversible processes occur inside it, then equation 
(46.19) is applicable. 

Let us now determine, with the aid of equation (46.18), the amount 
of work that can be performed by a heat engine. By this term we 
understand a machine which periodically obtains heat from some ther¬ 
mal reservoir and, at the expense of this heat, performs work. Accord¬ 
ing to the first law of thermodynamics, the total work performed by 
the engine in a single work cycle is equal to the quantity of heat 
received in that cycle (46.10). If the engine operates reversibly, then 
the quantity of heat is expressed in accordance with (46.18). Therefore 

J(fA=j0d*S. (46.21) 

It follows from this that if the temperature of the working substance 
remains constant over one working cycle, the work is identically zero; 

JdA = 8jd-5f = 0, (46.22) 

since the initial state in a periodic process coincides with the final 
state, and the entropy is a single-valued function of the state, so that 

jdS—0. dQ<Qd8 in irreversible processes, so that if the 

temperature is constant. By keeping the temperature constant, we 
can obtain a periodic process only at the expense of external work. 

It follows from equation (46.22) that a heat engine cannot work 
using heat received from the surrounding medium, because by def¬ 
inition the medium is at a constant temperature. The statement 
formulated here is known as the second law of thermodynamics. 


522 


STATISTICAL PHYSICS 


[Part IV 


An imaginary engine operating from heat obtained from the sur¬ 
rounding medium is called a perpetual motion engine of the second kind. 
In an axiomatic account of thermodynamics, the impossibility of 
constructing such an engine is postulated (naturally on the basis of 
numerous failures to make a perpetual motion engine of the second 
kind), and the subsequent proofs are indirect; first it is assumed that 
the statement to be proved is false, and then the feasibility of such a 
perpetual motion engine is derived therefrom (see exercise 5). 

A perpetual motion engine must not be confused with the so- 
called free engine, like the wind engine, wliich gets its energy from the 
sun heating up the earth. 

Elflciency. For a heat engine to work, it is necessary to have the 
working cycle at two temperatures at least. The higher temperature 
01 is usually called the soitrce temperature, while the lower tempera¬ 
ture 02 is the sink temperature. The work in one cycle is equal to 

b a 

fdA = QijdS + O^jdS, (46.23) 

a b 


where the limit a refers to the initial and final state, and the limit b 

b a 

refers to the intermediate state. But —[(ZiSso that the work 

a b 

in one cycle is 

b 

jdA = (01 - 02)Jd*S. (46.24) 

a 

b 

The total quantity of heat given up by the source is JdQ = 9iJdjS. 


The efficiency of an engine is the term used for the ratio of the work 
obtained from it to the quantity of heat taken from the source, since 
the principal losses are associated with obtaining this heat. From 
(46.24), the efficiency of a reversible engine (denoted by E) is 


E 


\'dA 

7 ^ 


h. 

0i 


(46.26) 


This equation shows that the efficiency of a reversible engine depends 
only upon the temperatures of the source and sink, and nothing else. 
Actually, Oj is either the temperature of the surrounding medium or 
a somewhat higher temperature. To increase E we must increase 6i. 

Equation (46.26) shows that the efficiency of a reversible engine 
can be taken to determine the absolute thermodynamic scale of 
temperature, independent of a thermometric substance. This scale 
coincides with the gas thermometer scale. 


Sec. 46] 


THEBMODYNAMIO QUANTITIES 


623 


The efficiency of an irreversible engine is less than the efficiency of a 
reversible engine operating at the same temperature difference and at a 
given source temperature. Indeed, when (46.20) is taken into account, 

equation (46.24) is replaced by the inequality J (0^—Og) (Sa — Sb). 

This is why, given the same quantity of heat taken from the 
source, the work done by an irreversible engine is less than that of a 
reversible engine. 

In axiomatic thermodynamics this statement is proved indirectly 
(see exercise 6). The efficiency of an irreversible engine is less because 
part of the heat obtained from the source is not spent in useful work, 
but in overcoming frictional forces or it is dissipated into the ambient 
medium in the engine itself, through the walls of the working cylinder, 
for example. 

It should be noted that an ideally reversible engine would have to 
operate infinitely slowly, for otherwise finite deviations from statistical 
equilibrium would arise in it. Approximation to statistical equilibrium 
is always an irreversible process. 

Thermodynamic identities ior energy and enthalpy. Prooeeduig from 
the general equation (46.9), we can write down a general equation 
for the differential of the mean energy of a system in the case of a 
constant number of particles, if we take the volume as an external 
parameter 

d7=Qd8 -pclV. (46.26) 

dS in this formula denotes the entropy increment, which is due to the 
reversible process in the system and the interaction with the surround¬ 
ing medium. The state of a homogeneous system with a constant 
number of particles is defined by two quantities; volume and entropy. 
This can be seen from the number of independent parameters appearing 
in the Gibbs distribution: 8 and X= F can be taken instead of 6 and 
X = F; the energy of such a system is a single-valued function of entropy 
and volume. Let us take the total differential of the function S (8, F): 

"“(Hi‘"5+(#)/''■ <«■”) 


where the index of the derivative denotes which quantity is kept 
constant during differentiation. Comparing (46.26) and (46.27), we 
have 


(46.28) 


Differentiating 0 with respect to F, and p with respect to 8, we obtain 
an equation between the cross derivatives: 


(11] 

_ 


[srI 

s 

h ~ dKSA’ 


(46.29) 


624 


STATISTICAL PHYSICS 


[Part IV 


The enthalpy or heat function I is connected with the energy by the 
relationship 

/ = / - pF 


[see (46.13)]. Whence an exjiression follows for the total differential 
of enthalpy 

dl = + Vdp, (46.30) 


which gives a series of differential relations 


0 


dp'aS 


(46.31) 


Equations (46.26) and (46.30) are known as thermodynamic identities. 
'J'heso identities permit c.xi)rcssing certain tliermodynamic quantities 
in terms of others. 

Free energy. If an irreversible ])rocess occurs in a system, then, 
from (46.20), dQ^QdS. Substituting this inequality into the equation 
of the first law (46.0), we have 


d^'^ddS-dA. (46.32) 

Thus, the work performed by the system will always satisfy the ine- 
quahty _ 

-dA^dS -QdS. (46.33) 


Let the process occur at a constant temperature. Then (46.33) can 
be written as a relationship between total differentials 

-dA^d(i-Q8). (46.34a) 

If the system performs positive work, dA>0. Reversing the sign of 

the inequality, we get 

-d(?-0/S). (46.34b) 

The quantity 

I-QSsF (46.35) 

[cf. (45.31)]’appears in the Gibbs distribution (45.10); it is called the 
free energy of the system. 

It follows from the inequality (46.34b) that the greatest amount 
of work that can be performed by a system at constant temperature 
is equal to the change in F, taken with opposite sign: 

= (46.36) 

Thus the work is equal to its maximum value in a reversible process. 
The inequality (46.34a) has a somewhat different meaning: it deter¬ 
mines the least amount of work which must be performed on the system 
in order to produce the given change of state in it: 


Sec. 46] 


THERMODYNAMIC QUANTITIES 


625 


= (46.37) 

The entropy of the system and the surrounding medium (taken 
together) is conserved in these processes, and the inequality (46.32) 
becomes an equality. 

Consider the following example. Let an ideal gas expand into a 
vacuum. No work is performed so that energy is conserved. But the 
energy of an ideal gas depends only upon its temperature (see Sec. 40), 
and not upon volume. Therefore, the temperature does not change 
during expansion into a vacuum. As we have seen, the entropy of 
the gas increases. Then the minimum work required to return the gas 
to its original volume at the same temperature is equal to the change 
in its free energy during expansion. The entropy of the gas will de¬ 
crease in such a reversible compression, but on the other hand the 
entropy of the surrounding medium will increase to the same extent. 

It is easy to obtain a thermodynamic identity for free energy. 
Differentiating the relationship between total and free energy, and 
substituting the identity (46.26), we obtain 

dF = - SdQ - pdV, (46.38) 

whence it is easy to find an expression for the entropy and pressure, 
and also an equation between the cross derivatives 


The relations (46.39) are convenient in that the independent variables 
are volume and temperature, which can be directly measured experi¬ 
mentally. Yet the thermodynamic identity for energy (46.26) involves 
entropy as an independent variable. But the entropy itself must be 
calculated, for example, by integration of (46.19). 

From (45.12), the free energy F is expressed in terms of a statistical 
sum 

_ ^ 

F = -ein2^c 9. (46.40) 

The right-hand side of this equation is expressed in terms of the 
temperature 0 and the external parameters involved in the character¬ 
istic values of But 0 and X are those very independent variables 
which are chosen in the identity (46.38). Therefore, for a determination 
of all the thermodynamic quantities it is sufficient to calculate the 
_ £ 

statistical sum^e 9 . The actual calculation of this sum for an 

arbitrary system entails enormous mathematical difficulties. Actually, 
it is calculated only for ideal gases and crystals, and also for systems 
which deviate but little from ideal. It should be noted that even if it 
were possible to evaluate the statistical sum for some actual substance. 


626 


SXATISTIOAIi PHYSIOS 


[Part IV 


say liquid water, the thermodynamic laws obtained with such very 
great difficulty would apply only to water and not to liquids generally. 
But the properties of ideal gases and crystals follow from statistics 
in a very general way. 

Thermodynamic potential. Let us now determine the maximum 
work that can be performed by a system at constant temperature and 
pressure, equal to the temperature and pressure of the external me¬ 
dium. Wo note that m a homogeneous system with a constant number 
of particles, where there are no phase or chemical transitions, the 
state is completely defined by the temperature and pressure, since 
the thermodynamic identities for such systems involve two independent 
variables. In this case the .specification of two quantities determines 
all the rest. But if a system consists of two phases of the same sub¬ 
stance, for example, a liquid and its vai)our, then the relationship 
between the fractions of liquid and gaseous substances may be quite 
arbitrary for a given temperature and pressure. 

Work is performed in increasing the volume of a system. Wo can 
imagine, for example, a system in a cylinder under a piston, and the 
piston rod connected with some object capable of changing only 
its mechanical energy: by means of a flywheel or load. In addition, 
on expansion of the system work is done on the external medium. 
If we call the work on the object A, then the total work performed 
is equal to — A—pA V= — (A-\-ApV). Here, p is the pressure 
in the external mechum, which pressure in the process considered 
is equal to the pressure in the system. Since, by convention, the 
temperature of the system does not change, we have, from (46.33), 

- {A + Apr)^A{^-QS) 
or 

— A^AiJ—OS + pV). (46.41) 

The quantity tu —is, obviously, a function of the state of 
the system. It is called the thermodynamic potential and is denoted 
by the letter <I>: 

(46.42) 

Thus, the maximum work which a system can perform at constant 
temperature and pressure is equal to the change in its thermodynamic 
potential (with reversed sign) 

•^max “ (46.43) 

This work is performed in a reversible process. 

When equilibrium is established in a system, work cannot be 
performed. Then d) attains a minimum, because, according to (46.43), 
the work is performed at the expense of a decrease in <&. When 


Sec. 46] 


THEBMODYNAMIO QUANTITIBS 


527 


vlmax = 0, O cannot decrease further. It has already been pointed 
out that the process can occur at constant temperature and pressure 
with a phase transition or chemical transformation; hence, the 
equilibrium condition here is that should be minimum. 

Let us now find the thermodynamic relationships for ^>. From 
(46.42) 

<5 = F + •pV. (46.44) 

Differentiating this equation and substituting dF from (46.38), wo 
obtain 

d<b = dF+pdV+ Vdp = -SdQ-pdV+ pdV + Vdp = 

= — SdQ + Vdjp. (46.45) 

Whence it follows, in familiar fashion, that 


8»g> 
dp 89 * 


(46.46) 


The thermodynamic potential depends only uijon quantities 
that characterize the state of a body: its temperature and pressure. 
At the same time 4) is, of course, an additive quantity; if two equal 
volumes of the same substance are joined at the same temperature 
and pressure, the common thermodynamic potential will be twice 
as great as it was for each part separately. Therefore, we can write 


O = A (i. (p, 0). 


(46.47) 


Here, (a is the thermodynamic potential related to a single molecule 
of the substance, p is also called the chemical potential. We shall 
show later on that for ideal gases it is identical to the parameter (a 
in the distribution fmiction (see Sec. 39). It is obvious that 


(46.48) 


If the system consists of several types of molecule, for example, 
a solution of one substance in another or a mixture of gases, then 
the state is determined not only by the temperature and pressure, 
but also by the concentrations of the substances. The concentration 
of the I’th substance in a mixture is 


(46.40, 


The chemical potential of the ith substance in a mixture is natur¬ 
ally expressed by analogy with (46.48): 

_ / 80 \ 


(46.60) 


628 


STATISTICAL PHYSICS 


[Part IV 


where (a,- depends upon p, 0 and all the concentrations; c^, 

a, - 

Regarding Ni as variables, we can write the total differential 
of d> in the following way; 

d(^^-SdQ + Vdp+2Jv^dNi, (46.51) 

i 

This equation generalizes (46.45) for the case of a variable number 
of particles. 

Since the transition from S to F and d> does not involve the number 
of particles Nu we can similarly generalize the differential relations 
(46.26) and (46.38): 

de = ^dS-pdV + 2^^,dNi, (46.52) 

djP = - 6 - pd F + 27F>- dNi. (46.53) 


For a constant volume and for one type of molecule, (46.52) reduces 
to the form 

dS = QdS + y.dN. (46.54) 

But this equation coincides with (39.18), whence it can be seen that 
the quantity S introduced in Sec. 39 is the entropy of a gas and [a 
is its chemical potential. 

Entropy in classical and quantum statistics. Let us compare the 
definition of entropy based on classical and on quantum laws of 
motion. In the latter case, entropy is defined as the logarithm of 
the number of states of a system for a certain energy value. When 
passing to a quasi-classical approximation, the number of states 
of the system is equal to the phase volume AT it occupies divided 
by (2 Tt h)", where n is the number of degrees of freedom [see (45.39)]. 
The logarithm of this ratio represents the entropy in the corresponding 
approximation. But statistics appeared before quantum mechanics. 
Therefore, entropy was originally defined in statistics as the logarithm 
of the denominate number A F. In this definition, entropy depends 
upon the choice of units: if, for example, the unit of mass is ehanged 
by a factor two, then nln 2 must be added to the entropy. But since 
the units of measurement are arbitrary, it follows from this that in 
classical statistics entropy was defined only within the accuracy 
of an arbitrary additive constant. Only the change of entropy had 
strict meaning. 

In quantum statistics, entropy is defined as the logarithm of an 
abstract number, and therefore does not depend upon the choice of 
units of measurement. 


Sec. 46] 


THERMODYNAMIC QUANTITIES 


629 


The temperature of a system is equal to absolute zero when the 
system is in the ground state, i.e., when it has the least possible energy. 
This state has a weight equal to unity, so that the entropy, or log¬ 
arithm of the weight, becomes zero at the absolute zero of temperature. 
This statement is known as Nernst’s theorem, which is sometimes 
called the third law of thermodynamics. Certain consequences of 
Nernst’s theorem will be considered below (see exercise 6). 


Exorcises 

1) Find the ratio between the specific heats at constant volume and at 
constant pressure. 

With the aid of (46.18), we find, from the definition of specific heat. 


Tile derivatives can be rewritten in the following way: 


from the formulae for the derivatives of implicit functions. The partial deriv¬ 
atives with the same subscript may be cancelled like fractions, since the 
differentials in them have the same sense. T'his gives 


I8p\ 


1 

\8V}s^ 

[dp to 

cv 


(^] 


\8V}o 

\dpls 


so that sj)eciflc heats are related in the way that isothermal compressibility 
relates to isentropic comiiressibility. It is sufficient to measiue only three 

— ,ev\ 


of the fotir quantities Cp,Cy, \Wp}s 

( Q ^ \ 

8V to 

and, according to (46.39), transform the 


We substitute S —F+tiS-, 
derivatives 

~ IdF 


as 

dVh 


= — p+0 


. 


If the pressure is known as a function of temperature and volume, the energy 
can be calculated only to the acctiracy of an arbitrary temperature function 

?.Jdr[-p+e(lf)J + /(o). 

Therefore, it must always be remembered that a determination of the relation¬ 
ship p =p(F, 0) does not yield complete information about the thermo¬ 
dynamic properties of a substance. In addition, any pressure term depending 


34 - 0060 


630 


STATISTIOAl PHYSICS 


[Part IV 


linearly upon temperature will not affect the energy, since it is eliminated 

N 0 

from the equation obtained. For example, in all ideal gases p = —pr- , and the 

energy depends upon the temperature in a rather complicated way if discrete 
quantum levels must bo taken in the statistical summations. 

3) Answer: V + 

4) Find the difference between the specific heats at constant volume and 
at constant pressure. 

The quantity of heat at constant pressure is equal to dl, and at constant 
volume, to dS [see (40.14) and (40.12)]: 


We transform Cp: 


Further, representing energy as = # [0, F {p, 0)], we write the deriv'ative 
(■^-] m the form 

\S0 /p _ _ _ _ 

00 


Whence 


where we have used the result of exercise 2. The derivative 
thus: 


) is transformed 


e/p 


( 


8Q I 


u 


Whence it follows that 


Cv = 


(IP) 

\ 80 Ir 

\8v/e 

Aso/y 

(iP) 

[ev/e 


It will later bo shown rigorously that|-^^|^< 0, i.e., the pressiue can only 

increase with decrease in volume (otherwise the state of the system is dynam¬ 
ically unstable, which is obvious as it is). Therefore Cp > Cv always, and also 


I8p\ _ 

N 

(IP) _ 

NO 

\ SO jy 

V • 

\eF/e 

y2 


so that Cp — Cy = K. 

6) Accepting the second law of tlierraodynamios as a postulate, prove 
that the efficiency of a reversible engine is always greater than the efficiency 
for an irreversible engine, working with the same temperatiue difference 
between sowce and sink. 


Sec. 46] 


THERMODYTTAMIO QUANTITIES 


631 


The proof is indirect. Let a reversible engine and an irreversible engine 
obtain the same quantity of heat Qi from a source, but let the in-evorsible 
engine give a smaller quantity of heat Q^' to the sink than the reversible engine. 
The reversible engine may be made to work as a refrigerator, i.e., to take 
heat from a cold reservoir and to deliver it to a hot reservoir at the expense 
of external work. In order to return a quantity of heat to the hot reservoir, 
in accordance with om’ assumption, the reversible engine must take a larger 
quantity of heat from the cold reservoir than the irreversible engine delivered. 
But it will then turn out that the hot reservoir does not deliver heat at all, 
and the cold reservoir delivers a quantity of heat — Q./, at the expense 
of which useful work is performed equal to the difference between the work 
of the irreversible engine and the work of the reversible engine operating 
as a refrigerator. The surrounding medium can be taken as the cold reservoir, so 
that usofid work will be performed at the expense of heat obtained from the 
surrounding medium; this contradicts the second law of thermod 3 mamics. 

6) Prove that the specific heat of a system tends to zero when the tem¬ 
perature tends to absolute zero. Do the same for 


The entropy is related to tho specific heat O by the relation 

0 


i^v\ _ 

(SS\ 

\ 80 )p 

Up/' 


where the lower limit of the integral is put equal to zero from Nernst’s theorem. 
For the integral to have meaning, we must demand that lim O = 0. In addition 
iSS\ . 

— limit of tho last derivative is also equal to zero 

as 0 tends to zero, because lim iS = 0 in the case of an arbitrary pressure. 

o-».e 

7) Show that the sum of the enthalpy and kinetic energy is conserved in 
the motion of a substance without any internal heat exchange and without 
exchange of heat with tho external medium. 

Let a certain mass of substance be transferred from a volume Vi, pressure 
Pi, and energy to Fj, p^, and respectively. In order to displace this mass 
from the volume Fj at a pressure Pi, an amount of work pi Fj must be done. 
Therefore, in going to p„ V^, a work p^ Fj — pj Fj is done. The total change 
of energy of the given mass, in a reference system fixed in it, is equal to 
d’j — + Pi Fi — pj Fj = Ii — Jj. Since there is no heat exchange, this 

quantity can be equal only to tho change in kinetic energy 


mvl 

mvl 

— 7 

7 

“2““ 

~ ”2“ 


mv\ 

mv} 

2 

+ 


In future we shall relate this equation to imit mass of the substance, and write 
it in the form: 

ti* 

I -f =» const, 

where I is the enthalpy of unit mass. 


34 ' 


532 


STATISTICAX PHYSICS 


[Part IV 


8) Find tho propagation \'olo(:ity of small isontropic tlistnrbances in an 
isotropic medium [in other words, neglecting heat transmission and considering 
that p == p (p)]. 

If the initial position of a particle is described by a single coordinate a, 
and tho displaced position by tho coordinate x, measured in the same direction, 
then the equation of conservation of mass is the following; 

Po da — p dx 

(p is the density, p,, is the initial tlonsity). Wlience 

Po \9»/(’ 

Tho force acting on an element of mass p dx is 

— p (a; + dx) + p (;c) = — dx. 


According to Newton’s Second Law, this force is equal to the product of mass 
and acceleration, i.o.. 


Considering tho displacements small, wo see that the derivative ^ is close 

to imity, so that the second derivative is a small (piantity. The result, therefore, 
is the approximate equation 

8‘x 1 _ fi 

3a® (^ 

{wls 

It coincides with tho wave equation of the form (17.4), which describes the 
propagation of ilisturbanccs with velocity c. In the given case, the propagation 
velocity of the process is nothing other than tho velocity of sound. It is equal 


9) A substance flows in a tube of constant cross-section without heat 
exchange, but with internal friction. Show that the maximum entropy is 
attained where the flow rate is equal to the velocity of sound. 

The following conservation laws apply: 


pv — j = const. 


1 + 


2 


= const. 


Sec. 46] 


THERMODYNAMIC QUANTITIES 


633 


Substituting the velocity from the first equation, wo obtain 

Let us differentiate this equation, considering that the enthalphy is expressed 
in terms of the independent variables S anil p: 


P* 


d p = 0. 


Close to tho entropy maximum dS = 0, the derivative 
so that 


V 


P 


( 


(>p jdS^-0 


For constant entropy so that at maximum entropy. 


If there is a flux in the tube for which v < u (“subsonic flow”), the value 
V — u can only bo attained at the tube outlet because, otherwise, tho entropy, 
on reaching a maximum somewhere in tho tube, would have to decrease in 
the subsequent flow, which is imjiossible. 

10) A substance flows without heat exchange or friction, i.e., isenlropically, 
in a tube of continuously variable cross-section /. Show that tho velocity in 
subsonic flow increases with decrease in cross-section, but in suporsonie flow, 
it increases with /. The flow is considered as one-dhnensional because of the 
smooth variation of /. 

From tho law of conservation of mass 


whence it follows that 


/ p a = const, 
df ^ ^ - 0 

f p V ' 


Taking into account that entropy is constant, wo can write: 


P P I 


If.) = 
dp Is p 


Differentiating tho 


equation I -)- = const at constant entropy, we have 


dp 

P 

-f vdv = 0 

Wlicnco it follows 

that 


d p 

vdv 


and finally 

P 


dv / v^ \ 


df 


~v' 


f 


which proves the statement. In order to obtain flow with supersonic velocity 
at the outlet of the tube, we must pass it through a Laval nozzle, i.e., along 
a tube whoso aperture first decreases, so that v — u at the narrowest place, 
after which v becomes greater than u and continues to increase. 


634 


STATISTICAL PHYSIOS 


[Part IV 


11) A piston is in movement with constant velocity v, into a cylinder with 
cross-section /, filled with a substance with initial pressure Po initial density 
Po. The enthalpy I per vmit mass is regarded as a known fimction of p and p. 
Formulate a system of equations, from which it is possible to determine the 
displacement velocity of the boundary between the compressed and non- 
compre&sed substance, and also the density and pressure of the compressed 
substance. 

The compressed substance moves with velocity v equal to the piston veloc¬ 
ity. The boimdary between the compressed and noncompressed substance 
has a certain velocity D. The compressed substance has a velocity v — D 
relative to this boundary, and the noncompressed substance has velocity D. 

Let us pass to a reference system moving together with the interface. 
Then the mass conservation law is expressed as follows: 

/PoD-/p(D-e). (•) 

We shall consider a cylindrical volume of the substance passing through 
the boundary in unit time. The length of this cylinder in the compressed sub¬ 
stance is equal to Z? — v, while its mass is f f (D — v), so that its momentum 
is equal to f p (D — «)*. The momentum in the noncompressed substance 
equalled / po'Z?®- A resultant force (po— p) / acted on this cylinder, whence 
the conservation equation 

/(Po+Po-D’) = /[P + P(J5-«)*]• (**) 


The third equation expresses the absence of heat exchange (see e.xercise 7) 


It) -h 


2 


1 + 


(D - e)» 
2 


(*•*) 


At the interface, a discontinuotis change occurs in the density, pressure, and 
velocity of the substance. This surface is called a shock wave. We can deter¬ 
mine D, p, and p from the three conservation laws, if the form of the function 
1 (p, p) is known. These quantities will be determined specifically in exer¬ 
cise 7, Sec. 47, where it will also be shown that the compression process in a 
shock wave is irreversible. 

12) Show that the classical expression for a statistical sum does not depend 
upon the constant magnetic field in which the system is situated. 

The classical expression for the statistical sum (or more exactly, for the inte¬ 
gral) is 

.Pjv;r»>r,,.. . r^) 

Z = J e ® dTp,dTp,dTp,.. dVjdFj... dFN- 


When the system is placed in a magnetic field the momenta of the particles 

change according to the formula p->-p-^A=«P. Passing (for the phase- 

volume element) from dtp to dtp, wo find that the statistical sum appears 
the same as in the absence of field because the new notation for the integration 
variable does not change anything: 

_ . '«• . 

Z = j e ® drp, dtp, ... dtpj^j • dFj dF, ... dFiv. 


Thus, classical mechanics cannot describe the magnetic properties of a sub¬ 
stance. 


Sec. 47] 


THE THBBMODYNAMIO PBOPBBTIE8 OB IDBAI. GASES 


536 


13) Express the entropy of an ideal gas in terms of the occupation nk 
for all three statistics (gk = 1 everywhere). 

Using the expressions for S in equations (39.14) and (39.25), we find: 
Bose statistics: 

S + 1) In {nk + 1) - nu In njb]; 

k 

Fermi statistics: 

— ^[(1— nfe) ln(l— wa) + Wfclnwfc]; 

k 

Boltzmann statistics corresponding to n* 1: 

S=-2’n.ln^-. 

;r 

If the weight is not equal to unity, then, introducing Hk ^ gk fkt we obtain 
for all three statistics: 

^Boae =2^ [{/fc + 1) In (/fe + 1) -/fc In/fe] , 

k 

Si'ermi = - 2^ gk[(l — - fk) + fk In fk], 

k 

'S'BoItemanu — 27 ^ ^ ' 

It 


Sec. 47. The Thermodynamic Properties of Ideal Gases in Boltzmann 

Statistics 

In this section we shall consider certain consequences that foUow 
from the general principles of thermodynamics as applied to ideal 
gases. We shall suppose that the gas density is sufficiently small 
for Boltzmann statistics to be applied to its molecules. This does 
not mean that the motion of the molecules should be regarded as 
nonquantum; the quantization of rotational, vibrational (and aU 
the more so, electronic) levels of a molecule must be taken into 
account in all cases when the spacing between neighbouring levels 
is comparable with 6 (i.e., kT) or greater than 0. Even when the level 
spacing is sufficiently small compared with 6, as is the ease of trans¬ 
lational motion, the quantum of action should be left in the formula 
for the statistical weight of the states, since it would be impossible 
otherwise to obtain a unique expression for entropy. 

Deviations from Boltzmann statistics that occur in gases at low 
temperatures or high densities are sometimes called “degeneracies.” 
One should diflerentiate between deviations from the characteristic 


636 


STATISTICAI. PHYSIOS 


[Part rv 


ideal gas state, due to the interaction between molecules, and quantum 
deviations from classical statistics. Of course, there also arise correc¬ 
tions which are due to the effect of both factors together. 

Free energy of an ideal gas. As was indicated in the preceding 
section, it is convenient, when calculating thermodynamic quantities, 
to proceed from the expression for free energy. 

We shall start with formula (46.40), reducing the statistical sum 
to the form that it takes for a Boltzmann gas. For this it is necessary 
to take into account that, by definition, a statistical sum is calculated 
over all the physically different states of a gas. But the state of the 
gas docs not change if all possible molecular permutations are per¬ 
formed over the individual states; in nonquantum statistics such a 
permutation can be defined. The number of permutations of N 
molecules is equal to A!. 

The total energy of an ideal gas separates into the sum of the 
energies of all of its molecules: 

N 

;=i 


where k is the number of the quantum state. 

Substituting the expression for into the statistical sum (46.40), 
and dividing this sum by the number of permutations of the mole¬ 
cules, we obtain 

N 


, 


-ly' 

9 * 

1-1 


N 


fc 


Vfc_ 


Nl 


Nl 


Nl 


(47.1) 


The second summation over (k) relates to all possible combinations 
of the energy of the separate molecules Here, we have made 
use of the fact that the energy spectrum is the same for all molecules 
(if the gas consists of molecules of one type). The summation in 
(47.1) is performed over the spectrum of a single molecule. Replacing 
A! by its expression in Stirling’s formula, we arrive at a general 
formula for the free energy of an ideal gas under Boltzmann statistics. 

e(k) 

6 

F = -NQ]n-^ -. (47.2) 

Summation over translational degrees of freedom. It is expedient, 
in the statistical summation over the states of a separate molecule, 
to separate the translational degrees of freedom and represent the 
energy in the form 


Sec. 47] 


THE THERMODYNAMIC PROPERTIES OF IDEAL OASES 


637 


e = + sC). (47.3) 

It is taken here that the energy does not depend upon the coordinates 
of the centre of mass of the molecule. 

The statistical weight of a state with momentum p is equal to 


g = g^'> 


dpx dpy dpz dx dy dz 


(47.4) 


Sf('> denotes the weight referring to an energy level sW. Integration 
over X, y, z contributes the factor J dx, dy, dz=V to the statistical 
sum. The integration over momenta is performed in a familiar manner: 


Je dpx=V'lT:m^. 


(47.6) 


Thus, the free energy for an ideal gas reduces to the following 
form : 

i’= _iV01n^/(0). (47.6) 

A relationship is obtained here between free energy and volume. 
The function / (0) depends upon the molecular structure. 

Thermodynamic quantities of an ideal gas. It is easy to determine 
pressure from formula (47.6). From (46.39) we obtain 


dF NQ 
'P~ dV ^ V ’ 


(47.7) 


i.e., the well-known Clapeyron equation. The thermodynamic potential 
is 


O = F + pF = F + iV0 = - A61n^/(0), 


but here it is expressed in terms of volume. To be able to use identity 
(46.45) we must, in addition, replace ^ by , whence a final formula 
is obtained for the thermodynamic potential of an ideal gas: 

a) = -A61n-^^. (47.8) 

We find the chemical potential with the aid of (46.47) or (46.48): 

(j.= -01n-^^. (47.9) 

The entropy of an ideal gas is 


= Wln^/(6)+Ae 


rm 
/( 0 ) ' 


(47.10) 


538 


STATISTIOAl PHVSICS 


[Part IV 


This expression does not agree with Nernst’s theorem. In actual 
fact, of course, we must apply to a gas at very low temperatures, 
not Boltzmann statistics but quantum statistics, even neglecting 
the fact that at low temperatures the gas actually condenses. 

The energy is equal to S—F + 6 <S, or 

Thus, the energy of an ideal Boltzmann gas, expressed in terms 
of temperature, does not depend upon volume at all. The mean 

energy of a single molecule e = ^ depends only upon the temperature 

of the gas. This is not only because there is no force interaction 
between the gas molecules, but also because the properties of the gas 
are described by classical statistics. In the quantum statistics of 
ideal gases, the energy of a molecule depends both upon volume 
and temperature. It should be noted that the variables in formula 
(47.11) do not correspond to the identity (46.26). In order to make 
use of this identity, temperature must be eliminated from (47.10) 
and substituted in (47.11), but this is very difficult to do in the general 
form. 

The enthalpy of an ideal gas is 

I='J+ = 6 . (47.12) 

Like energy, it depends only on the temperature. 

A mixture of ideal gases. Since the molecules of ideal gases do not 
interact, the free energy of a mixture is additive and reduces to the 
sum of the free energies of all the components: 


F = (47.13) 

I 

The pressure of the mixture is calciilated in the usual way: 


If we introduce the partial pressure of the ith component of the 
mixture, i.e., its contribution pi to the total gas pressure p, then 


NiO _ 
V ~ 


Nip 


(47.16) 


k 

so that the total pressure appears as the sum of partial pressures. 


Sec. 47] 


THE THEBMODYNAMIC PEOPBBTIES OP IDBAI, OASES 


639 


This, of course, refers only to ideal gases. The thermodynamic poten¬ 
tial of a mixture is 

O = _2^iy,.ein-^^ . (47.16) 

( 

The chemical potential of the ith component is determined from 
formula (46.60). It turns out to be equal to 

(47.17) 

These formulae are very important in the theory of chemical equilibria 
in gases. 

Rotational energy of a gas. Let us now calculate the statistical sum 
over the rotational degrees of freedom of molecules. Since we wish 
to obtain simple thermodynamic formulae, in this section we shall 
confine ourselves to the case of nonquantum rotational motion (the 
quantized case was considered in Sec. 41). This means that the tem¬ 
perature satisfies the condition 

(47.18) 

Here, J is the molecular moment of inertia. At room temperatures, 
the condition (47.18) is satisfied for all gases, including hydrogen. 

The expression for the mean energy of a molecule (47.11) involves 
only the logarithmic derivative of the statistical sum, so that the 
constant factors in this sum are not essential. I3ut the value of the 
sum as such is important in many statistical applications (chemical 
equilibria). In order to calculate this value we must take into account 
that the summation is taken over the physically different rotational 
states. For example, diatomic molecules of Hg or O 2 take on coincident 
positions when rotated through 180° about an axis perpendicular to 
the line joining the nuclei. The position of a diatomic molecule in 
space is given by two angles, the azimuthal and polar angles, and can 
be represented by a single point on the surface of a sphere of unit 
radius. But the physically different molecular orientations corre¬ 
spond only to half of this sphere. 

The position of a nonlinear molecule in space is given with the aid 
of the Eulerian angles (see Sec. 9). If the molecule possesses any 
form of symmetry, then the statistical summation over aU possible 
orientations should be divided by the number of rotations leaving 
the molecule invariant. For example, the ammonia molecule NHj 
has the form of a pyramid Avith a regular triangular base. Its rotational 
statistical sum, taken over all rotations in space, must be divided by 3. 
The benzene molecule CgHg has a regular hexagonal form. A hexagon 
does not only coincide with itself for 60° rotation in its plane, but also 


540 


STATISTICAL PHYSICS 


[Part IV 


for 180° rotation about an axis joining opposite vertices. Hence, its 
statistical sum is taken over 1 / 2-6 = 1/12 part of all orientations. 

It is now easy to write down a classical expression for the rotational 
statistical sum of a diatomic (or, in general, linear) molecule: 


M,»+ 

2/0 dMjdAI^ 


(47.19) 


the 4 re factor takes into account all orientations in space. If the 
diatomic molecule consists of different atoms, c = 1 ; if it consists of 
identical atoms, c = 2 . The quantum statistical sura for an oxygen 
molecule, whose nuclei do not have spin, is taken only over even 
rotational states (see Sec. 41). In the classical limit, this is taken into 
account by the factor 1 / 2 . When the nuclei of a molecule possess 
spins we must multiply the statistical sum by the quantities 2 s+ 1 , 
taken for all the nuclei. In the linear triatomic molecule CO 2 (0 = C = O), 
wo must also put tj = 2 . Thus, for a linear molecule 


2 ^ 


rot 


Itt 27tJ^0 
o (2n/i)® 


2 JO 


(47.20) 


The position of a nonlinear molecule in space is given by the orien¬ 
tation of its arbitrary axis and the rotation angle about this axis. 
Consequently, all rotations in space introduce the factor 471 -271 = 811 :^ 
into the statistical sum for a nonlinear molecule, and the result is 
the following oxi)rcs 3 ion: 


2 


rot 


AV 

A 


M,‘\ 

-j,-) dM^dM^dM^ 
(2 tc h)3 


^ i ^(\/ 27 i 0 )® VdiJ^Ja ^ (2 0)’U\/nJiJ^J3 


(47.21) 


Thus, the contribution of the rotational energy to the total energy 

of the gas—equal to Nd —amounts to W 6 in the case of a linear 

molecule, and-|-iV0 in the case of a nonlinear molecule. 

The vibrational energy of a molccnle. The energy of a molecule 
performmg small oscillations can be represented, according to Sec. 8 , 
in the form 

n 

Evlb = w 2 {Pa. + Wa ^a) + Uq, (47.22) 

a = l 


where are the normal oscillation coordinates, are the corre¬ 
sponding momenta, Uq is the potential energy in the equilibrium posi- 


Sec. 47] 


THE THEBMODYNAMIC PBOPEBTIBS OP IDEAL OASES 


541 


tion (including the energy of the vibrational ground state). If 6, 
then the vibrational summation may be taken for nonquantized 
motion: 


2:=n//‘ 

vih a — 1 


1 

20 


(p| + (oj Q^) - 


(2jt ft.)« 


n 


a-t 


(47.23) 


The vibrational motion makes a contribution N^nO— j^17o ^^e 

/j2 

mean energy of the molecule. Usually h coa> -j- so that a temperature 
region exists that satisfies the two inequahties 

h(i>a > G > -J-. (47.24) 


The vibrational quanta of the molecules are not yet excited at these 
tem^ieratures, while the rotational specific heat is already constant. 
Thus, in the case of nitrogen and oxygen the total specific heat amounts 
3 5 

to Y iV^+iV^ = “ iV for temperatures from several tens to many 

hundreds of degrees. Under these conditions, gases (for instance, air) 
are subject to the eqiiipartition law with a reduced number of degrees 
of freedom. The last expression means that each dynamical variable 
coordinate or momentum entering into the Hamiltonian quadratically 

gives, in the classical limit, an amount in the mean energy of 
the gas. 

The energy of the higher vibrational states is not governed by the 
simple formula (47.22). This can be seen from Fig. 47, where the lower 
curve is close to a parabola (harmonic oscillations) near the equilibrium 
position, and very dilferent from a parabola (anliarmonic oscillations) 
close to the dissociation limit. If the temperature is so high that 
oscillations (vrith large quantum numbers) close to the dissociation 
limit are excited, then the greater part of the molecules will have 
separated into atoms. At lower temperatures, the anharmonic nature 
of the oscillations affects but little the value of the statistical sum. 

Thermodynamic quantities for a gas governed by the equipartition 
law. The specific heat of a gas governed by the principle of equiparti¬ 
tion of energy is constant over a wide range of temperatures. Hence, 
the ratio of specific heats 

_ Cf Cv¥N 

~ Cv~ Cv 


is also constant. 


(47.26) 


642 


STATISTICAL PHYSICS 


[Part IV 


It wiU be convenient, in certain further applications, to express 
the thermodynamic quantities in terms of y. The function / (9) is 

Cy u, I u, 

propoitionaJ toO^ e ^ e 

Whence we obtain an expression for the energy of a gas, which energy 
we shall write here to the accuracy of the constant term NU„: 

6V 0 = 

Y — 1 Y — 1 

Tlio enthalpy is equal to 

1 = 1 rpY = . (47.27) 

To the accuracy of the constant term, the entropy is 

.Sf=WlnK + -^^hie^™j-lnpFT^. (47.28) 

Whence we get an equation for an adiabatic process in a gas governed 
by the equipartition principle: 

pVy = const. (47.29) 

Y is often called the adiabatic index. From exercise 9 of the preceding 
section we find an expression for the velocity of sound 

u=' I (47.30) 

where V is the volume of unit mass. 

lA>t us bring together the laws from which the specific heat of a gas 
subject to the equipartition principle is calculated. At a sufficiently 

N 

high temperature there is a specific heat -j- per rotational degree 

of freedom, and also per translational degree of freedom, since each 
such degree of freedom contributes one squared term to the energy 

expression of the form • 

Hence, the integral of the distribution function acquires either 
a factor V27t m 0 or V27t ; this yields the term when calculating 

the energy. If we can replace the summation by an integral, the 
vibrational degree of freedom contains two variables appearing 
quadratically in the energy [see (47.22)], thus jdelding the mean 
energy 0. 

To summarize, at a sufficiently high temperature, each vibrational 
degree of freedom, if it is strongly excited, makes a contribution N 
to the specific heat. If we apply the equipartition principle, then the 
specific heat of a molecule consisting of i atoms, which are not in 


Sec. 47] 


THE THERMODYNAMIC PROPERTIES OF IDEAL GASES 


543 


one line (i> 2 ), is equal to —~j N, and if the atoms form a line 

in the equilibrium position, it i-s (3 i — 3) N. 

Thus, for a triatomic molecule of triangular form (for example 
H 2 O), a specific heat 6 iV' is obtained for full excitation of all the degrees 
of freedom (besides electronic), and the ratio C7 p/C'f = 7/6. If the vi¬ 
brations arc not yet excited, then (7 f==3 N and CpjCv—^l^. At the 
lowest temperature, only the translational degrees of freedom remain, 

as in the case of a monatomic gas, which gives Cv = ^N and Op/Cr=5/3. 

If the atoms of a triatomic molecule form a line (for example CO 3 ), 
then the maximum specific heat Cv~ — N and (7p/C'i'= 15/13, i.e., 
Cv is greater for a linear molecule than for a triangular molecule. 
But if vibrations are not excited then Cv=-^N, which is now less 

than for a triangular molecule. Such an intersection of the specific 
heat curves of COj and H.,0 witli change of temperature is actually 
observed. 

Adiabatic demagnetization. Of great interest is the process of 
isentropic (adiabatic) demagnetization. In Sec. 40 we considered the 
paramagnetism of the salts of rare-earth elements due to the free 
rotation of the magnetic moments of unfilled shells. Such moments 
may be interpreted as a “gas.” 

Let us suppose that a salt is magnetized to saturation at low temper¬ 
ature and is then suddenly demagnetized. Its entropy does not have 
time to change. But if all the moments are orientated in one direction, 
the entropy is small because this state is obtained in a small number 
of ways (in one way, in the limit). When the field is rapidly removed, 
the entropy will remain small only due to a big drop in the temperature. 
This method has been used to obtain temperatures of several thou¬ 
sandths of a degree above absolute zero. 

Exercises 

1) Find the work and the quantity of heat obtained by a gas in an isothermal 
process. 

The work is equal to the change in free energy: 

A = -NQin-^. 

The quantity of heat is expressed in terms of entropy change: 

A and Q are equal and opposite in sign, because energy remains unchanged 
at constant temperature. 

2) Two portions of different gases occurring at the same temperature and 
pressure are mixed. Find the increase in entropy. 


644 


STATISTICAL PHYSICS 


[Part rV 


A S = InIn _ JV In _ ivr In , 

* p, ^ Pi P P 

whore Pi and pj are tho partial pressures of both gases after mixing. Wlionce 

A S - Jr, to A + W. In i . In + », In . 

If two portions of tho same gas aro mixed imder the same conditions, the 

entropy will equal (N^ + N^) In , after mixing, so that A S' = 0 as it should 

be. This would not have occurred if tho factor N\ in the statistical sum had 
not been introduced into the expression for free energy. Due to this factor, 
only tho summation over physically different states of a gas appears in the 
free energy, and the entropy cannot change when two portions of the same 
gas are combined at the same temperature and at equal pressure. 

3) Calculate the free energy of a gas in a centrifuge, of radius R and length I, 
rotating with angular velocity oj. Find the mean square distance of the particle 
from tho axis. 

The centrifugal force is equal to r, which corresponds to an effective 
potential energy U = — j ma^rdr — - ^ 


Whence we obtain an expression for the free energy 

R mo)*r« 

e/{0) 


F= -WOln 


N 


I e 


20 


r dr = — .N 0 In 


efM 

N nioi- 


-i) 


Tho free energy satisfies tho general relation d F = — S d 0 — AdX 
[cf. (4().38), where X - F]. 

Regarding as an external parameter X, we determine the mean square 
distance of the particle from tho axis: 


Nm S(to*) 


because, if <o®= X, then 


U - 6 


mw'JJ* 

20 


_ 20 _ 

m R^ 


R\ 


2 

At very largo angular velocities r^ 


8F 

ex 


A [cf. (40.7)]. 

R^, and at small velocities = 


R 2 

2 


4) Find the velocity of stationary exit of a gas in vacuum, considering the 
adiabatic index as constant 


= V^=|/^pF = l/^n. 


6) Find the maximum flux density of a gas for an adiabatic transition from a 
pressure Pq and specific vohune Fq to a pressure p. 

From the adiabatic equation, 

Whence we find the final enthalpy: 

y y 


1 = 


-pV- 


Y-l 


1 

Po'^VoP ^ 


Sec. 47] 


THE THERMODYNAMIC PROPERTIES OF IDEAL OASES 


546 


The flux density is 


_1_ 

Vo 


( - ’ 
^PoVo-Po'^P 


Denoting P^plPo< "-e obtain 


Hence, the maximum flux density is obtained at 


Y 


6) Show that the flux density is maximum when the flow velocity is equal 
to the velocity of sound at the given point. 

Note. If the pressure at the output is made less than Hmax. the exit velocity 
will no longer vary and will remain equal to the local value of tlio veloeity of 
sound. Any disturbance occurring in the flow is propagated with the velocity of 
sound. When the flow velocity is equal to the velocity of sound, the disturbance 
is carried away by the flow. Therefore, if the external pressure Pi is less than 
Po Pinax. it will no longer affect the exit. In order to obtain supersonic exit, a 
Daval nozzle must be used (see exercise 10 of the previous section) for the flow 
velocity to equal the local velocity of sound at the narrowest part of the nozzle. 

7) Derive a relationship between pressure and density in a shook wave pi'op- 
agatod in an ideal gas of constant spccilic heat, and also find the sudden entropy 
change in the shook wave (see exercise 11, Sec. 46). 

From equation (*) in the exorcise mentioned, p. 634, wo have a relationship 
between v and D: 

JL = P ~ Po 
D p • 


Whence, from equation (**), wo obtain 

^,2 = <P - P o) ( P - Po) ^ J 52 ^ PP-Po 

PPo Po P ~ Po 

V 

We now substitute the expressions for d* and-^, and also / from (47.27), into 

equation (***), and find a relation between density and pressure for shock 
compression 

P ^ (r + 1 ) P — (r — 1) Po 
Po (y + 1) Po - (r - 1) p ' 


It must not be thought that for a shock compression from Po fo p, the pressiuw 
actually follows this equation, passing through the entire intermediate series 
of values corresponding to the intermediate values of p. The relationship 
obtained shows what final values of p may be obtained when the gas is com¬ 
pressed with given initial Po Po the shock wave. The greatest density 


increase (for pjpo 


■ oo) corresponds to — = ? 

Po Y — 1 


shock wave cannot be increased by more than 
Y = 7/5, the quantity p/po < 6. Using the formula (47.28) for entropy, we find 


Y+1 

Y-1 


, i.e., the density in the 
times. For example, when 


35 - 0060 


646 


STATISTICAL PHYSICS 


[Part IV 


A6- = i(l 

Y- 1 \Y 


In 


(y + 1) p — (y — 1) Po 


In 


)■ 


(y + 1) Po — (y — 1) p Po^ 

If the density does not change significantly, we can expand this expression 


in a series in powers of e = 


Po 


1. It then turns out that 


Hence, A iS is third order with respect to e or, what is just the same, with 
respect to the gas-compression velocity v. Therefore, we can neglect the change 
in entropy for a slow gas compression, and consider that the gas is compressed 
reversibly. 

To the same approximation, the shock wave velocity assumes the following 
form: 


which coincides with the velocity of sound in an ideal gas. Thus, a weak shock 
wave degenerates to a sound wave. This holds for any substance. 

A compression sound wave sent behind another sound wave in a substance 
will overtake it, because it wiU be moving in a substance having a certain 
velocity and, in addition, initially compressed by the first wave. A sufficiently 
large number of such waves, sent one after another, must finally merge to 
form a shock wave (of finite amplitude) propagated in the substance faster 
than sound. 


Sec. 48. Flactuations 

The reversibility of mechanical equations with respect to time. 
The initial and final states of a system in classical mechanics are 
uniquely interrelated: one of them completely determines the other. 
This is expressed mathematically by the fact that if the signs of all 
the velocities are reversed, the entire motion of the system wUl be 
in the reverse direction. Changing the sign of the velocities is formally 
equivalent to changing the sign of the time. But if the sign of the 
time is changed, the form of Lagrange’s equations will not change. 
It is stiU easier to see the invariance of the equations of Newton’s 
Second Law with respect to a change of i to — t: these equations 
involve only second derivatives with respect to time, which preserve 
their sign in such a change. 

In order to perform the same transition in electrodynamical 
equations, we must first of all change the signs of all the currents. 
In order that Maxwell’s equations (12.24)-(12.27) should not change, 
we must also change the sign of the magnetic field together with 
the current, and leave the sign of the electric current unchanged. 
Since the magnetic field is an axial vector, or a pseudovector (see 
Sec. 16), the choice of sign is a matter of pure convenience. 

Quantum mechanical equations also preserve their form when t 
is changed to — t. In the simplest case, when the Hamiltonian operator 
is real (i.e., when it does not involve i), the transition t->—t simply 


Sec. 48] 


rLtrCTUATIONS 


547 


denotes the simultaneoua transition which can be directly- 

seen from Schrbdinger’s equation (24.11). But the function is 
completely equivalent to ij/ (no matter which of them is regarded 
as conjugate). In more complicated cases, when the operator ^ is 
complex, we can likewise always pass from the function t}* to another 
(physically fully equivalent to it) together with the transition from 
5 to — 

Statistics and reversibility in time. We shall now examine the way 
that the laws of statistical mechanics relate to time inversion. 
Statistical mechanics states that if at some initial instant of time 
a system is deviated from statistical equilibrium, then, in the over¬ 
whelming majority of cases, it will subsequently approach equilibrium. 
A system which is already in equilibrium will remain in equilibrium, 
no matter what imaginable changes in the sign of time are performed 
in the mechanical equations describing the detailed microscopic state 
of the system. Therefore, a situation arises that is rather paradoxical 
at first sight: statistical laws, which appear noninvariant with respect 
to time inversion, are derived from the equations of mechanics! 

The problem stated in classical mechanics and statistics. Let us 
examine this paradox in the limits of the classical laws of motion. 
First take the following example. Let a gas occupy one half of a vessel 
divided by a partition. After this partition is removed the gas will 
occupy the whole vessel. Let us foUow the motion of each molecule 
of the gas in this irreversible process (in classical mechanics this is, 
in principle, possible). The motion of aU the molecules is represented 
in phase space by the displacement of a single point along a phase 
trajectory. If in the state of statistical equilibrium we mentally change 
the signs of all the velocities, the phase point in its imaginary move¬ 
ment will be displaced in the reverse direction, and all the gas will 
collect in one half of the vessel. Since any equilibrium state of the gas 
is attainable from a nonequilibrium state, and both velocity signs 
are, a priori, equiprobable, the gas must come out of the statistical 
equHibrium state as often as it enters it—^which, it would appear, 
is never observed. 

In actual fact, in statistics, equilibrium is not just any strictly 
defined state, but a whole range of states in which a closed system 
spends the greater part of its time. The phase point roams about in 
the equilibrium range for an extremely long time before spontaneously 
leaving it for any considerable distance. Through the vast majority 
of phase points in the statistical equilibrium region there pass trajec¬ 
tories which almost never enter regions that correspond noticeably 
to nonequilibrium states. 

If we choose a certain section of the equilibrium range, we may say 
that the system emerges from it just as frequently as it returns to it, 
but that in the vast majority of cases it does not go “far.” 


548 


STATISTICAL PHYSICS 


[Part IV 


Therefore, the apparent irreversibility of statistics is due to the 
way the problem is stated in it: the system does not remain for long 
in nonequihbrium states, and therefore rapidly enters equilibrium 
states; it remains for a very long time in equilibrium states, so that 
the probability of spontaneously leaving these states ean, in the 
majority of cases, be neglected. 

Quantum mechanics and the irreversibility ol transitions. The 
principle of detailed balance (see See. 39) is fundamental to quantum 
statistics. In accordance with this principle, the probabilities of direct 
and inverse transitions are equal between two states having the same 
statistical weight. However, it by no means follows from this prin¬ 
ciple that the probability of transition from an equilibrium to a non¬ 
equilibrium state is the same as for a transition from a nonequilibrium 
to an equilibrium state. A statistically equilibrium state includes very 
many equiprobable microstates, while a nonequilibrium state con¬ 
tains a comparatively small number of microstates: the reason why 
the system spends the greater part of its time in equibbrium is that 
there are incomparably more equilibrium states than nonequilibrium 
states. Each given microstate, belonging to the set of statistical equilib¬ 
rium states, passes to another state from the same range with over¬ 
whelming probability, while to a nonequilibrium state it passes with 
a negligible probability. A nonequilibrium state passes preferentially 
to an equilibrium state because the transition to a state of less equilib¬ 
rium can occur in an incomparably smaller number of equally prob¬ 
able ways. It is for this reason that a system “tends,” as it were, to 
equilibrium, despite the identical probabihty for direct and inverse 
transitions between any two initially equiprobable microstates. 

Poisson’s formula. The spontaneous transition of a system from an 
equilibrium state to a noticeably nonequilibrium state is of very 
small probability, but is not completely impossible. Deviations of 
actual values from their averages are more probable, the smaller the 
system in which they occur. If, for example, gas molecules are ob¬ 
served in a cube with a side 10~® cm, then, under normal conditions 
(0® C, 760 mm Hg) the mean number of molecules is in all 27. The 
molecules may leave for neighbouring portions, so that their actual 
number in a certain volume will exhibit a very noticeable deviation 
from the number 27. 

It is very easy to determine the probability that there will be N 
molecules in a given volume F, if there are Ng molecules contained 
in the total volume Fq. The probability of finding a single molecule 

in the volume F is obviously equal to . Therefore, the probability 

of finding N molecules in the volume F, and Nq—N molecules in the 
remaining portion of the volume, is equal to 

N„l (y\^(i V\N,-N 

(N,-N)lNl\Vol \ Vo) 


(48.1) 


Sec. 48] 


FLUCTUATIONS 


549 


An analogous formula was derived in Sec. 39 for the probability of 
obtaining tails k times. 

Let the total number of molecules be arbitrarily large and let 
N be any number, though considerably less than Nq. We replace the 
factorial ratio thus: 


( y \N / Y N 

-^1 and II- y-\ 

as 0 0 

/_7 \N_ ^ 

\Vol “ N,N’ 

— y — 

where A = by definition of N. Substituting all the obtained ex¬ 
pressions into the initial formula, we find the required probability 


«-N. 


iN 


(48.2) 


(Poisson’s formula). It will be shown in exercise 1 that at large N 
the distribution (48.2) has a very sharp maximum at N—N. 

Fluctuation probability. Here, we shall obtain a general formula 
for fluctuation probability in a subsystem of a large system. The small 
volume of gas just considered may be taken as a special case of such 
a subsystem. 

Let it be that a certain deviation from statistical equilibrium has 
occurred in the subsystem. The entire large system thus have 
deviated somewhat from equilibrium. The ratio of the probabilities 
for the equflibrium and nonequilibrium states of the large system is 
equal to the ratio of the statistical weights of the states 


w Q 

iVq Oq 


(48.3) 


where w and O refer to the large system. The index 0 denotes the 
equilibrium value. 

Expressing the statistical weight in terms of entropy (/S=ln(?), 
we obtain 

— . (48.4) 

W0 


Formula (48.4) can be given a somewhat different form. Since the 
large system is closed, its energy remains unchanged for fluctuation 


660 


STATISTICAI/ PHYSICS 


[Part IV 


in the subsystem: But the totel energy and the free energy 

F are related thus: F = £ —0/S, F^^—S^ —O/Sq. It follows from these 
equations that the change in entropy of a system undergoing fluc¬ 
tuation is equal to the change in free energy, taken with opposite 
sign, divided by the temperature: 

,8 —/S„ = ^i^=_ii^. (48.5) 

The change in free energy is expressed, in accordance with (46.37), 
in terms of the minimum work. 

Here, Amin is the minimum external work which must be performed 
on the system in order to produce this fluctuation reversibly, i.e., 
without change in entropy. 

Thus, the fluctuation probability is defined by the following formula 
derived by Einstein: 

W'^ e ® . (48.6) 

It must be borne in mind that fluctuation occurs spontaneously, with¬ 
out expending any external work. This was taken into account by 
the equation The same deviation of actual values from 

equilibrium values in the subsystem can be produced without expecting 
a fluctuation in it—^by performing work Amin reversibly. 

Fluctuations ol thermodynamic quantities. The minimum work 
expression may be reduced to a more convenient form for actual 
calculations. We shall consider that the large system has been divided 
into two parts: a small part, in which a fluctuation occurs and sta¬ 
tistical equilibrium is spontaneously disrupted, and the remaining 
part, in which the variation of quantities is reversible. In other words, 
the fluctuation has produced a deviation from equilibrium only in a 
small part of the system. The quantities referring to this part wiU be 
written without any indices, while those relating to the entire remain¬ 
ing part will be primed, and equilibrium quantities will be written 
with 0 index. 

By definition, the minimum work is calculated in the case of constant 
entropy of the whole system; i.e., as if instead of a fluctuation occur¬ 
ring there is some change in the quantities at the expense of external 
action which does not destroy the statistical equilibrium. Given 
external action, the work is equal to the change in the energy of the 
system: 

Amta=AZ+A^'. (48.7) 

The work here is equal to the energy change taken with positive 
sign (48.7) because, by definition, Amin is performed on the system. 


Sec. 48] 


FLUOTCATIONS 


661 


The changes in the quantities in the large system are very small, 
being less the larger the system, so that A S' may be replaced from the 
thermodynamic identity (46.26); 

Ar =0oA/S' —PoAF'. (48.8) 

As already pointed out, Amin is calculated in the case of reversible 
process. Therefore, A8'=—AS and, in addition, AF'=—AF, of 
course. Hence, 

Ajain = AS — %AS + PoAV. (48.9) 


Large fluctuations are highly improbable. Therefore, the quantities 
AS and AF should be regarded as small in the subsystem also; but 
it is now necessary, here, to make a series expansion up to second 
order quantities since, otherwise, Amin, would be identically equal to 
zero (close to the maximum, the entropy expansion can begin only 
with quadratic terms): 


+ W 


( 3 ^ \ IB ^ \ 

— dg, = —Po> quadratic terms remain 

in the expression for Aniin=(A^ 4-A if'). These terms may be repre¬ 
sented in somewhat diflerent form. Taking advantage of the fact that 


'ep\ . __/89\ . _ /ap\ —l^^\ 

Jvfs’’ {8s^ffr~{as)y’ dvas ~ \8sly~{avls’ 


we write Amin as 


And so Einstein’s formula for fluctuation probability is transformed 
thus: 

where the 0 index is omitted from 9. 

Let us fin d the probability for volume and temperature fluctuations. 
To do this we replace Ap and AS by their expressions in terms of 
volume and temperature; 


652 


STATISTICAL PHYSICS 


[Part IV 


But according to (46.39) right-hand side 

of equation (48.11) is represented as the product of two factors de¬ 
pendent only upon AF and A6: 

(48.12) 

It is now' easy to determine the mean square fluctuations (A F)® 
and (A0)*. For the time being we write 

Then the square of the volume fluctuation is easily written in the 
form 

4-00 


(AF)2 = ~ InJe-X^nVlAF): 


8a 


In 


1 

2a 


(48.14) 


The integration was justifiably extended from 
the integi'and is very small for large A F. 

We finally arrive at the formula 


(AF) 


2 _ 


i^P\ 

\dV)o 


-oo to oo, because 


(48.15) 


It must be remembered that this is not a volume fluctuation, generally, 
but only at constant temperature. At constant entropy, for example, 
the expression would have been different. The square of the tempera¬ 
ture fluctuation, too, is found analogously: 


Gv • 


(48.16) 


This fluctuation is calculated for constant volume. 

We notice that the square of the fluctuation of volume is directly 

ldv\ 


since 


proportional to the first power of the additive quantity {^~gp 
volume is an additive quantity. Hence, the relative volume fluctuation 

]/(AF)® /F is inversely proportional to the square root of the dimen¬ 
sions of the system. This statement, as applied to energy, was ex¬ 
pressed in Sec. 45. The temperature fluctuation y(A0)® is inversely 
proportional to the square root of the specific heat and, for this reason. 


Sec. 48] 


FLUCTUATIONS 


553 


naturally, also decreases together with the dimensions of the sub¬ 
system. 

The quantity 6 is the modulus of the Gibbs distribution for the entire 
large system. When a fluctuation occurs in the subsystem, 0, naturally, 
does not coincide with its temperature, i.e., with its distribution 
modulus, which refers to the time interval during which the sub¬ 
system is quasi-independent of the largo system. During such a time 
interval, 0 is not the temperature of the large system either, since 0 
has the meaning of temperature only in equilibrium. 

The temperature and energy of a system are not related quite im- 
ambiguously: at a given energy, temperature can experience slight 
fluctuations, and at a given temperature, the energy fluctuates. 

Thermodynamic inequalities. I^om formulae (48.15) and (48.16) 
there follow the very important thermodynamic inequalities: 

(^)^<0,Ok>0. (48.17) 

The state of a substance can be stable only when these inequalities 
are satisfied. If the equation of state of a substance indicates that 
these inequalities break down ftt certain p, F, and 0, then the sub¬ 
stance is unstable for such p, F, and 0 and must break up into separate 
phases (liquid and vapour, for example), to which other values of 
F correspond. 

The mean product of the fluctuations of two quantities. Let us now 
consider together the fluctuations of volume and entropy. In this case, 
the formula for fluctuation probability looks like this: 

= e ' ‘ ® (Ty-)s. (48.18) 


Here, the expression on the right-hand side no longer separates in¬ 
to the product of two factors that depend on each variable separate¬ 
ly. Therefore, besides the volume fluctuation at constant entropy, 
and the entropy fluctuation at constant volume, the mean value of 
the product of their fluctuations also differs from zero: AF A/S^O. 
Let us calculate this mean from formula (48.18). We write (48.18) in 
shortened notation: 

W'-^e ^ . (48.19) 

In this notation, the required quantity appears thus: 


AFA6' = - 


DO DO 


— OO —-OO 


In order to calculate the integral, we write the quadratic expression 
in the exponent in the form of a sum of quadratic terms 


S64 


STATISTICAL PHYSICS 


[Part IV 


«u «22 " ^12 ^\2 

«Il 


After this we change the variables in the integral, denoting 


“11 

The integration variable 5 varies within the same limits as A F and 
A»S', i.e., from —cxj to oo. The integral in (48.19) is 


/ 


«H«W- an* 
Oil 


V „. 

r a^j r aiia22 otu V “110^22—“12 


From this we obtain the required mean value: 


(48.20) 


(48.21) 


We shall now show that this mean quantity is nothing other than 
—6(-|^)^=0(-|^) . We consider the inverse quantity 


But if the pressure is represented as a function of entropy and volume 
in the form p=p [8, V (8, 0)], then the latter expression is — 
whence it follows that 


AVA8 


The volume and entropy fluctuations are said to be related, or cor¬ 
related. This is understandable, since if the volume of a system 
increases, then the statistical weight of its state (i.e., its entropy) 
also increases. 

Scattering ot light by fluctuations. Because of fluctuations, no 
medium can be completely homogeneous. For this reason, electro¬ 
dynamical equations, for which the constants of the medium s and x 


Sec. 48] 


IXTJCTUATIOlfS 


656 


are regarded as being fixed and everywhere the same, are, strictly 
speaking, nowhere valid. There always exist small imperfections 
in the homogeneity which must affect the propagation of light in the 
medium. Plane waves cannot be propagated in a nonhomogeneous 
medium; fluctuations cause scattering oif the waves. Let us consider 
the quantity of scattered energy as a function of the frequency. 

We shall consider that only the dielectric constant e experiences 
fluctuations (since % is always close to unity in transparent media). 
The wavelength of visible light X is about half a micron, which is 
considerably greater than the mean dimensions of regions in which 
any noticeable fluctuations occur; this is because very many mole¬ 
cules are still contained in subsystems of volume cm® ~ X®. 

and even in gases under normal conditions. 

The period of oscillation in a light wave is of the order 10“^® sec, 
and is considerably less than the time during which fluctuation 
occurs. A time of at least 10-^"= 10-^^ sec is required for the establish¬ 
ment of statistical equilibrium in the very smallest subsystem. For 
example, under normal conditions, the time interval between two 
collisions of a gas molecule is about 10~® sec, and there is absolutely 
no reason for equilibrium to be established in the condensed phase 
a million times faster. The velocities of the molecules are very close 
in aU phases at the same temperature, and interaction, in establish¬ 
ing equilibrium, must be transmitted over distances not less than 
10-« = 10-^ cm. 

We can, therefore, consider that in the region where fluctuation 
has occurred the parameters of state of the medium, including polar¬ 
izability, have changed somewhat. The polarization of the medium, 
produced in this region by a harmonic light wave, depends upon 
time in accordance with the same law that determines the electric 
field of an incident wave, i.e., like Since the dimensions of 

the region are very much smaller than the wavelength of the in¬ 
cident light, the polarization has the same phase over the whole 
region. Consequently, the polarization may be integrated over the 
whole region, so that a resultant dipole moment is obtained propor¬ 
tional to The light scattering problem must be considered 

here to a dipole approximation, in accord with the condition r X 
being satisfied (see Sec. 19). 

The total scattered energy is proportional to the square of the 
second dipole-moment derivative, i.e., a* or provided we neglect 

the way that polarizability depends upon frequency. 

When light from the sun passes through the earth’s atmosphere, 
the blue rays are scattered more than the red, because the wave¬ 
lengths of the blue rays are shorter. Therefore, the blue portions 
of the solar spectrum predominate in the scattered light from the 
sky. This explains the colour of the sky. 


656 


STATISTICAL PHYSICS 


[Part IV 


Exercises 

1) Write down Poisson’s formula (48.2) for largo N and N. We represent 
(48.2) as _ _ 

__ I_. -N+NInN-NlnN + N 


where JV! is written to the same acciu'acy as in exercise 1, Sec. 39. 

N I N — N\ 

Further, wo must express Inas — ln|l -1- ^—-j and e.xpand in a series 

up to the scconil term inclusively. Tliis loads to the (.iaussian distribution: 

_ 1 (n-n)° 

tt’A" = —e ^ , {AN)^ = N. 


The same value {AN)^ is obtained from the exact Poisson formula if we 
write 


A’2 MW == c' ^ A" -4= 


A--.^ y 4^ A-Le'^ = 

. A ^ A! aA aA 

= A2 -I- A ; A2 - A2 = A. 


2) Fmd the pressiHO fluctuation for constant entropy, and (he entropy fluc¬ 
tuation for constant pressure. 

Answers: 

rA6y = CV; (Ap)'^^-- . 


3) Find the mean value of AOAp. 
Answer: 


AOAp(-If.) 


4) Find the fluctuation of the energy and the number of fpianta for an electro- 
mognotic field of given frequency. 

Proceeding from the expression 

h o> 


It (O 

e 9-1 


we ol>tain, with the aid of the formula derived in Sec. 45 (45.22) 

h<iti 

- l-TZ —T2 • 


(.-‘•■-J 


Tliis formula can be I'epresented thus: 

(A^o)“ --= (/<»)*/-^ 


/— 4 

1 ') 

;i« ^ 

/ha \2 1 


Sec. 49] 


PHASE EQUILIBBITTM 


557 


Introducing the number of quanta of given frequency, Nu, — 


a , 

—, wo have 
h u 


{^Na)^ = Na + 


I’ho fluctuation of the number of quanta appears differently from the flucUia- 
tion of the number of particles of a Boltzmann gas (cf. exercise 1). 

5) A vertically hanging mathematical pendulum performs fluctuation 
oscillations about the equilibrium position. Find the mean squai’e deviation 
angle. 

Let us denote the length of the pendulum by I and its mass by m. The poten¬ 
tial energy at a deflection angle <? is equal to -^r- nUg In this case it is the 
minimum work appearing in the fluctuation i^robability. This gives 


’ mgl ' 

(5) Determine by how much the energy flux of a plane electromagnetic wave 
deci'eases in unit length in a gas as a result of density fluctuations. 

The dielectric constant of the gas is 


c = 1 -f 4 


where > 1 . is the number of molecules in imit volume, p is the polarizability of a 
single molecule. The additional dipole moment introduced in some volume V 
by the density fluctuation is equal to 

d^Ee(N L’o • ^N . 

4 nn 


The square of its time derivative is 


~e-l 
4 nn 


After averaging over fluctuations, we obtain 


dfi = a*ElN 


z — 1 
4 ntt 


The attenuation in the energy flux of a plane light wave over unit length is 
equal to 


2 /c r.2_ (e-l)2 _ 8,.Me:-l)^ 

3 c»k/ 4,t “ n 3 


Sec. 49. Phase Equilibrium 

Separation into phases. A substance consisting of molecules of a 
single type is characterized by four quantities: the number of par¬ 
ticles, the temperature, the pressure, and the volume. Only three 
of the four quantities are independent, since the equation of state 
must always be satisfied. Thus, for an ideal gas the Clapeyron equation 

pV=NQ Lids. 


668 


STATISTIOAI> PHYSICS 


[Part IV 


An ideal gas uniformly fills the whole of its permissible volume 
and in this sense is more an exception than the rule. Thus, for example, 
if we take one gram of water at a temperature of 20° C, then, no matter 
what the positive pressure, it is impossible to make it uniformly 
oceupy a volume 10 cm® (concerning negative pressures, see below 
in this section). One gram of water at 20° C placed in such a volume 
separates into two parts—liquid and gaseous; in other words, it 
does not remain homogeneous. And a certain, very definite, equi¬ 
librium pressure is established in the system. 

In the state of statistical equilibrium, the mean number of mole¬ 
cules going from water to steam in unit time is equal to the mean 
number of molecules going from steam to water. It will be seen 
immediately that this condition cannot bo satisfied for all pressures: 
the number of molecular impacts against the liquid surface is directly 
proportional to the pressure, whereas the number of evaporating 
molecules dexiends very weakly upon the pressure. Therefore, at a 
given temperature, only one pressure corresponds to equilibrium 
between liquid and vapour. Under other conditions, separation 
may occur into a liquid and a solid, into a gas and a solid, or into solids 
of various crystalline modifications or, in general, into phases. 

The condition for phase eqnilibrinm. Equilibrium pressure can 
be determined by the methods of statistical physics and does not 
require a detailed examination of the transition from one phase to 
another. 

In the equilibrium state, the temperature and pressure in both 
phases are, of course, the same. This condition is necessary though 
not sufficient for equilibrium. In addition, a sufficient condition 
is that the thermodynamic potential be a minimum (see Sec. 46). 
The termodynamic potential is additive: it is equal to the sum of 
the potentials of both phases, and the condition of it being a minimum 
is written as follows: 

dO = dOi-h dOg. (49.1) 

For a given temperature and pressure, the entire change of <I)i and 
d >2 can occur only due to a change in the number of particles: 

dd>2 = jXjdAg. (49.2) 

But as many molecules leave one phase as enter another: 

dN^ = -dN^. 

Whence it follows that 

(Pi — (Xg) dA'i = 0. (49.3) 

Since dN^ is any number, the phase-equilibrium condition consists 
in the equality of chemical potentials; 

t^l (P, 0) = 1^2 (P, 6). 


(49.4) 


Sec. 49] 


PHASE EQUIUBBUTM 


559 


This equation may be represented in the form of a curve in the p, 
0 plane. In other words, to a certain temperature there corresponds 
very definite pressure. 

And three phases of the same substance can occur in equilibrium. 
In this case the equilibrium condition is 

P-i (P. 6) = H-a (P. 0) = (P. 6)- (49-5) 

These two equations define a single point in the p, 6 plane (the triple 
point). Out of it come the equilibrium curves between each two 
of the three phases (see Fig. 66). 

Heat of transition. Usually, two phases of the same substance 
differ greatly from one another; their specific volume, entropy, 
energy, and other additive quantities experience a discontinuity at 
the transition point. 

Let us find the quantity of heat released (or absorbed) at the 
transition point. Since the transition occurs at constant pressui-e, 
the quantity of heat is equal to the change in the heat function. 
We shall refer this heat to a single molecule, so that the heat function 
must also be referred to a single molecule. Such a heat function will 
be denoted by i (to distinguish it from I) while the entropy referred 
to a single molecule will be denoted by — s. The heat of transition 
to a single molecule is correspondingly equal to 

q = i2 — ii. (49.6) 

The heat function is connected with the thermod 3 niamic potential 
by the relation /—0+6/S. Going over to quantities which relate 
to a single molecule, and applying (46.48), we obtain 

i = [i + 0s. (49.7) 

Whence 

g'=[X2-fil + e(52-5l). 

But in equilibrium so that the heat of transition is equal 

to the temperature multiplied by the entropy change: 

g' = e(«2 —Si). (49.8) 

This result is quite understandable since phase transition is a rever¬ 
sible process. 

The Clausius-CIapeyron equation. Let us consider two phases of 
the same substance occurring in mutual equilibrium. We suppose 
that the temperature in the equilibrium system is changed somewhat. 
It is required to determine how the pressure must be changed so as 
to keep the phase equilibrium intact. In other words, the derivative 

must be determined along the equilibrium curve. 


660 


STATISTICAL PHySICS 


[Part IV 


The dependence of equilibrium pressure on temperature is given 
in the form of an implicit function (49.4). Hence, the derivative is 
found according to the usual rule; 


From (4(5.48) 


8 (III — 


dp 

, _ 8 <)_ _Jp , 

(49.9) 


dp Jo 

\do)p 


(49.10) 


where v is the volume referred to a single molecule. Multiplying the 
numerator and denominator of the light-hand side of (49.9) by 0, 
and making use of (49.8), we obtain the required equation: 


dp ^_ q _ 

(io 0 — vi) ’ 


(49.11) 


which is known as the Clausius-Clapeyron equation. 

Let us assume that a transition is considered for which q is positive, 

for example, fusion. Then tlie sign of the derivative is dependent 

upon which phase has the greatest specific volume; liquid or solid. 
For example, the specific volume of water at the melting point is 

less than the specific volume of ice, so that is a negative quantity. 


If the pressure above an equilibrium system of water and ice is 
raised, the melting temperature falls. 

In the transition to the gaseous phase (vapourization, if the transi¬ 
tion is by a liquid, or sublimation, if the transition is by a solid body) 
we have the inequality Neglecting in equation (49.11) 


and reifiacing by --, we obtain 


dlnp 
din 0 


(49.12) 


This derivative is always positive. Therefore, the 
water equilibrium curves close to the triple point 
may be represented approximately as shown in 
Fig. 65. The equilibrium curve between water 
and ice has a negative derivative in accordance 
with what has been said. 

Van der Waals’ equation. We shall now show how, 
from the equation of state of a substance, it is 
possible to ascertain the necessity of a phase transition. It is 
convenient for this to make use of the well-known van der 
Waals equation of state for “real gases.” This equation cannot 
strictly be derived from the fundamentals of statistical mechan- 


Fig. 65 


Sec. 49] 


PHASE EQtriLIBBIUM 


561 


ics on any assumptions, and neither is it supported by accurate 
quantitative experiments. Nevertheless, it is the simplest of the 
equations suited for a qualitative description of a very wide range 
of states, from an ideal gas to its condensation into a hquid. Let us 
remind ourselves how the van der Waals equation is formed. 

It is first of aU assumed that a gas cannot be compressed indef¬ 
initely, hut only to a certain volume 6, which is related to the charac¬ 
teristic volume of aU the molecules. This is taken into account in 
the Clapeyron equation pF==N 6 by putting V—b instead of V 
(in actual fact this has no strict foundation, even if the molecules 
are regarded as solid spheres). Over large distances between mole¬ 
cules the acting forces are those of attraction, which fall off rapidly; 
ui the absence of such forces, condensation into a hquid would be 
impossible altogether. These forces reduce the pressure. The reduc¬ 
tion in pressure is inversely proportional to the square of the volume 
occupied by the gas; this can be shown by the following reasoning. 
The gas pressure on a wall is proportional to the density of its kinetic 
energy. The kinetic energy of a molecule incident on a wall decreases 
due to the attraction of this molecule to the other molecules occmring 
in the volume. This attraction is due principally to the couplings 
of the molecules, because the contribution of triple interactions is 
slight in the case of small gas densities. The quantity of interacting 
pairs is proportional to the square of the gas density, or inversely 
proportional to the square of the volume it occupies. Since the energy 
stems from the attractive forces, its density is negative, and it results 
in a reduced pressure. The van der Waals equation is iinally written 
thus: 


JVO a 

P— -glTfi — 1^> 


(49.13) 


where the second term takes into account the attractive forces. 
This term in the equation of state can also be rigorously substantiated 
by the methods of statistical mechanics, but, of course, only at den¬ 
sities which are still very much less than the 
density of the liquid phase, where each mol¬ 
ecule is in constant interaction with many 
neighbouring molecules. Therefore, the exact 
equation of state for a real liquid must be 
immeasurably more complex than the van 
der Waals equation. It is doubtful whether 
it is possible to write a single exact equa¬ 
tion apphcable to a wide class of liquids. 

Van der Waals’ equation and phase transition. We shall now show 
how, from the van der Waals equation, the existence of a range of 
states, in which the substance separates into gaseous and liquid 
phases, can be demonstrated. Equation (49.13) is third degree in 


36-0060 


562 


STATISTICAL PHYSICS 


[Part IV 


volume. It must have three real roots for certain values of 6 and p. 
In other words, the pressure-versus-volume curve at constant tem¬ 
perature (isotherm) is of the form ABFD, as showai in Fig. 56. But 

the derivative is positive between the points B and F and, 

in accordance with the first inequality of (48.17), the state of the 

substance is unstable if ^ • Hence, the necessity for the 


separation of the substance into two phases in this region. 

The portion of the curve AB corresponds to the liquid state (small 
volume). As the pressure is reduced the liquid expands to the point 
A', after which change occurs along the straight line KL. The points 
K and L are uniquely defined from the equality condition of the 
chemical potentials (49.4), and the intermediate points along the 
lino correspond to a mixture of liquid in a state corresponding to 
J(, and vapour ui a state L. We notice that the position of the pomt 
A', at a temperature corresponding to the given isotherm, is defined 
uniquely. 


The portion KB is not absolutely unstable, since on it 


< 0 . 


The states of this portion can be attained without allowing the 
formation of vapour bubbles in the liquid (a superheated liquid). 
For this the liquid must be free from foreign agents, for example, 
bubbles of dissolved gases, which favour vapourization. Sometimes 
the portion KL lies partly below the abscissa axis, thus corresponduig 
to a negative pressure, i.e., an extension of the liquid. A liquid can 
indeed be extended if it adheres eveiywhere to the walls of the vessel 
and does not have a free surface. The portion FL corresponds to 
a supercooled vapour, which can be obtained if condensation centres 
are prevented from forming. Such condensation nuclei or centres 
easily arise from ions, for example. This is the underlying principle 
of the Wilson cloud-chamber for the observation of the tracks of 
charged particles. 

The critical point. At a sufficiently high temperature, the first 
term on the right in the van der Waals equation predominates over 
the second. The equation then becomes very similar to the Clapeyron 
equation for a volume F— h. But this equation has only one real 
root for each value of p. This corresponds to the Avell-known fact 
that at high temperatures a substance does not split into two phases 
at all. 

Let us find the temperature at v'hich separation into phases ceases. 
On the corresponding isotherm A' OD' (Fig. 56) the points B and 


F, where the derivative 


j^becomes zero, merge into one point C, 


and the region of unstable states disappears. All three roots of equa¬ 
tion (49.13) merge at the point C, so that C corresponds to the triple 


Sec. 49] 


PHASE EQUIUBRItTM 


563 


root of this equation. But the expansion of the function with respect 
to the difference V—Vc must begin with a third-order term if Fc 
is a triple root. The linear and quadratic terms in the expansion 
become zero if the first and second pressure derivatives with respect 
to volume are equal to zero at the point C. It is easy from this to 
determine the position of the point G from the van der Waals 
equation. 

Let us write down the condition that the first and second deriv¬ 
atives become zero: 


/ d^p \ 


From this we obtain 


NOc _, 2^ „ 

' (Fc- ’5)2 F»c ' 

2 Oc 6 o „ 
CVc-bf ' V*c~ 


Vc-b _ Vc 
2 3 ’ 

Vc = 3b. 


Then, from (49.14), we find 


(49.14) 

(49.15) 


(49.16) 


so that 


_ JV0cF\; 
““ 2(Vc-bf 


= ^lmcb, 


00 = 


8 a 

WWb' 


(49.17) 


The pressure at the point C is determined from the van der Waals 
equation: 


Noc: a \ a 

yc-b ~ “fF 


(49.18) 


If we represent the phase equilibrium curve in the p, 0 plane, then 
this curve will end in the point p = pcj0 = 0c. G is called the critical 
point. Separation into phases does not occur at temperatures 0 > 0c. 

The critical point can exist only on the equilibrium curve between 
two such phases, which have no feature that is incapable of varying 
continuously. An example of such a feature is the regularity of crystal 
structure: in principle, the position of an atom in an ideal crystal 
defines the position of the whole crystal (with the exception, naturally, 
of its orientation in space). Yet, the position of an atom in a liquid 
affects only the position of its closest neighbours. And so for certain 
substances a continuous transition between the solid crystalline phase 
and the liquid phase is impossible. The curve dividing the crystalline 
and liquid phases cannot end and cannot, therefore, have a critical 
point. 


36* 


564 


STATISTICAL PHYSICS 


[Part IV 


The law ol corresponding states. Eliminating the constants o, h 
and N with the aid of (40.16), (49.17), and (49.18), we have 


6 = J^,a = 3Fc®2Jc,iV=| 


pc Vc 
9c 


(49.19) 


The last of these three equations shows by how much the equation 
of state of a substance differs, at the critical ])oint, from that of an 

ideal gas: pc Vc = -^NQc- But we should note that, as a general rule 

this relationship is not really satisfied. As has already been mentioned, 
the van der Waals equation is qualitative in character, and so there is 

nothing surprising in the fact that for real substances pc Fc iV^6c. 
If we now substitute (49.19) in (49.13), we get 


P 80/0c o/^)' 

PC “ (3F/Fc)-1 \Vcr 


(49.20) 


Formula (49.20) expresses a special form of the so-called law of corres¬ 
ponding states: for two different substances, the ratios 
and —are related by a single universal equation. It should be noted 

that in general form the law of corresponding states, especially for 
substances of similar structure, is satisfied better in practice than the 
specific formula (49.20) based on the interpolation van der Waals 
equation, because this, more general, law does not impose a definite 
functional form on the equation of state. However, there are, of 
course, deviations from the law of corresponding states also: the ratios 

are not strictly the same for two substances having identical 


— and —. 
pc Oc 


The properties of a substance close to the critical point. Let us now 
investigate the properties of a substance close to the critical point in 
general form, without assuming that the van der Waals equation 
(49.13) holds. We shall only make use of the fact that the derivative 

(-“jo, close to the critical point, must tend to zero like the square of the 

difference F—Fc, because the expansion of p on the critical isotherm 
begins with a term proportional to (F—Fc)®. At temperatures suffi¬ 
ciently close to critical, differs from first-order 

quantities in (6—6c), because the relationship between pressure and 
temperatm’e close to the critical point does not exhibit any peculiari¬ 
ties. Thus, the expansion in the critical region is of the form 


(|f),=-X(F-Fc)®-v(e-9c). (49.21) 


Sec. 49] 


PHASE EQUILtBRnm 


566 


Here, v >0, because the inequality 0 must always be satis¬ 

fied at temperatures higher than critical. Therefore, X > 0 also. At 
temperatures lower than critical, becomes zero at two points. 

They correspond to B and F in Fig. 56 

Fb—F c=-l/^(0c-e), Ff-70 = 14^(00-0). (49.22) 


Tjct us now find the points on the isotherm that correspond to K 
and L, i.e., to phase equilibrium. 

For this we make use of the phase equilibrium condition [XK=tXL. 
It is conveniently written in the form of an integral taken along the 
isotherm on wliich the points K and L lie: 

Cdfi. = 0. (49.23) 

k 


Multiplying by N and then replacing d<I> by V dp for 0 = const, we 
obtain 

L L L 

jdii=jVdp=j(V-Vc)dp, 

K K K 


L 

because the integral jdpis equal to pt —PK and becomes zero accord- 

K 

ing to the condition pL=PK- Now substituting the initial expression 
(49.21) we see that the equality of chemical potentials reduces to the 
requirement 

|(7- Fc) [X(F- Fc)* + V (0 - 0c)] dV= 0. (49.24) 


The mtegrand is odd in F—Fc. Therefore the integral becomes zero, 
if at the integration limits (i.e., Fk—F c and Fx,—Fc) the values of 
F—Fc are equal and opposite in sign or, in other words, the volumes 
Vl and Fx differ equally from critical. 

We also represent the condition of pressure equality in integral 
form 

X. ^'x. 

Jdp=||f-dF=0. (49.26) 

K Vg; 

Substituting (49.21) here and integrating, we obtain 


566 


STATISTICAL PHYSICS 


[Part IV 


I ( Vl~ Fc)» + V ( Fl- Fc) (6-0c) - y ( Vk - Fc)^ - 

-v(Fjc-Fc) (0-0c)=O. 

Making use of the fact that Vk—V c — — (Fl — Fc), we obtain the re¬ 
quired equation 

3 ( Fl - Fc)» + V ( ri. - Fc) (0 - 0c) = 0 , 


from which it follows that 

Vl - Fc = Fc - Vk = . (49.26) 


Thus, close to the critical point, the region of absolutely unstable states 


is narrower. 


in the ratio , 


than the whole region where phase 


sejiaration occurs. 

Let us now find the heat of transition close to the critical point. 
By definition we have 


Q^Qc (6'l - Sk) = 0c ( Fl - Fk) . (49.27) 


dS 

At the critical point, the derivative ypr maintains a finite value 
l^it is equal to • Therefore, the heat of transition is proportional 

to (0c—0). Right at the critical point it becomes zero, as expected. 

Close to the critical point, the density of the substance experiences 
large statistical fluctuations, because density fluctuations are inverse¬ 
ly proportional to as we have seen in the previous sec¬ 

tion, this results in a strong scattering of light. As a result of this scat¬ 
tering, the substance acquires a certain turbidity, similar to the tur¬ 
bidity of opal (critical opalescence). 

Phase transitions of the second kind. At the phase transition point 
the thermodynamic potentials of both phases are equal. The other 
additive quantities (such as entropy, energy, and volume) experience 
discontinuities. But there also exist phase transitions for which not 
the additive quantities themselves are discontinuous, but only their 
derivatives—specific heat, compressibility, etc. An example of such 
a transition was already given in Sec. 43; this is the transition of 
helium at a temperature of 2.2° K. 

The specific heat at the transition point changes discontinuously. 
Another example is the transition of iron from the ferromagnetic to 
the nonfeiTomagnetic state at 770° C (the Curie point). 

Phase transitions of the second kind are very frequently observed 
in crystals. In this case they correspond to a certain change in the 
translational or vibrational symmetry of the lattice. Since the form 


Sec. 49] 


PHASE EQUILIBBHT.M 


667 


of the symmetry cannot change continuously (the symmetry property 
either exists or it does not), symmetry always changes disconti- 
nuously. If an entropy discontinuity is then experienced, we have 
a phase transition of the first kind; if the entropy is continuous, and 
the derivatives experience discontinuities, the transition is of the 
second kind. 

Let us interrelate the derivative discontinuities of various quantities 
on the lines of phase transitions of the secopd kind. Since entropy and 
volume are continuous, we write 


A,S' = .S 2 - A'l - 0; A F = Fa - = 0. (49.28) 

Let us differentiate these equations with respect to temperature along 
the transition line. VVe then obtain 

Here ^ denotes the derivative of pressure with respect to temperature 
along the transition curve. Further, = — (v^) I-®®® (46.46)] 

and Whence, after eliminating A obtain 


(49.31) 


Thus, along phase-transition lines of the second kind, the specific heat 
discontinuity at constant pressure is associated with a compressibility 
discontinuity. A similar expression can also easily be found for the 
discontinuity in specific heat at constant volume. 

Sometimes, phase transition lines of the first kind become a phase 
transition line of the second kind at some point. If the transition is 
associated with a change in symmetry, then neither line can simply 
terminate. 

The thermodynamic theory of phase transitions of the second kind 
has been developed by L. D. Landau (see L. D. Landau and E. M. Lif- 
shits, Statistical Physics, Gostekhizdat, 1951.) 


Exercises 


I) Find the specific heat of one of the phases of a substance along the curve 
of phase transitions. 

From the definition of specific heat 


c=e 


es 

00 


dp ies\ - 
do \0p /e. 


= Cp 


__ 7__/0T\ 

0{F,-Fi) \00/p- 


668 


STATISTICAL PHYSICS 


[Part IV 


2) Show that Cp becomes infinite at the critical point. 

Use the result of exercise 4, Sec. 46, and the condition deiining the critical 
point. 

3) Find the discontinuity in specific heat (expressed in terms of compressi¬ 
bility discontinuity) at constant volume along a phase-transition line of the 
second. 


Sec. 50. Weak Solutions 

Weak solutions and ideal gases. Weak solutions exhibit many regu¬ 
larities which make them similar to ideal gases. The reason for this 
similarity can bo easily seen in the fact that the molecules of a dis¬ 
solved substance in a weak solution interact just as little as ideal gas 
molecules. But the molecules of a dissolved substance interact strongly 
with the surrounding molecules of the solvent, whence the differences 
between a solution and a gas. 

The thermodynamic potential of a weak solution. We shall proceed 
from the general expression for free energy in classical statistics: 

Jf’=-einj'e 8 dr (50.1) 

[we have omitted the unessential factor (2 tcA)^]. The integral is taken 
over all physically different states of the system. If one takes into 
account the identity of aU theiV^ molecules of the solvent and the w mole¬ 
cules of the dissolved substance, he can extend the statistical integral 
over the entire phase space and divide it by the total number of per¬ 
mutations of all identical particles. The number of such permutations 
is N\ n\. 

We now write down the thermodynamic potential of a weak solution 

(I) = jF-f pF=(-61nje“'8dr-f einiV^! -[- pF)-f01nM!, (50.2) 

where the integi’al extends over the whole phase space of the system. 
Let us expand the expression in brackets, on the right-hand side of 

(50.2), in powers of the small quantity , taking into account that 
the zeroth term in the expansion is the thermodynamic potential 
of a pure solvent O,,. In addition, we replace In w! by win — according 
to Stirling’s formula: 

= + nOln^. (60.3) 

We can refine the dependence of B (p, 6, N) on the number of solvent 
particles N by noting that the thermodynamic potential must be an 


Sec. 60] 


WEAK SOLUTIONS 


569 


additive function of N and n. In other words, if N and n increase a 
certain number of times, for example, twice, must also increase 
by a factor of two. But in (60.3), this requirement is satisfied directly 
only by the potential of the pure solvent <I>o, equal to N (Xq, where 
[Xq is the chemical potential of the pure solvent. For the second and 
third terms to be additive, let us first write the third term in the form 


6wln — = 0wln + 0 m In iV^. 

e eN 

After this, the thermodynamic potential will look like 

«> = [Xo (p, 0) + «01n -^ + n + 01n a) . 

In order to obtain an additive expression, wo must demand that 
the function ^ + 0 In iV should not be dependent on N at all. The 

result is a general expression for the thermodynamic potential of a 
weak solution: 

O = iV[Xo(p,0)+n01n-^ + »X(p,0). (60.4) 


The chemical potential of a solvent in solution is equal to 


_ 3 ® _ 

^ ~ dN ~ 1^0 ■ 


n9 

“aT 


while the chemical potential of the solute is 


V- 


8 ® 

8n 


01n-^ + X(p,0). 


(60.6) 


(60.6) 


Osmotic pressure. Certain semipermeable membranes pass solvent 
molecules freely, but do not pass molecules of the dissolved substance. 
The solvent must be in statistical equilibrium on both sides of such a 
membrane. But this is possible only when the chemical potentials of 
the pure solvent and the solvent in the solution beyond the membrane 
are equal. The temperature of the substance on both sides of the mem¬ 
brane is, of course, the same, for otherwise equilibrium could not set 
in. Only the pressure can differ, provided the pressure difference is 
held in check by the membrane. Denoting the pressure difference by 
Ap, we obtain the equilibrium condition: 

H-o (P. 0 ) = {X (p -f Ap, 0 ) = (Xo (P + Ap, 0) — . (60.7) 


Let us expand [Xq in a series in powers of Ap, to the linear approxi¬ 
mation. This expansion is justified, since the pressure difference for a 
liquid Ap is a small quantity. Therefore, 

lXo(p + Ap,0) = ixo(p,0) + 4^Ap. 


(60.8) 


570 


STATISTICAL PHYSICS 


[Part IV 


But the derivative is equal to the volume of a single molecule of 
pure solvent: 

^l-tQ _ _P_ 
dp N 

Whence we obtain the equation 

V • Ap = nO. (50.9) 

The excess pressure Ap in the solution is called the osmotic pressure. 
Kijuation (50.9) boars a striking resemblance to the Clapoyron equa¬ 
tion for ideal gases. It was originally found experimentally and served 
as the basis for the formulation of a thermodynamic theory of solu¬ 
tions. We obtained equation (50.0) by proceeding from the general 
jirinciples of statistics. 

Phase equilibrium of a solvent (Baonlt’s laws). We shall now consider 
another case, when equilibrium is also established between solvent 
molecules. Lot the solution occur in equilibrium with another phase 
of the solvent, while the solute does not pass into this phase. We find 
the displacement of the phase equilibrium curve in the p, 6 plane. 

I^et us call the chemical potential of the phase into which the solute 
does not pass, g,. Then the phase-equilibrium condition of the pure 
solvent is determined by the equation 

(Xi(p,0) = [Xo(P,9), (50.10) 

while the equilibrium of the other phase of the solvent and solution is 
ilisplaced and is given by the following condition: 

gi (p ~h A'p,0 -f AO) = po(P + Ap,0 -p AO) —(50.11) 


Let us expand the chemical potentials in a series in Ap and AO: 
[ji, (p 4- Ap,0 -f AO) —[io(p 4- Ap,0 -h AO) = (Xj (p,0) — go (p, 6) + 
+ [-^y (t^i—!^o)] Ap + ((Ai—(Ao) ] AO = 

= («i —*’o) Ap —(«! —«o) AO. (50.12) 


We shall now assume that the pressure in the system is the same as 
above a pure solvent, i.e., that Ap = 0. Then the equilibrium-temper¬ 
ature displacement AO will be defined: 


"TT’ 


(50.13) 


where Q—NO (s^— Sq) is the heat of the phase transition of the pure 
solvent. For vapourization Q>0; therefore, AO>0 if the solute does 
not pass into vapour, so that the equilibrium temperature is raised. 
Indeed, the solution has a higher boiling point than the pure solvent. 


Sec. 50] 


WEAK SOLUTIONS 


571 


Let us now suppose that the solute does not pass into the solid phase 
of the solvent. Then Q is the heat of solidification, Q < 0. It is seen 
from this that the fusion temperature of the solution is lower than that 
of the pure solvent. The use of cooling mixtures is based on this prop¬ 
erty of solutions. 

Let us now consider equilibrium at a given temperature, A9 = 0. 
Then the reduction in equilibrium pressure over the solution is deter¬ 
mined from (50.12): 

If a solution is in equilibrium with vapour, then Vq. The product 
Ni'i is the volume of the entire solvent in the vapour state. If it were 
possible to transform the solute to vapour together with the solvent, 
the partial pressure of the molecules of the substance would equal the 
reduction in the equilibrium pressure above the solution. The relative 
pressure reduction —A p/p is equal to the concentration of the solu¬ 
tion nIN. 

Solute equilibrium. A solution is termed saturated if it is in equilib¬ 
rium with the dissolved substance. The equilibrium condition con¬ 
sists in that the chemical potential of a pure solute, [i'q, is equal to its 
chemical potential in the dissolved state: 

^'„ = ji' = ein-^ + X(p,e). (50.15) 

We have supposed that the saturated solution is also stiH regarded as 
weak, i.e., 

If a pm^ substance occurs in a gaseous state, then its chemical poten¬ 
tial depends upon pressure according to the law [see (47.17)] 

(i'o=01np-[-/i(6). (50.16) 

The function x (p, 6) is but slightly dependent on the external pressure: 
X (p, 6) is determined by the properties of the condensed phase, which 
do not change when the external pressure varies over several atmos¬ 
pheres. Comparing (50.15) and (60.16) and taking antilogarithms, we 
find that the equilibrium concentration of the dissolved gas is propor¬ 
tional to its pressure above the liquid (Henry’s law) 

^ = a(0)p. (60.17) 

The coefficient of p depends very weakly on pressure. 

Heat of solution. The heat of solution is equal to the difference in 
the heat functions of the substances comprising a solution before and 
after being dissolved. The heat fimction is related to the thermo¬ 
dynamic potential in the following way: 


672 


STATISTICAL PHYSICS 


[Part IV 


/ = 


O —6 


— 62 


as 


Therefore, the heat of solution is equal to 


(50.18) 


g = _e2-^-L(iV(io + TC61n^+ nX — nii'o— N , (50.19) 

where (i'q is the chemical potential of the dissolved substance. The 
quantities appearing here may be expressed in terms of the concentra¬ 
tion of the saturated solution nJN with the aid of the saturation 
condition (50.15). This yields 

Q = _«G2~ln-^. (50.20) 

The heat of solution for one molecule is equal to Qjn—q, or 


8 0^1 ^ 
In-— 

* do n. 


Wq do 


(50.21) 


Thus, if the concentration of a saturated solution increases with tem¬ 
perature, then heat is absorbed in dissolution. 

The Le Chatolier-Brann principle. Let us suppose that heat is supplied 

Q yh 

to a saturated solution in equilibrium with the solute. Then, if > 0, 

part of the substance will further dissolve, and the heat is spent 
not only in raising the temperature but also in dissolving. But if 

- >. 7 !® < 0 , then some of the substance comes out of solution, on which, 

do 

in accordance with (50.21), heat is also expended. In both cases, 
changes occur in the equilibrium system that counteract the external 
action (raising of the temperature). The foregoing example illustrates 
a general rule, known in thermodynamics as the Le Chatelier-Braun 
principle. The Clausius-Clapeyron equation can be examined on the 
basis of this principle. 

The phase rule. Let us suppose that there are k substances (compo¬ 
nents) distributed in the form of solutions of arbitrary concentration 
over / phases. How many parameters define the equilibrium state of 
such a system ? 

The chemical potentials of the substances depend upon temperature, 
pressure, and relative concentrations. The concentrations of aU the 
substances in any phase satisfy the equations 


1-1 

since, by definition, the concentrations are equal to 


(50.22) 


Sec. 60] 


WEAK SOLUTIONS 


673 


cf 


f 

m 


The equilibrium conditibn consists in the equality of the chemical 
potentials of each of the k substances over all / phases: 


[i} (p, 8, cj cj, ..., Cfc) = (JL? (p, 6, cj, Ca,..., cl) = ... pi (p, 0, c[, ci, cl) 
ftl (p, 6, cl, cj,. -., cj) =... = pi (p, 0, c(, clcl). (60.23) 


Here the superscript always denotes phase, while the subscript de¬ 
notes the substance. 

Equation (50.23) involves k concentrations in / phases, and two other 
variables (temperature and pressure), so that there are kf-i-2 variables 
in all. 

There are /— 1 equations (50.23) for each substance and, in addition, 
the concentrations satisfy / equations (50.22), so that in all there are 
k (/—1)-|-/ equations for determining k/-h2 variables. The number 
of independent variables which may vary arbitrarily is equal to the 
difference between the number of variables and the number of equa¬ 
tions, i.e., 

r^kf+2 — k(f~\) -j=^k — i + 2. (50.24) 


The quantity r is called the number of thermodynamic degrees of 
freedom of the system. (50.24) expresses the Gibbs phase rule', the 
number of degrees of freedom is equal to the number of components, 
minus the number of phases, plus two. 

For example, if a single substance occurs in equOibrium in two 
phases, then r = 1; in such a system, one may change arbitrarily a single 
variable: temperature or pressure. In a two-component, two-phase 
system, there are two degrees of freedom: the component concentra¬ 
tion in one of the phases can be varied together with the temperature 
or pressure. 

Strong electrolytes. The thermodynamic properties of solutions of 
strong electrolytes exhibit noticeable deviations from the laws obtained 
in this section for the solutions of neutral substances. It is natural 
to look for the cause of these deviations in the fact that ions interact 
electrostatically. This type of interaction was not taken into account 
at all in the theory of weak solutions. 

Aqueous electrolyte solutions exhibit very strong dissociation, the 
cause of which can be qualitatively imderstood if we take into account 
that the static dielectric constant of water is equal to 81. The potential 
energy for the atomic interactions of a heteropolar molecule in a solu¬ 
tion is less than in vacuum by roughly s times, so that in water the 
atoms of such a molecule are more weakly bound by a factor of 81. 


574 


STATISTICAL PHYSICS 


[Part IV 


Thermal motion in the solution disrupts the bonds, and instead of 
molecules we have ions in the solution. The interaction forces between 
ions are Coulomb forces; they faU off with distance much more slowly 
than the interaction forces between neutral molecules. Let us deter¬ 
mine the correction to the chemical potential for a weak solution, 
which correction is due to the interaction forces between ions. 

The ionic atmosphere. For simplicity, we shall consider an electrolyte 
which contains only singly charged ions of both signs, for example, 
H-* and Cl“ in a weak aqueous solution. Both positive and negative 
ions can occur close to a positive ion. The density of positive ions near 
an ion of the same sign is reduced, compared with the mean density, 

by the Boltzmann factor e ® , where 9 is the potential produced by 
the charge distribution at a given point in the vicinity of the positive 
ion. The negative-charge density at the same point is increased by 

the Boltzmann factor v ® . If po is the mean ion density, then the 
charge density at the point of the solution considered is 


C?p <'<P 

epoc « — f'poC « = — 2 pocsh^'^-. 

From equation (14.6), this quantit)^ multiplied by-is equal to 

1 ^ 

the Laplacian of potential [we must introduce — into equation (14.6) 

in order to take into account the reduction of the electric field in the 
medium]. This equation of electrostatics, written doAvn straightway 
in spherical coordinates, is of the foim 


—r 


2 cZ ^ 8 


d) 


p 07C 1 e<p 

eposh-?-. 


(50.25) 


We shall consider the solution to this equation for pomts in space 
for which the inequality e<p<^ 6 is satisfied. It is only this region which 
is of practical interest. If the indicated inequality is satisfied, then 

.sh -y- is replaced by its argument, and equation (50.25) becomes 
linear; 


\ d A do SttpoC® 

_ yz I. ^-Jio— ^ 

dr dr eO ^ 


(50.26) 


We shall look for a solution to (50.26) in the form of the usual Cou¬ 
lomb potential corrected by the screening factor 5 (r): 

9 = (50.27) 

Then, substituting into (50.26) leads to an equation for ^: 

dH _llL£o£lc 
dr* ee ’’ 


(50.28) 


Sec. 50] 

WEAK SOLUTIONS 

575 

Putting 

_ 1 /SrtPoe** 

\ 66 ’ 

(50.29) 

we obtain 

a solution of (50.28) in the form 


5 (r) = e~ 

(50.30) 


Of the two solutions of (50.28), only the one with a minus sign in the 
exponent is retained. The constant of integration is put equal to unity, 
since screening should not have any effect at small distances from the 
ion; close to the ion, 9 (r) becomes the usual Coulomb potential. 

From (50.27) and (50.30) we have 

cp = -'’-e-«r. (50.31) 

The potential decreases e times for a distance 1/x from the ion. 
1 /x is termed the radius of the ion cloud surrounding the given ion. Tn 
actual fact, there is no “cloud,” but simply an increased charge 
density of opposite sign and a reduced charge density of the same 
.sign, which leads to the screening of the given ion field at a distance 1 /y. 

Thermodynamic quantities for strong electrolytes. The potential 
produced by the ion cloud at a certain point is equal to the difference 
between the potential 9 (r) and the potential of the free charge not 
surrounded by ions. At the point where the original positive ion is 
situated, this additional potential is 


It is of finite magnitude. It was for this reason that the solution 
(50.31) could be extrapolated to such small distances away from 
the ion that the linear approximation (50.26) does not, strictly siieak- 
ing, hold. The additional free energy of the ion can be calculaterl 
as the work produced in taking the charge e to a point of potential 
S 9 . It is necessary, here, to take into account that the potential 
S 9 is produced by the charge e itself. Therefore, the work done in 
introducing the charge is not equal to 689 , but to the integral 

C 

Js 9 d e. Here, the charge is the external parameter X, while the po- 
0 

tential 8 9 is the generalized force A (see Sec. 46); 

f* 

8 F = SA=j'S 9 de= —(50.32) 
0 

where the fact that e« e* was taken into account in the inte¬ 
gration. From this it is easy to obtain the addition to the thermo- 


576 


STATISTICAL PHYSIOS 


[Part IV 


dynamic potential for the ion, which is associated with the correction 
to the free energy: 


8y., ^8A- . (50.33) 


3 t is inversely proportional to the square root of the volume, so 
that “ — Y ■ • Since S (i refers to a single ion, the addition 

per ion pair (comprising a single molecule) is the required correction 
to the chemical potential of the solute 

S(x' = 2.1 =-l/-5- W ^ - (50-34) 


where p is the solvent density ( 


/ molecules \ 


- ■ 


To the thermodynamic potential of the whole solution we add the 
quantity 


p e* 
0 


(50.36) 


while for the chemical potential of the solvent of a strong electrolyte 
wc obtain the correction 


_^S<D _ J_'|/8n n=* pe« 
d'N ~ 3 F 0 ■ 


(50.36) 


This theory of strong electrolytes was formulated by Debye and 
Hiickel. 

Implications of the Debye-HUckel theory. If we introduce 8 p. 
into equation (50.7) for the osmotic pressure, then instead of (50.9) 
we get 

Ap.F = 2n6(l-|l/^^-‘>;-] (50.37) 

(the factor 2 takes the complete dissociation into accomit). The 
correction factor inside the brackets tends to unity as the solution 
concentration rijN decreases, but its derivative with respect to 
concentration tends to infinity. 

The additional term involved in equation (50.15), which determines 
the concentration of a saturated solution, is proportional to Vn. 
It also contributes infinite additions to the derivatives as n tends to 
zero. 


Sec. 51. Chemical Equilibria 

Reversible and irreversible reactions. Like all processes whose 
velocities do not coincide -with the rate of change of the external 
parameters of a statistical system, chemical reactions, which proceed 


Sec. 51] 


CHEMICAI. EQUILIBarA 


.577 


with finite velocity, are irreversible. For example, when an explosive 
luixtnro (hydrogen and oxygen) burns, water vapour is produced 
irreversibly. 

If a certain quantity of the oxyhydrogen mixture is prepared in 
a closed vessel, the state of the mixture will be thermodynamically 
unstable with respect to the reaction. True, the reaction by no means 
]>roceeds directly according to the “gross equation” 2 H 2 + 02 = 
=-- 2 H^O. To do so, the molecules would have to overcome very 
high j)otential barriers. In actual fact, the reaction must proceed 
through stages involving the intermediate unstable substances OH, 
H, O Avith nonsaturatcd valences; these are the so-called active 
centres. 

The initial formation of active centres is very difficult, so at room 
temperature an oxyhydrogen mixture may be preserved indefinitely. 
Hut if active centres are somehow produced (by a powerful electric 
spark, for example) then they are renewed and multiplied in the 
course of the reaction (a chain reaction).* When the multiplication 
of active centres is fast enough, the reaction proceeds explosively. 

But chemical reactions never proceed to the end. If an explosion 
is ])roduced in a sufficiently strong vessel (bomb), then the finite 
equilibrium state will contam hydrogen, oxygen, and water vapour 
in certain (strictly definite) concentrations that depend upon the 
temperature and pressure and the initial compo.sition of the mixture. 
This finite equilibrium state is termed ehemical equilibrium. 

When the state in an equihbrium system is changed slowly, the 
equilibrium Avill shift in one direction or other, i.e., the quantity 
of initial or final products may increase. But these chemical reactions 
])roceed with the same velocity as that with which the external 
conditions change. Hence, such reactions are reversible—-as are aU 
]}rocesses whose velocity is not estabhshed spontaneously but is all 
the time equal to the velocity of change of the quantities that the 
equilibrium state of the system is given by. 

Chemical equilibrium. The state of chemical equilibrium can be 
found with the aid of the thermodynamic functions of the substances 
involved in the “gross equation” reaction, quite independently of 
the mechanism by w'hich the reaction proceeds. Tliis is Avhy the 
theory of chemical equilibria had already been formulated in the 
nineteenth century, while the study of the velocity of chemical 
reactions is still developing vigorously at the present time. In this 
sense, the situation is similar to that in statistics in general, i.e., 
in the science of equilibria, and in kinetics—the science of the ve¬ 
locities of macroscopic processes. 


* The majority of chain reactions aro assooiatod with an active centre. This 
was established by N. N. Semyonov, the discoverer of chain reactions, and his 
pupils (and independently by C. N. Hinsholwood). 


37 - 0060 


678 


STATISTICAL PHYSIOS 


[Part IV 


For a given temperature and pressure, chemical equilibrium is 
attained only when the thermodynamic potential in the reacting 
mixture has a minimum 


dcD = 0. 


(51.1) 


When p = const and 6 = const, the minimum condition, d<I)=0, 
appears thus; 

dQ>=^\LidNi. (51.2) 

( 

Here, (i,- is the chemical potential of the ith substance appearing 
in the “gross equation” reaction. For an oxyhydrogen mixture, 
for example, the only substances of this kind are hydrogen, oxygen, 
and water vajiour. But the numbers dNi are not arbitrary: they 
change as the reaction proceeds, and are therefore interrelated by 
the reaction equation. In other words, Ni may vary only in equiv¬ 
alent (stoichiometric) quantities. For example, if we take the reaction 

2 C 0 -f- O2 = 2CO2, 

then d Nqo • d No, ’■ d Nco, = — 2 ; — 1 : 2 . In the reaction of the 
thermal dissociation of hydrogen 

H2 = 2H 

d Nh, : dNn — — 1:2. In general, the number d Ni is propor¬ 
tional to the equivalent of the given substance in the reaction v,; 
equation (61.2) can also be rewritten thus; 


27(x.v, = 0. 


(51.3) 


This equation expresses the condition of chemical equilibrium in 
a system. 

Law of mass action. Equation (51.3) is especially useful when 
we have an explicit expression for the chemical potential of the 
reacting substances, as in a weak solution or in an ideal gas, for 
example. In the latter case, the equilibrium concentrations of the 
substances may be determined if there are sufficient data about 
the structure of all the molecules in equilibrium. 

The chemical potential for a gas in a mixture of ideal gases is 
[see (47.17)] 


[x,- = — 6 In 


0 /.-( 0 ) 


(51.4) 


where /,• is a statistical sum taken over all momentum values of the 
moleciile as a whole, and also over aU its rotational, vibrational, 
and electronic states. The latfer are essential only when they occur 
close to the ground state of the molecule and are far from its disso- 


Sec. 51] 


CHEMICAL EQtrtLIBBIA 


679 


ciation limit. If they occur close to the dissociation limit, the mole¬ 
cule decomposes before such highly excited states can in any way 
affect the values of the statistical sums (see exercise 2 ). 

Substituting the expression for chemical potential into the chemical 
equilibrium condition (31.3), and eliminating 0, we obtain 

^ Vi In Pi = V/ In 6 fi. 

I 

Taking antilogarithms of this equation, we obtain the equilibrium 
condition (expressed in terms of partial pressures) from the formula 

□ = (51.6) 


This equation can also be written in terms of the relative concen¬ 
trations of the substances by replacing the partial pressures with 
the aid of (47.15): 

-S'’.- -S'’i 

Ci''i = p ' n[( 6 /i)''' = ?> ' K. (51.6) 

i i 


Here, ci denotes the coneentration of the ith component of the mix¬ 
ture: 


Ci = 


Ml 

N • 


(61.7) 


The pressure on the right-hand side of (51.6) has still to be expressed 
in terms of the initial pressure or in terms of the initial density; 
this can always be done easily with the Clapeyron equation, if 
we take into account the change in the number of particles (relative 
to the initial number of particles), for a given equilibrium of the 
chemical reaction. 

The component concentrations depend upon the initial quantities 
of the original substances involved in the reaction. Thus, also the 
eiiuilibrium concentrations depend upon these quantities (or masses). 
Therefore, equation (51.6) expresses the so-called law of mass action. 

The quantity appearing on the right-hand side of equation (61.5) 
is called the equilibrium constant of the given reaction, because it 
does not involve the concentrations of the mixture. Its dimension- 

r ^'■■1 

ality is [p' J • 

Heat of reaction. The heat of a chemical reaction occurring at 
constant pressure is defined as the difference in the heat functions 
of the reacting substances before and after the reaction. It is conven¬ 
ient to write this heat as calculated for a single elementary act of 
the reaction 


a 8<p 
ae 0 


(61.8) 


580 


STATISTICAL PHYSICS 


[Part IV 


[cf. (50.18)]. But in an elementary act S jP [i-iv,-, so that the 
heat of reaction is ‘ 


This expression denotes the heat absorbed during the reaction. 
The heat evolved would have to be defined with opposite sign. 

When the law of mass action applies, the heat of reaction is ex¬ 
pressed in terms of the equilibrium constant Ki 

(51.10) 

Formula (51.10) agrees with the Le Chatelier-Braun principle, 
which can be easily seen from^ the following argument. If - > 0, 

then, as the temperature increases, the equilibrium tends towards 
a predominance of those substances that enter into the reaction 
equation with positive coefficients v,-. The concentrations of these 
substances appear in the numerator on the left-hand side of equation 
(61.6). But then, according to (51.10), the system absorbs heat, 
so that reactions occiw in it which oppose the rise in temperature. 
The increase or decrease in the temperature of an equilibrium system 
can produce reversible reactions in it in any desired direction. 


Exeiciseg 


1) Write down the equations of the law of mass action for the reaction 
2CO -t- Oj = 2 CO 2 , if a moles of CO and b moles of Oj initially take part in the 
reaction. 

Let X moles of Oj react; then 2x moles of CO enter into the reaction with 
them, and 2x moles of COj are formed. In all, there are a + b — -Zx + 'lx = a + b — x 
moles of different substances in the system. The concentrations are, re¬ 
spectively, 

a — 2x 

Cco = —Ta— r . 

a + b — X 
_ h ~ X 
“ o -I- 6 — 0! ’ 

2x 

so that the equilibrium equation appears thus: 

(2 x)‘ (a + b-X) 

(a-2x)‘(b-x) 


Here, p is the equilibrium pressure, which differs from the pressure of the 
original substances p„ (for the same temperature) by the factor ^ Whence 


a -f- 6 


the equation for the required quantity x is 


PqK 


(a — 2x)^(b — x) i{a + b) 


Sec. 61] 


CHEMICAIi EQUIUBBIA 


581 


2) Calculate the equilibrium constant for the thermal dissociation of nitrogen, 
using the following data. 

The groimd state for a nitrogen atom is ^S. The first excited state lies 2.4 ev 
higher (*£)), the next lies 3.5 ev above the groimd state (^P). 

The formation energy of an N, molecule at absolute zero is equal to 9.70 ev 
(this value is now reliably established). 

The moment of inertia for the ground state is J = 13.84 x 10~*® gm/cm®. 
The vibrational quantum of a molecule is equal to 0.287 ev. In the groimd state 
of the molecule, the orbital and .spin angular momenta of the electrons do 
not have projections on the line joining the nuclei. The first excited state lies 
higher than 0 ev above the ground state of the molecule. 

The statistical sum for the atoms is 


/n ’ 


(2 7t WiN 0)it/2 


44 - 2 . 5.6 


2.4 

0 + 2 . 


3.6 


3^5 

'o 


I 


Here and in future, it is convenient to express 0 in electron-volts, taking into 
account that 1 ev corresponds to 11,600°. We shall confine oiu’selves to tempera¬ 
tures for which the statistical sura for a molecide involves only the electronic 
groimd state. Then we obtain [see (47.21)]: 

_ 

, (27cWfi*0)3/2 2 kJQ 4-n 6 ® 

/n* = —(2 „A)aT" ■ 

1 „ 0 


The equilibrium constant for the reaction N 2 = 2N is, from (51.5), equal to 


K: 


111 

/n, 


I 2.4 3.5\a „ , „ / hw\ 9.76 

= ( 44 - 106 84-66 9 ) Oe ® ■ 


To illustrate, let us find the fraction of dissociated molecules, if the tempera¬ 
ture is equal to 1 ev and there are 2.7 X10^“ molecules in 1 cm®. Then the equa¬ 
tion of the law of mass action equation is 


4 a:® 

1 — a: 


=0.494 . 10« . 5.90 . 10-» = 24.2. 


Here, the factor before the exponential is equal to ~ 5 x 10®, while the exponen¬ 
tial is equal to 5.9 x 10~®. The eiiuilibrium degree of dissociation is a: = 0.88. 
Thus, when the tomperatiu-o is equal to only one tenth of the dissociation energy, 
88% of all the molecules have already dissociated. The predominance of the 
pre-exponential factor over the exponential is explained, for such relatively low 
temperatures, by the fact that the statistical weight of the dissociated state is 
determined by the entire volume occupied by the gas, while the nondissociated 
state is determined only by the volume of the molecules; therefore, at atmospher¬ 
ic density of the gas (2.7 x 10*® mole/cm®), dissociation is already highly 
probable. 

3) Find the degree of thermal ionization for helium as a function of its tem¬ 
perature and pressure. Do not take second ionization into account. 

The first ionization potential of helium is 24.47 ev, while the first excited 
state lies 20.5 ev above the ground state. 

The ionization equilibrium satisfies the law of mass action 

Ce CHe'*' 

CHe 


= p-* K. 


682 


STATISTICAL PHYSICS 


[Part IV 


The statistical sums here are: 

fe- 

Ieb^ '■ 
/ho = 


{2-r:meQpi^ 
{2nhf ’ 
(2teWi£o6)®^^ 


( 2 rtA)=> 


24.47 


(2ji miio 0)3/2 _—L 

(2T:hf 


(the factor 2 takes into account the spin of the electron and Ho+ ion). 
From this wo express the equilibritim constant as 


, (2rtm, 0)3/2 
(2itA)> 


0 ■ e 


M.47 

6 


1 1/2 m’O* 
F 


e 


24.47 


If the initial pressure of helium is Po» then the ionization-otiuilibrium equation 
assumes the following form: 

x’‘ _ 

1-x ~ po ■ 

To illustrate, we shall take a temperature of 4 ev and a molecular density 
2.7 X 10** mole/cm*. Then the equation of ionization equilibrium is written 
numerically thus: 

-T= -8.47 • 103.2.19.10-3 = 7.6. = 0.90. 

I — X 

Here, >is in the previous example, the pre-exponential factor predominates 
over the exponential, equal to ^19 x lO”*, as a result of the largo statistical 
weight of the ionized state. Excited states of the helium atom make a negligible 
contribution to the statistical sum. At higher temperatures, the first ionization 
is practically complete, so that there are simply no neutral atoms which could 
be excited. 

4) Relate the e.m.f. of a primary cell to the heat of the chemical reaction 
occurring in it. 

By definition, the e.m.f. is the work done in carrying unit charge aroimd a 
conducting circuit. If a primary cell i.s connected in the circuit, then, in travers¬ 
ing it, reversible chemical reactions occur that neutralize the ions at the elec¬ 
trodes. The work done in a reversible reaction is equal to the change in thermo¬ 
dynamic potential, while the heat is equal to the change in the heat fimction. 
Whence, from equation (51.9), we obtain 


Sec. 52. Surface Phenomena 

The thermodynamic potential of a surface layer. We have so far 
considered only the volume properties of a substance, so that all 
the results relating to phase and chemical equDibria, and to equilibria 
in solutions, refer, strictly speaking, to very large systems. 

The surface layers separating different substances or different 
phases of the same substance exhibit special properties, which, 
however, depend both on the nature and the states of the volumes 
in contact. 


Sec. 52] 


STTRI’AOE PHENOMENA 


583 


The thermodynamic potential for unit surface of contact of two 
media depends upon the temperature 0 and pressure p in the sur¬ 
rounding media. In equilibrium, 6 and p are constant over the whole 
surface. Interaction between the contacting portions occurs across 
their boundaries. The dimensions of the boundaries are proportional 
to the first power of the linear dimensions of these portions, while 
their areas are proportional to the squares of the linear dimensions. 
Therefore, for sufficiently large dimensions, the portions of the sur¬ 
face layer may be regarded as quasi-independent subsystems, like 
the way in which we considered volume subsystems. Hence, the 
thermodynamic potential of a surface is additive for the same reason 
as the volume potential is. If the thermodynamic potential for unit 
interface surface of two media is denoted by a and the magnitude 
of the whole contact surface is X,, then, by virtue of additivity, the 
potential for the whole surface will be equal to 

a) = a!:.' (62.1) 

Surface tension. The work done at constant pressure and temper¬ 
ature is equal to the change in thermodjmamic potential (Sec. 46). 
Therefore, in changing surface area by unity, work is performed 
equal to a. This work is called the surface tension of two given media. 

It is easy to demonstrate the relation between this definition of 
surface tension and its elementary definition. Let a film of liquid 
be stretched on a rigid H-shaped wire frame, closed by a movable 
member, completing a rectangular liquid surface. If the length of 
the movable member is equal to unity, then a force acts on it (from 
the direction of the film) equal to twice the surface tension of the 
film (because the film has two sides). When the cross-member is 
displaced by unit length, the force of surface tension performs work 
numerically equal to double its magnitude. But the total surface 
of the film increases in this case by two units, so that the work done 
in increasing the surface by unity is indeed equal to the “force” 
of surface tension. 

When the surface area is increased, part of the atoms leave the 
volume for the surface layer; to do so it is necessary in part to 
overcome the attractive forces duo to other atoms. This is what 
explains the origin of the work that is lost (or gained, depending 
upon the nature of the volumes in contact) in increasing the area. 
The surface tension of a condensed phase at the boundary with a 
vacuum is, of course, always positive. 

The thermodynamic potential is at a minimum when in equilibrium. 
In this case, the minimum is attained simply for the least area 
Therefore, a liquid film, stretched on a certain (in the general case a 
nonplane) frame assumes the least possible surface area for the given 
frame. A liquid drop in ideal equilibrium assumes a spherical form 
that has the least surface area for a given volume. 


584 


STATISTICAL PHYSICS 


[Part IV 


The heat ot surface increase. When surface area increases, heat is 
evolved in addition to the performance of work. Since the process of 
surface increase is reversible, the heat is determined from the general 
formula (46.18) Q~0AS. The entropy of the surface is calculated 
from formula (46.46). Substituting the thermodynamic potential for 
the surface (62.1) into this formula, we find an expres.sion for the heat; 


Q = (52.2) 

doL 

Thus, heat may be evolved or absorbed depending upon the sign of g—. 

The equilibrium vapour pressure above a drop. The phase equilibrium 
condition changes if the surface thermodynamic potential is taken 
into account in addition to that for volume. Of course, the general 
condition d<I) = 0 still holds, but it no longer reduces to the form 
(49.4) ( 1.1 = 1 x 2 . Instead, it is necessary to write, generally. 


e 4 >j _ 0*2 

'aN ~ 'on * 


(52.3) 


Let the subscript 1 refer to the vapour phase contained in a large 
volume, and the subscript 2 relate to a small liquid drop of radius It. 
Then, for the first phase. 


aN ~~ 


(52.4) 


while for the second phase. 


0^2 _ 

8N ~ aN 


(iV (X2 + a Q. 


(52.5) 


The derivative of the second term is calculated in the following way; 
0 


y ax. ^ j^aR 


(52.6) 


If the density of the liquid is p mole/cm®, then R = 


—V’, so that 

It "pj 


aR _ I R 
aN ~ z N ’ 


(52.7) 


Substituting this in (52.6) and now expressing N in terms of R, we 
obtain 


aN 


a R^ 


(52.8) 


Thus, the equilibrium condition between the vapour and liquid drop is 
(Xi(p, 6) = {r2(?J.9) + 


(52.9) 


Sec. 62] 


SURFACE PHENOIirENA 


585 


We represent the pressure f as where Po is i-h® equilibrium 

pressure on a plane surface. The expansion of chemical potential in 
powers of Ap gives [cf. (50.12)] 

(Wi —r2)Ap = -^. (52.10) 

Neglecting the specific volume of the liquid compared with the specific 
volume of the vapour, we find a final expression for the excess pressure; 


The analogous expression for the pressure in a vapour bubble within 
a liquid leads to the same formula, but with opposite sign. 

The stability ol supersaturated phases. We have thus seen that the 
equilibrium vapour pressure over the convex surface of a drop is 
greater than that over a plane surface, while that over the concave 
surface of a bubble is less. This explains the relative stability of super¬ 
saturated phases, which was mentioned in Sec. 49. If a liquid drop 
appears in a supersaturated vapour with pressui’e p'>Po. whose 
radius is less than 


_ 

p 0 (p' - Po) ’ 


(52.12) 


then this drop evaporates once again, and further condensation on it 
is highly improbable, since it is a fluctuation phenomenon. Only if 
inequality (52.12) is reversed can the drop begin to grow. But the 
spontaneous formation of a large drop, like any large fluctuation, is 
highly improbable. Therefore, condensation usually begins on small 
nuclei that are already in the vapour, for example, ions. 

In exactly the same way we can explain why a highly purified 
superheated liquid does not boil. Boiling of a liquid consists in the 
formation of vapour bubbles in its volume. In order that such a bubble 
should not collapse due to external pressure, the equilibrium pressure 
of the vapour must be at least that of the external atmospheric pressure 
above the liquid. But if the equilibrium vapour pressure above a 
plane surface is only equal to the external pressure, then there is not 
sufficient pressure in the bubble for equilibrium. Therefore, a bubble 
of insufficient size cannot grow. 


APPENDIX 


APPENDIX 


687 


The summation with the upper sign may be reduced to the summation with 
the lower sign. Indeed, 


- 1 -t- 


+ _L. + .. 

2(1+1 “ 3"+! 4”+l ~ 

1.11 


I 311+1 + 411+1 + • • • 


41 H 


•2 • 


211+1 I 


1 + 


~ + 


311+1 + • • • j 


/, 1 \ 

/, 1 

1 , 

1 \ 


(i+2S+r + 

311+1 + 

411+1 + ... j 


Finally, the summation involving positive signs has the following values: 


1 

n^-j 

1 

3 

2 

2 

5 

2 


00 

=2-612 

fe-1 

1.646 

1.341 

1.202 

1 

1.127 

1.0823 


For odd n, the following formulae obtain: 

00 00 

1 _ n* 1 


Therefore, 


fe 1 


Wo also note that 


■ x^dx 
e*—1 


fe~l 


00 00 

j-xV^ ^ y . 2.612 = 2.31 . 

J e* - 1 2 " 2 

0 < 1-1 

W'e have met with the integral (44.39): 

00 00 

fe+ ' if(f+e - ^ )- = - 2/** i (i^r) = 

0 
00 


f - 4(1 


Je* + 1 i 

0 

2/ 6 

3 


— 00 


588 


BIBLIOaBAPHY 


BIBLI06BAPHY 


Part I 

1. *L. T). Landau and E. M. Lifshits, Mechanics, Gostekhizdat, Moscow- 

Loningrad (1958). 

2. *0. K. Suslov, Theoretical Mechanics, Gostekhizdat, Moscow-Leningrad 

(1944). 

2 . *T. Levi-Civita and I. Amaldi, Lezioni di meccanica razionale, Bologna 
(1930). 

4. E. T. Whittaker, Analytical Dynamics (1937). A Treatise on the Analytical 
Dynamics of Particles and Rigid Bodies, Cambridge (1927). 

5. A. Sommerfeld, Mechanik (1944). 

(5. G. Goldstein, Classical Mechanics, Cambridge (1950). 


Part II 

7. *N. E. Kochin, Vector Ccdcidus, GITTL (1934). 

8 . L. D. Landau and E. M. Lifshits, Field Theory, Gostekhizdat, Moscow- 
Leningrad (1948). 

9. *1. E. Tamm, Fumlamentals of the Theory of Electricity, Gostekhizdat, 

Moscow-Leningrad (1954). 

10. *Abraham and Becker, Theorie der Elektrizit&t, Bd. I, Leipzig (1932). 

11. *P. Becker, Elektronentlieorie. 

12. A. Einstein, The Meaning of Relativity, Princeton (1953). 

13. P. G. Bergman, Introduction to the Theory of Relativity (1942). 

14. *Einstein and Modem Physics, Gostekhizdat (1966). 


Part III 

1.5. L. D. Landau and E. M. Lifshits, Quantum Mechanics, Part 1 (1948). 

16. ^D. I. Blokhintsev, Fundamentals of Quantum Mechanics (1944), 2nd ed. 

(1949). 

17. A. Fok, The Principles of Quantum Mechanics (1932). 

18. H. A. Bethe and E. E. Salpeter, Quantum Mechanics of One- and Two- 
Electron Systems, Springer-Verlag (1957). 

19. P. A. M. Dirac, The Principles of Quantum Mechanics, Oxford (1957). 

20. E. V. Shpolskii, Atomic Physics, Parts 1 and 2 (1960). 


Part IV 

21. L. D. Landau and E. M. Lifshits, Statistical Physics (1960). 

22. *V. G. Levich, Introduction ta Statistical Physics (1954). 

23. ♦M. A. Leontovich, Statistical Physics (1944). 

24. *M. A. Leontovich, Introduction to Thermodynamics (1950). 

25. *E. Fermi, Molecules and Crystals (1947); (Molekiile und Kristalle, 1938). 

26. ♦A. G. Samoilovich, Statistical Physics (1955). 


* As far as the author knows, these have not been translated into English. 


SUBJECT INDEX 


589 


SUBJECT INDEX 


Abberation of light: 200 
Absolute thermodynamic scale of 
temperature: 622 

- - electrostatic system of units: 106 

— black body: 467 
Absorption of light: 170 
Action: 82 

— for a particle in an electromagnetic 
field: 218 

- for an electromagnetic field: 117 

— in the theory of relativity: 211 
Additive quantity: 16 

— integral of motion: 36 
Adiabatic process: 620 

~ demagnetization: 643 
Alpha disintegration: 283 
Amplitude: 61 
Analogy, Mandelshtam: 282 
—, optical-mechanical: 235 
Angle, Eulerian: 79 

— -, solid: 54 
Angular momentum: 36 

of a field: 124 

—, as a generalized momentum: 42 

— composition of: 309, 337 
Antiparticle: 401 
Antiproton: 401 
Aphelion: 45 

Atmosphere, planetary: 441 
Atom, vector model of: 369 
Atomic imits: 316 
Atoms, hydrogen-like: 322 
—, metastable 367 
Barrier factor: 282 

- for alpha disintegration: 284 
Biot-Savart law: 137 

Bohr theory of atomic structme: 229 

- magneton: 330 

■ quantum conditions: 290 
Boltzmann distribution: 430 
Born approximation: 385 
Bose condensation: 474 
Centre of mass: 26 
Proper time: 205 
Chemical eq^ibrium: 576 

- potential: 668 
Circulation of a vector: 96 
Close action: 134 

Coefficient, mutual induction: 160 
—, self-induction: 160 
Collisions, elastic: 61 
—, inelastic: 60 

Commutative relations for operator: 298 
Condition, Lorentz: 116 
Conductors: 149 
Constant, equilibrium: 579 


Constant, Planck’s: 230 
Constraints, ideal rigid: 16 
Contraction of the length .scale: 198 
—, time interval: 197 
Coordinate system: 11 
—, centre of mass: 49 
—, inertial: 15 
—, laboratory: 49 
, rotating: 69 

Coordinates, curvilinear: 101 
• -, generalized: 12 
—, normal: 66 

, normal electromagnetic field: 275 
—, spherical: 24 
Coupling, j-j: 339 

— ■, Bussel-Saunders: 338 
Critical point: 662 
Curl: 96 

Current, displacement: 111 
Cyclic variabfes: 43 
Deflection of light rays in a gra¬ 
vitational field: 210 
Degree of freedom: 11 
—, thermodynamic: 573 
Density, charge: 109 
-, current: 110 
—, energy: 122 
—, probability: 244 
—, probability flux: 250 
Diamagnetic substance: 161 
Diamagnetism of electrons: 485 
Dielectric: 149 

— constant: 163, 162, 442 
Diffraction, electron: 239 
—, x-ray: 239 

Dipole approximation: 185, 362 
Dispersion: 379 

— formula: 382 
Distribution, Boltzmann: 430 
—, Bose-Einstein: 426, 474 
—, Fermi-Dirac; 426 

—, Gibbs: 602 
—, Maxwell: 432, 477 
Divergence, vector: 94 
Domains: 162 

Effect, anomalous Zeeman: 374 
—, Compton: 231 
—, Dopier: 207 
—, linear Stark; 377 
—, normal Zeeman: 374 
—, square-law Stark: 377 
Effective cross-section, 
differential: 64, 385 
—, total: 391 
Efficiency: 622 
Eigenfimctions: 253 


690 


SUBJECT INDEX 


Kigonvaluos: 253 

—, angular-momontuinproiection:294 
; degenerate: 416 

- of square of angular momentum: 308 
Einstein formula for fluctuation pro¬ 
bability: 550 
radiation laws: 464 
Eloments, rare-earth: 342 

- transuranium: 243 

Emission of quanta, spontaneous and 
forced: 462 

magnetic dipolo: 188, 366 
, quadrupolo: 188, 367 
Energy: 31 
-, field: 121 
-, hydrogen atom: 410 
-, in the theory of relativity: 213 
-, of potential gauge calibration: 44 
-, potential: 19 
-, rest: 213 
-, solid body: 467 
-, calculation of: 468 

- -, total: 32 

Energy levels, electronic: 448 

- —, fine structm'e of: 330 
multiplet: 339 

-, rotational: 452 
-, spectroscopic notation of: 339 

- -, vibrational: 449 
Entropy: 506 

- — of a subsystem: 507 
Equation, Clapeyron: 436 

-, Clausius-Clapeyron: 559 
Dirac: 395 

- -, Schrodinger: 246 
—, Thomas-Fermi: 487 

- -, van dor Waals: 560 
Equations, Euler: 78 

Hamilton: 89 
—-, Lagrange; 19 
—, Maxwell: 109, 113 
-, in complex form: 157 
Excited state of a system: 254 
Expansion, eigenfunction: 304 
Experiment, Fizeau: 200 
—, Michelson’s: 191 
—•, Stern-Gierlach: 295 
Factor, Land6: 372 
Fermi-Dirac distribution: 426, 477 
Ferromagnetism: 151 
Filled levels, shell: 370 
Pine structure of atomic levels: 330 
Fizeau experiment: 201 
Fluctuation, absolute energy: 504 
—, relative energy: 504 
Force: 14 
—, central: 24 


Force centrifugal; 73 
—, Coriolis: 72 

electromotive: 106 
—, friction: 16 
—, inertia: 71 
—-, Lorentz: 220 
-, magnetomotive: 112 

- oscillator: 384 
Formula, barometric: 441 

-, Debye: 470 
--, dispersion: 382 
Pl^ck: 464 

- -, Poisson: 548 

—, Rutherford: 55, 387 
—, Stirling: 423 
Formulae, Fresnel: 172 
Free energy: 624 
Frequency, of oscillation: 58 
Function, Hamilton: 88 

- Lagrange: 22 
—, for field: 117 
Galilean transformation: 68 
Gauge calibration, potential: 115 
—■ invariance: 116 

Gibbs distribution: 502 
Gradient: 98 

Ground state, electromagr etic field: 277 
- of a system: 264 
Hamilton’s equation: 89 

— function (Hamiltonian): 88 
Harmonic oscillator: 61, 266 
Heat of reaction: 679 

— of solution: 571 
Helium, orthostate of: 349 

— parastate of: 349 
Ideal gases: 415 
Index, adiabatic: 542 
—, refractive: 170 
Induction, electric: 148 
—, magnetic: 148 
Integrcd principles: 82 
Integrals of motion: 31 
Intensity, dipole radiation: 187 
Interval: 202 

Intervals, space and time: 203 
Invariant electromagnetic field: 223 
Inversion: 320, 403 
Irreversible process: 620 
Isomers, nuclear: 367 
Isothermal process: 619 
Kepler problem: 47 
Lagrange’s equation: 22 
—, inte^als of motion: 31 
Lagrangian: 19 
—, field: 117 
Land6 factor: 372 
Laplacian operator: 100