Rudolf Kippenhahn Alfred Weigert Stellar Structure and Evolution 1996
STELLAR STRUCTURE & EVOLUTION
-
Upload
khangminh22 -
Category
Documents
-
view
2 -
download
0
Transcript of STELLAR STRUCTURE & EVOLUTION
STELLAR STRUCTURE & EVOLUTION
Prof. Dr. Conny Aerts
KU Leuven, Belgium
Master Program in Astronomy & Astrophysics
(NL: Master Sterrenkunde)6 Study Points in the European Credit system (ECTS)
Contents
Preface xi
PART I: BASIC INTRODUCTION TO ASTRONOMY 1
1 Observational framework of astronomy 1
1.1 Magnitudes and colour indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Spectral types and luminosity classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 The formation of spectral lines in the stellar spectrum . . . . . . . . . . . . . . . . . 4
1.2.2 Spectral types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Luminosity classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.4 Stellar atmosphere models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 The Hertzsprung-Russell diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Stars in our Milky Way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Galaxies in the Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6 Starting point of this course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
25
iii
PART II: STELLAR STRUCTURE 25
2 A simple equation of state: an ideal gas with radiation 25
2.1 Introduction to thermodynamics, applied to stars . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.1 Thermodynamic equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.1.2 The first law of thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.1.3 The entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.4 The specific heats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2 An ideal gas with radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.1 The classical ideal gas law applied to stars . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.2 The mean molecular weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.3 The internal energy of an ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.4 The contribution of the photon gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3 Classical mechanics applied to stellar structure 37
3.1 Some preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1 Eulerian description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.2 Lagrangian description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Conservation of momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.1 Hydrostatic equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.2 Simple solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.3 The equation of motion in case of spherical symmetry . . . . . . . . . . . . . . . . 44
3.5 Conservation of energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5.1 The virial theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5.2 Conservation of energy in stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5.3 The different time scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Additional relevant equations of state 53
4.1 Polytropes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2 The degenerate electron gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3 The Chandrasekhar limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Schematic representation of the relevant equations of state . . . . . . . . . . . . . . . . . . 63
5 Energy transport 65
5.1 Energy transport by radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1.1 Mean free path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1.2 The temperature gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1.3 The diffusion approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1.4 The Rosseland mean opacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.2 Energy transport by conduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 Stability analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.3.1 Dynamical instability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3.2 Buoyancy frequency and semiconvection . . . . . . . . . . . . . . . . . . . . . . . 78
5.4 Convective energy transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.1 Mixing length theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.2 A computational scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4.3 The parametric implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6 The chemical composition of stellar matter 87
6.1 The relative mass fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 Variations of chemical composition due to nuclear reactions . . . . . . . . . . . . . . . . . 88
6.3 Effective cross sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.4 Nuclear burning cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4.2 Big Bang nucleosynthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.4.3 Hydrogen burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.4.4 Helium burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.5 Fusion of heavier elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.5 Summary: the periodic table filled via the nuclear reactions in stars . . . . . . . . . . . . . . 102
7 Complications: mixing of chemical elements due to transport processes 103
7.1 Convective mixing and nuclear burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.2 Convective boundary mixing aka CBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.3 Mixing due to rotational instabilities and waves . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3.1 Models with rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3.2 Rotational and pulsational mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.4 Microscopic atomic diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8 Numerical computation of stellar structure and evolution models 113
8.1 The full system of basic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
8.2 Time scales and simplifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8.3 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.3.1 Central boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.3.2 Boundary conditions for the surface . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.4 A simple numerical scheme: the Henyey method . . . . . . . . . . . . . . . . . . . . . . . 118
8.5 The MESA stellar structure and evolution code . . . . . . . . . . . . . . . . . . . . . . . . 124
PART III: STELLAR EVOLUTION 129
9 Star formation 129
9.1 The interstellar medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.2 The Jeans criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.3 Fragmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
9.4 The formation of a protostar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.5 Hayashi tracks in the HR diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.6 Evolution of the protostar towards the zero-age main sequence . . . . . . . . . . . . . . . . 139
10 The main sequence or core-hydrogen burning phase 145
10.1 Zero-age main sequence models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.2 The mass-luminosity and mass-radius relations . . . . . . . . . . . . . . . . . . . . . . . . 147
10.3 Chemical evolution on the main sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
10.4 The end of core-hydrogen burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
10.5 Later stages of evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
11 Evolution of a star with 8M⊙<∼ M <
∼ 15M⊙ 163
11.1 The Hertzsprung gap for stars with M >∼ 2.3M⊙ . . . . . . . . . . . . . . . . . . . . . . . . 163
11.2 Helium burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
11.3 Later evolution stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
11.4 Burning cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
11.5 Explosive versus non-explosive evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
11.6 Neutron stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
11.6.1 Supernova explosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
11.6.2 The neutrino flux and the r-process . . . . . . . . . . . . . . . . . . . . . . . . . . 171
11.6.3 Pulsars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
12 Evolution of a star with M <∼ 8M⊙ 179
12.1 Post-main-sequence evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
12.2 The helium flash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
12.3 Evolution after the helium flash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
12.4 AGB stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
12.4.1 The circumstellar envelope and mass loss during the AGB . . . . . . . . . . . . . . 188
12.4.2 Thermal pulses, Hot Bottom Burning and the 3rd dredge-up . . . . . . . . . . . . . 191
12.4.3 The s-process in AGB stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
12.5 Post-AGB stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
12.6 White dwarfs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
13 Evolution of a star with M >∼ 15M⊙ 199
13.1 The spectra of hot massive stars with mass loss . . . . . . . . . . . . . . . . . . . . . . . . 199
13.2 Basic characteristics of radiation-driven stellar winds . . . . . . . . . . . . . . . . . . . . . 203
13.3 Mass loss and terminal wind speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
13.3.1 Thomson scattering in the stellar wind . . . . . . . . . . . . . . . . . . . . . . . . . 206
13.3.2 LBVs, WR stars and the Eddington limit . . . . . . . . . . . . . . . . . . . . . . . 207
13.3.3 A realistic description of a line-driven stellar wind: the CAK-model . . . . . . . . . 208
13.4 Consequences of mass loss for stellar evolution . . . . . . . . . . . . . . . . . . . . . . . . 211
13.5 Example: the evolution of a star with an initial mass of 60M⊙ . . . . . . . . . . . . . . . . 215
13.6 Black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
13.7 Chemical evolution of galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
13.7.1 Chemical enrichment by stellar evolution . . . . . . . . . . . . . . . . . . . . . . . 219
13.7.2 Initial mass function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
13.7.3 Global enrichment of the Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
PART IV: BINARY EVOLUTION 227
14 Binary stars and their evolution 227
14.1 Observational classification of close binary stars . . . . . . . . . . . . . . . . . . . . . . . . 228
14.2 The Roche model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
14.3 Determination of the orbital and fundamental parameters of binaries . . . . . . . . . . . . . 236
14.3.1 Orbital elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
14.3.2 Masses and radii of main-sequence stars . . . . . . . . . . . . . . . . . . . . . . . . 237
14.3.3 Masses of white dwarfs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
14.3.4 Type Ia supernovae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
14.3.5 Masses of neutron stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
14.4 Mass transfer and evolution of close binaries . . . . . . . . . . . . . . . . . . . . . . . . . . 240
14.4.1 Tidal effects: circularisation and synchronisation . . . . . . . . . . . . . . . . . . . 241
14.4.2 Mass transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
14.4.3 Effect of mass transfer on the orbital parameters . . . . . . . . . . . . . . . . . . . 244
14.4.4 The common envelope phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
14.5 Some binary scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
THE END 257
A Planck’s radiation laws 257
B Values of some physical and astronomical constants 261
C Some key references for this discipline, used in these lecture notes 263
x
Preface
Stars are the building blocks of the Universe as a whole, and of galaxies and exoplanetary systems in
particular. Stars determine the dynamical and chemical evolution of galaxies and of their multiple systems
such as star clusters, binaries, and planetary systems. Stars are responsible for the production of nearly all
chemical elements and as such they determine the chemical enrichment of their host galaxy and any forms
of biological life occurring on planets (if any). During their lives, the products of nucleosynthesis are fed
back into the interstellar medium through stellar winds. Massive stars also contribute to the chemical and
dynamical evolution of their host galaxy at the end of their life when they explode as a supernova. It is
thanks to this chemical enrichment of galaxies, and of our Milky Way in particular, that life on Earth has
been able to emerge and evolve. The derivation of distances in the Universe is based on our knowledge of
stars. This is also the case for the determination of the age of the Universe. We cannot but conclude that
models of stellar structure and evolution play a pivotal role in most subjects of modern astrophysics.
Keeping this in mind, it is not only obvious but as well recommendable to have a mandatory 6 ECTS
master course describing these essential constituents of the Universe in quite some detail. At Bachelor
level, most students already had descriptive courses covering various topics in astronomy, e.g., (exo)planets,
stars, interstellar medium, galaxies, cosmology, etc. These courses might only have touched upon each
of these many topics, depending on the background of the student. Some students may enter the MSc in
A&A at KU Leuven without any prior knowledge of astronomy whatsoever. Our aim with this course is to
get everyone comfortably aboard, despite the large diversity in prior knowledge and scientific or cultural
background.
⋆
⋆ ⋆
Computations of stellar structure and evolution models are based on our current knowledge of the physical
properties of matter and radiation occurring in stars. By comparing the computed models with ever more pre-
cise observations, we are able to test the assumptions on the input physics used in the model computations.
This is the only practical method to evaluate stellar evolution models, because stellar interior conditions are
so extreme in terms of temperature, density, and pressure that it is impossible to carry out laboratory tests
under the appropriate circumstances and for the proper time scales. This is what distinguishes observational
stellar astronomy from experimental physics.
The structure and evolution of stars is mainly determined by the microscopic properties of stellar matter,
xi
in particular the equation of state of the gas, the energy transport, the nuclear reactions, and the interaction
of radiation and matter. The equation of state determines the relations between the various thermodynamic
properties like temperature, density, and pressure inside the star. This equation of state is quite simple in the
case of stellar interiors because the prevalent high temperatures imply that matter is almost entirely ionized.
However, some complications occur, for example when the stellar gas is only partially ionized in the layers
near the stellar surface. In these regions, we need to take the degree of ionization into account. This level
of ionization depends on the interaction between the different gas components. In the stellar atmosphere of
cool stars, molecules may appear aside from neutral atoms, influencing the equation of state. On the other
hand, in the hot cores of evolved stars, the temperature may become so high that energy loss due to the
production of neutrinos has to be taken into account. Also, the density may become so high that the pressure
will be dominated by degenerate electrons or neutrons. In such a case, the ideal gas law is no longer valid
but the laws of quantum physics have to be considered to arrive at a proper equation of state.
A detailed description of the physical properties of stellar interiors needs in-depth studies. Fortunately,
it is possible to derive an appropriate description of stellar structure and evolution without going into the full
details and obtain a good solid understanding of how stars live their life. The theory described in this course
is elegant, has an impressive predictive power, and combines several branches of mathematics, physics,
chemistry, and computer science. Furthermore, via modern implementations of the theory into an efficient
set of modules called MESA, detailed numerical computations of this theory can be performed (with much
gratitude to the MESA developers team!). In that way, we are able to predict how the complicated internal
stellar structure changes during a star’s life and what the final end product of the stellar evolution will be: a
white dwarf, a neutron star or a black hole.
⋆
⋆ ⋆
The current 6 ECTS course Stellar Structure & Evolution (SSE) aims to give a stimulating derivation and
description of the physics of stars to Master students. This way, students will be able to understand the
origin, life and death of stars and learn how this cycle influences the chemical evolution of galaxies and of
the Universe as a whole. Introductory bachelor courses may have suggested that the branch of physics as a
scientific domain is composed of several separate sub-branches, like mechanics, thermodynamics, nuclear
physics, quantum mechanics, etc. In this stellar structure & evolution course, these seemingly separate
segments nicely come together in a natural way, taking chemistry onboard as well. This course is thus
intended for students who have a basic knowledge of mathematics, physics, and to some extent chemistry.
We tried to limit specific technical jargon to a minimum and to compose a course well accessible for all
students at the start of the Master in Astronomy & Astrophysics, whatever their background in science.
The first chapter recalls some of the basics of bachelor courses, limiting to only those aspects and
definitions that are essential as a start-up for the current course. Some of the aspects of astrophysics were
already treated in Leuven Bachelor courses, which master students who did their Bachelor studies elsewhere
might not have encountered in their education. The ingredients that are of major importance here, are
therefore briefly repeated in this Master course.
Conny Aerts, September 2021.
xii
Chapter 1
Observational framework of astronomy
All we know about stars and other celestial bodies, is derived from observations of their electromagnetic
spectrum. The amount of light radiated by a star is mainly determined by its size and its temperature and
chemical composition near its surface. The challenge is to decode this radiated light and deduce information
which is not observable, like stellar mass, age, internal temperature, pressure, chemical composition, . . .
The brightness or luminosity L of an object is the amount of energy it radiates per second. This quantity
is a crucial parameter for a star’s life, together with its mass M . None of both variables can be measured
directly. The amount of energy coming from a star that can be measured with an instrument, depends on the
distance d between star and observer. Therefore it is clear that the determination of distance is an important
part of observational astronomy.
Often we will assume that stars roughly behave like black bodies, following Planck’s radiation law. A
short description of these laws and their interpretation, can be found in appendix A. The correlation between
temperature, luminosity and radius of a black body is described in the law of Stefan-Boltzmann:
L = 4πσR2T 4, (1.1)
where σ is a constant defined as 2π5k4/15c2h3 with k the constant of Boltzmann, h the constant of Planck
and c the speed of light. The effective temperature of a star is then defined by the comparison of a star
with radius R and luminosity L with a spherical black body with radius R: L = 4πR2σT 4eff . This way the
effective temperature is a measure of the temperature in the star’s photosphere. This is the area near the
outer layers of the star where the radiation escapes. The Sun has an effective temperature of ≃ 5780K.
An important characteristic coming from Planck’s radiation laws is the displacement law of Wien.
This law states that the wavelength at which the maximum flux is radiated is entirely determined by the
temperature of the radiant body, according to λmax = (2.9/T )mm, where the temperature is expressed in
Kelvin. This implies that the Sun’s radiation is at its maximum around 500 nm. Humans and planets radiate
from infra-red to submm and mm wavelengths. Stars have an effective temperature range from about 3000 K
until some 50 000 K. By consequence, they dominantly radiate in the UV and at visual wavelengths.
1
1.1 Magnitudes and colour indices
A system of magnitudes is a logarithmic scale depending on the amount of radiative energy coming from an
astronomical source. When we consider two sources, the difference in magnitude of source 2 in relation to
the one of source 1 is given by
m2 −m1 = −2.5 logS2S1, (1.2)
by which S is the amount of received radiative energy per time unit. This leads to the conclusion that, in
case more energy is received from source 2 than from source 1, the magnitude of source 2 is lower than the
magnitude of source 1.
The introduction of magnitudes comes from the Greek astronomer Hipparchus of Nicea. In the sec-
ond century before Christ, Hipparchus classified all stars visible with the naked eye and divided them into
six classes. The brightest stars were in class 1 and the faintest ones in class 6. Only last century, the
more mathematical definition (1.2) was introduced and this was matched as best as possible to Hipparchus’
classification. In order to achieve this, the zero point of the magnitude scale had to be established in a
well-determined way. Eq. (1.2) was then rewritten as
m = (m1 + 2.5 log S1)− 2.5 log S = C − 2.5 log S. (1.3)
Fixing the constant C thus comes down to determining the zero point of the magnitude scale.
Another important aspect is the considered wavelength range involved in the magnitude system. The
first extreme possibility is to consider the so-called bolometric magnitude mbol. This is the magnitude of the
object when all wavelengths of the electromagnetic spectrum are considered. On the other hand, there is the
extreme possibility of a monochromatic magnitude mλ, which is the magnitude when only one wavelength λis considered. In practice we neither use bolometric nor monochromatic magnitudes, but rather magnitudes
related to a selected wavelength range. Hence, this range has to be specified.
The amount of received radiative energy S in definition (1.3) can be described as follows. Define Sλas the amount of radiative energy an observer receives at wavelength λ through continuum radiation (see
below for a description). A fraction ηλ of the radiation might be blocked from the observer by absorption
lines in the electromagnetic spectrum. The received radiation at wavelength λ is then Sλ(1−ηλ). Moreover,
let us assume that the sensitivity and the efficiency of the instrument used to measure the radiative energy is
described by the function ϕ(λ). This way, we can describe the amount of received radiative energy as
S =
∫
∞
0Sλ(1− ηλ)ϕ(λ)dλ. (1.4)
The standard system used for the determination of magnitudes is the UBV system designed by Johnson
and Morgan. The functions ϕU (λ), ϕB(λ), ϕV (λ) are determined by the wavelength filters used at the
telescope. These functions have a maximum at respective wavelengths 365, 440 and 548 nm. The definition
for the visual magnitude can now be described as:
mV = CV − 2.5 log
∫ ∞
0Sλ(1− ηλ)ϕV (λ)dλ, (1.5)
2
where the constant CV is fixed in such a way that the visual magnitude matches as best as possible the
magnitude classification determined by Hipparchus. Similarly to Eq. (1.5), the U− and B−magnitudes can
be defined. The zero point of the magnitude scale was chosen in such a way that mU = mB = mV = 0 for
the bright star Vega, which has a spectral type of A0V (see below for the definition).
When the magnitudes are corrected for interstellar absorption and extinction due to the Earth’s atmo-
sphere, the magnitude is only determined by the amount of radiative energy a source is emitting per unit
time and by the distance of the source to the observer. To rule out the differences in distance, the sources are
fictively positioned at an equal distance from the Sun. Indeed, when the sources are at an equal distance, the
differences in magnitude are solely determined by the differences in the amount of radiative energy. This
way of thinking precedes the introduction of an absolute magnitude scale. For a spherical star, the product
Sλ(1− ηλ) in Eq. (1.4) can this way be linked to the outward radiation flux F+λ :
4πR2F+λ = 4πd2Sλ(1− ηλ), (1.6)
where d is the distance between the star and the observer and R is the stellar radius. We thus have
Sλ(1− ηλ) =
(
R
d
)2
F+λ . (1.7)
The expression for what we call the apparent magnitude is then
m = C − 2.5 log
[
(
R
d
)2 ∫ ∞
0F+λ ϕ(λ)dλ
]
. (1.8)
Hence, the apparent magnitude is not only determined by the amount of emitted energy but also by the
distance of the star to the observer.
We now introduce the absolute magnitude M of a star. This is the apparent magnitude the star would
have when positioned at a distance of 10 parsec away from the Sun. The difference between the absolute
and apparent magnitude then is
M −m = 2.5 log
(
10pc
d
)2
. (1.9)
This can also be written as
M = m+ 5− 5 log dpc, (1.10)
where dpc is the distance in parsec. A parsec is some 3.26 light years, which corresponds to 3 × 1013 km.
The difference m−M is called the distance modulus.
Finally, we introduce the term colour index. Colour indices of stars are differences between the mag-
nitudes of the same star at different wavelengths. With the three magnitudes U,B, V , two commonly used
colour indices are constructed: U −B and B − V .
If we apply relation (1.10) on two different magnitudes of the same star and subtract term by term, we
find
M2 −M1 = m2 −m1. (1.11)
3
The difference in apparent magnitude is a quantity that is easily measured. The colour indices thus are a
measure of an intrinsic characteristic of the star. The index B − V , e.g., is a good measure for the effective
temperature of the star.
1.2 Spectral types and luminosity classes
In Figure 1.1 the stellar flux is shown as a function of the wavelength λ for different types of stars. These
were ordered from cool on top to hot at the bottom. The hottest stars are blue and their spectrum shows
absorption lines of ionized atoms. Cool stars, on the other hand, radiate particularly strongly at red wave-
lengths and show absorption lines of neutral atoms and molecules.
1.2.1 The formation of spectral lines in the stellar spectrum
Spectra of astronomical objects show continua dominated by spectral lines. The latter can occur in absorp-
tion as well as in emission vis-a-vis the local continuum. We will now shortly describe the formation of
continua and spectral lines. For a more extensive description we refer to the courses Radiative Processes
in Astronomy and Stellar Atmospheres lectured in the Bachelor of Physics and Master of Astronomy &
Astrophysics of the KU Leuven, respectively.
Spectral lines
Spectral lines are the result of discrete energy transitions like the jumps of an electron between bound levels
in an atom or an ion. These are described as bound – bound transitions or bb-transitions. Excitation to a
higher energy level can on the one hand be caused by the absorption of kinetic energy (collisional excitation)
or on the other hand by the absorption of a photon (radiative excitation). Analogously the de-excitation to
a lower energy level can be caused by a collision (collisional de-excitation) or by emission of a photon
(radiative de-excitation).
The energy exchange accompanying a bb-transition has always to do with a difference in energy hν =Emn, where Emn = Em − En is the difference in energy between the levels m and n (m > n). The
photons suitable to be involved must have the specific wavelength λ = hc/Emn.
Spectral lines are always linked to discrete bb-processes. However, emission lines are not always the
result of radiative de-excitation while absorption lines are not always the result of radiative excitation. The
origin of a spectral line is depending on the radiative transport throughout the medium. In general, spectral
lines are the result of extra bb-processes that occur at specific line wavelengths in the medium, next to the
processes that define the continuum spectrum at that wavelength.
4
Figure 1.1: Optical stellar spectra of main-sequence stars with approximately the same chemical com-
position but with increasing effective temperature from top to bottom. (From Spark & Gallagher, 2000,
Cambridge University Press)
5
Continua
Continua are the result of non-discrete processes where photons are absorbed or emitted. First, there are
bound-free transitions (or bf-transitions) from atoms and ions. For example, an electron is released from
a bound state n through the absorption of a photon with an energy higher than or equal to the ionization
energy E∞n = E∞ − En from the same level. This is called: radiative ionization. Trapping of a free
electron can lead to a bound state. Here, a photon is emitted (emission) with an energy higher than or
equal to E∞n. This is called: radiative recombination. Ionization and recombination can also occur via
absorption or emission of kinetic energy without any involvement of photons. This is called: collisional
ionization and collisional recombination. The free states of the electrons above the level of ionization are
not discrete because the free electron can have an arbitrarily large amount of kinetic energy mev2/2, in other
words hν = E∞n +mev2/2.
Bound-bound excitation and de-excitation, as well as bound-free ionization and recombination, can
occur through the absorption or emission of radiative energy from photons as well as through the absorption
or emission of kinetic energy through the collision of particles.
Furthermore, there are also free-free transitions (or ff-transitions). This is also called: Bremsstrahlung.
This is the emission or absorption of photons as a result of the acceleration or deceleration of a charged
particle in a Coulomb field, for example at a collision between an ion and an electron.
Photon creation, photon destruction, photon scattering
We can compile the bb-processes in three pairs :
1. collisional excitation followed by radiative de-excitation. This results in the creation of a photon. In
this case kinetic energy is transformed into radiation;
2. radiative excitation followed by collisional de-excitation. This results in the destruction of a photon.
In this case radiation is transformed into kinetic energy;
3. radiative excitation followed by radiative de-excitation. This is called scattering of a photon. In this
case only a repartition of radiation occurs.
When photon scattering occurs, at least the direction between the incoming and the scattered photon
changes. For the lower atomic levels, photon scattering is an important process because the decay time for
radiative de-excitation is very short at these low levels (typical 10−10 − 10−9 seconds). When there is a line
transition caused by scattering from the ground state, it is called a resonance line. In this case, the scattering
process is called resonant scattering. Resonance lines represent the lowest possible energy transitions from
the ground state. By consequence, the involved transitions have a short life time, and therefore occur very
frequently at high densities as well as at low densities (there is always a huge reservoir of electrons in the
ground state waiting for a photon to arrive so that they can make a line transition based on photon scattering).
6
Figure 1.2: Energy levels for the hydrogen atom. The bound levels approach the ionisation boundary at
13.598eV. For each of the four first hydrogen levels the bound states are designated by vertical lines with
the name and wavelength of the corresponding spectral line. The limit of each series is also indicated. The
Lyman lines are found in the UV, the Balmer lines in the visual and the Paschen and Bracket lines in the
infra-red. (From Rob Rutten, 2003, Lecture Notes, Utrecht University, NL)
1.2.2 Spectral types
Since the birth of spectroscopy in the second half of the 19th century, astronomers have classified the stars
in classes according to the strength of their Balmer lines. These are absorption lines of neutral hydrogen
(HI, see Figure 1.2) 1. This way the A stars were defined as the ones with the strongest Balmer lines, B stars
are second according to the strength of these lines and so on. At the end of the 19th century the Harvard
astronomer Antonia Maury realized that the strength of all spectral lines, not only the Balmer lines, followed
a nice sequence when she ordered the classes according O B A F G K M. This is illustrated in Figure 1.3.
On this basis the first large-scale stellar spectrum classification was carried out at Harvard College
Observatory. This was possible thanks to the financial contribution of Ms. Draper, who wanted a beautiful,
lasting memory to her deceased husband. Henry Draper was the first person ever to photograph a stellar
spectrum. The classification was carried through between 1886 and 1924 under the leadership of Annie
1In astronomy we use a different notation for isotopes and for the lines they create in a spectrum. For example, when we speak of
“iron four” and write down Fe IV, we mean the spectrum of the Fe3+ isotope. HI therefore means the spectrum of neutral hydrogen.
7
Figure 1.3: The spectral sequence of Harvard. These example spectra are printed in such a way that the
absorption lines are dark on a clear background of the continuum radiation of the star. The wavelengths are,
as is common in astronomy, expressed in Angstrom (1 A= 0.1 nm = 10−8 cm). The clearest parts in the
spectrum shift from the early-type stars (O and B) to the late-type stars (GKM). (From Rob Rutten, 2003,
Lecture Notes, Utrecht University, NL)
Cannon2. Almost 240 000 stars were classified in the Henry Draper Catalogue (as exchange for 250 000
dollar of Mw. Draper). Nowadays, there are 400 000 classified stars if we enclose those of the catalogue’s
supplement.
Today we know that the sequence of Maury is one of descending effective temperature and that the
line strength in the spectra is determined by the ionization law of Saha and the Boltzmann equation. This
very important interpretation was done by Cecilia Payne-Gaposhkin (1925)3 in her doctoral study. She
demonstrated that stars are mainly made of hydrogen (∼ 70%), helium (∼ 28%) and for the rest only for
2Note that women were not allowed to study astronomy, except as a hobby. This was the case until the classification of tens
of thousands of stars had to be carried through. The idea and plan to go ahead with this classification was not only the work of
women, but, moreover, the implementation required such a patient work, at a very low wage, that the president Edward Pickering
of the Harvard College did not find men who were willing to do this task. This is the way women entered professional astronomy
in the first half of the 20th century.3Cecilia studied astronomy at Cambridge, UK, with Sir Arthur Eddington. Eddington was however convinced that women were
not apt to do research in astronomy. Intrigued as she was by astronomy, and displeased with Eddington’s attitude, Cecilia left to
Harvard, USA, where she was welcomed to continue with her research work. She did this brilliantly.
8
∼ 2% of heavier elements (or shortly metals) 4.
Each of the seven classes was subsequently divided into ten subclasses, from 0 for the hottest until 9
for the coolest star in one class. This scheme is still in use today (so Mw. Draper should be pleased). The
Sun is a star of spectral type G2. Recently, an additional class L was added for very cool stars detected by
infra-red observations. Often the first classes of the hot stars (OBA) are called early-type stars while the
classes at the end of the classification series (GKML) are called late-type stars.
The effective temperature of O stars is higher than 30 000 K. Their strongest spectral lines are those of
singly ionized helium (He II lines) and doubly ionized helium carbon (CIII). Their Balmer lines are weak
because almost all the hydrogen in the photosphere is fully ionized at such high temperatures. The spectra
of B stars do have strong Balmer lines and also strong lines of neutral helium (HeI), their temperature is
between 12 000 and 25 000 K from B9 until B0.
The A stars have temperatures around 10 000 K and are cool enough to keep the hydrogen neutral in
their photosphere. Next to very strong Balmer lines they also have many lines of singly ionized metals, like
calcium. Also remarkable in their spectra is the so-called Balmer jump at 365 nm (3646A, see Figure 1.2).
F stars have weaker Balmer lines than A stars. In their spectrum lines of neutral metals are visible. In G
stars like the Sun, the singly ionized calcium lines (CaII) at 4300A are remarkable. These were discovered
by Fraunhofer in 1815. He labelled all lines he could distinguish in the solar spectrum, from A to K from
red to blue wavelengths. Up to now, the strongest calcium lines are therefore still called the H and K line.
The D line of neutral sodium (NaI) is also remarkable in G stars.
In the spectra of K stars we can mainly distinguish lines of neutral metals and of molecules like TiO
(titanium oxide). M stars are mostly cooler than 4 000 K at their surface. Therefore we can find deep
absorption bands of TiO and VO (vanadium oxide) as well as lines of neutral metals. In the even cooler L
stars, sodium D lines are remarkable, causing broad molecular bands.
1.2.3 Luminosity classes
The lines in a stellar spectrum do not only give us information about the effective temperature and chemical
composition, but also about the value of the surface gravity. This quantity refers to the gravitational accel-
eration at the surface of the star, namely g ≡ GM/R2 with M the stellar mass5. In astronomy, quantities
are mostly expressed in the cgs system. This is due to historical reasons, but also because this gives “handy
numbers” for some important observational parameters characterizing stars. We refer to Appendix B for the
values of physical and astronomical constants, in this system as well as in the SI system. This gives values
4Although in other branches of science, carbon, oxygen, nitrogen, iron, . . . are not grouped under the single denominator “met-
als”, it is very meaningful to do so in astronomy. This is because hydrogen and helium (and a tiny bit of lithium) were formed
within half an hour after de Big Bang, while all other elements were only formed afterwards through nucleosynthesis in stellar
interiors.5For the mass and the absolute magnitude of a star the same symbol is used. It is always clear from the context which quantity
is meant.
9
Figure 1.4: Optical stellar spectra of three stars of spectral type A, but belonging to a different luminosity
class. (From Spark & Gallagher, 2000, Cambridge University Press)
for log g between 1 and 5 for most stars, and log g from 6 to 8 for compact stellar remnants.
In Figure 1.4 the spectra of three A stars are shown. The upper star is a dwarf, like the Sun, the middle
star is a giant and the lower is a supergiant. Their log g reduces from top to bottom because the radius is
increasing in that direction. So the dwarf star is much denser than the giant and supergiant. The atoms in
this star are thus more crowded than in the giant and supergiant. This has an effect on the spectral lines
because they are subject to the Stark pressure broadening effect. This implies that the width of a spectral
line at a well-defined temperature is mainly a function of the pressure experienced by the atoms, responsible
for the line.
Therefore stars are not only divided into classes according to their temperature, but also following their
gravity. Most stars are dwarf stars like the Sun. Calling them dwarfs is somewhat misleading because the
hottest dwarfs are much bigger than the Sun, having a radius of about 10 R⊙. Giants and supergiants are in
any case much bigger, with radii of respectively a factor 10 to 100, and a factor 100 to 1000 times bigger
than the radius of the Sun. According to Eq. (1.1), they have a much higher luminosity than dwarf stars
with the same temperature. Moreover, there are also white dwarfs and these dwarfs differ very much from
the Sun. Actually, these aren’t stars anymore, but compact stellar remnants left over at the end of stellar
evolution. This is also the case for neutron stars. They have such a high density that their gravity can rise
until log g between 7 and 8 (in cgs). This is because their radius is very small, typical 0.01R⊙ (i.e. about
the Earth’s radius) for a white dwarf and only a few tens of kilometers for a neutron star.
In practice, stars are classified in luminosity classes: VII for whit dwarfs, VI for subdwarfs, V for
10
dwarfs, IV for subgiants, III for normal giants, II for bright giants, and I for supergiants. These last ones
are then subdivided in Ia and Ib according to their luminosity (Ia highest L, Ib less luminous). Often
small capitals are added to the luminosity classes, based on the appearance of specific characteristics of the
spectral lines. This way stars of which Balmer lines are observed in emission are labelled with small capital
“e” behind the luminosity class, and hot stars with NIII and HeII lines in emission get an “f” , etc.
1.2.4 Stellar atmosphere models
In practice, the effective temperature and gravity of a star are estimated by comparing its spectrum with
the one of other stars of which this value is already known. So-called model atmospheres are also used
here. These are computer models calculating how the radiation is transmitted through a stellar atmosphere
with a given effective temperature, log g and chemical composition. Model atmospheres are then calibrated
based on stars of which high-resolution spectra are available with a high wavelength range and of which the
effective temperature, gravity, chemical composition, interstellar absorption and distance are all well-known
quantities (so-called standard stars or calibration stars).
Computed model atmospheres also allow to determine the surface abundances of elements of a star,
based on high-resolution high signal-to-noise spectra. The purpose of such research is to determine the
amount of heavy elements in the stellar atmosphere and to confront that with the outcome of evolutionary
models. The abundances are always expressed in relation to those of the Sun. Abundance determination
is of major importance for the interpretation of stellar spectra of evolved stars, as treated in Part III of this
course. The modelling and interpretation of measured stellar spectra is discussed in detail in the KU Leuven
Bachelor and Master courses Radiative Processes in Astronomy and Stellar Atmospheres.
1.3 The Hertzsprung-Russell diagram
The Hertzsprung-Russell diagram (HR diagram, named after the American astronomer Henry Russell and
the Danish astronomer Enjar Hertzsprung) gives an important statistical relation for stars by means of a
diagram. The diagram represents the evolution of stars and is thus a basic diagnostic for the discussion of
stellar evolution theory. A very schematic representation of the HR diagram according to the luminosity
classes is shown in Figure 1.5, while Figure 1.6 shows the connection between the effective temperature and
spectral type on the x-axis, with several “well-known” stars named in the graph.
Russell was the first to study the relation between the spectral type and the absolute magnitude MV
of stars. Hereto, he constructed a diagram in which he plotted the absolute magnitude versus the spectral
type. On the other hand, Hertzsprung noticed a difference between dwarf stars and giant stars for late
spectral types. Often the colour index B − V is put on the abscissa instead of the spectral type or effective
temperature. This is then referred to as the colour-luminosity diagram (CD) or colour-magnitude diagram
(CMD). The use of the colour index has the advantage that this observable is a continuous variable, which
is not the case for the spectral type, which is a so-called categorical (discrete) data-type. In this way, one
11
Figure 1.5: Schematic representation of the HR diagram, where the luminosity (wrt the solar value) is repre-
sented as a function of the effective temperature. The positions of the dwarfs on the main sequence, the red
giants, the supergiants and the white dwarfs are indicated. The Sun is a main-sequence dwarf star of spectral
type G 2, with an effective temperature of 5 780 K (see Figure 1.6 for its position with respect to other bright
stars; source: tim-thompson.com)
can incorporate much weaker stars in a C(M)D, because these can be observed photometrically and not
necessarily spectroscopically.
In the schematic HR diagram shown in Figure 1.5 we notice that the stars are not scattered in an arbitrary
way. Certain combinations of luminosity and effective temperature occur much more frequently than others.
The vast majority of stars is situated in one of the following groups: the main sequence, the (red) giants and
supergiants groups, or the white dwarfs group. Most of the stars belong to the main sequence, going from
stars with negative absolute visual magnitude and low colour index (hot blue stars) to stars with a high
absolute visual magnitude and high colour index (red dwarfs). The Sun is a rather ordinary main-sequence
star of spectral type G2V with an absolute visual magnitude MV = 4.79 and effective temperature of
5 780 K.
Since the absolute magnitude of a star is only known if its visual magnitude and distance are known,
the determination of precise distances is important to deduce the positions of stars in the HR diagram. In the
current era, the satellite mission Gaia of the European space agency ESA is determining precise distances
of about a billion stars in our Milky Way The first and second data releases of the mission6, DR1 and
DR2, took place on 14 September 2016 and 25 April 2018, respectively. The DR1 data already allowed
to compose a very detailed observational HR diagram, albeit for the close vicinity of the Sun. All stars
6http://gea.esac.esa.int/archive/
12
Figure 1.6: HR diagram, where the luminosity (wrt the solar value) is represented as a function of the
effective temperature and corresponding spectral type. The positions of various brigth stars are indicated.
(Source: Wikipedia)
in Gaia DR1 with a relative precision of the parallax better than 20%, are shown in the HR diagram in
Figure 1.7. A particular division of the stars is again obvious. The main sequence and the red giant groups
are prominently present. Within DR1, Gaia was not yet able to measure the distances of a huge amount
of distant white dwarfs accurately enough to be part of DR1. That’s why this group of stars is not densely
populated in the observational Gaia DR1 HR diagram. This is also the case for massive OB-type stars of all
luminosity classes. These are far away from the Sun and their distance could not be determined precisely
enough to be part of the DR1 Gaia HR diagram.
Figures 1.8 and 1.9 show the latest results from DR2 for 32 open and 14 globular clusters, respectively.
We come back to the issue of stellar ageing and its metallicity dependence from clusters in Chapter 10.
13
Figure 1.7: Observational CMD for the solar neighbourhood constructed on the basis of measurements by
the ESA satellite Gaia, from DR1 (Gaia collaboration, Brown et al., Vol. 595, id.A2, 23 pp. , 2016). All stars
whose parallax was measured with a relative precision better than 20%, are shown. The contours indicate
the borders of the position containing 10, 30, and 50% of all the shown stars.
Subsequent Gaia data releases with even more precise parallaxes, including as well the astrometric solutions
for binaries, are foreseen for 2021 and beyond 2022.
The version of the HR diagram used for studies of stellar evolution theory has the logarithm of the
luminosity against the effective temperature of the star and includes so-called evolutionary tracks. This
diagram is shown on the cover of these lecture notes. It is the diagram we shall use throughout this course.
14
Figure 1.8: Observational CMD of 32 open clusters constructed on the basis of measurements by the ESA
satellite Gaia, from DR2 (Gaia Collaboration, Babusiaux et al., A&A, Vol. 616, id.A10, 29 pp., 2018). The
colour coding is according to age.
1.4 Stars in our Milky Way
Our Milky Way is a spiral galaxy consisting of a central bulge with a radius of some kpc and an extensive
flat disk, surrounded by a halo of star clusters (figuur 1.10). The Sun is situated in the disk at some 8 kpc
away from the bulge. Around the Sun, approximately one star per 10 pc3 is found, so the interstellar space is
rather empty. The interstellar medium consists mostly of gas and dust, efficiently absorbing and re-emitting
stellar radiation. This radiation shows us that the interstellar medium is mainly composed of H−, H and H2.
More complex molecules, like O, HCN and CS also occur.
The disk of the Milky Way is composed of a thin part (300 to 400 pc) where dark gas and dust clouds
are found and where new stars are being formed all the time. On the other hand, there is a thick disk (1000
to 1500 pc) where star formation took place earlier on in the history of the Milky Way. The stars which were
formed there clearly contain less metals. The bulge consists of a dense nucleus of massive stars and a black
hole with a mass of around one million solar masses.
The disk as well as the bulge of the Milky Way rotate around its centre. Stars in the disk move at
circular orbits with a velocity of some 200 km s−1, so it takes the Sun about 250 million years to finish one
complete orbit. The stars in the bulge randomly move at velocities of some tens of km s−1. The stars in
the metal-poor spherical star clusters do not undergo a global rotational movement around the bulge in the
centre of the galaxy. They move randomly and their orbits are often very eccentric so that they are mainly
situated far above the disk and just once in a while pass through it. At this moment, they lose their gas,
15
Figure 1.9: Observational CMD of 14 globular clusters constructed on the basis of measurements by the
ESA satellite Gaia, from DR2 (Gaia Collaboration, Babusiaux et al., A&A, Vol. Volume 616, id.A10, 29
pp., 2018). The colour coding is according to metallicity.
which remains captured in the disk.
In general, the stars are divided in different populations: on the one hand according to their metallicity
and on the other hand according to their location and movement in the galaxy (see Figure 1.10). Popula-
tion I stars have a relatively high metallicity and are concentrated in or near the galactic plane. They move
according to the rotation of the galaxy. Population II stars, on the other hand, have an extremely low metal-
licity. They are found far from the galactic plane and do not follow the rotation of the disk of the Milky
Way. The interpretation of this division in populations is that the population II stars were formed before
the matter in the galaxy had collapsed into a disk and the population I stars are born later on in the disk,
up to the present day and in the future. Given that there is a thick and a thin disk, the division of the stars
formed in the thick disk in terms of only two populations is not that simple. Moreover, one has introduced
a third category of Population III stars to denote stars born straight after the Big Bang, having no metals at
all. We will again discuss these populations near the end of the course when we recapitulize the chemical
enrichment in galaxies.
The total mass in the Milky Way Galaxy is about 60×109 M⊙ and the luminosity is about 20×109 L⊙.
The mass in the halo is only 109 M⊙. From the motion of the clusters and stars far away from our Milky
Way, we can deduce that its total mass must be much more than the one we find on the basis of the observed
stellar population. This leads to the introduction of the concept of dark matter in galaxies. One assumes
that, without a thorough argumentation, that this dark matter must mainly be situated in the dark halo of
galaxies. The search for the missing dark matter is an active research topic in modern astronomy.
16
Figure 1.10: Overview of the components of our Milky Way. More explanation: see text. (From Spark &
Gallagher, 2000, Cambridge University Press)
For a detailed study of galaxies and their structure, we refer to the KU Leuven master course Galaxies
and Cosmology.
1.5 Galaxies in the Universe
Galaxies are perceived as little light clouds in the sky. They were discovered in the 1920s as “nebulae”
but as the telescopes qualitatively evolved Edwin Hubble concluded that these nebulae were composed of
individual stars. The diameter of a galaxy typically is a few thousands of light years. Each galaxy contains
between a million and 1012 stars. Almost all light that we receive from the galaxies is emitted by their stars.
Galaxies also contain gas and dust clouds.
17
Figure 1.11: Classification scheme of galaxies. Explanation: see text. (From Spark & Gallagher, 2000,
Cambridge University Press)
Galaxies are classified according to their shape in optical light. In Figure 1.11 the famous Hubble
classification scheme is shown (adapted version). Although large galaxies emit most of the light, the small
dwarf galaxies are dominantly present in the Universe.
Ellipsoidal galaxies E are “flat” and show little or no structures. They contain a small amount of cool
gas and therefore no young blue stars can be formed. Their brightest star population is mainly constituted of
red giants and asymptotic giant branch stars (see further in the course for a definition). They mainly occur
in large clusters of galaxies and the largest of them, the cD systems, are then to be found in the core of the
cluster. Their stars do not move in an organized way, like the rotational motion in spiral galaxies, but they
randomly move through the galaxy. In less bright ellipsoidal galaxies, on the contrary, the stars follow a
common rotational movement. The faintest of these galaxies are divided into two groups: dwarf galaxies dE
and dwarf spheroids dSph. Lenticular systems have, aside from a central bulge, also a rotating disk. They
are indicated as SO and form the transition between ellipsoidal galaxies and spiral galaxies. SOs have no
gas and dust, but a thin and fast rotation disk like spiral galaxies.
Spiral galaxies are easily observable thanks to their radiation at blue wavelengths coming from the
spiral arms where O and B stars are located between the gas and the dust. The spiral arms are areas where
18
efficient star formation is happening through the collapse of molecular clouds of H and H2. Approximately
half of the lenticular systems have a central bar. This is called the sequence of barred systems SBa, . . . ,
SBd parallel to these without bar. These are subsequently classified from Sa until Sd depending whether the
central bulge is more or less pronounced in comparison with the fast rotating disk. Our Milky Way is of the
type Sc. The Sm and SBm systems, finally, are called Magellanic systems according to their prototype, the
Large Magellanic Cloud (LMC). The Magellanic cloud itself rotates with a average velocity of 80 km s−1
which is three times slower than our own Milky Way.
Furthermore, there are also small blue galaxies without any structure. The smallest among those are
called dwarf irregular galaxies. They differ from dwarf spheroids because they contain young hot stars and
gas. In that way, dwarf spheroids are dwarf irregulars which have already lost their gas. Another type are the
so-called “starburst” galaxies, where stars have been formed recently after gas was spit out by supernovae
explosions. The gas is often sucked to the centre of the galaxy, where subsequently a lot of young stars are
born and are packed within a close distance (a few parsecs). This way multiple episodes of efficient star
formation are happening. In this class there are also the interacting or colliding galaxies, where merging can
lead to new star formation areas.
It is clear that the galaxies are no isolated islands but are influencing each other’s evolution. The
Local Group, including our own galaxy, consists of some 40 galaxies centred around our Milky Way and
Andromeda (the closest massive neighbour) at a distance of a megaparsec. As Andromeda, our Milky
Way has more than ten known satellites. Aside from these, only one small ellipsoidal system and another
few randomly moving systems occur in the Local Group. The mutual gravitational attraction within the
Local Group was obviously strong enough to rule over the global expansion of the Universe. In this way
we approach Andromeda with a velocity of 120 km s−1 and the other members of the Group move with a
velocity which differs less than 60 km s−1 of the common proper motion of our Milky Way and Andromeda.
Therefore the galaxies within the Local Group have a kinetic energy too low to escape from it.
Other clusters of galaxies in our neighbourhood are the Virgo and Coma clusters at respectively 20 and
70 Mpc. These large structures obviously form major complexes within the otherwise mainly empty volume
of the Universe. About 50 percent of all systems are found in a cluster where the density is high enough
to counteract the cosmological expansion. The Universe is indeed not static, but expands: all clusters of
galaxies move away from each other and thus away from us. This expansion has started with the Big Bang.
This happened quite “recently”, i.e., only about 13.7 billion years ago, which is only three times the age of
the Sun (and Earth).
The classification of galaxies by itself is not a very tantilizing topic. The interesting aspects of this is
the astrophysical interpretation. This situation is similar to stellar classification: it only becomes interesting
and meaningful when these classifications also turn out to have a physical meaning, like Cecilia Payne
discovered. Concerning the evolution of galaxies, we distinguish between their chemical evolution based on
the evolution of the stars and the dynamical evolution due to the motion and dynamical interactions of all
the galaxy’s constituents (and those that may have come from outside in the case of a merger event). The
chemical evolution is totally determined by the evolution of the stars that form the galaxy. This is further
discussed in Part III of this course. Studies of the dynamical evolution of galaxies are based on N -body
simulations. This topic is not treated here but we refer to the Master course Dynamics of Stellar Systems,
taught at the University of Ghent. KU Leuven students are invited to take the course in the framework of the
19
Master in Astronomy and Astrophysics at KU Leuven.
1.6 Starting point of this course
The main purpose of this course is to understand the evolution of the stars in the Universe, i.e., the life cycle
of stars. We will put the main focus on the evolution of single stars, but briefly touch upon the complications
arising from binary evolution as well. The more complex evolution of multiple stars, with binaries as a
particular category, follows other pathways. Indeed, tidal forces and spatial restrictions prevent each of the
components to evolve as if there was no companion. This implies specific phenomena such as mass transfer
and angular momentum exchange between the components. We will highlight some of the main aspects
of binary evolution but refer to the biennial Master courses Binary Stars and High Energy Astrophysics for
thorough coverage of this topic.
The theoretical HR diagram representing stellar evolution is shown on the cover page of these lecture
notes. The abscissa contains the effective temperature of the stars and the ordinate shows the logarithm of
the stellar luminosity. The evolutionary tracks of stars of different mass are shown. These are the pathways
of stellar evolution. The main goal of this course is to understand the position of the stars and their evolution
tracks, as indicated for stellar models in the HR diagram on the cover. To reach this aim, we first study the
internal structure of stars; this is the subject of part II of this course. Once we have dealt with the basic
concepts and equations describing the stellar structure, we study the life cycle of a star, as represented by
the evolution tracks in the HR diagram. We treat the evolution of single stars of different birth masses in
Part III. The short Part IV is dedicated to basic aspects of binary evolution.
20
The World-Wide-Web offers many illustrations and photos of stars (and their planets), star clusters, and
galaxies in different evolutionary stages. These illustrations lose much of their quality when printed and,
moreover, we care about our environment and aim for a sustainable planet. Therefore I direct the readers to
the internet to look for illustrations of the objects in the Universe and to admire their beauty. Here are some
links:
• https://www.eso.org/public/images/
• https://www.esa.int/ESA−Multimedia/Images
• http://hubblesite.org/gallery/showcase/text.shtml
• http://antwrp.gsfc.nasa.gov/apod/
Some of the professional astronomers are incidentally called “Archivists of the Cosmos”. Astronomy is
a specialisation that promotes open access, proper documentation, and exhaustive collection of information
and data in electronic form (from telescopes and instruments operative from laboratories on Earth as well as
from satellites in space). An absolute must for astronomers as sources of information are the two following
astronomical databases available on the World-Wide-Web:
• https://ui.adsabs.harvard.edu/classic-form
• http://simbad.u-strasbg.fr/simbad/
The first site consists of all international peer-reviewed articles published in astronomy and includes links
to the most common journals and magazines. You can enter a search on the name of the author, name of
the star, the magazine, etc. The second is a database where you can consult and if necessary query the
known factsheet and data of each known celestial body. Anyone who wants to specialise in astronomy, will
definitely use these databases frequently.
21
Chapter 2
A simple equation of state: an ideal gas with
radiation
Describing the stellar structure requires the knowledge of the characteristics of stellar material. This chapter
describes the thermodynamic properties of the stellar gas. The general assumption that we make is that,
at every location in the star, the gas is in a state of thermodynamic equilibrium. With this assumption, we
do not have to take into account the detailed reactions between the particles, like atoms, electrons, ions,
photons. . . , which are the building blocks of the gas. The average characteristics of the gas can be described
in terms of local variables and the relations between them. Consequently, at a given temperature, density and
chemical composition, it will be possible to determine all other variables like pressure and internal energy.
The specification of these relations is called the determination of the equation of state (EOS) of the gas.
In this chapter we will discuss one example of an EOS which is highly relevant for stars. Other realistic
EOS will be discussed in Chapter 4. First, we will focus on some basic notions and briefly recall some basic
relations of thermodynamics which are relevant in the context of stellar structure.
2.1 Introduction to thermodynamics, applied to stars
The changes in the state of the gas of which a star is composed are important during the evolution of the
star. The basic equation that describes the evolution of the characteristics of a gas is the first law of thermo-
dynamics. We will now summarize the notions of thermodynamics, which are important to understand the
internal stellar structure.
25
2.1.1 Thermodynamic equilibrium
Classical thermodynamics describes the systems that have a uniform temperature and chemical composition
and are in mechanical and thermodynamic equilibrium. In general, these conditions are not met in stars.
The state of mechanical equilibrium is reached when in each point the pressure force is compensated
by the sum of all other acting forces. In astronomy, this is called hydrostatic equilibrium.
We now consider a volume consisting of radiation and matter that is adiabatically enclosed. This
means there is no possible heath exchange with the surroundings. When the mechanical equilibrium is
accompanied by a single temperature in the volume, there is a mechanical and thermal equilibrium.
In general the system consists of reacting elements with varying concentration over time due to chem-
ical reactions that occur. When the density and the temperature remain constant, the relative concentrations
of the particles remain in equilibrium. In this case, we are dealing with chemical equilibrium. When both
chemical and thermal equilibrium are reached, the system does not change anymore. This is called: thermo-
dynamic equilibrium.
Although classical thermodynamics is not strictly valid in stars, it can be used for the description of
stellar structure. The reason is that the star can be divided in a large number of layers, which can each be
taken fairly thin so that these enclose the characteristics of equilibrium in the sense of classical thermo-
dynamics. In astronomy, this state is called local thermodynamic equilibrium, or LTE. If LTE is a close
approximation, the basic laws of the classical thermodynamics are valid within each layer of the star, even
when the star is not in thermodynamic equilibrium as a whole.
2.1.2 The first law of thermodynamics
We consider the work that is connected with the volume change of a system. Assume that P is the pressure
at the surface of the system and that the system undergoes a surface variation. The work that is done by the
pressure on a uniform unity surface is dW = PdV . In astronomy the work per unit of mass W is used, by
the introduction of the specific volume v = 1/ρ, i.e., v is the volume that is taken by one unit mass. We then
get dw = Pdv.
We now consider an infinitesimal thermodynamic transformation of the system, corresponding with
infinitesimal variations of the pressure, the density and the temperature. Define dq as the amount of heat
that is absorbed by the system per unit mass and dw as the work done per unit mass by the system. The first
law of thermodynamics states that the differential du ≡ dq − dw is a total differential. The first law thus
allows us to define a function u, which will be called the internal energy per unit mass of the system. By
consequence, the internal energy of the system can be altered by doing work or by heating or cooling the
system.
The first law of thermodynamics gives the relation between the added heat dq, the internal energy u
26
and the specific volume v = 1/ρ (each defined per unit mass):
dq = du+ Pdv. (2.1)
An adiabatic process is a process that occurs in such a way that no heat enters or leaves the system: dq = 0.
For an adiabatic process the change of the internal energy is opposite to the work done by the system.
When dw is negative, like when you have a compression, the internal energy increases, which is mostly
accompanied by an increase in temperature. On the other hand dw > 0 implies a decrease of the internal
energy, accompanied by a decrease in temperature. When a process like compression or expansion occurs
quickly, it will be approximately adiabatic because the increase or decrease of heat occurs very slowly.
When there is no work done during the adiabatic process, the internal energy of the system does not
change. It is however possible that there are alterations in P, ρ, T .
2.1.3 The entropy
Suppose that a system runs through a series of states of thermodynamic equilibrium. This is called a quasi-
static transformation. Such a quasi-static transformation is called a reversible transformation when during
the transformation no energy is lost due to effects like friction. By consequence, a reversible transformation
can be run through in two opposite directions.
We let the system run through a reversible cycle, first in one direction, and afterwards in the opposite
direction. We then define the quantity s of the system by ds ≡ dq/T ; s is called the entropy of the system
and is defined per unit mass. According to the first law ds also is a total differential, given by du/T+P/Tdv.
The entropy of a system is solely defined for states of thermodynamic equilibrium. Moreover, we can only
determine the variation of the entropy from the first law. We want to stress that the relation du = Tds−Pdvdoes not suppose any variation in chemical composition.
2.1.4 The specific heats
From a mathematical point of view it is useful to import general specific heats
cα ≡(
∂q
∂T
)
α, (2.2)
i.e., cα is the amount of heat a system has to absorb so that the temperature rises with one unit. From a
physical point of view, two kinds of specific heats are particularly relevant:
cP ≡(
dq
dT
)
P=
(
∂u
∂T
)
P+ P
(
∂v
∂T
)
P,
cv ≡(
dq
dT
)
v=
(
∂u
∂T
)
v.
(2.3)
27
When searching for a relation between cP and cv , we consider general equations of state ρ = ρ(P, T )and u = u(ρ, T ). In general, ρ and u also depend on the chemical composition, but here we assume these
are constant. We now define the derivatives:
α ≡(
∂ ln ρ
∂ lnP
)
T= −P
v
(
∂v
∂P
)
T,
δ ≡ −(
∂ ln ρ
∂ lnT
)
P=T
v
(
∂v
∂T
)
P.
(2.4)
The equation of state can then be written as
dρ
ρ= α
dP
P− δ
dT
T. (2.5)
We now use (2.1) and
du =
(
∂u
∂v
)
Tdv +
(
∂u
∂T
)
vdT (2.6)
to determine the change ds = dq/T of the specific entropy:
ds =dq
T=
1
T
[(
∂u
∂v
)
T+ P
]
dv +1
T
(
∂u
∂T
)
vdT. (2.7)
Since ds is a total differential, this makes ∂2s/∂T∂v = ∂2s/∂v∂T . We apply this on the former equation
and this way get∂
∂T
[
1
T
(
∂u
∂v
)
T+P
T
]
=1
T
∂2u
∂T∂v. (2.8)
After the implementation of the differentiation in the left part of the equation we get
(
∂u
∂v
)
T= T
(
∂P
∂T
)
v− P. (2.9)
This relation is called the reciprocity relation.
To get cP − cv we first deduce an expression for (∂u/∂T )P in which we take P and T as independent
variables. From (2.6) we getdu
dT=
(
∂u
∂T
)
v+
(
∂u
∂v
)
T
dv
dT, (2.10)
and thus
(
∂u
∂T
)
P=
(
∂u
∂T
)
v+
(
∂u
∂v
)
T
(
∂v
∂T
)
P=
(
∂u
∂T
)
v+
(
∂v
∂T
)
P
[
T
(
∂P
∂T
)
v− P
]
, (2.11)
hereby using (2.9). Together with the definition of the specific heats, this last result gives:
cP − cv = T
(
∂v
∂T
)
P
(
∂P
∂T
)
v. (2.12)
28
On the other hand we use the right hand sides in the definitions of α and δ of Eqs (2.4) to obtain:
Pδ
Tα= −
(
∂v∂T
)
P(
∂v∂P
)
T
=
(
∂P
∂T
)
v. (2.13)
Using T (∂v/∂T )P = vδ = δ/ρ, we then get the basic relation
cP − cv =Pδ2
Tρα. (2.14)
We establish that the difference of the specific heats can be determined entirely from the derivatives of the
equation of state.
Now we want to rewrite the first law of thermodynamics in terms of the variation of the pressure and
the temperature. Therefore we first write:
dq = du+ Pdv =
(
∂u
∂T
)
vdT +
[(
∂u
∂v
)
T+ P
]
dv. (2.15)
Using firstly (2.9) ,and then the definition of v and (2.13) we find
dq =
(
∂u
∂T
)
vdT + T
(
∂P
∂T
)
vdv = cvdT − T
(
∂P
∂T
)
v
1
ρ2dρ = cvdT − Pδ
ρα
dρ
ρ, (2.16)
which can be rewritten as
dq = cvdT − Pδ
ρα
(
αdP
P− δ
dT
T
)
=
(
cv +Pδ2
Tρα
)
dT − δ
ρdP. (2.17)
We then find, via (2.14)
dq = cP dT − δ
ρdP. (2.18)
For adiabatic transformations the entropy remains constant ds = dq/T = 0. We now define the
adiabatic temperature gradient ∇ad as follows :
∇ad ≡(
∂ lnT
∂ lnP
)
s, (2.19)
in which the lower index s indicates that the definition is valid for constant entropy. From (2.18) we deduce
that (dT/dP )s = δ/ρcP . This leads to an expression for ∇ad:
∇ad =
(
P
T
dT
dP
)
s=
Pδ
TρcP. (2.20)
∇ad defines the temperature variation perceived by the particles in a mass element of a system when this
element suffers a pressure variation as a result of adiabatic expansion. This is an expansion which causes
no heat exchange with the surroundings. What happens is this: mass elements which are heated deeply
in the star rise, because, due to their lower density, they are lighter than their surroundings. Due to this
rise the mass elements end up in the higher layers where the density is lower and they therefore expand.
The expansion of the mass elements causes a decrease in temperature of the gas. ∇ad is the value of this
temperature change. The pressure as well as the temperature decrease moving outwardly. The value of the
decrease in pressure is given by the equation of hydrostatic equilibrium (see below) and once this value is
determined, we can compute ∇ad.
29
2.2 An ideal gas with radiation
2.2.1 The classical ideal gas law applied to stars
The presumption of thermodynamic equilibrium implicitly assumes that the conditions in the gas do not
remarkably alter on an average free path length and during the average time between two collisions of the
gas particles. With the term gas particle we do not only mean the material particles like atoms or electrons,
but also photons. The condition of thermodynamic equilibrium is surely met in stellar interiors, where the
density is high. It is not valid anymore in the stellar atmosphere.
A remarkable simplification is obtained when we take the high temperatures in the stellar interiors into
account. After all, in most stars the gas can be considered as fully ionized, in other words only consisting of
nuclei and free electrons without any internal degrees of freedom. The particles in such a gas do not interact.
Such a gas is called an ideal gas.
The well-known notification of the ideal gas law for gas particles of one single type in a volume is:
PV = NkT, (2.21)
with P the pressure, V the volume, N the amount of gas particles in the volume, T the temperature and
k Boltzmann’s constant (see Appendix B) given by R/NA with R the gas constant and NA = g/mu (by
which mu is expressed in grams) Avogadro’s number. In stellar media, we cannot easily specify the amount
of particles in the volume, and therefore we choose to work with densities. We represent the amount of
particles per volume unit as n = N/V . This way we can also write the ideal gas law as follows:
P = nRNA
T = nmuRT, (2.22)
where we use the gas constant with a dimension (energy per K and per unit mass).
We now define the molecular weight µ as the particle mass expressed in mu. In our definition, µ is
dimensionless (instead of having the dimension of mass per mole as is common habit in thermodynamics).
The density of the stellar material is in fact the product of the number of particles per volume unit and the
mass of the particles. This way, we find nmu = ρ/µ and for the ideal gas law, we get:
P =RµρT. (2.23)
This is the usual notation of the equation of state of an ideal gas consisting of one type of particles in
astrophysics.
2.2.2 The mean molecular weight
In the stellar interior close to the stellar core all matter is ionized. This means that there is one free electron
per hydrogen atom and for each helium atom there are two free electrons. In reality, we thus have a gas
30
mixture consisting of two types of particles, the ions (each composed of different components – protons
and neutrons) and the free electrons. This mixture is again an ideal gas when both components meet the
requirements of the ideal gas law.
The composition of the stars is very simple compared with the one of materials on Earth. Due to the
high pressure and temperature the stellar interior is almost entirely composed of fully ionized matter. In
such a medium it would be sufficient to describe the different types of nuclei, which we will call particles in
the future. To each type of particle we will assign an index i. With Xi we denote the relative mass fraction
of the particles of type i, i.e., the fraction of one unit of mass consisting of type i particles. From this, we
get:∑
i
Xi = 1. (2.24)
The chemical state of the gas mixture composed of fully ionized nuclei and free electrons is described
by specifying all Xi, which have a molecular weight µi and a charge Zi. For ni particles per volume with a
particle density of ρi, we have Xi = ρi/ρ and
ni =ρi
µimu=
ρ
mu
Xi
µi. (2.25)
We ignore the mass of the electrons vis-a-vis the mass of the ions (see Appendix B for the mass of both).
The total pressure P of the gas mixture is the sum of the partial pressures:
P = Pe +∑
i
Pi =
(
ne +∑
i
ni
)
kT, (2.26)
where Pe is the pressure of the free electrons, Pi the partial pressure due to the type i particles and supposing
that each of the components is an ideal gas. The contribution of one fully ionized atom of type i to the total
amount of particles (core and Zi free electrons) is 1 + Zi from which
n = ne +∑
i
ni =∑
i
(1 + Zi)ni. (2.27)
This expression together with (2.25) and (2.26) gives the following new expression for the total pressure
P = R∑
i
Xi(1 + Zi)
µiρT. (2.28)
This result can be simplified (2.23) when we introduce the mean molecular weight:
µ ≡(
∑
i
Xi(1 + Zi)
µi
)−1
. (2.29)
This way we can treat a gas mixture of components which are each of them an ideal gas, as one uniform
ideal gas. We only have to replace the molecular weight µ in Eq. (2.23) by the mean molecular weight µ.
31
The definition of the mean molecular weight can easily be adjusted for a neutral gas where all electrons
are still bound to their nucleus. In this case we simply replace the factor 1 + Zi by 1. With this description
we can handle all situations with fully ionized matter or with neutral atoms.
The mean molecular weight depends on the chemical composition. Let us consider a chemical compo-
sition based on a fraction X of hydrogen, Y of helium and Z of heavy elements so that X + Y + Z = 1.
The fraction of heavy elements is generally coming from j different elements: Z =∑
j Zj , having mass
number Aj . The average amount of free electrons that is released when these heavy elements with fraction
Zj are fully ionized is Aj/2.
When all atoms are ionized, the mean molecular weight can be written as follows:
µ =
X(1 + 1)
1+Y (1 + 2)
4+∑
j
(
Zj(1 +Aj/2)
Aj
)
−1
. (2.30)
In practice, all terms Zj/Aj drop out, because their contribution is negligible (consider that Z ≈ 2 − 3%).
We then get:
µ =
(
2X +3Y
4+
1
2(1−X − Y )
)−1
=
(
3X
2+Y
4+
1
2
)−1
. (2.31)
In the central layers of a newborn star like the Sun (X = 0.717, Y = 0.270, Z = 0.013) we then find
µ = 0.61. In the case of pure, fully ionized hydrogen we find µ = 1/2. In case of a fully ionized helium
gas we contrarily find µ = 4/3.
When we are dealing with a neutral gas, we get
µ =
X
1+Y
4+∑
j
Zj
Aj
−1
, (2.32)
or simplified:
µ =
(
X +Y
4
)−1
(2.33)
where again all contributions Zj/Aj were neglected. For the outer stellar layers of the Sun we thus find
µ = 1.29.
In reality, the outer layer of cool stars will not contain any ionized gas. On the other hand all the
atoms in the inner layers will be fully ionized. Somewhere in the star there is thus a critical layer where
both ionized as well as non-ionized matter from a chemical element is occurring. This is called a partial
ionization layer. The ionization of hydrogen, e.g., requires 13.6 eV. The first ionization of helium needs
24.6 eV, etc. We conclude from this that the first partial ionization layer of helium is located deeper inside
the star than the partial ionization layer of hydrogen. Analogously, the second partial ionization layer of
helium is located deeper inside the star than the first partial ionization layer. When the temperature is higher
than roughly 200 000 K all hydrogen and helium is fully ionized.
When the stellar material is partially ionized, we have to consider all the different ionization states
when determining µ and thus it is not possible to compute this quantity analytically as we did for the fully
32
ionized and fully neutral case. In general the proportion of the number of particles in the ionization state
(r + 1) to the number of particles in the ionization state r is described by the ionization law of Saha:
Nr+1
Nr=
1
Ne
2Ur+1
Ur
(
2πmekT
h2
)3/2
exp [−χr/kT ] , (2.34)
with Ne the electron density, me the mass of the electron, χr the energy needed to ionize a particle from
state r to state r + 1, and Ur+1 and Ur the so-called partition functions of the ionization states r + 1 and r.
These last ones are found from the Boltzmann distribution:
nr,sNr
=gr,sUr
exp [−χr,s/kT ] , (2.35)
with nr,s the number of particles per cm3 in level s of ionization state r, gr,s the statistical weighing factor of
that level, χr,s the excitation energy of that level measured from the ground state (r, 1), and Nr ≡ ∑
s nr,sthe total particle density in all levels of the ionisation state r, and Ur:
Ur ≡∑
s
gr,s exp [−χr,s/kT ] . (2.36)
The excitation energy χr,s is the energy difference between the excited state (r, s) and the ground state
(r, 1). The statical weights gr,s measure the degeneracy of the levels as a consequence of magnetic fine
splitting. In absence of a magnetic field, these are equal to two (spin “up” or “down” for the proton or the
electron).
We also notice that the mean molecular weight changes during the evolution of a star, since the mutual
fractions X,Y,Z change as a consequence of the nuclear reactions. The mean molecular weight changes
during the evolution layer by layer, as the efficiency of the nuclear reactions is uttermost temperature depen-
dent. Because of this, the star builds up a gradient of µ in its deepest layers during its life. As of now, we
will use the simplified notation µ to still indicate the mean molecular weight inside the star.
Finally, for subsequent use in the case where the electrons are responsible for the dominant pressure
rather than the ions, we wish to determine the mean molecular weight per free electron µe. For a fully
ionized gas every nucleus i delivers Zi free electrons and we get
µe =
(
∑
i
XiZi/µi
)−1
. (2.37)
Since for all elements that are heavier than helium, the approximation µi/Zi ≈ 2 is valid, we find
µe =
(
X +1
2Y +
1
2(1−X − Y )
)−1
=2
1 +X. (2.38)
This result will be used in Chapters 12 and 13.
2.2.3 The internal energy of an ideal gas
For an ideal gas (α = δ = 1), Eq. (2.14) simplifies to the well-known result cP − cv = R/µ, from which
we can deduce that cP > cv. Note that in classical thermodynamics we find cP − cV = R for an ideal gas.
33
That we find a factor 1/µ here is due to the fact that we work per unit of mass in astronomy rather than per
unit of volume.
From the reciprocity relation we find that(
∂u
∂v
)
T= 0. (2.39)
From this we deduce that the internal energy of an ideal gas is only a function of its temperature.
The distribution of the velocity v in an ideal gas consisting of classical particles (thus ignoring rela-
tivistic effects) is given by the Maxwell distribution function :
f(v) = 4πv2(
m
2πkT
)3/2
exp
(
−mv2
2kT
)
, (2.40)
with m the mass of the particle. This distribution function is defined such that f(v)dv represents the proba-
bility that the particle has a velocity between v and v + dv. The function f is normalised such that∫ ∞
0f(v)dv = 1. (2.41)
The maximum of the distribution, i.e., the most likely velocity, is given by√
2kT/m. On the other hand, the
average velocity is equal to
< v >=
∫
∞
0vf(v)dv =
(
8kT
πm
)1/2
(2.42)
and the average quadratic velocity is given by
< v2 >=
∫
∞
0v2f(v)dv =
3kT
m. (2.43)
From this equation we deduce that the average kinetic energy per particle equals 3kT/2. The average kinetic
energy density, which is the average amount of kinetic energy per unit of mass, is therefore found by dividing
3kT/2 by the average mass of a particle. This average mass is nothing else than µmu, so that we find an
average kinetic energy density equal to 3kT/2µmu. Since k/mu = R, we finally find 3RT/2µ for the
average kinetic energy density per unit mass.
The internal energy of the ideal gas is in general given by the sum of the kinetic energy due to thermal
motion and the ionization energy. A fully ionized gas or an entirely neutral gas have no ionization energy.
In this case, we thus find for the internal energy of the gas that
u =3RT2µ
. (2.44)
The average internal energy per unit mass is equal to 3P/2ρ in the limit of a classical ideal gas consisting
of only one type of particles.
From the expression of u we immediately find
cv =
(
∂u
∂T
)
v=
3
2
Rµ. (2.45)
34
Consequently cP − cv = R/µ then gives
cP =5
2
Rµ, (2.46)
from which we can deduce that
γ ≡ cPcv
=5
3. (2.47)
We then find ∇ad = 2/5 for an ideal gas that is entirely composed of on the one hand fully ionized matter
or on the other hand neutral atoms. This means that the temperature variation of an ideal gas in adiabatic
compression or expansion follows T ∼ P 2/5.
For an ideal gas we can link the pressure, volume and density variations as follows:
dP
P= −cP
cv
dv
v= −γ dv
v= γ
dρ
ρ, (2.48)
which can be rewritten as
(
∂ lnP
∂ ln ρ
)
s
= γ ;
(
∂ lnP
∂ lnT
)
s=
γ
γ − 1;
(
∂ lnT
∂ ln ρ
)
s
= γ − 1. (2.49)
These expressions are only valid when the motion of the gas particles is the only contribution to the internal
energy, like in the case of a fully ionized or entirely neutral ideal gas. The expressions are not valid in more
general conditions. Nevertheless it is useful to define the adiabatic variations through similar equations for
such more general conditions. Therefore the following adiabatic exponents are used:
Γ1 ≡(
d lnP
d ln ρ
)
s
,Γ2
Γ2 − 1≡(
d lnP
d lnT
)
s,Γ3 ≡
(
d lnT
d ln ρ
)
s
+ 1, (2.50)
which show compliance toΓ1
Γ3 − 1=
Γ2
Γ2 − 1. (2.51)
These definitions are not based on any hypothesis regarding the equation of state. For a fully ionized ideal
gas we recover Γ1 = Γ2 = Γ3 = 5/3.
We finally define the isothermal speed of sound a by
a2 ≡ RµT. (2.52)
In the case of an isothermal ideal gas we can thus also formulate the ideal gas law as follows:
P = a2ρ, (2.53)
with a constant. We will use this formulation in the description of the star formation process (see Part III of
the lecture notes).
35
2.2.4 The contribution of the photon gas
Due to the high temperatures in stellar interiors the photons considerably contribute to the pressure and
the internal energy of the gas. The pressure in the star is therefore not completely determined by the gas
pressure but there is also a component coming from the pressure due to the photon gas. This radiation
pressure accounts for a considerable fraction of the total pressure in the stellar core of all stars and also in
the photosphere of hot massive stars.
The radiation can very well be approximated by the one that is valid for a black body. The energy
density of a black body is described by the radiation law of Planck (also see Appendix A):
uν(T ) =2hν3
c2(exp(hν/kT )− 1)−1 . (2.54)
Since the photons carry along momentum, there is a pressure connected with the radiation. This radiation
pressure is given by Prad = aT 4/3 with a the radiation constant (see Appendix A). The energy density
per unit mass corresponding to this radiation pressure is u = aT 4/ρ = 3Prad/ρ. We can conclude that the
energy density per unit mass is 3P/2ρ for a non-relativistic ideal gas and 3P/ρ for the photon gas.
According to the law (2.1) we see that, for an adiabatic variation of a photon gas,
0 = dq = du+ Pdv = du+ Pd
(
1
ρ
)
= 3d
(
P
ρ
)
+ Pd
(
1
ρ
)
= 4Pd
(
1
ρ
)
+3
ρdP = −4P
ρ2dρ+
3
ρdP.
(2.55)
From this we can conclude Γ1 = 4/3. On the other hand we see
0 = dq = d
(
aT 4
ρ
)
+1
3aT 4d
(
1
ρ
)
= −4aT 4
3ρ2dρ+
4aT 3
ρdT, (2.56)
from which we can deduce that Γ3 = 4/3. From (2.51) we then also find Γ2 = 4/3.
When the system consists of a mixture of particles that behave like an ideal gas and of radiation, the
total pressure is given by
P = Pgas + Prad =RµρT +
a
3T 4. (2.57)
Often a measure for the contribution of the radiation pressure is defined by introducing β ≡ Pgas/P ,
which is equivalent to 1 − β = Prad/P . For β = 0 the gas pressure is zero and for β = 1 the radiative
pressure is zero. Fixing a value for β is thus the same as establishing a mutual link between the gas and
radiative pressure. Obviously, β changes when we move from the stellar interior to the stellar surface. For
stars with M ≥ 10M⊙, β 6= 0 in the whole star, even in the area near the stellar surface. For very massive
stars Pgas is even negligible compared to Prad. On the other hand, Prad is negligible near the stellar surface
for stars like the Sun or cooler.
36
Chapter 3
Classical mechanics applied to stellar
structure
In this chapter, we discuss the equations of classical mechanics relevant for the study of stellar structure.
When we derive and solve these equations, we will use some of the thermodynamic relations that were
discussed in the previous chapter. We first consider a few key simplifications to develop the theory and will
come back to these in Chapter 7.
3.1 Some preliminaries
In this chapter, we will be approximating a star as a non-rotating non-magnetic gaseous sphere. Obviously,
the theory will only be valid for single stars that are rotating “slowly” and have only a “weak” magnetic
field. In that case, the forces acting upon a fluid element in the star are the pressure force and gravity,
while the Coriolis, centrifugal, Lorentz and any tidal forces can be ignored. These assumptions imply a
tremendous simplification: all quantities are in this case constant in concentric spheres and one spatial
coordinate suffices to describe them. How good an approximation is this?
The aspect of multiplicity and tidal forces and interactions will be treated in the final chapter of these
lecture notes. We do not discuss it here other than remarking that, the higher the birth mass of a star, the
more likely it is to reside in a binary or multiple system. On average, half of the stars occur in binaries so
this cannot be ignored. But in order to understand how binary or multiple stars live their life, one must first
understand how single stars evolve and this is the major topic of this course.
The assumption that the Lorentz force can be ignored is very reasonable for the majority of stars when
studying stellar evolution. The magnetic field of the Sun (and similar stars) causes spectacular phenomena
at the solar surface, like solar flares and coronal mass ejections. However, in the overall life cycle of the
Sun, these effects do not play a major role, because they are local phenomena limited to the solar outer
37
layers and the corona, where the density of matter is immensely low. Magnetic effects do play a role for
the circumstellar environment, such as the planetary systems consisting of planets, moons, asteroids, comets
and other rocky material revolving their host star. However, stellar evolution is dominantly determined by
internal physical processes taking place deep inside the star, in and near its core, where the pressure gradient
and gravity are the dominant acting forces.
We make a distinction in terms of birth mass when it comes to the importance of the interplay between
magnetic effects and rotation during most of the lifetime of a star. As we will discuss in Part III of the course,
star formation will be accompanied by the occurrence of extensive convective regions (see Chapter 5) in a
rotating sphere (see Chapter 9). In such circumstances, a magnetic dynamo gets created. Stars born with
a mass M <∼ 1.3M⊙ will keep a convective outer envelope while they are burning hydrogen in their core,
which constitutes by far the longest phase of their evolution. All these stars tend to be slow rotators during
almost their entire life. This is due to an efficient slow-down of their rotation caused by magnetic braking.
This phenomenon is induced by the fossil magnetic field originating in their convective envelope during the
process of star formation. This dynamo along with angular momentum loss via a thin stellar wind, which
in case of the Sun causes a mass loss of some 10−14 M⊙ per year, is very effective in slowing down the
rotation. As we will discuss further in these notes, stars born with a mass M >∼ 1.3M⊙ will have a radiative
outer envelope at birth. They do not sustain a magnetic dynamo in their envelope at birth and their rotation
is not slowed down by magnetic braking. Either way, the magnetic fields, if any, lead to a Lorentz force that
is far less important than the pressure and gravity forces in the stellar interior, aside from a few exceptional
stars with a very strong magnetic field. When we encounter an evolutionary stage in which the Lorentz force
does play an important role, such as during the star formation process and with the formation and evolution
of neutron stars, we will explicitly state this further on in the lecture notes, but for the bulk of the course it
is fine to ignore it.
The above makes it clear that the assumption of slow rotation is harder to justify for stars born with
M >∼ 1.3M⊙. Some of these stars even rotate at a considerable fraction of their so-called critical rotation
velocity (see Chapter 7). For such stars the effects of the Coriolis- and centrifugal forces can be substantial.
As long as the rotational velocity remains below, say, 50% of the critical velocity, the star is not seriously
flattened at its polar regions by rotation. So at first instance, we will ignore any rotational effects. We do this
because it brings about a huge mathematical simplification. Indeed, when we ignore the centrifugal force,
the star does not deviate from a sphere. Another major reason for not considering rotation in the description
of the stellar structure is that we only have very limited knowledge about the internal rotation laws in stars.
Since we neither have a good star formation theory that explains how the stellar interior rotates at birth nor
a good theory of angular momentum transport in stellar interiors, we start off by ignoring rotation in the
theory. We come back to this issue in Chapter 7.
38
Figure 3.1: We use the mass within the sphere with radius r as an independent variable for the description
of the equations determining the stellar structure. (From Kippenhahn et al. 2012)
3.2 Coordinates
3.2.1 Eulerian description
In the approximation of a non-rotating non-magnetic spherically symmetric star, all functions are well de-
scribed by one spatial coordinate. The distance r, measured from the stellar core to the fluid element, is a
natural choice for this spatial coordinate. The distance r can vary from r = 0 to r = R, where R is the
stellar radius.
To describe the evolution of the quantities in time, we introduce the time coordinate t. If we use the two
independent variables r and t to compute the stellar structure, we use the so-called Eulerian description. All
other quantities are then determined as a function of r and t. Examples are the density ρ = ρ(r, t), pressure
P = P (r, t), temperature T (r, t), etc.
We now want to describe the effect of the mass distribution in the star to compute the gravitational
force. In order to do so, we define the function m(r, t) as the mass in a sphere with radius r at the time t; mvaries according r and t:
dm = 4πr2ρdr − 4πr2ρvdt. (3.1)
The first term on the right-hand side of this equation is the mass in a spherical shell with thickness dr (see
Figure 3.1). This term expresses the variation of m(r, t) as the result of a variation of r at constant t:
∂m
∂r= 4πr2ρ. (3.2)
Equation (3.2) is the first of the basic equations that determine the stellar structure in the Eulerian description.
The second term on the right-hand side of Eq. (3.1) represents the spherically symmetric mass flow
39
throughout the sphere with a constant radius r, as a result of an outward velocity v in the time span dt:
∂m
∂t= −4πr2ρv. (3.3)
Taking the derivative of Eq. (3.2) with respect to t and the one of Eq. (3.3) to r, and equating both these
derivatives, we get the well-known continuity equation for spherical symmetry:
∂ρ
∂t= − 1
r2∂(ρr2v)
∂r. (3.4)
3.2.2 Lagrangian description
As will become clear later on it is often more efficient to work with a Lagrangian coordinate instead of
the Eulerian coordinate r. This is a spatial coordinate connected to a fluid element that does not change in
the course of time. In the Lagrangian description, we characterize a fluid element by m, which is the mass
contained in a concentric sphere at a given time t0.
The new independent variables then are m and t and all other quantities are written in terms of these
variables. An example is again the density ρ = ρ(m, t), and now also the distance r of the fluid element to
the stellar centre: r = r(m, t). In the stellar centre we have m = 0 and at the surface m = M , the total
mass of the star. This example already shows a great advantage of the Lagrangian description: as opposed
to the large variation in the radius R during a star’s lifetime, the independent variable m varies, to a good
approximation, over the constant interval [0,M ] for more than 90% of the star’s lifetime.
There is an unambiguous connection between the coordinates r and m. For the partial derivatives to
both variables, the following formulae apply:
∂
∂m=
∂r
∂m.∂
∂r,
(
∂
∂t
)
m=
(
∂r
∂t
)
m· ∂
∂r+
(
∂
∂t
)
r.
(3.5)
If we apply the first of these derivatives to m, we get
1 =∂m
∂r.∂r
∂m,
which gives the following equation if we fill in the relation (3.2):
∂r
∂m=
1
4πr2ρ. (3.6)
This differential equation describes the spatial behaviour of the function r(m, t). It replaces equation
(3.2) and is the first basic stellar structure equation in the Lagrangian description. By substituting this
40
equation in the upper relation of (3.5), we find the following relation between the two operators:
∂
∂m=
1
4πr2ρ
∂
∂r. (3.7)
The second equation of (3.5) is the main reason to use the Lagrangian description. The time derivative
on the left-hand side describes the change of a function of time when following a given fluid element. The
laws of conservation for time-dependent spherical stars are just simple expressions for this time derivative.
If we would work in terms of the local time derivative (∂/∂t)r , terms with the velocity (∂r/∂t)m would
appear explicitly, which is not the case in the Lagrangian formalism.
3.3 Poisson’s equation
In a spherically symmetric body, the modulus of the gravitational acceleration ~g at a distance r of the centre
does not depend on the mass located at a larger distance than r away from the centre, i.e., g = |~g| is only
dependent of r and of the mass that is located in the concentric sphere with radius r, which we have defined
as m:
g =Gm
r2, (3.8)
with G = 6.673 × 10−11 m3/kg.s2 the gravitational constant in SI units.
In general the gravitational field in a star can be described by a gravitational potential Φ, which is a
solution of Poisson’s equation:~∇2Φ = 4πGρ, (3.9)
in which ~∇2 stands for the Laplace operator. For spherically symmetric configurations, Poisson’s equation
simplifies to:1
r2∂
∂r
(
r2∂Φ
∂r
)
= 4πGρ. (3.10)
The gravitational acceleration vector ~g is pointed towards the stellar core and is written in spherical
coordinates as ~g = (−g, 0, 0) with g = |~g| > 0. The vector ~g = −g~er is derived from the potential Φ as in
the equation ~g = −~∇Φ. For a spherically symmetric star, only the partial derivative with respect to r differs
from zero and we get:
g =∂Φ
∂r. (3.11)
Using the expressions (3.11) and (3.8) we get
∂Φ
∂r=Gm
r2. (3.12)
Integration of the expression (3.12) gives
Φ =
∫ r
0
Gm
r2dr + constante. (3.13)
41
Figure 3.2: Representation of a state of hydrostatic equilibrium: the outward pointing pressure force has to
compensate the inward pointing gravitational force. This can only be the case when the force at the inner
boundary of the shell is larger than the one at the outer boundary. (From Kippenhahn et al. 2012)
The integration constant is chosen in such a way that Φ disappears for r → ∞. Moreover, Φ is minimal at
the stellar core.
3.4 Conservation of momentum
3.4.1 Hydrostatic equilibrium
We cannot observe structural changes for most of the stars in real time. This implies that the stellar material
is not accelerated noticeably, implying that all forces acting on fluid elements must compensate each other.
This mechanical equilibrium is called hydrostatic equilibrium. Supposing that we are dealing with a gaseous
non-rotating star which has no magnetic field or close companion. In such a case, the acting forces are the
gravitational force and the pressure force.
Let us consider, for a given time t, a thin spherical mass shell with infinitesimal thickness dr at a
distance r of the stellar centre. The density at the border of the shell is ρdr and the acceleration of the shell
is −gρdr, which represents the gravitational force that is pointed towards the stellar centre. To avoid that
the fluid elements of the shell are accelerated into the direction of the centre, they have to experience a net
pressure force that is exactly equal to the gravitational force, but pointed outwardly. This implies that there
is a higher pressure at the inside of the shell (Pi) than at the outside of the shell (Pe). We refer to Figure 3.2.
The total force per surface unit on the shell as a consequence of these different pressures is:
Pi − Pe = −∂P∂r
dr. (3.14)
42
The sum of the forces as a consequence of gravity and pressure has to be zero, so
∂P
∂r+ ρg = 0. (3.15)
This equation is rewritten by using Eq. (3.8) to find the equation of hydrostatic equilibrium:
∂P
∂r= −Gm
r2ρ. (3.16)
It is the second basic equation describing the stellar structure in Eulerian form.
If we choose m to be the independent variable, then we get the Lagrangian form of the hydrostatic
equilibrium by multiplying Eq. (3.16) with ∂r/∂m = (4πr2ρ)−1 following Eq. (3.6), while using the first
relation of (3.5):∂P
∂m= − Gm
4πr4. (3.17)
3.4.2 Simple solutions
Until now, we have only concentrated on the mechanical problem that is linked to the gravitational field and
the pressure ratification in the star. As such, we deduced two basic equations, taking the following form in
the Lagrangian formalism:∂r
∂m=
1
4πr2ρ,
∂P
∂m= − Gm
4πr4. (3.18)
We will now search for provisional solutions for these differential equations.
We search for a solution for the three unknown functions r, P, ρ and have to define a relation between
at least two of these three quantities. In some special situations we can write the density ρ as a function of
r and P or of m and P . In that case we have to do with normal differential equations because time does
not play an explicit role. An example of this is a homogeneous sphere with ρ = constant. A more realistic
physical example is given by the so-called barotropic solutions where ρ = ρ(P ), for example an ideal gas
at a constant temperature. A class of simple barotropic solutions which are important for studying stellar
structures are the polytropes. We will go more deeply into this special class of equations of state later on.
In general, though, the density is not only a function of the pressure, but it also depends on the tem-
perature: ρ = ρ(P, T ). A well-known example is that of an ideal gas. If we have to do with an equation
of state in which the temperature is playing a role, it becomes much more difficult to determine the internal
structure of a self-gravitating gas sphere. The mechanical structure is then depending on the temperature
stratification, which in turn is coupled to the production and the transport of energy in the star. To describe
this situation, we need to add more equations to solve the stellar structure.
43
3.4.3 The equation of motion in case of spherical symmetry
The equation of hydrostatic equilibrium (3.16) is a special case of conservation of momentum. When accel-
erated motions occur in the spherically symmetric star, the inertia of the fluid elements must be taken into
account. Below, we limit ourselves to the Lagrangian description.
Consider again a thin shell with mass dm at a distance r from the stellar center. This shell experiences
a force per unit area fp due to the pressure gradient given by (3.14). This equation can be rewritten as
fP = −∂P∂m
.dm. (3.19)
The gravitational force per unit area acting on the shell is given by
fg = −g dm4πr2
= −Gmr2
dm
4πr2, (3.20)
in which we made use of (3.8). If the sum of the pressure force and the gravitational force is non-zero, the
shell will be accelerated according todm
4πr2∂2r
∂t2= fP + fg. (3.21)
From this and using (3.19) and (3.20), we derive the equation of motion:
1
4πr2∂2r
∂t2= −∂P
∂m− Gm
4πr4. (3.22)
If only the pressure gradient would be active, there would be an outward acceleration (∂P/∂m); with only
the gravitational force at play, there would be an inward acceleration. The equation of motion is reduced to
the equation of hydrostatic equilibrium when all fluid elements are at rest or move in the radial direction at
a constant velocity. When the two terms on the right-hand side of the equation of motion compensate each
other, the assumption of hydrostatic equilibrium is a good approximation, and the star will evolve through
quasi-equilibrium states.
Let us assume in a thought-experiment that there is a deviation from hydrostatic equilibrium because
of a sudden “out fall” of the pressure force. The inertial term on the left-hand side of the equation of motion
would then need to compensate for the gravitational term on the right-hand side. We define a characteristic
time scale τff , connected to the implosion of the star due to the sudden disappearance of the pressure force:
∣
∣
∣
∣
∣
∂2r
∂t2
∣
∣
∣
∣
∣
≡ R
τ2ff, (3.23)
in which R is the radius of the star. Using the equation of motion (3.22), we can write τff as:
τff ≈(
R
g
)1/2
. (3.24)
This shows that τff is a mean value for the free-fall time over a distance of the order of the stellar radius
caused by the sudden disappearance of the pressure force.
44
In a similar way, we can define the characteristic timescale τexpl for the expansion of the star caused by
the sudden disappearance of the gravitational force:
∣
∣
∣
∣
∣
∂2r
∂t2
∣
∣
∣
∣
∣
=R
τ2expl= 4πr2
∂P
∂m=∂P
∂r
1
ρ≈ P
ρR, (3.25)
in which we have replaced ∂P/∂r by P/R. This yields
τexpl ≈ R
(
ρ
P
)1/2
. (3.26)
Because√
P/ρ is a measure for the average speed of sound in the stellar interior, we can interpret τexpl as
the mean time it takes a sound wave to travel from the stellar centre to the stellar surface.
When a star is in a state of hydrostatic equilibrium, the two terms on the right-hand side are approxi-
mately equal, yielding τff ≈ τexpl. We define the hydrostatic timescale τhydro as the time needed to restore
hydrostatic equilibrium in the star after a small perturbation. Using g ≈ GM/R2, Eq. (3.24) yields
τhydro ≈(
R3
GM
)1/2
≈ 1
2(Gρ)−1/2 . (3.27)
The equations mentioned above that describe the stellar structure are special cases of the equations
known from hydrodynamics and are only valid for spherically symmetric bodies.
3.5 Conservation of energy
3.5.1 The virial theorem
The virial theorem is not so important for solving most physical problems. In the study of stellar structures,
however, it is of major importance, since it connects two dominant energy reservoirs and helps to deduce
predictions and interpretations for certain evolutionary stages in the life of the star.
If we multiply the left-hand side of the Lagrangian form of the hydrostatic equilibrium (3.17) with 4πr3
and we integrate over the mass in the interval [0,M ] from the centre to the stellar surface, we get
∫ M
04πr3
∂P
∂mdm =
[
4πr3P]M
0−∫ M
012πr2
∂r
∂mPdm. (3.28)
The term between square brackets disappears as r = 0 in the stellar centre and P = 0 at the stellar surface.
On the other hand, we can reduce the integrand of the second term in the right-hand side by using (3.6) to
obtain 3P/ρ. At last we get∫ M
0
Gm
rdm = 3
∫ M
0
P
ρdm, (3.29)
45
where we get the left-hand side of (3.29) by replacing the left-hand side of (3.17) into its right-hand side.
Both sides of equation (3.29) have the dimension of an energy. We define the gravitational energy Eg of the
star by
Eg ≡ −∫ M
0
Gm
rdm. (3.30)
Let us consider a unit mass at a position r. The potential energy of this unit mass, as a consequence of
the gravitational field undergone by the mass m situated within a radius r, is −Gm/r. We can see that Eg
is the potential energy of all fluid elements dm of the star, which is normalised to zero at infinity. An energy
−Eg(> 0) is necessary to expand all mass elements to infinity, while this amount of energy is released when
an infinite cloud contracts into a star.
When all fluid elements within a star expand or contract together, Eg will gradually increase or de-
crease, respectively. This must then also be true for the integral in the right-hand side of Eq. (3.29). Here,
we stress that the contraction or expansion must be seen on a timescale that is much longer than τhydr since
otherwise Eq. (3.29) will not be valid.
To discover the meaning of the term in the right-hand side of Eq. (3.29), we consider an ideal gas:
P
ρ=
RµT = (cP − cv)T = (γ − 1)cvT. (3.31)
For a mono-atomic gas γ = 5/3 and we get P/ρ = 2/3u with u = cvT the internal energy of the ideal gas
per unit mass. If we define
Ei ≡∫ M
0u dm (3.32)
as the total internal energy of the star, we deduce from Eq. (3.29), in the case of an ideal gas,
Eg = −2Ei. (3.33)
This result is the virial theorem for a mono-atomic ideal gas.
For a general equation of state we define the quantity ζ by
ζu ≡ 3P
ρ. (3.34)
For an ideal gas we have ζ = 3(γ − 1). In the mono-atomic case (γ = 5/3) this gives ζ = 2. For a gas
solely composed of photons we have γ = 4/3, P = aT 4/3 and uρ = aT 4 with a as the radiative constant,
leading to ζ = 1. When ζ is constant in the star, Eq. (3.29) gives the more general result that
ζEi + Eg = 0. (3.35)
We now define total energy W of the star as W ≡ Ei + Eg, for which W < 0 for a gravitationally bound
system. Based on (3.35) we then get
W = (1− ζ)Ei =ζ − 1
ζEg. (3.36)
46
From this we deduce that the total energy of a gas of photons is zero.
If the star expands or contracts in a way so as to maintain the hydrostatic equilibrium, then Eg and Ei
will vary and the total energy will change. The gas will then radiate energy. If we define the total energy
loss by radiation per unit of time as the luminosity L of the star, we can deduce, following the conservation
of energy, that (dW/dt) + L = 0, which, via (3.36) implies that
L = (ζ − 1)dEi
dt= −ζ − 1
ζ
dEg
dt. (3.37)
When all mass shells contract simultaneously, dEg/dt < 0 and we get L = dEi/dt = −0.5dEg/dt > 0for a mono-atomic ideal gas. This means that half of the energy that is released as a consequence of the
contraction is radiated and the other half is used to heat up the star.
Equation (3.37) shows that L is of the order of |dEg/dt|. This way we can define a characteristic time
scale
τHK ≡ |Eg|L
≈ Ei
L, (3.38)
which is called the Helmholtz-Kelvin time scale (referring to the two physicists that deduced this as the
evolution timescale for a contracting or expanding star). A rough estimate of |Eg| is
|Eg| ≈Gm2
r≈ GM2
2R, (3.39)
in which m and r represent average values for m and r over the star (which we have replaced by M/2 and
R/2). This way we get
τHK ≈ GM2
2RL. (3.40)
During certain stages in the life of the star Eg is the main energy source and the star evolves on a timescale
τHK. For a detailed description of stellar evolution we refer to part III of this course, but we now already
stress why the virial theorem, together with the energy transport equation (see Chapter 5), is of major im-
portance for the star’s life.
The temperature of the star is decreasing from the inner regions towards the outer regions. This implies
that energy is transported outwardly through the star and is radiated away into space, i.e., energy is taken
away from the stellar interior. If there is no more nuclear source, e.g. when all H in the stellar core is con-
verted into He, the star can only deliver the necesssary energy by contracting. This contraction is happening
slowly, so that the star stays in hydrostatic equilibrium during the contraction. The star has no other choice
but to contract because shrinking is the only way to cover the energy loss. It does so on a timescale of
Helmholtz-Kelvin. The timescale that is necessary to recover from a pressure distortion is much shorter,
namely τhydro. This means that, during the slow contraction process of the star, a new pressure equilibrium
can always be installed quasi-instantaneously: during the contraction the virial theorem stays valid. Thus,
when the star contracts, half of the gained potential energy is radiated while the other half is used to increase
the temperature of the gas. Due to the increase in temperature, the temperature gradient is raised, causing
even more energy radiation and consequently stronger contraction of the star is needed. Due to this vicious
circle, the stellar core keeps on shrinking and getting hotter until the temperature is high enough for a next
fusion process (for example when T=108 K, helium burning can start). Afterwards the star can again radiate
without shrinking for a long period of time.
47
Figure 3.3: Representation of the quantity l, which depicts the amount of energy that radiates per second
through a sphere with radius r. (From Kippenhahn et al. 2012)
3.5.2 Conservation of energy in stars
We define l(r) as the net amount of energy, integrated over all frequencies, that is radiated per second
through a sphere with radius r. We assume there is no infinitely high energy source in the stellar centre.
This way the function l is zero in the stellar centre. It also equals the total luminosity L of the star at the
stellar surface. Between r = 0 and r = R, l is a complicated function that depends on the distribution of all
energy sources occurring in the different stellar layers. This way, l encloses the energy that is transported
by radiation as well as conduction and convection. In the following chapter, we will focus on these means
of energy transport, all of which require a temperature gradient. In the function l, we do not take into
account a possible energy flux as a result of neutrinos. After all, they have a negligible interaction with the
stellar material and we shall treat the neutrino flux, which does not require a temperature gradient, always
separately.
Local conservation of energy
Consider a spherically symmetric mass shell at radius r, with thickness dr and mass dm. Depict the energy
that enters the inner part of the shell by l and the energy that leaves along the outer side of the shell by l+ dl(see Figure 3.3). The surplus dl can be provided by nuclear reactions, by cooling, or by the contraction or
expansion of the shell. In a stationary situation dl only originates from the release of energy coming from
nuclear reactions. If we represent the nuclear energy released per unit mass and per time unit by ε, we get
dl = 4πr2ρεdr = ε dm (3.41)
or∂l
∂m= ε. (3.42)
The quantity ε in general depends on the temperature, the density and the abundances of the different react-
ing nuclear particles.
48
For a non-stationary shell, dl can differ from zero, even when no nuclear reactions take place. Indeed,
the shell can alter its internal energy and, moreover, exchange mechanical energy with neighbouring shells.
In that case we write instead of (3.42)
dq =
(
ε− ∂l
∂m
)
dt, (3.43)
in which dq is the heat added to the shell per unit mass. If we change dq according to the first law of
thermodynamics, we get∂l
∂m= ε− ∂u
∂t− P
∂v
∂t= ε− ∂u
∂t+P
ρ2∂ρ
∂t. (3.44)
Keeping the thermodynamic relation (2.18) in mind, we can write this expression in terms of the pressure
and the temperature:∂l
∂m= ε− cP
∂T
∂t+δ
ρ
∂P
∂t. (3.45)
This equation is the third equation that describes the stellar structure.
Often the terms containing a time derivative in Eq. (3.45) are treated together in a so-called source
function εg:
εg ≡ −T ∂s∂t
= −cP∂T
∂t+δ
ρ
∂P
∂t= −cPT
(
1
T
∂T
∂t− ∇ad
P
∂P
∂t
)
, (3.46)
where we have used ds = dq/T and Eq. (2.20) for ∇ad.
Let us now look into the energy change due to neutrinos. Neutrinos can occur in large amounts as a
side product of nuclear reactions (see later on, description of the different burning cycles). On the other
hand, the average free path of a neutrino in a typical stellar medium is about 100 parsec! In the stellar core
of a main-sequence star they even have an average free path of about 3000 R⊙. The stellar material is thus
fully transparent for neutrinos and therefore they can easily transport the energy they are carrying with them
to the stellar surface (this assumption is not valid anymore in the last end stages of the life of a star). This is
the reason why we treat the influence of neutrinos separately from the energy fluxes requiring a temperature
gradient. The only fluid elements that are influenced by neutrinos are those where neutrinos are formed. In
such regions, the neutrinos can cause a decrease of energy. We define εν(> 0) as the energy that is taken
from the stellar material per unit mass and per time unit in the form of neutrinos. The total equation for local
conservation of energy then is∂l
∂m= ε− εν + εg. (3.47)
The energy that is transported per second by neutrinos is called the neutrino luminosity and is given by
Lν ≡∫ M
0ενdm. (3.48)
As mentioned already l = 0 in the stellar centre and l = L at the stellar surface. For an intermediate
value of r, l is not necessarily monotonously increasing and can even become higher than L or negative. An
example of this is an expanding star where L is lower than the energy produced by the nuclear reactions in
the central parts due to the expansion (εg < 0). A strong neutrino loss can induce l < 0 in some stellar
layers.
49
Since neutrinos can leave the star without any problems after their creation by numerous nuclear re-
actions, they provide direct information about these reactions and hence on the physical conditions in the
stellar core. This offers a unique opportunity to probe stellar interiors. However, due to their very large
free path it is difficult to detect neutrinos in laboratories on Earth. Nevertheless, it is well possible to catch
neutrinos produced by hydrogen burning in the Sun. In one of the successful detections the neutrinos are
caught by the reaction
νe + 37Cl → e− + 37Ar. (3.49)
One of the original successful detectors was based on a volume filled with 380 000 liter C2Cl4 (a standard
detergent). Despite this gigantic big volume only one neutrino was detected every other day. This is much
less than the amount of neutrinos that is predicted using the solar models. This problem was known for
30 years as the solar-neutrino-problem. In another experiment, one looked at the scattering of neutrinos at
electrons in a volume of 680 ton of water. In contrast with the Cl experiment, the direction of the neutrinos is
measured, from which it could be deduced that they are produced by the Sun. Also in this case, the number
of detections was far too low compared to the theoretical predictions.
A solution to the problem was found after realising that the detectors were insensitive to the more un-
common types of neutrinos. The Cl and electron experiments are indeed only sensitive to a small fraction of
the total neutrino production in the Sun, namely only the high-energy electron-neutrinos. The above exper-
iments are not sensitive to mu- and tau-neutrinos. Two additional experiments are sensitive to the majority
of the produced neutrinos. They are based on a reaction of the neutrino with 71Ga. The gallium experi-
ments gave results that were closer to the theoretical expectations, but until 2001 considerable differences
remained. The solution was found thanks to hundreds of researchers active at the Sudbury Neutrino Obser-
vatory in Canada, who developed a new generation of neutrino detectors. They could confirm that a part
of the solar neutrinos change their character from electron-neutrinos into mu- or tau-neutrinos by the time
they arrive on Earth. Estimates of the sum of the three types of neutrinos fairly well correspond to models
of the Sun. The leading team of scientists was rewarded with the Nobel Price in Physics for having brought
a solution to the solar neutrino problem.
Given the challenges to detect the neutrinos coming from the Sun, it is of course even more difficult to
detect neutrinos produced by other stars. However, neutrinos produced during supernovae explosions have
been detected. The famous supernova SN 1987 A in the Large Magellanic Cloud gave rise to the detection
of 20 of its emitted neutrinos in two different detectors, both situated in the Northern Hemisphere. This
offered a very precise test of nuclear reactions during supernova explosions (see later on).
Global conservation of energy
When describing the virial theorem we have limited ourselves to taking into account the internal energy Ei
and the gravitational energy Eg. We neglected the nuclear energy as well as the energy of the neutrinos and
the kinetic energy of the fluid elements (for example due to stellar oscillations). If we now redefine the total
energy of the star as W = Ekin +Eg +Ei +En with En the nuclear energy-content of the whole star, then
the equation that describes the global conservation of energy is given by:
d
dt(Ekin + Eg + Ei + En) + L+ Lν = 0. (3.50)
50
3.5.3 The different time scales
Suppose that the luminosity of the star is only caused by the release of nuclear energy. If L is constant, this
energy loss can take place during the nuclear time scale that is defined by
τn ≡ En
L. (3.51)
En represents the energy reservoir built up by nuclear reactions. The main reactions during the largest
fraction of the stellar life are those that achieve the fusion of four 1H nuclei in one 4He nucleus. This
hydrogen burning releases an energy of 6.3× 1018 erg g−1, which corresponds to a mass deficit of ∼ 0.75%
(it is equal to the total mass of four protons minus the mass of a helium nucleus, divided by the total mass
of four protons (see Appendix B). The nuclear time scale shows the total life span a star can have based on
the production of nuclear energy. Later we will show that the luminosity of a star is a strongly increasing
function of the stellar mass. Because of this, the nuclear time scale decreases very fast with increasing mass.
A star with initial mass of 30M⊙, for example, can only live for about 5 million years while a star with half
a solar mass barely had enough time to evolve in the current Universe with its age of ∼ 13.79± 0.02 billion
years.
The relation between the different time scales for the Sun is (see exercises) :
τn >> τHK >> τhydr. (3.52)
This relation is valid for all stars for which hydrogen or helium burning is the main energy source. The re-
lation between these time scales helps us to simplify the equation that expresses the conservation of energy.
Let us consider the four terms that occur in Eq. (3.45) for a star of which the properties are changing consid-
erably on a time scale τ , which can be small or large compared to τHK. A cause of this change could be, for
example, the depletion of a certain nuclear fuel inside the core. For an ideal gas we can easily approximate
the terms in Eq. (3.45) by:
∣
∣
∣
∣
∂l
∂m
∣
∣
∣
∣
≈ L
M≈ Ei
τHKM,
ε ≈ L
M=
En
Mτn≈ Ei
τHKM,
∣
∣
∣
∣
cP∂T
∂t
∣
∣
∣
∣
≈ cPT
τ,
∣
∣
∣
∣
δ
ρ
∂P
∂t
∣
∣
∣
∣
≈ Rµ
T
τ≈ cPT
τ≈ Ei
τM.
(3.53)
In the case of τ >> τHK the values of the last two expressions given in Eqs (3.53) are far below those of
the first two expressions and we can neglect the time-dependent terms in the energy equation (|εg| << ε). The
latter then reduces to ∂l/∂m = ε like in (3.42). This approximation is valid when the burning of hydrogen
or helium determines the stellar evolution (τ = τn) and implies a huge simplification when computing stellar
models. These models are in full mechanical and thermal equilibrium.
In contrast, if τ << τHK, the values of the right-hand sides of the last two equations given in (3.53) are
large compared to those of the first two equations. This means that the time-dependent terms in the energy
51
equation compensate each other to a very good approximation, implying that dq/dt ≈ 0. In this case, we are
dealing with a quasi-adiabatic change. An example of this is a star pulsating on a time scale τ << τHK. The
variable luminosity of a pulsating star is the consequence of variations in εg and not in ε. For an extensive
description of pulsating stars we refer to the course Asteroseismology in the Leuven Master of Astronomy
and Astrophysics, while the course Theory of Stellar Oscillations fully describes the theoretical aspects of
this research field.
As the reader will have noticed, the determination of the time scales is somewhat arbitrary. We just as
well could have taken R or R/10 as the average distance to use in the expressions, instead of R/2; a similar
remark is valid for the average mass. However, it is not the intention to determine accurate values for the
time scales but rather to have an idea of their order of magnitude.
Finally, when deducing the relation between the different time scales we have sort of assumed implicitly
that the stellar quantities change linearly. However, when only certain parts of the star have to be considered
because of non-uniform variations, the above-mentioned argumentation is not appropriate anymore because
local rather than global time scales should be taken into consideration.
52
Chapter 4
Additional relevant equations of state
The temperature does not occur in Eqs (3.18). For certain equations of state, this allows to separate these
two equations from the thermo-energetic equations that are also necessary to define the stellar structure. We
will now discuss two of such equations of state that are important for the life of the star.
4.1 Polytropes
We take a star in hydrostatic equilibrium and use the Eulerian description. For a time-independent stellar
model the gravitational potential has to fulfill the following equations:
dP
dr= −ρdΦ
dr,
1
r2d
dr
(
r2dΦ
dr
)
= 4πGρ.
(4.1)
When ρ is not depending on T : ρ = ρ(P ), this relation can be substituted into Eqs (4.1), which then forms
a system of two equations for the two unknowns P and Φ. These equations can be solved without having to
use the equation describing the energy transport (see next chapter).
We assume that we have a simple relation between the pressure and the density which looks like this:
P = Kργ = Kρ1+1n , (4.2)
in which K, γ and n are constants. An equation of state of the form (4.2) is called a polytrope. K is the
polytropic constant and γ the polytropic exponent. Instead of γ, often the polytropic index n is used, which
is defined as n ≡ 1/(γ − 1).
53
In general K is constant for one specific star, but it can take different values for different stars. For an
isothermal ideal gas, the equation of state can be written as follows: P = (RT0/µ)ρ. In this case, we are
dealing with a polytrope with K = RT0/µ, γ = 1, n = ∞. For an ideal mono-atomic gas with a negligible
radiation pressure: ∇ad = 2/5. This means that T ∼ P 2/5. Furthermore, in this case µ =constant, and
therefore T ∼ P/ρ, so we finally get P ∼ ρ5/3. This is again a polytrope, this time with γ = 5/3, n = 3/2.
A homogeneous gaseous sphere can be seen as a special case of (4.2) for γ = ∞, n = 0. We thus conclude
that polytropes indeed occur in the case of simple equations of state that already have the form (4.2) as well
as in case of an ideal gas when an extra relation between temperature and pressure can be deduced.
For a polytropic relation (4.2), the first equation of the system (4.1) can be transformed into
dΦ
dr= −γKργ−2dρ
dr. (4.3)
For γ 6= 1 this equation can be integrated to result in:
ρ =
( −Φ
(n+ 1)K
)n
, (4.4)
in which we have used the definition of n and the integration constant was chosen in a way that Φ = 0 at the
stellar surface. When we substitute (4.4) in the second equation of (4.1), we become an ordinary differential
equation for Φ:d2Φ
dr2+
2
r
dΦ
dr= 4πG
( −Φ
(n+ 1)K
)n
. (4.5)
We now define the dimensionless variables z and w by
z = Ar met A2 =4πG
(n+ 1)nKn(−Φc)
n−1 =4πG
(n+ 1)Kρ
n−1n
c ,
w =Φ
Φc=
(
ρ
ρc
)1/n
,
(4.6)
where the subscript “c” indicates the stellar centre. In the centre, we have r = z = 0,Φ = Φc, ρ = ρc and
so w = 1. Substituting these variables into (4.5), we get
d2w
dz2+
2
z
dw
dz+ wn = 0, (4.7)
which again can be transformed into
1
z2d
dz
(
z2dw
dz
)
+wn = 0. (4.8)
Equation (4.8) is the Lane-Emden equation. We search for solutions of this equation that remain finite in
the stellar centre. This condition is met when dw/dz(0) = 0. In general, we have to determine solutions of
the Lane-Emden equation numerically, since only for n = 0, 1, 5 analytic solutions exist. The function w is
represented in Figure 4.1 for the two cases n = 3 and n = 3/2.
54
Figure 4.1: The solutions for the Lane-Emden equation (4.8) for n = 3/2 and n = 3. (From Kippenhahn et
al. 2012)
Imagine we have found a solution w(z) of the Lane-Emden equation for whichw(0) = 1 and dw/dz(0) =0. Following (4.6), the radial dependence of the density is then given by
ρ(r) =
[ −Φc
(n+ 1)K
]n
wn(Ar). (4.9)
For the pressure, we can then find the solution from the definition (4.2): γ, P (r) = Pcwn+1(Ar) with
Pc = Kργc . Finally we deduce an expression for the mass within the sphere with radius r:
m(r) =
∫ r
04πρr2dr = 4πρc
∫ r
0wnr2dr = 4πρc
r3
z3
∫ z
0wnz2dz, (4.10)
where we used (4.6). According to the Lane-Emden equation, the integrand wnz2 is a derivative and can
therefore be integrated with as result −z2dw/dz. The mass can then be written as
m(r) = 4πρcr3(
−1
z
dw
dz
)
. (4.11)
4.2 The degenerate electron gas
If a gas reaches a very high density, it cannot be described any longer by the ideal gas law. At high densities
quantum mechanical effects are interfering and such a gas is then called a degenerate gas. A schematic
comparison between “ordinary” and degenerate matter in a neutral gas is shown in Figure 4.2. In case “a”
the electrons move in a normal way, i.e., in their shells around the nuclei, wile in case “b” the mutual distance
between the nuclei is so small that the electrons cannot move in their shells anymore and form a “gas” that
moves in between the nuclei.
55
Figure 4.2: A schematic representation of the difference between ordinary (a) and degenerate (b) matter for
a neutral gas. In ordinary matter the inner electron shells are still intact. In degenerate matter the nuclei
are closer to each other than half of the diameter of the smallest possible electron shell. Because of this,
the electrons cannot move according to their shells but must move freely between the nuclei. In this way,
they form a “gas”. This degenerate electron gas exerts a huge pressure. (Image courtesy of Prof. E. van den
Heuvel, University of Amsterdam, NL)
Quantum mechanics states that there cannot be two identical particles that have the same position
and velocity, within the accuracy in which these can be measured according to the uncertainty relation of
Heisenberg. This law is called the Pauli Exclusion Principle. In other words: if two electrons are found very
close to each other, they cannot have exactly the same velocity.
In a low-density gas the average velocity of the particles is determined by the temperature. When the
temperature is high, the mean velocity of the particles is high. The gas pressure depends on the velocity of
the particles. Because the distance between the particles is large, the constraint that is put on the velocities
of particles by the exclusion principle has no effect. Such a gas is then called an ideal gas (see Chapter 2).
The situation is different for a gas that is compressed to a high density: all possible low velocities are in
this case filled up, causing many particles to undergo high velocities. These velocities are much higher than
the ones the particles would have when they would occur in a low-density gas with the same temperature.
When the density of the degenerate gas is extremely high, the velocities with which the particles are forced
to move, reach the level of the speed of light. Such a gas is called a relativistic degenerate gas. Because
the uncertainty relation contains the product of the mass and the velocity, the lightest particles will become
degenerate first. In a normal gas, these are the electrons.
We consider a gas with an adequately high density so that pressure ionization occurs. This effect occurs
when no bound atoms are found because the orbital radius a of the electrons becomes comparable or larger
56
than half of the distance d between two atoms. In the case of neutral hydrogen, a and d are given by
a = a0ν2, d ≈
(
3
4πnH
)1/3
, (4.12)
with a0 = 5.3 × 10−9 cm, the Bohr radius, ν the main quantum number and nH the number of hydrogen
parts per volume unit. A gas will experience no pressure-ionization as long as a < d/2, which implies the
following condition for the main quantum number:
ν2 <
(
3
4πnH
)1/3 1
2a0. (4.13)
In the centre of the Sun, we have ρc ≈ 170 g/cm3, nH ≈ 1026 cm−3 and so the condition by which pressure-
ionisation will not occur is given by ν2 < 0.13. This means that the ground state of the hydrogen atom
cannot occur and that all hydrogen atoms in the centre of the Sun have to be ionized. In stellar centres we
always have to do with pressure-ionized gasses for which the electrons first become degenerate and if that
is insufficient to provide enough pressure, the nuclei also become degenerate.
We now study free electrons that occur in a pressure-ionized gas. In the local space of momentum
px, py, pz , each electron is represented as a spherically symmetric “cloud”. Representing the absolute value
of the momentum by p (with p2 = p2x+p2y+p
2z), the distribution function of the momentum of the electrons
in a classical gas is a Maxwellian distribution function (2.40), which we now write down in terms of the
momentum rather than velocity:
f(p) =4πp2
(2πmekT )3/2exp
(
− p2
2mekT
)
. (4.14)
The maximum of this distribution function occurs at pmax = (2mekT )1/2. When there is a decrease in
temperature T , the maximum shifts to a lower p-value and the value of the maximum of f(p) increases (see
Figure 4.3).
The number of free electrons with particle density ne occurring in a volume dV of a pressure-ionized
gas and having a momentum in the interval [p, p + dp], is obtained by multiplying the distribution function
with nedV ; this way we obtain the so-called Boltzmann distribution function:
nef(p)dpdV = ne4πp2
(2πmekT )3/2exp
(
− p2
2mekT
)
dpdV. (4.15)
We now forget about classical mechanics and take the quantum mechanical principles into account. Since
the electrons have to fulfil the Pauli principle, there is a restriction on the amount of electrons that can occur
in a given state. Each quantum cell of the six-dimensional phase space (x, y, z, px, py, pz) can only contain
two electrons. The volume of such a quantum cell is dpxdpydpzdV = h3, with h Planck’s constant. In the
shell [p, p + dp] of the space of momentum, there are 4πp2dpdV/h3 quantum cells, that can only contain
8πp2dpdV/h3 electrons. These quantum mechanical considerations thus give an upper limit for the number
of electrons:
f(p)dpdV ≤ 8πp2dpdV/h3. (4.16)
57
Figure 4.3: Maxwellian distribution functions f(p) are shown as a function of the momentum p (thin lines)
for an electron gas with density ne = 1028cm−3 (which agrees with a density of ρ = 1.66 × 104g cm−3
for µe = 1) for different temperatures. The bold line shows the upper limit, imposed by the Pauli principle.
(From Kippenhahn et al. 2012)
This quantum-mechanical upper limit for f(p) is indicated as the parabola in Figure 4.3. We deduce that the
Boltzmann distribution for ne = constant is in contradiction with the quantum mechanical upper limit for
extremely low temperatures. The same result holds for T =constant and sufficiently high densities, since the
Boltzmann distribution is proportional to ne. Therefore, we have to abandon the classical description and
take the quantum mechanical effects into account when the gas temperature is too low or when the electron
density becomes too high. In that case, the distribution function exceeds the upper limit imposed by the
Pauli principle.
Let us consider an electron gas where the electrons have the lowest possible energy (T = 0K). The state
in which al these electrons have the lowest energy possible, while still complying with the Pauli principle, is
the one where all phase cells are populated with two electrons until a certain value of the momentum noted
as pF, while all the other cells are empty:
f(p) =8πp2
h3voor p ≤ pF,
f(p) = 0 voor p > pF.(4.17)
58
Figure 4.4: The distribution function f(p) as a function of the momentum p for a completely degenerate
electron gas with temperature the absolute zero point and density ne = 1028cm−3. (From Kippenhahn et al.
2012)
The distribution function is shown in Figure 4.4. The total amount of electrons in the volume dV is given by
nedV = dV
∫ pF
0
8πp2
h3dp =
8π
3h3p3FdV. (4.18)
For a given electron density, we then find the Fermi-momentum pF ∼ n1/3e . For non-relativistic electrons the
Fermi-energy is EF = p2F/2me ∼ n2/3e . We can see that, even though the temperature of the electron gas
is zero, the electrons have an energy different from zero that can amount to EF. When the electron density
is very high, the velocities of the fastest electrons can reach a considerable fraction of the speed of light.
Therefore we have to use expressions for the total energy and the momentum that are deduced according to
the theory of special relativity:
p =mev
√
1− v2/c2,
Etot =mec
2
√
1− v2/c2= mec
2
√
1 +p2
m2ec
2,
(4.19)
with me the rest mass of the electron. The kinetic energy of the electron is connected with the total energy
by E = Etot −mec2.
To deduce an equation of state for a degenerate electron gas we have to consider an expression for the
pressure of the electrons. The pressure is by definition the flux of momentum through a unit surface per
unity of time. Let us consider a unit surface dσ with normal vector ~n (see Figure 4.5). An arbitrary vector
59
~s defines the angle θ that is enclosed by ~n and ~s. We now determine the number of electrons that move
through dσ per second within the solid angle dΩs around the direction ~s. We will limit to electrons with
a momentum in the interval [p, p + dp]. At the position of the surface element there are f(p)dpdΩs/(4π)electrons per volume unit and per solid angle unit that have the appropriate momentum. There will be
moving f(p)dpdΩsv(p) cos θdσ/(4π) electrons per second through the surface dσ within the solid angle
dΩs. Here v(p) is the velocity defined by (4.19). Each electron has a momentum with absolute value pand with direction ~s. The component of it in the direction of ~n is p cos θ. We then obtain the total flux of
momentum in the direction ~n by integrating over all directions ~s of a sphere and over all absolute values p.
We this way find an electron pressure Pe
Pe =
∫
Ω
∫ ∞
0f(p)v(p)p cos2 θdpdΩs/(4π) =
8π
3h3
∫ pF
0p3v(p)dp, (4.20)
in which we have replaced f(p) by (4.17). By using the expression for p given in (4.19) we then find
Pe =8πc
3h3
∫ pF
0p3
p/mec
[1 + p2/(m2ec
2)]1/2dp =
8πc5m4e
3h3
∫ x
0
ξ4dξ
(1 + ξ2)1/2, (4.21)
where we have introduced the new variables ξ ≡ p/(mec), x ≡ pF/(mec). It can be shown that the integral
in the right-hand side of this expression is given by
1
8
[
x(
2x2 − 3)(
1 + x2)1/2
+ 3 sinh−1 x
]
=x
8
(
2x2 − 3)(
x2 + 1)1/2
+3
8ln
[
x+(
1 + x2)1/2
]
≡ 1
8g(x)
so that
Pe =πm4
ec5
3h3g(x). (4.22)
With the help of the definition of x we finally write the number of electrons as
ne =ρ
µemu=
8πm3ec
3
3h3x3. (4.23)
These last two equations define the function Pe(ne).
To find an expression for the equation of state Pe(ρ), we first deduce the asymptotic behaviour of the
function g(x). Therefore we write x as
x =pFmec
=vF/c
(1− v2F/c2)1/2
ofv2Fc2
=x2
1 + x2, (4.24)
in which vF is the velocity of the electrons with a momentum p = pF. When x ≪ 1, then vF/c ≪ 1 and
the electrons clearly move slower than the speed of light (non-relativistic limit). On the other hand, x ≫ 1implies that vF/c → 1. The higher x, the more electrons move relativistically and, for very large x, almost
all electrons move relativistically. The function g(x) shows the following asymptotic behaviour:
x→ 0 : g(x) → 8
5x5, x→ ∞ : g(x) → 2x4. (4.25)
When x≪ 1, relativistic effects can be neglected; (4.22) gives in this limit
Pe =8πm4
ec5
15h3x5. (4.26)
60
Figure 4.5: A surface element dσ with normal vector ~n and an arbitrary unity vector ~s, which is the axis of
the solid angle dΩs. (From Kippenhahn et al. 2012)
If we substitute the expression for x given in (4.23), then we get
Pe =1
20
(
3
π
)2/3 h2
men5/3e =
1
20
(
3
π
)2/3 h2
me
(
ρ
µemu
)5/3
, (4.27)
where we used that ρ = neµemu in the last step. We notice that this equation of state has the form of a
polytrope with γ = 5/3, n = 3/2.
For x ≫ 1 we are in the extreme relativistic limit and we find the following equation for the electron
pressure
Pe =2πm4
ec5
3h3x4. (4.28)
If we again substitute x based on (4.23) this gives now
Pe =
(
3
π
)1/3 hc
8n4/3e =
(
3
π
)1/3 hc
8
(
ρ
µemu
)4/3
. (4.29)
We again find a polytrope, this time with γ = 4/3, n = 3.
For both extremes of the degenerate electron gas (relativistic and non-relativistic), we find a polytropic
equation of state (where the function w was shown in Figure 4.1) in which the constant K is only determined
by physical constants. This in contradiction with the examples in the former section where K was a free
constant that can vary from star to star.
When the temperature is not zero, not all electrons will be packed in cells with a momentum that is
as low as possible. For temperatures which are high enough, the electrons will comply with Boltzmann
statistics. There is a continuous transition of a state of full degeneracy to a state of a non-degenerate gas.
This is called partial degeneracy. The distribution of the phase cells then follows a so-called Fermi-Dirac
statistic, that contains a degeneracy parameter ψ ∈ [−∞,∞]. This parameter shows which fraction of the
phase cells is filled and depends on ne and T . In this case, the equation of state cannot be described as a
61
simple analytic relation between the electron pressure and the density. For ψ → −∞ we recover an electron
pressure for the ideal gas approximation, Pe = nekT , in the case of the non-relativistic partial degenerate
electron gas. For a non-relativistic partial degenerate gas with ψ ≫ 1 (high level of degeneracy) we recover
the equation of state (4.27). For the relativistic limit of strong degeneracy (ψ → +∞), we find the equation
of state (4.29).
An important graph is the one where the temperature is plotted against the density and where the validity
areas with different equations of state are indicated. This graph is the outcome of one of the exercises.
4.3 The Chandrasekhar limit
We now look into a polytropic model in which the pressure is connected with a non-relativistic degenerate
electron gas. In such a medium, the central density and mean density increase with increasing stellar mass.
However, when the density increases, the electron gas becomes more and more relativistic. We can imagine
that we are evolving to a star with a relativistic core where the pressure is described by a polytrope with
n = 3 (see 4.29) and a non-relativistic envelope with a pressure given by a polytrope with n = 3/2 (see
4.27). Hence a transition will occur, where the pressure takes a value between both expressions (4.27) and
(4.29). The physicist Chandrashekhar was the first to look into such models in order to understand the
so-called white dwarfs (see Chapter 12).
A natural question is how such a model varies with increasing mass. At low M the whole model stays
non-relativistic and a polytrope with n = 3/2 gives an adequate description. When the central density is
high enough, an ever larger part of the stellar core will become relativistic. We expect that the star eventually
evolves into a state in which all particles move relativistically and the pressure is described by a polytrope
with polytropic index n = 3. This view has the following interesting property. From the definition of the
variable z, we find
R ∼ ρ1−n
2nc , (4.30)
thus from M ∼ ρcR3, it follows that
M ∼ ρ3−n
2nc . (4.31)
We can thus conclude that the mass of a polytrope with n = 3 does not depend on the central density:
M = constant. Therefore, there is only one admitted mass for a fully degenerate relativistic electron gas
that meets the requirements of a polytrope with n = 3. This mass is totally determined by physical constants
and the value of the functions z and w′ in the zero point of the polytrope with n = 3. The numerical limit
value of the only admitted mass is
MCh =5.836
µ2eM⊙. (4.32)
This mass is called the Chandrasekhar limit. It indicates the end point of the convergence process of
models with increasing central density for which the pressure is delivered by a fully relativistic degenerate
electron gas. This limiting mass (4.32) is very low if one keeps in mind that there are many stars that have
a much larger birth mass. However, all stars that have not yet started the ultimate end stage of their lives,
have an equation of state that differs a lot from a degenerate electron gas and so this mass limit is not of any
62
Figure 4.6: State of a stellar gas, where the central temperature expressed in K is shown as a function of the
central density expressed in g/cm3. (From Kippenhahn et al. 2012)
relevance for them. For white dwarfs, however, a degenerate electron gas does occur, as we shall discuss in
Chapter 12. For these stars, µe = 2 is a good approximation and we find the condition
M < MCh = 1.46M⊙. (4.33)
Even though we deduced the Chandrasekhar limit by using a polytropic model, the result is almost the
same when considering a more realistic equation of state, because, for extremely high densities, the pressure
of the electron gas converges to a pressure which is well described by a polytropic law with γ = 4/3, n = 3.
When we work with a more realistic, non-polytropic model, we find MCh = 1.44M⊙. Until now, no white
dwarf has been found with a mass that exceeds MCh. Chandrasekhar received the Nobel Prize in Physics
for his studies of white dwarfs.
4.4 Schematic representation of the relevant equations of state
In Figure 4.6 the different equations of state of the gas are shown in a (temperature-density)-diagram. Above
the dotted line, the radiation pressure dominates. Below the solid line the electrons are degenerate, on the
one hand in the relativistic limit (to the right of the dashed line) and on the other hand in the non-relativistic
limits (to the left of the dashed line). The thcik dashed line shows the central conditions for a model
representative of the Sun.
63
So far we have neglected the interaction between the ions. This is no longer justified for high densities
and low temperatures, because their Coulomb interaction starts to interfere. Instead of moving around freely,
the ions will, under the right circumstances, orderly take place in a grid so that their energy is minimal. Using
crystallization theory we can compute for which combinations of temperature and density these effects start
to dominate. This crystallization area is marked by the dashed-dotted line in Figure 4.6. In the interior of
stars that have not died yet, the densities are high but the temperatures as well. Therefore the crystallization
area is not important for stars. Cooling white dwarfs, however, end up in this area, since their density
essentially remains constant but they move down in the HR diagram along the cooling track (see further on).
This way they will obtain a crystallized core of carbon and oxygen. Cool white dwarfs are thus in fact giant
diamonds in the sky; they are massively present in the Universe!
64
Chapter 5
Energy transport
The energy radiated by a star at its surface is created in the hot central parts. Thus, energy is transported
through the stellar material. This energy transport is possible thanks to the existence of a temperature
gradient. Depending on the local conditions, the transport is done by radiation, conduction or convection.
Ions, atoms and electrons are constantly exchanged between cooler and warmer regions while they interact
with photons. The temperature differences between surrounding layers determine how the energy transport
occurs. In this chapter, we discuss the equation that describes the energy transport. This equation is the next
one in the system of equations that describe the stellar structure.
5.1 Energy transport by radiation
We start off with a few rough estimates of crucial quantities that characterize radiative energy transport by
photons. This will enable us to simplify the formalism.
5.1.1 Mean free path
A first estimation concerns the length of the mean free path ℓf of a photon located at a certain point in the
star where the density is ρ:
ℓf =1
κρ, (5.1)
with κ the “mean” absorption coefficient or opacity, which is the microscopic radiative effective cross sec-
tion per unit mass, averaged over all frequencies. First, we clarify the meaning of an effective cross section
and the mean free path. These concepts are introduced in the general context of collision probabilities.
What is the condition for two particles to collide? When we consider two spherical particles A and B,
65
with respective radii ra and rb, they collide when the distance d between their centres is equal or smaller than
the sum of the radii: d ≤ ra + rb. Equivalently, this condition can be expressed by requiring that the centre
of the particle B (the projectile) must be within or on a circle centred around A with a radius r = ra + rb.Consequently, the collision can be regarded as the collision between a stationary particle with radius ra+ rband a point-shaped incoming particle. The spherical stationary particle A can be further simplified to a disk
(a target), perpendicular to the direction of motion of the incoming particle B. The surface of this disk is
called the microscopic effective cross section, and is equal to κ = π(ra + rb)2.
We now consider a bar-shaped plane parallel stellar layer with dimensions l× l×dx, with thickness dxso small that the individual targets in the layer in the direction parallel to dx do not overlap. Furthermore,
dx is oriented parallel to the direction of the incoming projectile. We assume that the density in the plane
parallel layer is ρ. In total, the layer contains ρl2dx targets. These targets have a total joint effective
cross section given by κρl2dx. The impact probability of an incoming projectile is defined by the ratio of
the surface covered by a unit mass of targets, with respect to the total surface of the layer, and hence is
κρl2dx/l2 = ρκdx. The product κρ is called the macroscopic effective cross section per unit mass. This
quantity has the dimension of a reciprocal length.
Let the probability of a collision with one incoming particle be p, then on average 1/p particles have
to be directed to the plane paralleled layer to obtain one collision. In the case described above, an average
of 1/ρκ particles have to be sent in a unit mass of the plan-paralleled layer to cause a collision over the
distance dx. The mean distance the particle will travel before it collides with a target in the layer, then is
1/ρκdx/dx = 1/ρκ per unit mass. This mean distance is called the mean free path. In the case that the
projectiles are photons, we will note the mean free path as ℓf .
The opacity depends on the interaction between radiation and matter. Specifically, it depends on the
detailed distribution of atoms in the gas, the population of the energy levels, the ionisation states, and the
equation of state of the gas. The computation of κ is complex and requires intense efforts. This type of
work is taken up by various dedicated international research teams. These specialised activities result in the
publication of opacity tables, which describe the value of κ as a function of the density, the temperature,
and the chemical composition.
A few simple yet relevant approximations for the opacity exist. They provide a rough idea of the
dependence on the thermodynamical state of the gas described by ρ and T . Kramers’s approximation is the
best-known:
κ = κ0 ρ T−3.5, (5.2)
in which κ0 is a constant that depends on the chemical composition. This density- and temperature-
dependence of the opacity is appropriate in the stellar interior of low-mass stars, where the temperature
remains relatively low. In the core of massive stars, scattering by free electrons (Thomson scattering) dom-
inates the opacity, which makes it independent of the density and the temperature. The latter is also valid
when the gas is fully ionized. In this case, κe = 0.2(1 +X) ≈ 0.4 cm2/g is a good approximation for the
opacity. It provides a lower limit for κ, since bound-bound transitions in partially ionised atoms are respon-
sible for a large fraction of the opacity. At temperatures below 6 000 to 10 000 K, the absorption of photons
by the negatively charged hydrogen atom (hydrogen atom with one additional electron, H−) dominates the
opacity. This situation occurs in the atmosphere of stars with a mass below one solar mass. The required
66
electrons are provided by the ionisation of metals. In this case, the opacity is proportional to the density of
H− and hence to the density of the electrons. Put differently, the opacity is set by the degree of ionisation,
and increases with increasing temperature, contrary to expression (5.2) which is valid in stellar interiors. As
a typical value for κ in a star, we can regard the case of ionised hydrogen in the stellar core: κ ≈ 1 cm2/g.
Taking the average density of the Sun, ρ⊙ = 3M⊙/4πR3⊙ = 1.4 g/cm3, and using κe, one obtains
an upper limit for the mean free path of photons in the Sun: ℓf ≈ 2 cm ! Photons hence experience many
interactions before they get from the location of their creation (the stellar core) to the stellar surface. This
means that, in general, stellar matter is very opaque. This is not valid anymore in the photosphere of a star
or in red (super)giants, where the mean free path of a photon is much larger.
5.1.2 The temperature gradient
A typical value for the temperature gradient in a star like the Sun can be obtained by comparing the core
(TC ≈ 107 K) and surface temperature (TS ≈ 104 K), with respect to the stellar dimensions:
Tr ≈ TC − TS
R⊙
≈ 1.4× 10−4 Kcm−1, (5.3)
i.e., 14 K per kilometer.
Over the mean free path of a photon, the stellar interior is thus almost perfectly isothermal. The
difference in temperature over this length scale is only T = ℓf(dT/dr) ≈ 3 × 10−4 K. The relative
anisotropy of the radiation in a point with temperature T = 107 K is given by T/T ∼ 3 × 10−11. This
value shows that the physical state in the stellar interior must indeed be very close to thermal equilibrium,
and hence that the radiation can be very well approximated by that of a black body for which the energy
density is proportional to ∼ T 4. Hence, the relative anisotropy of the radiation is only ∼ 10−10. Despite
its extremely small value, this anisotropy is responsible for the outward transport of energy and this way for
the enormous luminosity of the star. A fraction of 10−10 of the flux radiated through a surface of 1 cm2 of a
black body with a temperature of 107 K, is still a factor 1000 larger than the flux we receive from the Solar
surface!
5.1.3 The diffusion approximation
Radiative energy transport in a star occurs because there is more outward radiation (coming from the hot mat-
ter close to the core) than inward radiation (coming from the cooler outer layers). The estimations described
above show that the mean free path of the “transporting particles” (photons) is extremely small with respect
to the characteristic length scale over which the transport occurs (i.e., the stellar radius): ℓf/R⊙ ≈ 3×10−11.
In such a case, the energy transport can be treated as a diffusion process, which implies a significant simpli-
fication of the formalism. We repeat that this approximation is not valid in the photosphere of a star.
67
General description
First, we recall the diffusion equation as often discussed in physics. In general the diffusive flux ~f of
particles per unit surface, per unit time, averaged over all frequencies, between areas with different particle
densities n (expressed per unit volume), is given by
~f = −D~∇n. (5.4)
Here, the diffusion coefficient D is set by the velocity v of the particles and their mean free path ℓd:
D =1
3vℓd. (5.5)
This form of the diffusion equation is general. Below, we recall how it is derived.
Consider a layer of gas, in which the motion of the particles happens in one direction, e.g., along the
x-axis. We want to determine the stream of particles through a fictitious plane perpendicular to the x-axis.
The number of particles per unit volume to the left of the plane is noted by n−, the particle density at the
right of the plane as n+. To be able to travel through the plane in a time interval t, the particles must
initially be closer to the plane than a distance vx t, where vx is their velocity in the x-direction. We assume
random motion of the particles. Hence, half of the particles at a distance vx t will move towards the plane,
the other half will move away from the plane. The resulting stream of particles through the plane per unit
time is:
fx =n−vx t
2 t− n+vx t
2 t=
(n− − n+) vx2
. (5.6)
Each of the particles can travel a distance ℓx before interacting with another particle. Hence, we can connect
the difference in particle density left and right of the plane to the mean free path:
n+ − n− =dn
dxx =
dn
dx2ℓx. (5.7)
The flux in the x-direction becomes:
fx = −ℓxvxdn
dx. (5.8)
We assume that there is no preferential direction. In that case, the average velocity of the particles is equally
large in the three spatial directions. Hence, the velocity in the x-direction is vx ≃ v/√3. A similar reasoning
applies for the mean free path: ℓx = ℓ/√3. We find
fx = −1
3ℓvdn
dx, (5.9)
The generalisation of this equation to three dimensions gives Eqs (5.4) and (5.5).
Application to stellar gas
To obtain the corresponding radiative energy flux ~f in a star, averaged over all frequencies, we replace nby the energy density of a black body (this time per unit of volume to be able to directly make use of the
68
diffusion equation) u = aT 4. The velocity v is replaced by the speed of light c, and ℓd by ℓf given in (5.1).
Because of the spherical symmetry of the star, ~f only has a radial component fr = |~f | = f and ~∇u reduces
to a derivative in the radial direction:∂u
∂r= 4aT 3 ∂T
∂r. (5.10)
Because the equations of the stellar structure are described in units per mass, we rewrite the equation
above:
f = −4ac
3
T 3
κρ
∂T
∂r. (5.11)
This equation can be formally considered as an equation describing heat conduction by writing it as
~f = −krad~∇T, (5.12)
with
krad ≡ 4ac
3
T 3
κρ(5.13)
the conduction coefficient for radiative transport. When we solve equation (5.11) for the temperature gradi-
ent and replace f by the local luminosity l = 4πr2f , we obtain
∂T
∂r= − 3
16πac
κρl
r2T 3. (5.14)
Finally, after transformation to the independent variable m, we obtain the basic equation for radiative energy
transport.∂T
∂m= − 3
64π2ac
κl
r4T 3. (5.15)
This equation is called the Eddington equation of energy transport through radiation.
We stress that this simple approximation is not valid close to the stellar surface. Indeed, due to the low
densities, the mean free path of the photons there becomes comparable to the remaining distance they have to
travel to reach the stellar surface. Hence the diffusion approximation breaks down in the stellar atmosphere,
and a much more complicated differential equation needs to be solved to describe the energy transport. In the
current course, we limit ourselves to the region in the star where the diffusion approximation is justified. For
a description of the energy transport in the stellar atmosphere, we refer to the courses Radiative Processes
in Astronomy and Stellar Atmospheres of the Master of Astronomy & Astrophysics at KU Leuven.
5.1.4 The Rosseland mean opacity
The equations described above are independent of the frequency ν because f, l and κ are defined as “av-
erages” over all frequencies. Here we discuss a useful and appropriate method to determine this average
opacity κ. We indicate the dependence of κ on the frequency ν by adding the lower index ν. We do the
same for all relevant frequency-dependent quantities κν , ℓν ,Dν , uν and so on. The diffusive radiation flux~fν in the frequency interval [ν, ν + dν] can be described as
~fν = −Dν~∇uν met Dν =
1
3cℓν =
c
3κνρ. (5.16)
69
The energy density in the frequency interval [ν, ν + dν] is given by
uν(T ) =4π
cBν(T ) =
8πh
c3ν3
exp (hν/kT ) − 1, (5.17)
in which Bν(T ) and uν(t) are the Planck functions for the intensity and the energy density of a black body,
respectively (see Appendix A). Hence, we find
~∇uν =4π
c
∂B
∂T~∇T. (5.18)
The latter equation yields, together with (5.16), the following expression for the total, frequency-integrated
flux ~f :
~f =
∫ ∞
0
~fνdν = −(
4π
3ρ
∫ ∞
0
1
κν
∂B
∂Tdν
)
~∇T. (5.19)
The equation has the same form as (5.12), but now with
krad =4π
3ρ
∫ ∞
0
1
κν
∂B
∂Tdν. (5.20)
If we compare this expression for krad to the one given in (5.13), we obtain a useful method to average the
absorption coefficients:1
κ≡ π
acT 3
∫ ∞
0
1
κν
∂B
∂Tdν. (5.21)
This is the so-called Rosseland mean opacity. Given that
∫
∞
0
∂B
∂Tdν =
acT 3
π, (5.22)
the Rosseland mean opacity is a harmonic average with weight functions ∂B/∂T . It is simple to compute
once the function κν is known in the form of opacity tables, as described above.
To derive the physical interpretation of the Rosseland mean opacity, we rewrite ~fν = −Dν~∇uν using
the expressions (5.16), (5.17) and (5.18):
~fν = −(
1
κν
∂B
∂T
)
4π
3ρ~∇T. (5.23)
This result shows that, for a given point in the star (given ρ and ~∇T ), the integrand in expression (5.21) is
proportional to the net energy flux ~fν for all frequencies. The Rosseland mean is hence created in such a
way that the highest weight is given to frequencies with maximal energy flux.
A downside of the Rosseland mean is that the opacity κ of a mixture of two different gases with
opacities κ1 and κ2, is not equal to the sum of the individual opacities: κ 6= κ1 + κ2. Therefore, it is
not sufficient to know the Rosseland mean for the two different gases that both occur in the gas mixture,
to be able to determine the Rosseland mean of the mixture. Suppose, for example, that the gas contains
a hydrogen fraction X and a helium fraction Y , then the Rosseland mean opacity must be computed for
κν = Xκν(H) + Y κν(He). Each time the abundance Y/X changes, κν has to be recomputed before the
Rosseland opacity can be evaluated using expression (5.21).
70
Until now, we have assumed that the energy flux is only the result of a diffusion process in which
photons take part. In the following sections, we will discuss two other ways of energy transport. Therefore,
we will from now on indicate all quantities that relate to radiative energy transfer with a lower index “rad”,
e.g. κrad, ~frad, etc.
5.2 Energy transport by conduction
Energy transport via heat conduction occurs through collisions induced by the thermal motion of particles
such as electrons and atomic nuclei in ionised matter, and atoms and molecules in non-ionised matter. In
“common” stellar material, conduction is not an important energy transport mechanism. Although the effec-
tive cross section for collisions of particles is relatively low in the stellar interior (approximately 10−20 cm2
per particle), the high density implies that the mean free path is many orders of magnitude smaller than that
of photons. Moreover, the velocity of the particles is only a fraction of the speed of light c. As a result, the
diffusion coefficient D is much smaller than the one for radiative transport via photons.
This situation changes, however, when considering the stellar cores of evolved stars in which the elec-
tron gas is degenerate. The density in a degenerate electron gas is enormous: typically 106 g cm−3, but, on
the other hand, the velocities attained by the electrons is a significant fraction of c. The degeneracy increases
the mean free path significantly. As a result, the diffusion coefficient becomes large, and heat conduction
becomes an important energy transport mechanism, that dominates over the radiative transport.
The energy flux caused by heat conduction ~fcd can also be described by the diffusion formula ~fcd =−kcd~∇T . The sum of the radiative and conductive flux can be written as
~f = ~frad + ~fcd = − (krad + kcd) ~∇T. (5.24)
Similar to (5.13), we can formally write the conduction coefficient kcd as
kcd =4ac
3
T 3
κcdρ, (5.25)
in which we have introduced the conductive opacity κcd. The total energy flux becomes
~f = −4ac
3
T 3
ρ
(
1
κrad+
1
κcd
)
~∇T. (5.26)
This equation shows that we can formally obtain the same equation as the one we obtained in the purely
radiative case (5.11), if we replace 1/κ by 1/κrad + 1/κcd. The transport mechanism that dominates the
sum, is the one which has the highest “transparency”.
Equation (5.15) with adapted κ is now valid for radiative and conductive transport. We reformulate
the equation in a form which will prove to be convenient later on. Under the assumption of hydrostatic
equilibrium, we divide (5.15) by (3.17) and obtain
(∂T/∂m)
(∂P/∂m)=
3
16πacG
κl
mT 3. (5.27)
71
We define the ratio between the partial derivatives on the left-hand side as (dT/dP )rad: the variation of
T with depth, where depth is expressed in terms of pressure (the pressure is a monotonically increasing
function towards the stellar center). For a star in hydrostatic equilibrium that transports energy via radiation
and conduction, (dT/dP )rad has the meaning of a gradient that describes the temperature variation with
depth. Using the common abbreviation
∇rad ≡(
d ln T
d lnP
)
rad, (5.28)
we obtain for (5.27)
∇rad =3
16πacG
κlP
mT 4, (5.29)
in which κ refers to the combined opacity of radiative and conductive transport. ∇rad is called the radiative
temperature gradient. It is the local logarithmic derivative of the temperature with respect to the pressure
that would be necessary if the entire luminosity had to be transported through radiation only.
We note that ∇rad and ∇ad are defined differently, and have, apart from different numerical values, a
different physical meaning. ∇rad describes a local derivative that connects P and T in two neighbouring
fluid elements, while ∇ad is a thermodynamical derivative, that describes the thermal variation of a single
fluid element during adiabatic expansion/compression.
Again we define a characteristic time scale, this time based on equation (5.29): the thermal time scale
or time scale for thermal adaptation τth. One can show that τth ≈ τHK when considering the value of
these time scales averaged over the entire star. This means that the Helmholtz-Kelvin time scale can be
interpreted as the period of time that a thermal fluctuation needs to travel from the stellar core to the stellar
surface. Despite the equivalence between the two time scales, it is best to use them separately. In most cases,
the Helmholtz-Kelvin time scale is used to describe the entire star, while the time scale of thermal adaptation
is often used for specific local layers in the star, and that these values are very different for different layers.
5.3 Stability analysis
Until now, we have assumed strict spherical symmetry. We hence assume that all functions are constant
over concentric spheres. In practice, small fluctuations occur, e.g. the thermal motion of gas particles. Such
local disturbances can be neglected, under the condition that they never grow to macroscopic, non-spherical,
local motions. This means that we are allowed to maintain spherical symmetry in the basic equations if we
consider the variables as accurate average values over the concentric spheres.
The microscopic motions, however, can have a large impact on the stellar structure. They can “mix”
stellar material, and moreover transport energy. The latter is because hot fluid elements will rise, while cool
fluid elements will sink. This energy transport mechanism is called convection. Whether or not convection
occurs in a certain stellar layer depends on whether small fluctuations remain small, or are able to grow
and become larger. In other words, it is a question of stability. Therefore, we will first derive criteria for
stability with respect to local, non-spherically symmetric disturbances, before we discuss convective energy
transport.
72
5.3.1 Dynamical instability
The starting point for the discussion of dynamical instability is the assumption that moving fluid elements do
not have a sufficient amount of time to exchange a substantial fraction of their heat with their surroundings.
In other words, these elements move in an adiabatic1 way. Consider the situation where physical quantities
such as temperature, density, etc. are not constant at the edge of a concentric sphere inside a star, but
have small local fluctuations. In our treatment of the global stellar structure, we assumed that the physical
quantities derived in the previous sections are good averages over the concentric spheres.
For a local description we will represent a fluctuation by considering a fluid element (with lower index
“e”) in which the physical quantities are slightly different than those in the surroundings (lower index “s”)
of the element. For a quantity A, we define the difference DA between the element and the surroundings
as DA ≡ Ae −As. Assume there is a small temperature fluctuation, e.g. the fluid element is slightly hotter
than the surroundings with DT > 0. At first instance, one would then also expect an excess in pressure
DP . However, the fluid element will expand to restore the pressure equilibrium with the surroundings. This
expansion occurs at the speed of sound, i.e., much faster than any other possible motion of the element.
Hence, we can assume that the pressure in the element is always in equilibrium with that of the surround-
ings: DP = 0. In other words, we assume that the fluid element and its surroundings are in hydrostatic
equilibrium.
In case of an ideal gas with ρ ∼ P/T , the excess in temperature DT leads to Dρ < 0. The element be-
comes less dense (“lighter”) than its surroundings and will start to feel buoyancy (principle of Archimedes),
given by −gρ, which lifts up the element. Temperature fluctuations hence lead to element movements in
the radial direction. To test the stability of a layer with respect to local temperature fluctuations, one can
thus equivalently take a radial displacement r > 0 as the initial perturbation of the element.
Consider a fluid element that is in equilibrium with its surroundings at its original position r, but that
is lifted by a perturbation to a position r+r (see Figure 5.1). The density difference between the element
and its surroundings at location r +r, is
Dρ =
[(
dρ
dr
)
e−(
dρ
dr
)
s
]
r, (5.30)
in which(dρ/dr)e represents the change in density of the element due to its rise. The other derivative has
a similar meaning, it is the density gradient of the surroundings. Dρ induces a radial component Kr =−gDρ/ρ of the force ~K per unit mass. This is the so-called buoyancy force of Archimedes. When Dρ < 0,
the element is less dense (“lighter”) than its new surroundings at r +r, and Kr > 0. This means that the
force ~K is pointing outward. This situation is unstable, because the element will be lifted up even further
away from its initial location r. On the other hand, Kr < 0 when Dρ > 0. In this case, ~K points inward.
The element is “heavier” than the other elements in its new surroundings, and the element will be pulled
back down. The equilibrium is restored, and the layer remains stable. The condition for stability hence reads(
dρ
dr
)
e−(
dρ
dr
)
s> 0. (5.31)
1When fluid elements do have enough time to exchange a subtstantial fraction of their heat with their surroundings, they move
diabatically. In astronomy, however, the word “non-adiabatic” is used for the latter movement, which is in fact a double negation.
73
e
e
r : T, P, !
r +!r :
T +!T,
P +!P,
! +!
ss
s
s
s
s s
Figure 5.1: The fluid element “e” with initial position r in the gas with local circumstances T, P, ρ is lifted
due to a fluctuation with respect to its surroundings “s” to a position r +r, where the circumstances are
T +∆T, P +∆P, ρ+∆ ρ.
In practice, however, it is difficult to apply this criterion because it is based on the knowledge of the density
gradient, a quantity that does not appear in the basic equations of the stellar structure. It would be more
convenient if we could derive a criterion based on the temperature gradient, because the latter appears in the
equation that describes radiative and conductive energy transport.
To evaluate (dρ/dr)e correctly, one has to determine, in principle, the energy exchange between the
element and its surroundings. Here, we make the approximation that there is no heat exchange, i.e., the
element moves adiabatically. For areas in the deep stellar interior, this is a good approximation. To convert
the derivative of the density into a derivative of the temperature, we consider the equation of state ρ =
74
ρ(P, T, µ) in its differential form:dρ
ρ= α
dP
P− δ
dT
T+ ϕ
dµ
µ. (5.32)
The quantities α and δ were defined previously. In expression (5.32), we allow for a change in chemical
composition, which was characterised by the molecular weight µ. We assume that dµ = 0 for the element
that carries its chemical composition with it. However, it is possible that dµ 6= 0 for the surroundings when
the element arrives in a layer with a different chemical composition than the layer at its initial location. In
analogy to α and δ, which are to be evaluated for constant T, µ and P, µ, respectively, we define:
ϕ ≡(
∂ ln ρ
∂ lnµ
)
P,T
. (5.33)
For an ideal mono-atomic gas, we have ρ ∼ Pµ/T and hence α = δ = ϕ = 1.
Using (5.32), the stability criterion (5.31) can now be written as
(
α
P
dP
dr
)
e−(
δ
T
dT
dr
)
e−(
α
P
dP
dr
)
s+
(
δ
T
dT
dr
)
s−(
ϕ
µ
dµ
dr
)
s
> 0. (5.34)
The sum of the two terms that contain the pressure gradient are zero because of the assumption DP = 0.
We define the pressure scale height HP :
HP ≡ − dr
d lnP= −P dr
dP. (5.35)
The local pressure scale height is a length scale. It is the distance over which the pressure drops by a factor
e (the base of the natural logarithm). Because P decreases with increasing r, HP > 0. Expressed in terms
of HP , the condition for hydrostatic equilibrium is: HP = P/ρg. Typical values are HP ≃ 1.4× 107 cm in
the photosphere of the Sun, and approximately 5.2× 109 cm at a depth of R⊙/2.
When multiplying all terms of (5.34) withHP > 0, and taking into account δ > 0, the stability criterion
becomes(
d ln T
d lnP
)
s<
(
d lnT
d lnP
)
e+ϕ
δ
(
d lnµ
d lnP
)
s. (5.36)
In analogy to the quantities ∇rad and ∇ad, we define three new derivatives:
∇ ≡(
d ln T
d lnP
)
s,∇e ≡
(
d lnT
d lnP
)
e,∇µ ≡
(
d lnµ
d lnP
)
s. (5.37)
∇ and ∇µ are spatial derivatives, to be evaluated in the new surroundings of the fluid element. They describe
the variation of T and µ with depth, in which P is a probe for that depth. ∇e describes the variation of Tin the element during its motion. Also here, the position of the element is expressed in terms of pressure
P . ∇e and ∇ad are defined in a similar way, because both describe the temperature variation of the gas in
the fluid element, when it experiences a change in pressure. On the other hand, ∇rad and ∇µ describe the
spatial variation of T and µ in the surroundings. When ∇ = ∇rad, the energy transport is fully done by
radiation (and conduction). However, when ∇ < ∇rad, a part of the transport is done by convection.
75
The condition for stability becomes:
∇ < ∇e +ϕ
δ∇µ. (5.38)
In a layer where the energy transport is uniquely done by radiation, we have ∇ = ∇rad. We now investigate
the stability of such a layer assuming that fluid elements move adiabatically (∇e = ∇ad). The condition for
stability now reads
∇rad < ∇ad +ϕ
δ∇µ. (5.39)
This stability criterion is known as the criterion of Ledoux for dynamical stability.2
In a region in the star with a homogeneous chemical composition, we deduce the Schwarzschild crite-
rion for dynamical stability
∇rad < ∇ad. (5.40)
When, in both criteria, the left-hand side is larger than the right-hand side, the layer is dynamically unstable.
This means that the energy transport via radiation would impose too large a temperature gradient, and hence
a switch to convection is needed to carry away the energy. When both sides in the equation are equal, there
is marginal stability. The difference between the two criteria is only important in layers where the chemical
composition changes in the radial direction. This occurs in layers close to the core of evolved stars, where
the heavy chemical elements are produced deeper in the core than the light elements, and hence µ changes
strongly going inward. The last term on the right-hand side in the Ledoux criterion has a stabilising effect,
because a mass element with heavier matter will be lifted up to a surrounding with lighter material. The
buoyancy force will push the heavier fluid element back down, to its initial position. When the criteria
of Ledoux or Schwarzschild are met, the energy transport is done exclusively by radiation, and one has
∇ = ∇rad.
Convective motion only occurs in a star when the criteria of Ledoux or Schwarzschild are not fulfilled.
This happens when:
• l(r)/m(r) is large, i.e., when the energy production within a radius r is very large. This occurs in
massive stars, and they therefore have a convective core.
• the opacity κ is large. This occurs in (the outer layers of) stars with low surface temperatures.
• ∇ad is small. This occurs mostly in partial ionisation zones of hydrogen, in the outer layers of cool
stars, because cP becomes very large there (the absorbed heat is mostly used to further ionise the
matter, not to heat it).
In this case, small perturbations will grow to a large amplitude until the whole region “boils”, with convective
spheres carrying away a part of the energy. The convective energy transport must be treated as described in
the next section. Convection occurs in the inner regions of high-mass stars and in the outer layers of cool
stars. The different temperature gradients in the current Sun are shown in Figure 5.2. As mentioned before,
∇rad becomes very large in the outer layers of the Sun due to the strong increase in opacity. Moreover, ∇2named after the U Liege astrophysicist Prof. Paul Ledoux.
76
Figure 5.2: The temperature gradient, ∇, in the current Sun (full line). The full line shows the effective
temperature gradient ∇. The dashed line is the adiabatic temperature gradient ∇ad. In the radiative region,
which reaches out to r ≤ 0.72R⊙, ∇ = ∇rad. In the convective envelope, ∇ is almost perfectly equal to
∇ad and the full and dashed lines coincide, except very near the surface, where radiation has no problem
escaping the star efficiently. (Figure courtesy of Prof. J. Christensen-Dalsgaard, Aarhus University, DK)
drops strongly below 2/5 in the ionisation zones of hydrogen and helium. In the regions which are stable
with respect to convection, ∇ equals ∇rad and the energy transport is fully radiative. In almost the entire
convective zone, ∇ is only a bit larger than ∇ad, except in a very thin layer at the upper part of the convective
envelope. Figure 5.3 compares the temperature gradients in a sun-like star to the one of a star of 4 M⊙.
We note that the criteria for stability are local criteria. Hence they can be evaluated easily for a specific
layer when the local quantities P, T and ρ are known, without further knowledge of the other parts of the
star. On the other hand, it is clear that convective motions do not only depend on local forces (as assumed
when deriving the criteria). These convective motions can influence the entire stellar structure, because in
reality it is coupled to all neighbouring layers via the basic equations. For certain purposes, the “reaction”
of the whole star to convection needs to be considered. An example is the precise determination of the
boundaries of the convective zone, where fluid elements that where accelerated elsewhere “overshoot” until
their motion is stopped. It is still unclear how important this overshooting is, while it is of great importance
when determining evolution models of stars that get born with a convective core. We come back to this issue
further on, but we first need to discuss convective energy transport in detail.
77
Figure 5.3: Comparison between the temperature gradients (denoted in this plot as instead of ∇) in a
sun-like star with a convective envelope and in a star of 4 M⊙ with a convective core. (Figure courtesy of
Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)
5.3.2 Buoyancy frequency and semiconvection
In a zone that is stable against convection, a fluid element that gets displaced by moving up will be pulled
back down until it is again situated at its equilibrium position, thanks to the action of the buoyancy force of
Archimedes. This oscillatory motion of the fluid elements depends on the local gravitational acceleration,
density, pressure, and chemical composition of the gas and happens with the so-called Brunt-Vaisala fre-
quency, or buoyancy frequency in brief, whose square is defined as
N2 ≡ g
(
1
Γ1
d lnP
dr− d ln ρ
dr
)
. (5.41)
We reformulate its physical meaning via Fig. 5.1: when the density of the displaced fluid element is lower
than the one of its surroundings, it will experience a net buoyancy accelerating the element further to the
surface and we have convective instability. In that case, N2 < 0. On the other hand, when the density in the
fluid element is larger than the one of its surroundings, buoyancy forces it back to its original equilibrium
position. This results in an oscillatory motion with a local frequency given by N(r).
In this way, we find that regions stable against convection have N2 > 0, while convective regions have
N2 < 0. As an alternative to using temperature gradients, the condition for convective instability can thus
78
also be written as:d ln ρ
d lnP<
1
Γ1. (5.42)
For a star fusing hydrogen into helium, N increases with time in the deep stellar interior because the
helium abundance increases due to the nuclear burning at the expense of hydrogen. In such a case lighter
gas surrounding the helium core is situated on top of heavier material and this adds to the stability. Thus
N increases throughout the core-hydrogen burning phase of evolution. This can be seen from the hydrogen
mass fraction profiles and the corresponding N2 profiles of an evolving stellar model in Fig. 5.4. It is also
seen in this figure that N2 = 0 in the convective core of the models, which decreases in mass fraction as the
evolution evolves.
The dependence of N2 on the chemistry of the star is most easily seen for the case of an ideal gas. In
that case, one can approximate N2 as
N2 ≃ g
Hp[δ (∇ad −∇) + ϕ∇µ] . (5.43)
The µ-gradient thus affects the local behaviour of N(r) in the radiatively stratified layers of the star. In
a dynamically stable layer, a displaced fluid element is pulled back by buoyancy. However, a vibrational
instability may occur when the element’s equilibrium position occurs in a layer with a temperature gradient
∇ that satisfies the Schwarzschild criterion, while it ends up in a layer that satisfies the Ledoux criterion:
∇ad < ∇rad < ∇ad +ϕ
δ∇µ. (5.44)
In this case, the fluid element is constantly rising and being pulled back down due to these two conditions.
This instability may have consequences for stellar structure when it concerns the motion of fluid elements
in layers with a different chemical composition. As shown in Fig. 5.4 and further discussed in Chapter 7 and
Part III, this occurs in the near-core regions of massive stars, because their convective core recedes during
the core-hydrogen burning phase. The receding convective core leaves behind a so-called µ-gradient zone
in which the right term in Eq. (5.44) becomes considerable such that vibrational instability sets in. In such a
case, the fluid elements oscillate between regions of different chemical composition, introducing slow mix-
ing, called semiconvection. Depending on its (unknown) efficiency of chemical mixing, the semiconvection
may (or not) introduce stratification in the mixed layer, resulting in plateaus in the profiles of chemical ele-
ments. For efficient semiconvective mixing, regions where ∇µ = 0 may result and their stability properties
will thus be altered by the semiconvection. For example, the hydrogen profile in the near-core region of
a massive star may become stratified; in that case, the semiconvective mixing may bring hydrogen in that
zone. This influx of hydrogen subsequently changes the value of ∇rad since it depends dominantly on the
hydrogen mass fraction (recall that the opacity ∼ (1+X) and is hence dominated by the hydrogen fraction).
As a result, the balance between the precise values of ∇rad and ∇µ, which are the decisive factors to have
convection or not, depend on the (in)efficiency of the semiconvection. In practise, semiconvection occurs
along with several other sources of mixing and their time scales and efficiencies decide which of the sources
dominate. We treat the topic of chemical mixing in stellar interiors in the dedicated Chapter 7.
As a side remark connected with the Ledoux criterion, let us imagine that a vibrational instability
occurs in a critical layer in the envelope of the star. A layer fulfilling the Ledoux but not the Schwarzschild
79
Figure 5.4: Top: Main-sequence evolutionary track in the HR diagram computed with the MESA code (see
Chapter 8) for a non-rotating star of 3.25 M⊙. Three stellar models are selected along the track indicated
by the three coloured circles, corresponding to the zero-age main sequence, mid-main sequence, and near
the terminal-age main sequence, respectively. Bottom: The corresponding profiles of the hydrogen mass
fraction X and the squared buoyancy frequency N2 are shown as a function of fractional mass coordinate.
(Figure courtesy of Dr. May Gade Pedersen)
criterion may coincide with a transition between an adiabatic and a non-adiabatic gas layer. In a chemically
homogeneous layer, compression is accompanied by a local temperature increase, resulting in a reduced
opacity following Kramer’s law. As a reaction, the layer expands, cools down and the instability is damped.
However, when the critical layer is not chemically homogeneous, such as in a partial ionisation layer, the
80
absorbed radiation will dominantly be used to futher ionise the material in that layer rather than heating it
up. As a consequence, the density increase is dominant in the opacity behaviour and the latter increases
upon compression, blocking even more radiative flux. This leads to an expansion beyond equilibrium,
which can grow in time if circumstances are optimal in the sense that radiative damping in the non-adiabatic
layer becomes inferior to the excitation and growth of the vibrational instability. Such type of vibrational
instability is thus strongly connected with the opacity behaviour in that critical layer and is therefore called
an opacity-driven instability. It is responsible for several types of pulsations that may occur in stars. We
refer to the Master courses Asteroseismology and Theory of Stellar Oscillations for a detailed description of
the conditions for, and consequences of the growth and damping of vibrational instabilities.
5.4 Convective energy transport
When the opacity and the amount of energy to be transported become too large, radiation alone cannot effi-
ciently transport the energy in a stable way. In that case, convection takes over the task of energy transport.
Convective energy transport is the exchange of energy via fluid elements moving between hotter and cooler
layers in a dynamically unstable layer. This happens by means of the exchange of macroscopic fluid ele-
ments. The hot convective cells move up, while the cool cells sink (cf. Figure 5.1). The moving cells will
dissolve in their new environment and hence release their excess or lack of heat. Because the density near
the stellar core is very high, convection can be a very efficient way to transport energy.
The detailed theoretical treatment of convective motions in stars appears extremely complicated and
a full description based on first principles does not exist. This is not surprising, because even convective
motions in a kettle with boiling water give rise to complex hydrodynamical motions that are not yet un-
derstood. Solving the hydrodynamical equations for stars, taking into account convection, has until now
only been done in simplified conditions that could be tested in the laboratory. Convection in stars, however,
occurs in extreme circumstances in which turbulent motions can transport large amounts of energy in a very
compressible gas, which has a pressure, density, temperature and gravity that can vary over many orders of
magnitude from layer to layer.
5.4.1 Mixing length theory
Many attempts have been made to take into account convection as accurately as possible. We limit our-
selves here to the description of a long-existing and simple approximation, i.e., the one of the “mixing
length” theory. This theory allows for the local treatment of convection in a fairly simple way and is a good
approximation at hand for regions near the stellar core. The mixing length theory was developed for stars in
hydrostatic equilibrium and assumes that convection is time-independent.
The mixing length theory states that convection can be compared to heat transport by molecules. The
transporting particles are macroscopic “spheres” (cf. Figure 5.1) instead of molecules, and their mean free
path (“mixing length”) is the distance over which they move before dissolving in their new environment.
81
The total energy flux l/4πr2 at a certain location now consists of the sum of the radiative flux frad (in which
we incorporate a possible contribution of conduction) and the convective flux fcon.
In Eq. (5.29), we have defined ∇rad as the gradient necessary to transport the total energy flux by
radiation. A part of the flux is now, however, transported through convection, which means that the true
unknown gradient ∇ of the layer will be smaller:
frad + fcon =4acG
3
T 4m
κPr2∇rad (5.45)
and
frad =4acG
3
T 4m
κPr2∇. (5.46)
Herein, ∇ is a new quantity to be determined. For this we need an expression for fcon. It is assumed that
the convective element moves radially over a distance ℓm with a velocity vconv. Subsequently it arrives in a
cooler environment, where the element has a temperature excess DT . In this new environment, it dissolves
and releases its excess of internal energy. Because of the assumption of pressure balance (DP = 0), the
released heat is cPDT . The local convective energy flux corresponding to this heat exchange is therefore
fcon = ρvcPDT .
To actually compute fcon, a variety of additional assumptions is considered, e.g., about the amount of
work done by the sphere before dissolving in het new surroundings, about the fraction of this work that is
transformed in kinetic energy of the sphere and the surrounding fluid elements, about radiative energy loss
of the sphere (as it ends up in a cooler environment), etc. In this way, ∇ can be estimated. The assumptions
made, as well as the computational scheme, can be found in the next subsection.
5.4.2 A computational scheme
We assume for all the fluid elements that their motion started only as a very small disturbance. Then we can
choose the initial values of DT0 and v0 equal to zero. Due to differences in the temperature gradient and in
the buoyancy, DT and vconv will change as the element sinks or rises. This will happen until the element,
after travelling over a distance ℓm (the “mixing length”), is dissolved in its new environment and loses its
identity. The elements that enter a concentric sphere with radius r at a given time have a different vconv and
DT , as they started their motion at a different distance, between 0 and ℓm. We therefore assume that the
“average” element has travelled over a distance ℓm/2 when it enters the concentric sphere. We then have
DT
T=
1
T
∂(DT )
∂r
ℓm2
= (∇−∇e)ℓm2
1
HP. (5.47)
The density difference is, due to the assumption DP = 0 and Dµ = 0, simply Dρ/ρ = −δDT/T and the
buoyancy is kr = −g(Dρ/ρ). We assume that half of this value has acted on the element when it moved
forward over a distance ℓm/2. The work supplied then is:
1
2krℓm2
= gδ(∇ −∇e)ℓ2m8HP
. (5.48)
82
Next we assume that half of this work is converted into kinetic energy of the element (v2conv/2 per unit mass)
and that the other half is transferred to elements in the surroundings that were “pushed aside”. That way we
obtain the average velocity vconv of elements that pass through the sphere:
v2 = gδ(∇ −∇e)ℓ2m8HP
. (5.49)
When we substitute this result and (5.47) in the expression for the average convective flux we get
fcon = cPT√
gδℓ2m4√2H
−3/2P (∇−∇e)
3/2ρ. (5.50)
We still need the determine the expression for ∇−∇e.
We regard the temperature variance Te inside the element with a diameter d, a surface S and a volume
V when it moves with a velocity vconv. This temperature variance has two possible causes: adiabatic
compression or expansion on the one hand and exchange of heath with the environment through radiation
on the other hand. First we derive the total energy loss λ per unit of time of the sphere. We regard a fluid
element with an excess temperature DT > 0. Consequently the element radiates in its new environment.
Beside the radial energy flux ~f , that transports energy from the stellar centre to the stellar surface, a local
non-radial flux ~f will occur that emits the excess energy of the element to its environment. According to
(5.12) and (5.13)
f = |~f | = 4acT 3
3κρ
∣
∣
∣
∣
∂T
∂n
∣
∣
∣
∣
, (5.51)
where ∂/∂n has the meaning of differentiation perpendicular to the surface of the sphere. Assume that the
element is a sphere with a diameter d. We put that
∂T
∂n≈ 2DT
d. (5.52)
The radiative flux loss λ per unit of time and per unit of mass through the surface S of the sphere then is
λ = Sf =8acT 3
3κρDT
S
d. (5.53)
The unit λ is a kind of “luminosity” of the sphere that represents the change of its thermal energy. The
energy loss λ per unit of time results in a decrease of the temperature as heath is passed to the environment
through radiation. This decrease in temperature is, in case of pressure equilibrium, given by λ/ρV cP vconv.
The total temperature variance per unit of length caused by the two effects, being adiabatic compression
or expansion and exchange of heat with the environment through radiation, then is
(
dT
dr
)
e=
(
dT
dr
)
ad− λ
ρV cP v. (5.54)
When we multiply this by HP/T we obtain
∇e −∇ad =λHP
ρV cP vconvT, (5.55)
83
where we can replace λ by (5.53) with an average for DT given in (5.47). The resulting equation then has
a pre-factor ℓmS/V d, that we choose equal to 6/ℓm (the value for a sphere with a diameter ℓm). Finally we
obtain the following result1
Γ≡ ∇e −∇ad
∇−∇e=
8acT 3
ℓmvconvκρ2cP. (5.56)
Except for a missing value for ℓm we have obtained five equations, being (5.45), (5.46), (5.49), (5.50)
en (5.56), for five unknowns frad, fcon, vconv, ∇e and ∇, where the local magnitudes P , T , ρ, l, m, cP ,
∇ad, ∇rad and g are known.
It can be shown that these five equations can be transformed into one equation of degree three with as
an unknown a complicated combination of all unknowns. We will not consider the full solution space of the
problem here, but refer to the actual implementation in numerical codes. Rather, we tried to show how hard
it is to accurately take into account the convective transport and that the current theory is based on many
assumptions, some of which are more plausible than others. We will restrict our discussion to a few relevant
limiting cases here:
• Γ → ∞: it can be shown that this case implies that ∇e → ∇ad and ∇ → ∇ad. A negligible excess
of ∇ with respect to the adiabatic value is apparently sufficient to transport the total luminosity. This
limiting case occurs in the areas near the stellar core of massive stars where the density is very high
and in the layers of the photosphere of low-mass stars where the opacity is very large (cf. Figure 5.3).
In this case we do not have to solve the equation of the mixing length theory as ∇ ≈ ∇ad is a good
approximation (see Figure 5.2 for the Sun). In that way, we are not subject to the uncertainties and
limitations of this theory for this area.
• Γ → 0: this limiting case corresponds to the demand that ∇ → ∇rad. This means that the convective
transport is inefficient and can absolutely not transport a substantial fraction of the luminosity. We find
in this case F → Frad and again ∇ is known without the need of the mixing theory. This limiting case
occurs in the photosphere of massive stars and in stellar cores of stars of low mass (see Figure 5.2).
The situation is more complicated for regimes in between these two limiting cases. In that case, the equations
of the mixing length theory actually need to be solved and will yield ∇ad < ∇ < ∇rad. In that situation,
the convection is said to be super-adiabatic.
5.4.3 The parametric implementation
The weak point of the mixing length theory (and other variants) is that there is no physical basis to determine
a value for ℓm. Therefore, the mixing length is always considered as a free parameter, usually expressed in
terms of the local pressure scale height: ℓm = αmltHP . To choose a plausible value, it is assumed that the
main part of the convective energy transport is done by the largest spheres, and that these can only travel
over a short distance, not much longer than their own diameter, before losing their identity. For the Sun, it
is found from helioseismology that αmlt ≈ 1.75. This value delivers stellar models that provide the best
84
accordance with high-precision solar observations of very different kind (including the solar oscillations). It
is assumed that the energy transport in all other stars has similar characteristics as for the Sun, and typically
αmlt ∈ [1.5, 2.5] is used when computing the convective transport in stellar models, but it can be as low as
0.5 for very metal-poor stars.
In addition to the energy transport, convection has another important effect on the life of a star. Con-
vection is responsible for the mixing of stellar material on a time scale that is much shorter than other
relevant time scales mentioned before. Hence, when determining the stellar structure and evolution, one can
assume that convection causes instantaneous mixing of all chemical species in the entire convective zone.
Convection makes a clear mark on the chemical history of the star. We come back to this in Chapter 7.
Finally, we remark that we have so far neglected the influence of stellar (internal) rotation, and hence
the Coriolis and centrifugal forces that are induced by it, on the stellar structure. The most important effect
of non-rigid internal rotation is the efficient mixing of stellar material it brings about, as is the case with
convection. We also treat this in Chapter 7.
85
Chapter 6
The chemical composition of stellar matter
6.1 The relative mass fractions
The chemical composition of stellar matter is extremely important because it determines the basic charac-
teristics such as radiation and energy production due to nuclear reactions. These reactions, in turn, change
the chemical composition. Nuclear reactions fix the lifetime of a star.
The chemical composition of the star at a time t is described by the functions Xi = Xi(m, t) with
m ∈ [0,M ]. It is useful to take m as the independent variable when describing the chemical composition.
Indeed, if we were to use a description in terms of r, then the functions Xi(r, t), and all other functions
that depend on the chemical composition, would change with each small expansion or contraction with
conservation of mass.
Often, one uses the particle number per volume ni for particles with mass mi: Xi = mini/ρ. Usually,
the number of different species i, and hence the number of Xi, can be kept low, because (i) most types of
particles are rare, (ii) they have little influence on the stellar structure, or (iii) they have a constant abundance
in time. For most purposes, it is sufficient to define only the mass fractions of hydrogen, helium, and “all
other” elements (also called “heavy elements” or “metals”) together. This is noted as
X ≡ XH, Y ≡ XHe, Z ≡ 1−X − Y. (6.1)
For an “average” star in our Milky Way, X is in the interval [0.68,0.73]. The mass fraction of heavy
elements, on the other hand, varies strongly from star to star, and lies in between Z = 10−6 to about
Z = 0.04. This has important consequences for our understanding of the chemical evolution in the Universe.
At the time of the Big Bang, hydrogen and helium were created, and almost no other elements (apart from
some lithium; see later on). This explains the relatively small ranges in the mass fractions X,Y . All
heavier elements were created by nucleosynthesis in stars. During the late evolutionary stages of stars, a
large fraction of their mass is lost to the interstellar medium, either through a strong stellar wind on the
87
asymptotic giant branch, or due to a supernova explosion (see Part III). Hence, the interstellar medium is
enriched with heavy elements, which are then incorporated in new stars that are formed in this medium.
Consequently, the broad range of Z values must be interpreted as a broad range of stellar ages. The stars
with a low Z are the first-generation stars which were formed before significant chemical enrichment of the
interstellar medium took place.
The nuclear reactions will obviously alter the initial composition X,Y,Z , and make this simple picture
more complex. For certain purposes, e.g. the determination of isotope ratios (see later on), the description
in terms of just 3 types Xi is not sufficient. We come back to the relative distribution of particles within the
Z group, specifically the distribution of C, N, and O isotopes which are important for the hydrogen burning.
6.2 Variations of chemical composition due to nuclear reactions
Assume that the mass fractions Xi can only change due to occurrence of nuclear reactions, which alter
atomic nuclei of type i within a fluid element. The frequency of occurrence of a certain reaction is called the
reaction rate rlm. It is equal to the number of reactions per unit mass and per unit time converting particles
of type l into particles of type m. In general, particles of type i can be influenced by several reactions, of
which some will destroy (rik) while some will create particles (rji). The reactions govern the change of niover time. Because Xi = mini/ρ, we have:
∂Xi
∂t= mi
∑
j
rji −∑
k
rik
, i = 1, . . . , I (6.2)
for all elements of type 1, . . . , I involved in the reactions. When more than one particle of type i is created
or destroyed per reaction, this can be taken into account by multiplying the corresponding term in the sum
with a factor that is equal to the number of particles i involved in the reaction.
The reaction p→ q that transforms a particle of type p into a particle q, is connected to a loss or gain of
energy epq. In the equation that expresses the conservation of energy, we have defined the energy production
ε per unit mass and per unit time; ε contains the contributions from several reactions and can be written in
terms of the reaction rates:
ε =∑
p,q
εpq =∑
p,q
rpqepq. (6.3)
We now define the energy that is generated when a unit mass of particles of type p is transformed into
particles of type q: qpq = epq/mp. For simple cases, it is useful to rewrite (6.2) in terms of ε because
this quantity already occurs in the equation of energy conservation. When all reactions deliver a positive
contribution to ε, we can write (6.2) as:
∂Xi
∂t=
∑
j
mi
mj
εjiqji
−∑
k
εikqik
≡ Ei. (6.4)
88
When I different types of particles simultaneously participate in the nuclear network of reactions, equations
(6.2) or (6.4) form a system of I differential equations. Because one of the latter equations can be replaced
by the normalisation condition (2.24), I − 1 reaction equations are needed to complete the system of basic
equations that describe the stellar structure.
In simple cases, it is sufficient to add only one reaction equation. This occurs when hydrogen burning
is the only origin of nuclear energy production. Representing the energy production of all types of hydrogen
burning by εH , the only equation that needs to be considered is
∂X
∂t= −εH
qH, (6.5)
with ∂Y/∂t = −∂X/∂t, and qH the energy gain per unit mass when hydrogen is transformed into helium.
In the following, we abbreviate the equations describing the time evolution of the mass fractions as
∂Xi
∂t= Ei, (6.6)
i.e. the rate of change of Xi due to nuclear reactions is denoted symbolically as Ei. Previously, we defined
a general nuclear time scale τn = En/L. For each type of nuclear burning, a nuclear time scale τn,i can be
defined. This is the time scale upon which a particle of type i gets exhausted by nuclear burning.
6.3 Effective cross sections
The reaction between particles is mostly caused by the strong interaction (a.k.a. strong nuclear force), which
occurs between nucleons (protons and neutrons). The range of the strong interaction is determined by
the extent of the particle under consideration. The Coulomb potential of a particle determines whether
the nuclear attraction force or the Coulomb repulsion dominates. The transition between both occurs at
a distance r0, almost equal to the radius of the particle, which is typically of the order of 10−13cm (see
Figure 6.1). For a reaction to take place, the different particles have to be brought very close to each other
to overcome the Coulomb repulsion. In practice, this means that the particles must almost touch each other.
One can show that the Coulomb potential is mostly determined by the charge of the particles and that
its value is of the order of MeV. This immediately shows how hard it is to make a nuclear reaction take
place, since the typical kinetic energy of particles, given by 3kT/2, is only of the order of 103eV. The
typical kinetic energy is three orders of magnitude too small to overcome the Coulomb potential and to
initiate nuclear reactions. Hence, in terms of classical mechanics, we find that nuclear fusion cannot
occur. This explains why scientists in the first quarter of the 20th century were convinced that nuclear
fusion reactions could not be responsible for the luminosity of stars.
Why are nuclear fusion reactions possible, then? This is a result of quantum mechanical effects. From
quantum mechanics, we know that there is a non-zero probability that particles can overcome the Coulomb
potential and hence react through the process of tunnelling, where a particle tunnels through a barrier even
89
Figure 6.1: Schematic representation of the Coulomb potential of a particle. For r < r0, the nuclear
attraction force dominates; for r > r0, Coulomb repulsion dominates. (From Kippenhahn et al. 2012)
if its energy seems inadequate according to the laws of classical mechanics. Because of the extent of the
Coulomb potential (see Figure 6.1), however, the probability that quantum-mechanical tunelling takes place
is small, and consequently the occurrence of nuclear reactions in the stellar interiors is a slow process.
The very low energies imply it to be extremely difficult to determine the effective cross section of a
nuclear reaction, i.e., the probability that a reaction will occur, under conditions relevant for stellar interi-
ors. The effective cross section depends on the relative velocity of the particles. This velocity, in turn, is
determined by the temperature and the relative energy of the nuclei. Moreover, it depends on the presence
of other particles in the gas, that can partially shield the charges of the nuclei and hence can influence re-
action rates, depending on the thermodynamical state of the gas. In principle, the effective cross sections
can be determined experimentally. However, the experiments in the laboratory are performed under condi-
tions that are quite different from those in the stellar interior, which makes extrapolation of the results quite
challenging.
The nuclear physicist Gamow has pioneered this field of research, and derived expressions for the
effective cross sections for stellar interiors. We will not go into details of these derivations but rather only
mention that the cross section depends strongly on the charges of the particles involved in the reaction,
because these charges determine the shape of the Coulomb potential. There is also a strong temperature
dependence, because this quantity determines the kinetic energy of the particles. Gamow found that the
effective cross section ∼ exp(−πZiZje2/ε0hν) exp(−mv2/2kT ), in which ε0 is the permitivity. The first
exponent increases with increasing velocity v, while the second decreases. Hence we obtain a maximal
probability at a certain velocity for the nuclear reaction to take place. It is known as the Gamow peak and
90
occurs at a velocity v = (πZiZje2kT/ε0hm)1/3. To determine the reaction rate, we must integrate the
effective cross section over all possible velocities. It can be shown, however, that the latter is proportional to
the velocity corresponding to the Gamow peak (we do not go into details of the derivation here). We deduce
that reactions between particles with low charge occur faster, and that reactions can take place even at
relatively low temperatures. Moreover, we find that reactions of heavier elements need higher temperatures.
Determining the effective cross sections is an active domain within nuclear astrophysics. Gradually,
the conditions of experimental research are converging towards the true conditions that govern in stellar
interiors. We refer to the course Nuclear Astrophysics at ULB, a course offered to the Master in Astronomy
& Astrophysics students at KU Leuven, for more details.
6.4 Nuclear burning cycles
The life of a star is governed by thermonuclear fusion, which, as we learned, is induced by thermal motion
and quantum mechanical effects. Several light particles fuse into heavier elements. In the discussion on
the energy production in stars by nuclear reactions, we limit ourselves to a summary of the most important
reactions. Instead of the thermonuclear fusion of a certain element, the term burning of the element is used.
The different types of burning occur at quite different temperatures.
When a star evolves on a time scale that is comparable to the reaction rates, we have to take into
account the network of nuclear reactions to obtain an accurate estimate of the energy production. The
total ε is the sum over all possible reactions, and the “bookkeeping” of all changing abundances must be
followed strictly. Very often, however, a much simpler procedure suffices to determine ε. We discuss the
most important burning mechanisms occurring in stars in the following sections, but first we define a few
basic concepts.
6.4.1 Basic concepts
In Figure 6.2, we show different forms of the simplest elements in nature, namely hydrogen and helium.
The top row displays the two possible ionization stages of hydrogen (H) and the bottom row of helium (He).
Each nucleus consists of a number of protons, indicated by the atomic number Z , and a number of neutrons
N . The atomic mass number A is the sum of both: A = Z +N . Not all combinations (N,Z) are allowed
in a nucleus. The stable (N,Z) combinations populate a narrow strip in the (N,Z) diagram, the so-called
stability valley. It expresses that too neutron-rich or too proton-rich nuclei are unstable; too neutron-rich
nuclei are subject to β− decay, while too proton-rich nuclei experience β+ decay. The β+ and β− decay are
both manifestations of weak interactions (a.k.a. the weak nuclear force). In β+ decay, a proton changes in
a neutron by emission of a neutrino and a positron (i.e., the positively charged antiparticle of the electron).
The β− decay, on the other hand, changes a neutron into a proton by emission of an antineutrino and an
electron.
A given number of protons Z can be combined with only a limited number of different neutron numbers
91
Figure 6.2: The ionization stages of H (top) and He (bottom). (From sciexplorer.blogspot.com)
Figure 6.3: The stable isotopes of carbon. (From sciencestruck.com)
92
N . As an example, 6 protons can only form a stable nucleus with 6, 7, or 8 neutrons (see Figure 6.3). Nuclei
having the same number of protons but with a different number of neutrons, are called the isotopes of an
element. The isotope is indicated with AX, in which X is the element, and A the mass number.
Nucleons can, just like electrons, only occupy certain specific energy levels, and they display a shell
structure. A nucleus is very stable when a proton or neutron shell is fully occupied (in analogy to the noble
gasses where the outer electron shell is fully occupied). This phenomenon occurs for the so-called magic
numbers of N or Z: 2, 8, 20, 28, 50, 82, 126. These numbers will be important later on, when we discuss
the s-process in Part III of the course. Moreover, the nuclei with an even number of protons are more stable
than nuclei with an odd number of protons. The same holds for neutrons. Pairs of protons and neutrons with
opposite spin are more stable than unpaired protons or neutrons.
Define α as the projectile, e.g., a proton, and X the target. Assume that both react to form the end
products β and Y . We then write the reactions as follows:
α+X → Y + β, (6.7)
or shorter, X(α, β)Y . Most nuclear reactions that occur in stars are exothermal. This means that they release
energy. For the reaction (6.7), we have an energy balance
mαc2 +mXc
2 = mβc2 +mY c
2 +Q, (6.8)
in which Q is the energy produced per reaction which is added to the system. Q is of the order of MeV.
Before the fusion process, the involved particles j have a total mass∑
Mj , which is different from the
mass My of the product that will be created by the reaction. The mass deficit is
M =∑
j
Mj −My (6.9)
and corresponds to an energy given by E = (M) · c2. This energy feeds the energy balance in the star.
An example is hydrogen burning (see later), in which 4 1H nuclei with a total mass of 4 × 1.0081mu are
transformed into a single 4He nucleus with a mass 4.0039 mu, where one mu (i.e., atomic mass unit) equals
1/12 of the mass of a 12C isotope. For this value we refer to Appendix A. During the hydrogen burning, for
each 4He nucleus that has been created, a mass of 2.85× 10−2mu has “disappeared”, which corresponds to
0.7% of the initial mass. The corresponding energy is about 26.5 MeV (with 1 eV = 1.6022 × 10−12 erg).
The current luminosity (energy loss) of the Sun corresponds to a mass loss of L⊙/c2 = 4.25 × 1012 g s−1.
When we assume that in total 1M⊙ of hydrogen will be converted into helium, then 0.7% of that will be
converted into energy. With the currently observed luminosity of the Sun, it can “survive” for 3× 1018 s or
≈ 1011 year. In practice, only 10% of the total mass in the Sun can take part in nuclear fusion, and hence
the lifetime of the Sun is ≈ 1010 year. The current Sun has used up about half of that energy reservoir.
The mass deficit is connected to the fact that the involved nuclei have a different binding energy EB .
This binding energy is the energy needed to break up the nucleus into its nucleons (protons and neutrons).
Put differently, EB is the energy that is gained when a certain number of free protons and neutrons are
brought together starting from infinity, to create a particle. Consider a nucleus with mass Mk and a mass
93
Figure 6.4: The binding energy per nucleon versus the mass number A. The curve is smoothed to remove
the variations induced by the internal shell structure of the nuclei. (From researchgate.net)
number A, that consists of Z protons with mass mp, and A−Z neutrons with mass mn. The binding energy
EB is then given by:
EB = [(A− Z)mn + Zmp −Mk] c2. (6.10)
When comparing different elemental nuclei with each other, it is better to use the average binding energy
per nucleon: f = EB/A, which is called the binding fraction. Aside from hydrogen, helium, and lithium
nuclei, all elements appear to have a binding fraction of about 8 MeV. This shows that the nuclear attraction
force only reaches nuclei in the immediate surroundings. A representation of f as a function of A is shown
in Figure 6.4. We notice that f increases with increasing A, starting from hydrogen, until a maximum is
reached at 8.5 MeV at A = 56 (56Fe). After this point, f decreases again. So 56Fe is the most strongly
bound, or most stable, nucleus. Figure 6.4 shows that the nuclear fusion that transforms light elements into
more stable heavier elements, delivers energy. However, each nuclear reaction that transforms 56Fe into
a heavier element, induces a loss of energy from fusion (one can only gain energy from fission of heavy
nuclei). In this sense, the creation of 56Fe is a natural end point for nuclear fusion in stars.
In the following, the quantities ε and ρ will be expressed in units of erg g−1 s−1 and g cm−3, respec-
tively. The temperature T will be expressed in the dimensionless form Tn = T/10nK.
94
6.4.2 Big Bang nucleosynthesis
Before we discuss the most important processes of nucleosynthesis in stars, it is interesting to consider the
production of elements at the very beginning of the Universe. The bulk of the current amount of helium in
the Universe originated from nucleosynthesis during the first few minutes after the Big Bang. This is often
referred to as Big Bang Nucleosynthesis, although also a few other light elements besides hydrogen and
helium were created.
Consider the very early Universe, at the moment when it has cooled to about 1010 K. The only nuclei
that existed at these temperatures were protons and neutrons. In normal circumstances, a neutron β− decays
into a proton after 15 minutes. However, at these high temperatures and densities, protons and neutrons are
constantly transformed into each other:
νe + n e− + p en νe + p e+ + n. (6.11)
Because neutrons are heavier than protons, it requires more energy to create a neutron. The ratio between
the number of neutrons and the number of protons comes from the Boltzmann factor
Nn
Np= exp
[
−∆m c2/kT]
, (6.12)
in which ∆m indicates the mass difference between a neutron and a proton. This mass difference is equiv-
alent to 1.3 MeV/c2. The Boltzmann factor in equation (6.12) implies that the ratio of neutrons to protons
quickly decreased when the temperature decreased because of the expansion of the Universe. This decrease
implied that the reactions in (6.11) became less frequent, until they became impossible when the tempera-
ture dropped below T < 1010 K. At that moment, the ratio of neutrons and protons was about 1/5. Fifteen
minutes later, it was 1/7 due to β− decay, and the Universe cooled enough to allow particle-particle interac-
tions.
At a temperature of 109 K, primordial deuterium (2H) formed, as well as 3He. These nuclei were
subsequently converted into alpha particles. Because 4He is by far the most stable of these different nuclei,
almost all neutrons that existed in the Universe at the time, were captured in alpha particles. Moreover,
the absence of stable nuclei with mass numbers between 5 and 8 prohibited the creation of heavier nuclei,
except 7Li.
Hence, the Big Bang nucleosynthesis formed a primordial soup of deuterium, 3He, 4He, and 7Li,
containing all neutrons, and with that a large excess of free protons. We can estimate the abundance of
the primordial helium, because it is directly related to the neutron/proton ratio of 1/7. For each couple of
neutrons, there are 14 protons. These nucleons can form one 4He nucleus, with 12 protons left. In other
words, 16mu of nucleons (protons and neutrons) produce one helium nucleus with a mass of 4mu. The
fraction of the mass converted in helium therefore is 4/16 or 25%. Big Bang nucleosynthesis has resulted in
a Universe where 25% of the mass is captured in helium, and the remaining 75% in hydrogen. This was the
material from which the first stars were formed.
95
6.4.3 Hydrogen burning
The result of hydrogen burning is the fusion of four 1H nuclei into a single 4He nucleus. The difference
in binding energy sums up to 26.731 MeV, which corresponds to a so-called mass deficit, of some 0.7%.
The energy that is released in this way is roughly a factor 10 higher than for any other fusion process that
can occur in a star. Different reaction chains exist, which in general occur simultaneously in a star. For the
hydrogen burning, two chains are important: the proton-proton chain (pp chain) and the carbon-nitrogen-
oxygen cycle or CNO cycle. Below, we have a closer look at both.
The proton-proton chain
The pp chain is called after the first reaction of the chain, in which two protons are converted into a deuterium
nucleus 2H (often also noted as 2D), which in turn reacts with another proton to form 3He:
1H+1 H →2 H+ e+ + νe,2H+1 H →3 He + γ. (6.13)
Herein, e+ represents a positron, γ a photon, and νe a neutrino. The first one of these reactions (the ppreaction) is unusual in comparison to the other fusion processes, because the protons have to undergo a β+
decay at their closest approach, so that a proton is converted into a neutron. The β+ decay is a process that
is caused by the weak nuclear force, and therefore is an interaction having a low probability to occur (i.e., its
effective cross section is small). The average time for this reaction to take place is of order ∼ 109 years and
the close encounter of the protons needs to happen when they have sufficiently high kinetic energy for the β+
decay to take place. This translates into the demand of a temperature above ∼ 107 K in order for the protons
to overcome the electrostatic repulsion. It is impossible to create this reaction in a laboratory. The second
reaction is deuterium burning. This burning occurs very fast, in a matter of seconds, as it is a consequence
of the strong nuclear force. Deuterium burning can already set in at a temperature of about 106 K. Hence,
as explained further in Chapter 9, protostars still in their formation process burn their primordial deuterium
before the pp reaction can occur. Their initial nucleosynthesis thus delivers 3He isotopes and produces
radiation before the full hydrogen burning, transforming 4 H into 4He and requiring a temperature above
5 106 K, can be done in equilibrium.
After 3He is formed via de pp reaction, the pp chain can be completed to form a 4He nucleus, also
known as an α particle, in the time span of a few 100 years. This can be done via three branches pp1, pp2, pp3.
All of these start from 3He and have 4He as their end product. So based initially on 4 protons, we have as
full pp chains:
pp1 :
1H+1 H → 2H+ e+ + νe2H+1 H → 3He + γ
3He +3 He → 4He +1 H+1 H,
(6.14)
pp2 :
3He +4 He → 7Be + γ7Be + e− → 7Li + νe(+γ)7Li +1 H → 4He +4 He,
(6.15)
96
pp3 :
7Be +1 H → 8B+ γ8B → 8Be + e+ + νe
8Be → 4He +4 He.
(6.16)
Herein, e− is an electron. The numbers 1, 2, and 3 indicate the importance of the sub-chain with increasing
temperature. The branch pp1 requires T6 ≥ 5, pp2 requires T7 ≥ 1.5, and pp3 needs T7 ≥ 24. In the Sun,
83.6% of the luminosity is delivered by the sub-chain pp1, 16.4% via pp2, and 0.015% via pp3. The very
different reactions in the pp chain occur at very different rates. The pp reaction itself is by far the slowest
reaction (about a factor 1018 slower than the other reactions). To reach completion of pp1, the first two
reactions described in (6.13) must have occurred at least twice. The reaction 2H(p, γ)3He in the pp1 chain
is so fast that the abundance of deuterium remains very low. The last reaction of the pp1 chain is again
slower than the second, but still much faster than the pp reaction itself. When the temperature increases, the
abundance of 3He decreases, so the first reaction of pp2 becomes more important (starting from T7 ≈ 1−2).
The pp2 chain continues with electron capture by 7Be, which is almost independent of temperature, unlike
the alternative reaction, proton capture by 7Be, in pp3. 7Be(p, γ)8B will start to dominate 7Be(e−, ν)7Li at
T6 ≈ 24. The 8B nucleus, produced by proton capture, is unstable with respect to positron decay with a
half-life time of 0.8 s. The neutrino released in the latter process, as well as the neutrino from the electron
capture by 7Be, have been detected in solar neutrino experiments. The final reaction in the pp3 chain is the
decay of 8Be in two α particles. This reaction is important, not only because it finalises the pp3 chain, but
also because the inverse reaction determines He burning at higher temperatures (see later on).
Because of the different amounts of energy lost via neutrinos in each of the three sub-chains, the energy
gain per produced α particle is different, and adds up to Q = 26.2, 25.7, 19.2MeV for pp1, pp2, and pp3respectively. The “effective” Qeff can be determined as an average of the energy produced by the three ppchains. The released nuclear energy of the pp chains is:
εpp =rppQeff
ρ≈ ψ f11 g11
2.57× 104ρX2
(T9)2/3exp
(
−3.381/(T9)1/3)
(in erg g−1 s−1). (6.17)
In this expression, f11 is a shielding factor for the reaction considered, g11 is a factor based on a polynomial
fit of fourth order in T9 to experimental laboratory results and attains a value near 1, and ψ is a factor between
1 and 2 depending on the relative contributions of pp1, pp2, and pp3. Overall, the formula in Eq. (6.17) is
a parametric fit to measured and tabulated values of nuclear reactions. These fits are made by research
groups active in nuclear astrophysics based upon their experimental data and differ somewhat from group
to group. This topic is an active field of research at the Universite Libre de Bruxelles (ULB, see Agnulo et
al., 1999, Nuclear Physics A, Vol. 656, p.3 – 183, for a reference often used in stellar evolution codes). The
temperature dependence of the reaction rate of the pp chain decreases from ∼ T 6 at temperatures T6 = 5,
down to ∼ T 3.5 for T6 ≈ 20.
The carbon-nitrogen-oxygen cycle
The CNO cycle describes a second chain of reactions that leads to hydrogen burning. For this cycle to work,
the presence of certain isotopes of carbon, nitrogen and oxygen is required. The reactions that occur at
97
temperatures typical for stellar interiors, are:
12C+1 H → 13N+ γ (6.18)13N → 13C+ e+ + νe
13C+1 H → 14N+ γ14N+1 H → 15O+ γ
15O → 15N+ e+ + νe15N+1 H → 12C+4 He
− or −15N+1 H → 16O+ γ (6.19)16O+1 H → 17F + γ
17F → 17O+ e+ + νe17O+1 H → 14N+4 He
The general structure of the CNO cycle consists of a series of protons captured by isotopes of C, N, or O,
intermixed with β+ decay, processes which all have half-life times of 100 – 1000 s. The cycle always ends
with proton capture that induces the formation of an α particle.
The first of reactions indicated in (6.18) constitute the CN cycle because only isotopes of C and N
occur as catalysts. The full CNO cycle occurs when 16O is already abundantly present, or when the reaction15N(p, γ)16O has occurred enough to have generated the necessary amount of this oxygen isotope. The
occurrence of the full CNO cycle is a factor 1000 less likely than that of the CN cycle. The end product of
the full CNO cycle is not only an α particle, but also a 14N isotope that can again feed the CN cycle.
A detailed description of the CNO cycle burning is extremely complicated because many isotopes are
involved in a cyclic way. The energy production, as well as the detailed abundances of all isotopes, depend
on the initial concentrations of the catalysts, on reaction rates, on the temperature, and on the age of the star.
Here, we will not describe all reactions in the cycle in detail. Rather, we discuss the consequences of the
most important chain in the CNO cycle.
The key reaction of the CNO cycle is 14N(p,γ)15O. This reaction is relatively slow, and is based on the14N isotope which is present in both cycles. Like in the pp chain, it is the slowest reaction which is the most
important to determine how the cycle evolves. When the temperature is high enough to activate hydrogen
burning via the CNO cycles for a substantial time in a stable way, almost all available C, N, and O will be
converted into the 14N isotope, which then becomes most the abundant byproduct of the hydrogen burning.
The energy production is also determined by the slowest reaction 14N(p, γ)15O, and therefore deter-
mines εCNO. A good estimate for the latter is 24.97 MeV. The released nuclear energy due to the average
Qeff , determined for all reactions that occur in the CNO cycle, can be determined as follows:
εCNO ≈ g14,18.24 × 1025ρXZ
(T9)2/3exp
(
−15.231/(T9)1/3 − (T9/0.8)
2)
(in erg g−1 s−1), (6.20)
where g14,1 again stands for a polynomial fit, this time of third order in T9 following Agnulo et al. (1999).
98
Figure 6.5: The fraction of the total energy production by the CNO cycle throughout the stellar interior for
stars on the zero-age-main-sequence (ZAMS, see later on for a formal definition) with a mass between 1
and 3M⊙. (From Kippenhahn et al. 2012)
The temperature dependence of the reaction rate of the CNO cycle is much larger than that of the pp chain;
approximately ∼ T 18 for T6 ≈ 20.
In Figure 6.5, we show the contribution of the CNO cycle to the total energy produced by hydrogen
burning for stars with a mass between 1 and 3M⊙ as a function of position in the star (represented as l/L).
It is clear that the CNO cycle is the dominant energy source for stars more massive than 2M⊙.
6.4.4 Helium burning
The nuclear reactions burning helium consist of a gradual fusion of several α particles, resulting in isotopes12C,16O,. . . . These reactions occur at a temperature much higher than the temperature needed for hydrogen
burning. The typical condition is T8 > 1.
The first and foremost reaction is the one in which 12C is formed from three α particles: the triple
alpha reaction. This reaction occurs in two steps, because a simultaneous close encounter of three particles
is highly unlikely:
4He +4 He 8Be + γ − 93KeV (6.21)8Be +4 He → 12C+ γ + 7.4MeV.
In the first step 8Be is formed temporarily from two α particles in an endothermal reaction, i.e., it costs
energy and the ground state of this 8Be isotope has an energy that is about 90 KeV higher than that of the
two α particles. Due to this, the isotope decays on a short time scale of 10−16 s, falling apart in two αparticles again. This seems like a very short decay time, but the high density in the stellar interior brings
about a suitable probability of another α capture, used to form 12C. The energy production per unit mass of
99
the reactions given in (6.21) is a factor 20 lower than that of the CNO cycle and can be approximated by
ε3α ≈ 5.1× 1011 f3α ρ2 Y 3 T−3
8 exp (−44.027/T8) (in erg g−1 s−1), (6.22)
where f3α is the screening factor of the triple-α reaction. The triple-α reaction is very temperature depen-
dent, as can be seen from the high exponent for the factor involving the temperature (recall that T8 ≈ 1).
Once enough 12C is formed by the triple α reaction, additional α capture can occur and nuclei of 16O,20Ne, etc. can be produced, releasing energy:
12C+4 He → 16O+ γ + 7.16MeV (6.23)16O+4 He → 20Ne + γ + 4.73MeV
...
There is considerable uncertainty in the first of these two reactions and hence also in the share of initial 4Hethat takes part in each of (6.21) and (6.23).
In summary, during helium burning, the reactions given in (6.21) and (6.23) occur simultaneously,
and the total energy production εHe consists of essentially three contributions: εHe = ε3α + ε12,α + ε16,α.
Initially, shortly after the onset of the helium burning, the trippel-α reaction dominates as 4He is much more
abundant than 12C and 16O. With increasing temperature and more carbon available, the (rather poorly
known) reaction 12C(α, γ)16O comes relatively more into play and becomes even dominant due to the
third power dependence on the relative mass fraction of 4He in Eq. (6.22). The final abundances of 12Cand 16O at the end of the helium-burning phase are critical for the further stellar life, yet they are quite
uncertain. Improving knowledge of the 12C(α, γ)16O reaction is therefore an active field of research in
nuclear astrophysics.
6.4.5 Fusion of heavier elements
Carbon burning
After helium burning, the central core consists mostly of a mixture of 12C and 16O. If the temperature is
high enough at that time, say T8 ≈ 5− 10, the process of carbon burning starts. For this type of burning, as
for all types to follow, the situation is so complex that computations are based on rough approximations. A
first difficulty is that the first reaction in the carbon burning, 12C+12C, results in a 24Mg isotope, which can
then decay in many different ways:
12C+12 C → 24Mg + γ 13.9323Mg + n −2.6123Na + p 2.2420Ne + α 4.6216O+ 2α −0.11
(6.24)
100
in which we have indicated Q in MeV in the last column. Note that the second and the last reactions are
endothermal. The relative frequency of the different decay paths depends on the temperature and is very
different. The most probable ways are those that result in 23Na+p and 20Ne+α. The latter occur equally
frequently at not too high a temperature (T9 < 3).
The next difficulty is that the produced protons and α particles experience such a high temperature that
hydrogen and helium burning are not possible. As a result, very complicated reaction chains occur. As an
example, we mention 12C(p, γ)13N(e+ν)13C(α, n)16O, which creates, amongst other elements, a neutron.
All details of such chains need to be accounted for when determining the average energy production. As a
rough approximation, an average Q of ≈ 13MeV is taken for each 12C+12C reaction with all subsequent
reaction chains. The end products of the carbon burning are mostly 16O, 20Ne, 24Mg and 28Si.
Oxygen burning etc.
For the reaction 16O+16O to occur, a temperature of T9 > 1 is required. Due to the high temperatures,
the protons and α particles react with other nuclei in the gas. Also the neutrons react with other particles,
since they are not hampered by Coulomb repulsion. Like in the carbon burning chain, the reactions can be
continued through different channels :
16O+16 O → 32S + γ31P + p31S + n28Si + α24Mg + 2α.
(6.25)
A large number of chain reactions follow, which create, aside from Al, Mg and Ne, large amounts of free
neutrons, protons and α particles. These will in turn react with the 28Si isotopes and gradually produce
heavier elements.
When the oxygen has been burned, a new stage of contraction and heating of the stellar core starts.
One may expect that the next burning cycle, the burning of magnesium, would start. However, before the
temperature is sufficiently high for this burning to take place, another type of reaction occurs. Because of
the ever increasing temperature, the thermal energy of the photons has strongly increased as well. At a
temperature of ≈ 109 K, a large fraction of the photons has an energy in the order of MeV. Such highly
energetic photons can cause photo-dissociation of the nuclei in the gas. Photo-dissociation is a process in
which radiation is converted into matter (as opposed to the aforementioned burning cycles, which convert
matter into radiation). Such a process in which a high-energy photon is converted into an electron-positron
pair occurs when the photon energy hν exceeds the rest mass of the electron-positron pair: hν > 2mec2.
An example of photo-dissociation is:
32S + γ 28Si + 4He. (6.26)
The double (right-left) arrow indicates that, after the formation of the α particle, it can react with other
nuclei, such as 28Si, and the inverse reaction can take place. There are many analogous reactions that take
101
Figure 6.6: The periodic table as filled by nucleosynthesis in stars, where the number of protons is indicated
for each of the chemical elements. (Figure courtesy of Prof. Jennifer Johnson, Ohio State University, USA)
place through photo-dissociation, involving the nuclei of 32S, 36Ar, 40Ca, 52Fe and 56Ni. Furthermore, the
absorption of protons and neutrons, as well as the decay of unstable nuclei need to be added to the reactions.
Hence, extremely complicated reaction chains come into being, which eventually leads to the reaction of
heavier elements. This whole process is, misleadingly, called silicon burning.
When this process is continued, and there is enough time available, the formation of 56Fe will be
completed. Because the 56Fe isotope is strongly bound (see Figure 6.4), it is the only survivor. When there
is not enough time to form 56Fe, however, 56Ni will become the most abundant element resulting from the
silicon burning. This situation occurs in supernova explosions (see Part III).
6.5 Summary: the periodic table filled via the nuclear reactions in stars
While we have gone over the most important nuclear reactions in the previous subsections, additional reac-
tions connected with the slow and rapid neutron capture will be discussed in Part III of this course, as they
occur in particular phases of stellar evolution. Figure 6.6 provides an overview of all the chemical elements
produced in stars. These chemical yields form the basis of the chemical enrichment of exoplanetary systems,
galaxies and of the Universe as a whole.
102
Chapter 7
Complications: mixing of chemical elements
due to transport processes
7.1 Convective mixing and nuclear burning
The mixing processes of stellar matter due to turbulent convective motions is a process that is active on a
very short time scale compared to the slow variations in chemical composition induced by nuclear reactions
as these happen on a nuclear time scale. Upon first thought, one could therefore consider it justified to
assume that the chemical composition in a convective layer remains constant, hence ∂Xi/∂m ≈ 0. Within
the mass interval of the convection zone, one could therefore consider all Xi ≈ Xi to be constant. At
the edges of the convective zone, a discontinuity may occur, i.e., the “outer” value may be different from
the “inner” value Xi. However, apart from the mass fractions of the species of type i, the positions of
the convection zones change with time (cf. Figure 5.4). It is thus clear that the chemical composition in a
stellar layer can change, even if there are no nuclear reactions taking place in this layer. In particular, this
happens when the border of the convective zone penetrates into a zone with a different, non-homogeneous
chemical composition. In this way, the products of previous nuclear reactions can be traced because they
can get transported throughout the star. Such element transport may bring processed material up to the
stellar surface, if mixing occurs throughout the stellar layers between the centre of the star and its surface.
On the other hand, “fresh fuel” can be transported into the zone where nuclear burning occurs and this can
drastically change the stellar evolution. Indeed, fresh fuel leads to a prolongation of the nuclear burning
cycle for stars with a convective core. Care must therefore be taken to describe chemical mixing properly
from solving the transport equations, rather than making too crude approximations.
The rate of change of species of type i with mass fraction Xi is caused by various processes aside
from the nuclear burning denoted as Ei. Extra mixing due to element transport may behave diffusively (e.g.,
when caused by gradients of physical quantities) while others are advective (e.g., large coherent motions
of fluid elements such as circulation due to rotation). When the rate of change for Xi happens much faster
than the nuclear timescale, as is the case in a convection zone, it is customary to approach ∂Xi/∂t by a
103
diffusion equation, mainly for computational convenience. In the simplest case of chemical mixing due to
convective motions, along with nuclear fusion, the transport equation describing the time-variability of the
mass fractions can be written in Lagrangian format as
∂Xi
∂t= Ei +
∂
∂m
[
(
4πρr2)2
Dconv(m)∂Xi
∂m
]
, (7.1)
where the diffusion coefficient associated with the convective mixing described by mixing length theory is
given by
Dconv =1
3αmltHp vconv (7.2)
and is thus expressed in the physical unit cm2 s−1 when adopting the cgs system.
7.2 Convective boundary mixing aka CBM
It is difficult to precisely locate the boundary layer between radiative and convective zones. Even though
the stability criterion (5.1) allows us to derive the zones where convection takes place inside the star, a com-
plication occurs in the transition layers between convective and radiative zones, hereafter termed convective
boundary layers. The fluid elements inside a convection zone experience a turbulent motion with velocity,
vconv. When they reach the convective boundary layer, their inertia will prevent them from stopping abruptly,
i.e., they will “overshoot” from the convection zone into the radiative layer over an unknown distance, which
we denote here with the parameter αov (in analogy to αmlt, it is expressed in the unit Hp). The way in which
the fluid elements overshoot the convective boundary depends on the location of the convection zone inside
the star and the physical circumstances at that position. It may be different for convective envelopes than for
convective cores.
Values for αov cannot be derived from first principles. Their derivation is an active research field within
3D hydrodynamical simulation studies. Such studies have indicated at least three physical processes that
may come into play: 1) convective penetration by plumes leading to superadiabatic mixing over a distance;
2) subadiabatic convective core overshooting due to thermal diffusion over a distance described by means
of an exponentially decaying mixing profile; 3) turbulent entrainment that occurs over a dissipation length
scale. All of these cases imply a different and uncalibrated level and functional form of convective boundary
mixing (abbreviated as CBM as of now) and have a different temperature gradient in the transition layer. We
use the global symbolic notation of the free parameter αov to express the unknown length scale over which
the fluid elements move from inside a convective region into the radiative adjacent zones, irrespective of the
profile that captures the efficiency of the mixing. Equation (7.1) gets extended to include CBM as follows:
∂Xi
∂t= Ei +
∂
∂m
[
(
4πρr2)2
(Dconv +Dov)∂Xi
∂m
]
, (7.3)
where the unknown profile due to CBM connected with the motions of the fluid elements beyond the con-
vective boundary are captured by the local diffusion coefficient Dov. Both Dconv(m) and Dov(m) involve
at least one free parameter (denoted here as αmlt and αov).
104
5000600070008000Teff [K]
1.0
1.1
1.2
1.3
1.4
1.5
logL
/L⊙
1.7 M⊙
4000600080001000012000140001600018000Teff [K]
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
logL
/L⊙
5 M⊙
Figure 7.1: Evolutionary tracks for two masses as indicated, computed with the MESA stellar evolution
code (see Chapter 8). The tracks were computed for X = 0.71, Z = 0.014 and for three different values
of convective core overshooting in the approximation of a diffusive exponentially decaying profile with pa-
rameter fov=0.005 (full lines), 0.020 (dashed lines), 0.040 (dotted lines) and with constant envelope mixing
Denv = 100 cm2s−1. The thick dot indicates the first model along the track whose hydrogen mass fraction
in the core, Xc < 10−4. (Figure reproduced from data in Johnston et al., 2019, A&A, Vol. 632, id.A74 by
courtesy of Dr. Cole Johnston, KU Leuven)
For stars with a core that is convective due to nuclear reactions taking place in the central regions, the
lack of calibration of the physics in the convective boundary layers implies a serious limitation. Indeed,
the CBM influences the amount of matter that can be brought into the central regions where nuclear fusion
takes place. The higher the CBM, the more fresh fuel reaches the nuclear reactor and hence the longer the
nuclear fusion can go on. Figure 7.1 shows the impact of CBM on evolutionary tracks for stellar models
with a convective core. The CBM not only affects the evolutionary tracks, but has a major impact on the
star’s core mass and on its ageing. For this reason, calibration of the amount of matter in the convective
core of a star, Mcore, via observational estimation of the profile Dov(m) is a crucial piece of information to
predict the evolution of stars with M > 1.5M⊙ for which the extent of the convective core directly relates
to the amount of stellar matter that takes part in the nuclear fusion. In this sense, the parameter αov is a
critical unknown in the theory of stellar structure. Independent methods to determine αov observationally
are explained in the courses Asteroseismology and Binary Stars.
7.3 Mixing due to rotational instabilities and waves
7.3.1 Models with rotation
As already mentioned in Chapter 3, rotation is common in all stars throughout their lives. Rotation acts upon
stellar structure models in at least three ways: it deforms the star from spherical symmetry, it leads to higher
polar than equatorial flux due to gravity darkening, and it induces various instabilities and chemical mixing
in the stellar interior. The level of confidence in how to treat these effects is different for the three aspects.
105
Figure 7.2: Left: Core/near-core rotation frequencies derived from nonradial oscillations for 1210 stars
observed with the NASA Kepler space telescope. The stars span the entire stellar evolution from the ZAMS
to white dwarfs. Right: stars with an additional measurement of the envelope/surface rotation frequency.
We refer to the course Asteroseismology and to the review paper Aerts et al., 2019, ARAA, Vol. 57, p.35 –
78 for details.
By definition, the critical (or break-up) velocity is reached when the outwardly directed centrifugal
acceleration is equal to the inward effective gravitational acceleration at any one place on the stellar surface.
Usually the Roche approximation is adopted, which assumes that the mass concentration inside the star
is not distorted by the rotation. In this case, the polar (Rp) and equatorial (Re) radii of the star differ by
less than a factor 3/2, where Re,crit/Rp,crit = 3/2. This leads to the critical rotation frequency given by
Ωcrit =√
GM/R3e,crit =
√
8GM/27R3p,crit, with Re,crit and Rp,crit the critical equatorial/polar radius of
the star, respectively. This is the solution for the critical rotation frequency when the Eddington parameter,
Γ, which we introduce in Chapter 13 as Γ = κL/4πcGM⋆ < 0.639. We do not treat the other solution here
but refer to the monograph by Maeder (2009) for details. The vast majority of stars rotate at less than half of
the critical rotation frequency, implying that their polar radius is between 90% and 100% of their equatorial
radius. The deformation of most stars thus remains limited.
Gravity darkening was first discussed by von Zeipel in 1924. It stands for a reduction in the flux and
hence in the effective temperature of the star resulting from the reduced gravity in the equatorial regions
compared to the polar ones. Usually, the von Zeipel effect is expressed as
Teff = Teff,p
(
geffgeff,p
)β
, (7.4)
where Teff,p and geff,p are the effective temperature and effective gravity at the pole of the star. For a
radiative envelope as considered by von Zeipel, β ≃ 0.25. In the presence of a convective envelope, βis usually assumed to be β <
∼ 0.1. This limited knowledge of the exponent β, and along with it a non-
106
symmetrical stellar wind (see Chapter 13), implies a non-trivial treatment in stellar evolution computations
in the presence of fast rotation.
The prediction Re,crit/Rp,crit = 3/2 of the Roche approximation, along with von Zeipel’s formula
(7.4), can be evaluated directly from interferometric measurements of stellar surfaces. Such observations
indeed show that fast rotators are oblate and that their surface properties and winds are not spherically
symmetric. However, such stars are rare, while the computation of 2D equilibrium models in the presence
of rotation comes with major uncertainty even in its simplest aspects of the local surface and its flux. For
this reason, computations of 2D models of stellar interiors are often restricted to static polytropic models.
A good compromise for stars rotating faster than about half of the critical rotation rate is to use only the
spherically symmetric component of the centrifugal force in the equation of hydrostatic equilibrium:
∂P
∂r= −Gmρ
r2+
2
3ρ rΩ2 , (7.5)
while sticking to 1D evolutionary models.
7.3.2 Rotational and pulsational mixing
Stellar models with rotation usually adopt the approximation of shellular rotation, where one assumes that
the chemical composition and the angular velocity remain constant on isobars. As such, the ratio of the
rotation frequency inside the star, Ω(r), with respect to Ωcrit (or the accompanying vrot(r)/vcrit) is used as
input for the numerical computations of the stellar models. Even 1D models of slowly rotating stars based
upon the simplified Eq. (7.5) face challenges, as such models cannot explain the latest space asteroseismic
data shown in Figure 7.2.
From a theoretical perspective, rotation is expected to induce a myriad of processes and instabilities in
the stellar interior, leading to transport of both angular momentum and of chemical species. This is exten-
sively discussed in the review paper by Aerts et al. (2019), to which we refer for details. In summary, the
processes can be classified into four main categories: meridional circulation, hydrodynamical instabilities,
magnetorotational instabilities, and internal gravity waves. The concept of rotational mixing in stellar evo-
lution theory as treated in the literature usually stands for the macroscopic element transport and chemical
mixing due to the action of circulation and all instabilities together. Further, in analogy to rotational mixing,
the term pulsational mixing is sometimes used for macroscopic element transport in radiative zones caused
by internal gravity waves.
The Eulerian transport equation controlling the evolution of the angular momentum in the presence of
rotation, r2Ω(r), reads
∂
∂t
(
r2Ω)
=1
5ρr2∂
∂r
[
ρr4ΩU(r)]
+1
ρr2∂
∂r
(
ρr4Dshear∂Ω
∂r
)
. (7.6)
Here U(r) is the radial component of the velocity due to meridional circulation and the local diffusion
coefficient Dshear(r) represents the vertical mixing due to a variety of shear instabilities occuring between
layers subject to different velocities (see Maeder, 2009).
107
Figure 7.3: Schematic representation of mixing profiles, Dmix(r), due to various transport processes in
stars with a convective core (indicated in grey) and a radiative envelope, for a model with CBM (purple)
described by exponentially decaying diffusive core overshooting (upper) or convective penetration (lower)
and different types of envelope mixing (pink). (Figure courtesy of Dr. May Gade Pedersen, KU Leuven)
The transport of the chemical species due to rotation can be approximated as a diffusive process in
the presence of strong horizontal turbulence due to shear instabilities. For this reason, the diffusive part
in the chemical composition equations in Eq. (7.3) gets extra terms due to various effects of rotation, each
with their own local diffusion coefficient. Aside from rotation, additional causes of element mixing are
also considered, particularly in transition layers that are stable against the Schwarzschild criterion, but un-
stable against the Ledoux criterion, both for the case of ∇µ > 0 (called semiconvective mixing as already
discussed) and ∇µ < 0 (called thermohaline mixing, which occurs in evolved stars due to dredge ups –
see Part III). For a star with CBM, semiconvection is not of importance as the convective overshooting or
convective penetration largely dominate over semiconvection.
Overall, one has to deal with a multitude of (uncalibrated!) extra diffusion coefficients that affect
the chemical composition profiles of the star, aside from Dconv(m) and Dov(m) included in Eqs (7.3). For
rotation, these have been grouped asDshear(m) andDeff(m) adopting the notation by Maeder (2009), where
the latter is the diffusive effect due meridional circulation in the presence of strong horizontal turbulence,
108
while the former stands for the vertical shear due to the joint effect connected with all sorts of rotational
(and possibly magnetic) instabilities.
Pulsational mixing profiles due to internal gravity waves, adopting a diffusion approximation, have
been derived from hydrodynamical simulations, resulting in a diffusion coefficient depending on the density
as DIGW(r) ∼ DIGW · ρ−γ(r) with γ ∈ [0.5; 1], where the proportionality factor DIGW remains unknown.
Figure 7.3 offers a schematic representation of some macroscopic diffusive mixing profiles that have been
considered in recent stellar evolution computations. The level of efficiency of the chemical mixing at the
position where the purple and pink profiles coincide remains uncalibrated so far. The particular shape of the
third panel from the left in Figure 7.3 is due to the drop in velocity U(r) ≃ 0 in some of the envelope layers
near m/M⋆ ≃ 0.5 in a 5 M⊙ star just after its birth. In general, the mixing profiles as indicated in Figure 7.3
vary strongly during the evolution of the star, but it is poorly understood how, given limitations in the theory
of element and angular momentum transport.
It also remains unknown how the diffusion coefficients in the equation describing the element trans-
port on the one hand, and the angular momentum transport on the other hand, scale with respect to each
other. In practice one therefore introduces extra unknown scaling factors (as free parameters, with value
between 0 and 1) between each of these two versions of the diffusion coefficients. As a consequence of this
zoo of uncalibrated local diffusion coefficients, major uncertainty occurs in the profiles of the mass frac-
tions, Xi(m, t), in the radiative zones of stellar interiors throughout the evolution of stars. For this reason,
so-called standard evolution models do not include any angular momentum and element transport in the
radiative zones.
Rotational or pulsational mixing are expected to (partly) homogenize the chemical mixture and thus
lead to flatter Xi(r) profiles in the layers where they are active compared to the case where no such extra
mixing occurs in the core boundary layer and envelope. Moreover, envelope mixing may transport elements
produced by the CNO cycle to the surface of the star. As highlighted in Chapter 6, 14N is the most important
byproduct of the CNO cycle. Hence CBM may transport it to the deep bottom of the stellar envelope, where
it can then be picked up and transported to the surface by any form of efficient envelope mixing. This is
observed in massive stars and is one of the major reasons to consider rotational mixing in stellar evolution
theory (see Figure 7.4).
The extra rotational or pulsational mixing processes occurring in the near-core boundary layers and in
the stellar envelope are assumed to happen on short time scales compared to the evolutionary time scale.
Rotating models including these ingredients therefore often ignore the element transport due to microscopic
atomic diffusion giving rise to concentrations of particular chemical species (see next subsection). However,
there is no justified physical reason for this “computationally convenient” simplification when the time scales
and levels of these processes are similar.
109
Figure 7.4: Predicted surface nitrogen abundance as a function of the projected rotational velocity v sin imeasured from spectral line broadening for observed stars (indicated as dots). The results for stellar models
with convective penetration as CBM over a distance αov = 0.35Hp and rotational envelope mixing based
on meridional circulation and shear instabilities (following Dmix(r) according to the lower right panel of
Figure 7.3) is indicated in colour. The observed stars reveal that these models with rotational mixing are only
partly able to explain the observed N excess at the stellar surface. In particular, these models are challenged
by the measured N excess in slowly rotating stars. (From Brott et al., 2011, A&A, Vol. 530, A115, 20pp.)
7.4 Microscopic atomic diffusion
Aside from full and instantaneous mixing in convective regions and full or partial mixing in the convective
boundary and radiative layers, the profiles of the mass fractions Xi(r, t) may also change due to microscopic
transport processes. In this section, we consider element transport caused by microscopic atomic diffusion,
which may induce changes in Xi(r, t) caused by gradients operating in the radiative layers of the star. These
gradients may introduce lower or higher concentrations of particular chemical species at particular layers
within a radiatively stratified zone because heavy elements sink, while light elements surface.
A key aspect of assessing the importance of microscopic atomic diffusion is that the time scales on
which it acts are very different for the atmosphere than for the interior of the star. Diffusion time scales are
typically less than a century for the stellar atmosphere, while millions to billions of years for the interior
regions. Here, we limit ourselves to those aspects of atomic diffusion that act on long time scales comparable
110
to the contraction or nuclear time scales of the star.
Following numerous studies, four different aspects of microscopic atomic diffusion are considered in
stellar models. These occur due to pressure, temperature, and concentration gradients on the one hand, and
radiative forces on the other hand. While pressure and temperature gradients augment the concentration
of more massive species towards the centre of the star, concentration gradients have the opposite effect.
Generally, these three microscopic processes together lead to a larger concentration of heavier elements in
the deeper interior of the star. On the other hand, radiative forces levitate species with an efficiency that
depends on the details of the atomic structure of the involved isotopes. The calculation of the appropriate
radiative accelerations is therefore challenging in terms of computational needs.
Radiative levitation due to accelerations acting upon isotopes can be computed from atomic data, by
treating the appropriate multi-component gas. This requires evaluations of the frequency-dependent absorp-
tion coefficients derived from Coulomb potentials, taking into account partial ionization, and this for all the
layers inside the star. Once the overall local velocities wi for each of the species i involved in the atomic dif-
fusion are computed, they can be inserted in the equation governing the time evolution of the mass fraction
Xi:∂Xi
∂t= Ei −
∂
∂m
(
4πr2ρXiwi
)
+∂
∂m
[
(
4πρr2)2
(Dconv +Dov +Denv)∂Xi
∂m
]
, (7.7)
where the 2nd term on the right-hand side is the result of the radiative levitation acting upon species Xi and
the 3rd term is due to microscopic and macroscopic transport of the chemical species due to convection,
CBM, envelope mixing and atomic diffusion together. The computation of wi necessary to solve the set of
I Eqs (7.7) at each step of the evolution, is a major CPU challenge. This computational inconvenience is
the reason why radiative levitation is most often omitted in stellar evolution computations. Moreover, for
stars with a radiative or a very thin convective envelope (i.e., born with M >∼ 1.3M⊙), the sinking of heavy
elements from the surface towards the interior happens on such a short time scale that it results in unrealistic
stellar atmosphere models depleted in all elements heavier than H. For this reason, such stellar models often
require a further ad-hoc turbulent diffusion coefficient to be at work near the surface along with a thin stellar
wind, to keep the metals in the appropriate position compatibel with surface abundance measurements. For
all these reasons, stellar models are often computed for the simplest case of atomic diffusion, namely helium
settling at the interface of the thick convective outer envelope and the radiative interior of low-mass cool
stars. This is entirely appropriate for stars like the Sun, with sufficiently cool envelope layers to prevent
radiative levitation to occur. Mixing at this interface due to helium settling is well calibrated for the Sun
from helioseismology.
Atomic diffusion impacts the concentration of species in the stellar interior on time scales that are
relevant for stellar evolution. Its effects are hard to unravel when only looking at a star’s luminosity and
effective temperature, which are the two quantities that define the evolutionary tracks in an HR diagram.
Models with and without atomic diffusion usually differ far less than typical observational errors of L or
log g plotted versus Teff . Given that the confrontation between data and theory in an HR diagram is often
the only assessment to evaluate stellar evolutionary theory by lack of additional data (particularly for faint
or very distant stars), and given the computational requirements, microscopic atomic diffusion tends to be
ignored in stellar and galactic astrophysics. It is, however, required to be considered for archaeological
studies of the Milky Way as they rely on chemical tagging and ageing of evolved stars because the effects
of atomic diffusion accumulate along the evolution of the star.
111
Chapter 8
Numerical computation of stellar structure
and evolution models
In this chapter we first give an overview of the full system of basic 1D stellar structure equations deduced
in the former chapters. We ignore the Coriolis and centrifugal forces due to rotation in the equation of
motion, but we do consider the effect of mixing due to transport processes by means of uncalibrated diffusion
coefficients in the chemical equations describing the mass fractions. Furthermore we discuss the boundary
conditions that are required for the computation of stellar models and give one of the relatively simple ways
of numerical analysis revealing how the models can be computed. Subsequently, we turn our attention to
the modern state-of-the-art software suite MESA, which forms the computational tool used for the lab work
of this SSE course.
113
8.1 The full system of basic equations
When we put together all relevant derived equations for a spherically symmetric star, we get the following
system of differential equations in Lagrangian form :
∂r
∂m=
1
4πr2ρ,
∂P
∂m= − Gm
4πr4− 1
4πr2∂2r
∂t2,
∂l
∂m= εn − εν − cP
∂T
∂t+δ
ρ
∂P
∂t= εn − εν + εg,
∂T
∂m= − GmT
4πr4P∇,
∂Xi
∂t= Ei −
∂
∂m
(
4πρr2Xiwi
)
+∂
∂m
[
(
4πρr2)2(Dconv +Dov +Denv)
∂Xi
∂m
]
, i = 1, . . . , I.
(8.1)
The last equation is in fact a system of I equations in which one equation can be replaced by the normalising
condition∑
iXi = 1. These equations describe the variation of the mass fractions Xi of the relevant species
i = 1, . . . , I considered in the chemical mixture of the star. In general ∇ stands for d lnT/d ln P , but when
the energy transport is only established by radiation (and conduction), ∇ is replaced by ∇rad, which was
defined in (5.29). When convective energy transport is important, ∇ in the fourth equation has to be replaced
by a value derived from the adopted approximation for the theory of convection. The fourth equation
assumes that the star is in hydrostatic equilibrium. As a good practise in computational astrophysics,
users of stellar evolution codes should always check if this condition is fulfilled for the numerical models
achieved as outcome of the scientific software.
In the system (8.1) of coupled differential equations we can identify some partial sub-systems. The first
two equations describe the mechanical part, which is only linked to the thermonuclear part via the density,
which in turn depends on the temperature. When the density is no longer linked to the temperature, we can
solve the first two equations without taking the other three into account. We then get the mechanical structure
expressed as r(m) and P (m). An example are the polytropic solutions. The last system of equations in
(8.1) describes the overall chemical aspect of the problem.
The equations in the system (8.1) contain functions that describe the characteristics of the stellar ma-
terial, like ρ, εn, εν , κ, cP ,∇ad, δ and via Ei also the nuclear reaction rates rij for the chosen network of
isotopes. We assume that these are known as a function of P, T and of the chemical mixture at birth (t = t0)
described by the mass fractions Xi(m, t0) of the isotopes. In other words, we assume the equation of state
to be known, as well as the Rosseland mean opacity, the equations for the other thermodynamical character-
istics of the stellar material, the nuclear reaction rates, the energy production and energy loss by neutrinos:
ρ = ρ(P, T,Xi) κ = κ(P, T,Xi) (8.2)
cP = cP (P, T,Xi) δ = δ(P, T,Xi) ∇ad = ∇ad(P, T,Xi)
rjk = rjk(P, T,Xi) εn = εn(P, T,Xi) εν = εν(P, T,Xi)
114
Definitions for cP , δ and ∇ad were given in Chapter 2. To compute them, we need to adopt an equation of
state throughout the star.
As already mentioned, the Rosseland mean κ is a good approximation for the opacity, except for the
outer stellar layers. The atmosphere requires a special approach for the energy transport because the mean
free path of the photons does not comply with the requirement of the diffusion approximation that this path
is much shorter than the distance the photons still have to pass before being radiated into space. Therefore
we have to solve a much more complicated energy transport equation in the stellar atmosphere. We do not
go into this matter in this course, but we refer to the course Stellar Atmospheres in the KU Leuven Master
of Astronomy & Astrophysics. Here, we simply assume that we have proper atmosphere models available
to be used as boundary conditions to close the system of coupled differential Eqs (8.1).
Let us now make the balance of the number of equations versus the number of unknowns, taking (8.2)
into account. All input functions described in (8.2) can be replaced by functions of P, T and Xi. For Idifferent types of species, the set of equations Eqs (8.1) constitutes a system of I + 4 differential equations
for the I+4 unknowns r, P, T, l,X1, . . . ,XI . The independent variables are m and t. When we assume that
the total mass of the star does not change in time (no mass loss), and when we denote the birth of the star
with t0, we have to compute solutions of the stellar structure equations in the intervals 0 ≤ m ≤M, t ≥ t0.
We have to solve a system of coupled non-linear partial differential equations. We will only find the
physically relevant solutions if we impose a correct number of appropriate boundary conditions for m = 0and m = M and if we know the initial values of the unknown functions. To figure out for which functions
we have to know the initial values, we replace the time derivatives of P and T in the third equation of
Eqs (8.1) by the time derivative of the entropy s, −T∂s/∂t , based on Eq. (3.46). We conclude that we can
solve the entire system Eqs (8.1) if we have initial values for the functions r(m, t0), r(m, t0), s(m, t0) and
Xi(m, t0). After we have found the appropriate initial values and physically justified boundary conditions
have been formulated, it comes down to solving the system Eqs (8.1) for a given equation of state and input
physics. A solution r(m), P (m), T (m), l(m),Xi(m) for a given time t is called a stellar model.
8.2 Time scales and simplifications
There are three types of time derivatives in the system Eqs (8.1). Each of them is connected with a charac-
teristic time scale. The term with ∂2r/∂t2 was used to estimate the hydrostatic time scale τhydr, the time
derivatives in the third equation resulted in the definition of the Helmholtz-Kelvin time scale τHK and the
time derivatives in the last set of equations led to the nuclear time scale τn.
We have already shown that the inertia term in the second equation of Eqs (8.1) can be neglected if
the star evolves slowly compared to the hydrostatic time scale. When the evolution of the star is dominated
by the stable nuclear energy production, we can replace this second equation by the equation of hydrostatic
equilibrium since the Helmholtz-Kelvin time scale as well as the nuclear time scale are much longer than
the hydrostatic time scale. In this case, we only have to know initial values for the functions s(m, t0) and
Xi(m, t0) to solve the system of equations.
115
8.3 Boundary conditions
Setting boundary conditions for the system of equations (8.1) is an important part of the problem to be
solved. The influence of the chosen boundary conditions on the solutions is often difficult to interpret. The
reason is that the boundary conditions for the stellar structure cannot be limited to one end of the mass
interval [0,M ], but have to be divided into conditions for the stellar centre and for the stellar surface. The
boundary conditions in the stellar core are quite simple in comparison to those for the stellar surface. These
latter have to be related to observational quantities and rely on a much more complicated energy transfer
equation. Here, we will only discuss a star in full equilibrium during the phase of core-hydrogen burning.
8.3.1 Central boundary conditions
We will search for central values for the unknowns r, l, P, T . We can immediately determine two boundary
conditions for the stellar centre (m = 0). Since the density has to remain finite, r = 0 has to be valid and
since the energy sources also have to remain finite, also l = 0 has to be valid. There are, on the contrary, no
conditions that we can impose to figure out the values for the central pressure PC and the central temperature
TC . We thus have only two boundary conditions and we will have to work each time with a two-parameter
solution for a given TC and PC . Therefore it is useful to investigate the behaviour of the four unknowns
close to the stellar core m → 0 at a certain point in time t = t0. The first equation of the system Eqs (8.1)
can be written as
d(
r3)
=3
4πρdm. (8.3)
For a constant density ρ = ρc (so for low values of m) we can integrate this equation. This results in
r =
(
3
4πρC
)1/3
m1/3, (8.4)
in which the integration constant was chosen to be zero to comply with the demand of r(m = 0) = 0. We
can see this result as the first term of a series expansion for r around m = 0. A similar integration of the
energy equation with condition l(m = 0) = 0 gives
l = (εn − εν + εg)C m. (8.5)
When we now substitute (8.4) in the equation of the hydrostatic equilibrium we get low values of m:
dP
dm= − G
4π
(
4πρC3
)4/3
m−1/3, (8.6)
which can be integrated to obtain
P − PC = −3G
8π
(
4πρC3
)4/3
m2/3. (8.7)
Furthermore the pressure gradient has to disappear in the stellar core as follows from the equation of hydro-
static equilibrium dP/dr ∼ m/r2 ∼ r3/r2 → 0.
116
For the variation of the temperature close to the centre, we limit to the radiative case, in which
dT
dm= − 3
64π2ac
κl
r4T 3. (8.8)
For P → PC and T → TC the opacity will converge to a certain value κC . When we substitute I by (8.5)
and r by (8.4) we can integrate (8.8) for small m values. We then get
T 4 − T 4C = − 1
2ac
(
3
4π
)2/3
κC (εn − εν + εg)C ρ4/3C m2/3 (8.9)
when the energy transport in the core is radiative.
8.3.2 Boundary conditions for the surface
It is complicated to deduce appropriate boundary conditions for the surface. As a very rough approach we
could take at first instance the naive conditions P → 0 and T → 0 for m→M . This indeed expresses that
P and T take very small values at the stellar surface in comparison with the values in the stellar interior, but
in the end the temperature and the pressure at the stellar surface are not zero.
The next step is going to a sphere which we can call the surface of the star and which defines the stellar
radius r = R. In the study of the stellar atmosphere, one uses the photosphere, i.e., the sphere where the
optical depth, defined as
τ ≡∫ ∞
Rκρdr = κphot
∫ ∞
Rρdr, (8.10)
equals 2/3. Here κphot represents a mean opacity of the photosphere. In hydrostatic equilibrium, the pressure
in this photosphere is determined by the matter above it. The gravity can be assumed as constant g =GM/R2 in this area because the photosphere is a thin shell containing only a very small amount of matter.
With the help of (3.15) and (8.10) we then get for τ = 2/3
Pr=R =
∫
∞
Rgρdr =
GM
R2
∫
∞
Rρdr =
GM
R2
2
3
1
κphot. (8.11)
The temperature in the photosphere is, to a good approximation, given by the effective temperature of the
star.
The photospheric boundary conditions deduced for Tr=R and Pr=R give two relations between the
surface values for the functions P, T, l, r that are certainly an improvement compared to the naive boundary
conditions P → 0 and T → 0. The weakest point when using them is that they were deduced for an area
where the basic assumption we made for the description of the radiative energy transport, namely that the
mean free path of a photon is much shorter than the distance to the stellar surface, is no longer valid. In fact,
a much more complicated energy transport equation should be used in the photosphere. We again refer to
the course Stellar Atmospheres.
In practice the transition of the solutions that are valid “inside” the atmosphere towards the ones that
are valid “outside” of it will happen by choosing a fitting point mf where both solutions will be coupled to
117
each other. Hence, mf has to be situated deep enough into the stellar interior because the assumptions made
to deduce the equations should still be valid. We then get solutions of these equations in the fitting point:
rinf , Pinf , T
inf , l
inf . On the other hand, mf has to be close enough to M so that we can use the simplification
of an outer layer in thermal equilibrium where l = L. The smaller M − mf , the lesser the energy that
can be gathered or released in the outer layer. In the study of stellar atmospheres, the solutions for the four
unknown functions routf , P outf , T out
f , loutf are computed. One can show that these solutions are functions of
the parameters R and L. The boundary conditions are the four solutions constructed for the interior in such
a way that they are equal to the ones computed for the atmosphere:
rinf = routf , P inf = P out
f , T inf = T out
f , linf = loutf . (8.12)
Those four boundary conditions can be met because we have enough free parameters: TC and PC for the
internal solutions and R and L for the external solutions.
For numerical applications (see next section) the following procedure is used. In the point mf we find
solutions for the outside of the atmosphere: routf (R,L), P outf (R,L), T out
f (R,L), loutf (R,L) by numerical
integration of the equations that are relevant in the atmosphere. The last function is very simple: loutf = L.
The first one can be inverted without any problems which leads to R = R(routf , L). This equation is
now used to express the R- dependence of the other two functions: P outf (R(routf , L), L) ≡ π(routf , L) and
T outf (R(routf , L), L) ≡ θ(routf , L), where π and θ are known functions of routf en loutf = L. We now
replace the variables for the outside by their equivalents at the inside, taking the fitting conditions (8.12) into
account:
P inf = π(rinf , L), T
inf = θ(rinf , L). (8.13)
These are two boundary conditions for the internal solutions. They were constructed in a way that, when a
good internal solution is found, this can always be coupled to the external solution in a continuous way.
8.4 A simple numerical scheme: the Henyey method
An analytical solution of the system Eqs (8.1) is not possible for realistic equations of state. We should
thus search for numerical solutions to solve the system of equations. Due to computational demands the
calculations of the solutions for the whole system have only been possible during the last half century.
Before modern computers were available, simpler stellar models, like polytropes, had to be used to predict
stellar evolution. One of the numerical methods that has seen widespread use to compute solutions of the
system Eqs (8.1) is the Henyey method, which we will discuss now. This method is particularly suitable to
compute stellar models during the core-hydrogen burning stage. Further in this chapter, and in the course lab
work, we introduce the students into a modern state-of-the-art stellar evolution code that is able to compute
stellar models at all the stages of the evolution and for all mass ranges.
The Henyey method is a very practical method to solve differential equations with boundary conditions
at both ends of the solution interval. A first approximate starting solution is proposed and evaluated. By
means of an iterative process the starting solution is gradually improved until an appropriate solution is
obtained that meets a predetermined precision. At each iteration step corrections are being applied to all
118
variables and in all grid points so that the effect of these variations to the final solution, including the
boundary conditions, is taken into account.
For spherical stars in hydrostatic equilibrium we have to solve the system Eqs (8.1), where we replace
the second equation by ∂P/∂m = −Gm/4πr4, with the corresponding boundary conditions as described in
the previous subsection. For standard stellar models without any transport processes (i.e., for ∂Xi/∂m = 0),
the set of equations allows us to solve two separate partial systems. We restrict the description to this
simplified case, where one can first solve the “spatial” system for the given Xi(m) and afterwards apply
the last system of equations of Eqs (8.1) for a small time step t. After this one can again solve the first
partial system for the new Xi(m), etc. We will now describe in detail how to solve the spatial system for
such standard models. For more sophisticated models with the chemical mixing as discussed in Chapter 7,
we refer to the use of the MESA code as illustrated during the MESA Lab work.
We will limit ourselves to solving models in full equilibrium: r = P = T = 0. We fill the initial values
for Xi(m), which we can choose as known parameters for each point. The input physics given in (8.2) can,
in the given system, be replaced by their dependencies of P and T . This way, we have to restrict ourselves
to solving models in full equilibrium: r = P = T = 0. We then only have to define initial values for Xi(m)for each point and solve for four unknown functions r, P, T, l in the interval [0,M ] for a given M . We write
these four equations asdyidm
= fi(y1, . . . , y4), i = 1, . . . , 4, (8.14)
where we have introduced the following abbreviations y1 = r, y2 = P, y3 = T, y4 = l.
The following step is the discretisation of the equations (8.14), by replacing them by differential equa-
tions for a finite mass-interval [mj,mj+1]. We will indicate the values of the variables at each end of the
mass interval [mj,mj+1] with upper indices: yj1, yj+11 , . . . , yj4, y
j+14 . The functions fi on the right-hand
side of Eqs (8.14) have to be evaluated in an average argument, which is indicated with yj+1/2i . A logical
choice for these arguments is the arithmetic or geometric mean of yji and yj+1i . Let us now define the four
functions:
Aji ≡
yji − yj+1i
mj −mj+1− fi
(
yj+1/21 , . . . , y
j+1/24
)
, i = 1, . . . , 4, (8.15)
then the following differential equations
Aji = 0, i = 1, . . . , 4 (8.16)
replace the differential equations Eqs (8.14) that we want to solve.
The two boundary conditions for the outside of the atmosphere are fixed in a fitting point mf . We
choose this point as the one with upper index j = 1. These two boundary conditions give a relation between
the four variables y11 , . . . , y14 in the point m1 = mf . With the definitions
B1 ≡ y12 − π(y11 , y14), B2 ≡ y13 − θ(y11, y
14) (8.17)
the boundary conditions (8.13) are given by
Bi = 0, i = 1, 2. (8.18)
119
We now consider the whole interval in m, from mK = 0 to the fitting point m1 = mf . We divide this
area into K−1 partial intervals by choosing K grid points, which do not have to be equidistant. In the inner
interval for m, between the central point mK = 0 and mK−1 we use the series expansions (8.4), (8.5), (8.7),
and (8.9) for the four variables. These four equations are of the form
Ci
(
yK−11 , . . . , yK−1
4 , yK2 , yK3
)
= 0, i = 1, . . . , 4, (8.19)
in which the requirement yK1 = yK4 = 0 (r = l = 0 in the centre) is already incorporated.
In the K grid points we have 4K − 2 unknown variables, since yK1 = yK4 = 0. These unknowns have
to meet (8.18) for the first point, (8.16) for all intervals except the last (j = 1, . . . ,K − 2) and (8.19) for the
last interval. In total we have 2 + 4(K − 2) + 4 equations, which we can write schematically as
Bi = 0, i = 1, 2
Aji = 0, i = 1, . . . , 4, j = 1, . . . ,K − 2
Ci = 0, i = 1, . . . , 4.
(8.20)
We search a solution for a given M and Xi(m), which occur as input parameters in the equations. We also
need a first rough estimate of the values of the unknowns:(
yji
)
1for i = 1, . . . , 4; j = 1, . . . ,K. Since the
(
yji
)
1are only approximations, they will not meet (8.20):
Bi(1) 6= 0, Aji (1) 6= 0, Ci(1) 6= 0. (8.21)
We now derive corrections δyji for all variables in all grid points so that the second approximation(
yji
)
2=
(
yji
)
1+ δyji of the arguments will make the functions Bi, A
ji and Ci disappear. The corrections δyji of the
arguments deliver corrections δBi, δAji , δCi of the functions. We thus demand that
Bi(1) + δBi = 0, Aji (1) + δAj
i = 0, Ci(1) + δCi = 0. (8.22)
For corrections that are small enough, we can expand δBi, δAji , δCi in a series of increasing powers of δyji
and only keep the linear terms of this series. For B1 this is for example
δB1 ≈ ∂B1
∂y11δy11 +
∂B1
∂y12δy12 +
∂B1
∂y13δy13 +
∂B1
∂y14δy14 . (8.23)
Thanks to the linearisation procedure the conditions are given as (8.22):
∂Bi
∂y11δy11 + . . .+
∂Bi
∂y14δy14 = −Bi,
∂Aji
∂yj1δyj1 + . . .+
∂Aji
∂yj4δyj4 +
∂Aji
∂yj+11
δyj+11 + . . . +
∂Aji
∂yj+14
δyj+14 = −Aj
i ,
∂Ci
∂yK−11
δyK−11 + . . . +
∂Ci
∂yK−14
δyK−14 +
∂Ci
∂yK2δyK2 +
∂Ci
∂yK3δyK3 = −Ci,
(8.24)
120
where the indices i and j can take the same values as in (8.20). We again have 4K−2 (linear inhomogeneous)
equations for as many unknown corrections δyji (since δyK1 = δyK4 = 0 following the boundary conditions).
When calculating (8.22) all functions Bi, Aji , Ci and all their derivatives have to be determined with as
arguments the first approximations(
yji
)
1. The scheme (8.24) that should be solved can be annotated much
shorter in matrix form:
H
δy11
.
.
.
δyK3
= −
B1
.
.
.
C4
. (8.25)
Here H is the Henyey matrix, whose elements are the derivatives in the left-hand side of (8.24).
When H has a determinant different from zero, we can solve the system of linear equations and compute
the corrections δyji . On their turn, these lead to a better, second approximation of the unknowns(
yji
)
2.
When we take these as arguments for the equations (8.20) to solve, and will still find
Bi(2) 6= 0, Aji (2) 6= 0, Ci(2) 6= 0, (8.26)
because we only worked in the linear approximation and, moreover, numerical inaccuracies are always
involved. Therefore we take a second iteration step where we determine new corrections following the same
method which leads to a third approximation:(
yji
)
3=(
yji
)
2+ δyji . We keep on going with this iteration
process until the approximate solution is close enough to the solution we are searching for, following a
predetermined stop criterion. In this way one determines the entire stellar structure of a star in equilibrium,
given the mass and the chemical composition in the different layers at birth.
In the Figures 8.1 – 8.5 we show the results of the functions m(r), P (r), ρ(r), T (r) and l(r) (loga-
rithmic scale), derived with the Henyey method for a star with an initial mass of 1M⊙ (left panels) and of
15M⊙ (right panels) just after birth. The initial chemical composition in the whole star was X = 0.74, Y =0.24, Z = 0.02 using the solar chemical mixture. Further we assumed an ideal gas with radiation taking
ionisation effects into account as EOS. For the energy production, the nuclear networks described by the ppchains and the CNO cycle were used. Convective energy transport has been taken into account by using the
mixing length theory as described in Chapter 5. All other sources of chemical mixing were ignored.
From Figure 8.1 we deduce that the mass is strongly concentrated near the stellar centre: approximately
80% of the mass of the sun-like star is situated within a sphere with r = 0.4R⊙, so within a fraction 0.064
of the total volume. For a star with 15M⊙, 80% of the mass is situated in the inner half radius, which
corresponds to a fraction of 0.125 of the total volume. We conclude that the mass is more concentrated
towards the stellar interior as the star is less massive. The outer layers of the star negligibly influence the
total mass of the star.
The luminosity is even more concentrated than the mass (see Figure 8.5): 90% of the luminosity is
created within r = 0.2R, so within a fraction 0.008 of the volume of the star. It is in that central core that the
121
Figure 8.1: The mass distribution m(r) as a function of the position inside the star for a star of 1M⊙ (on the
left) and of 15M⊙ (on the right).
Figure 8.2: The pressure P (r) as a function of the position in the star for a star of 1M⊙ (on the left) and of
15M⊙ (on the right).
122
Figure 8.3: The density ρ(r) as a function of the position in the star for a star of 1M⊙ (on the left) and of
15M⊙ (on the right).
Figure 8.4: The temperature T (r) as a function of the position in a star for a star of 1M⊙ (on the left) and
of 15M⊙ (on the right).
123
Figure 8.5: The luminosity l(r) as a function of the position in a star for a star of 1M⊙ (on the left) and of
15M⊙ (on the right).
nuclear fusion occurs. In all surrounding layers the energy is only transported outwardly; there l(r) = L =is constant. For the sun-like star, the density strongly peaks in the centre and at r = 0.5R⊙ it is already
decreased with a factor 100. The profile of the pressure follows the one of the density. For a massive star the
decrease in density and pressure happens more gradually than for the sun-like star. The temperature changes
gradually and is “only” decreased with a factor 3 at r = 0.5R⊙. The temperature decreases rapidly near the
surface of the star, because the radiation can escape easily from there. The models of the current Sun (age
approximately 5×109 years) are characterised by a chemical composition X ≈ 0.35, Y ≈ 0.636, Z = 0.014in the stellar core, while the initial chemical composition X = 0.718, Y = 0.268, Z = 0.014 is still valid
for the areas with r > 0.2R⊙. The mass, density, pressure, temperature, and luminosity have not changed
much since its birth.
8.5 The MESA stellar structure and evolution code
We now turn to more realistic and more complex stellar models. As part of this course, the students learn
how to use and interpret the outcome of a state-of-the-art modern stellar evolution code, which relies on far
more sophisticated numerical methods than the Henyey method described in the previous section and allows
for the inclusion of many more physical effects, such as transport processes and the chemical mixing they
induce as described in Chapter 7. It concerns the MESA code, whose first version has been released to the
public in 2011. This code is being upgraded continuously by its developers’ team led by Bill Paxton at the
University of California at Santa Barbara, following suggestions from an active community of hundreds of
users worldwide, among which members of the Institute of Astronomy of KU Leuven.
Students of this SSE course are invited to read the history of MESA, the motivation of the developers,
and the terms of reference at the MESA website:
124
http://mesa.sourceforge.net/
The MESA code is described and discussed in five extensive peer-reviewed instrument papers, which
we use as study material for the Lab work of this master course:
1. Paxton B., Bildsten L., Dotter A., Herwig F., Lesaffre P., Timmes F., 2011, “Modules for Experiments
in Stellar Astrophysics (MESA)”, The Astrophysical Journal Supplement Series, 192, article id. 3,
35 pp.
2. Paxton B., Cantiello M., Arras P., Bildsten L., Brown E. F., Dotter A., Mankovich C., Montgomery
M. H., Stello D., Timmes F. X., Townsend R., 2013, “Modules for Experiments in Stellar Astrophysics
(MESA): Planets, Oscillations, Rotation, and Massive Stars”, The Astrophysical Journal Supplement
Series, 208, article id. 4, 42 pp.
3. Paxton B., Marchant P., Schwab J., Bauer E. B., Bildsten L., Cantiello M., Dessart L., Farmer R., Hu
H., Langer N., Townsend R. H. D., Townsley D. M., Timmes F. X., 2015, “Modules for Experiments
in Stellar Astrophysics (MESA): Binaries, Pulsations, and Explosions”, The Astrophysical Journal
Supplement Series, 220, article id. 15, 44 pp.
4. Paxton B., Schwab J., Bauer E. B., Bildsten L., Blinnikov S., Duffell P., Farmer R., Goldberg J. A.,
Marchant P., Sorokina E., Thoul A., Townsend R. H. D., Timmes F. X., 2018, “Modules for Experi-
ments in Stellar Astrophysics (MESA): Convective Boundaries, Element Diffusion, and Massive Star
Explosions”, The Astrophysical Journal Supplement Series, 234, article id. 34, 50 pp.
5. Paxton B., Smolec R., Schwab J., Gautschy A., Bildsten L., Cantiello M., Dotter A., Farmer R., Gold-
berg J. A., Jermyn A. S., Kanbur S. M., Marchant P., Thoul A., Townsend R. H. D., Wolf W. M.,
Zhang M., Timmes F. X., 2019, “Modules for Experiments in Stellar Astrophysics (MESA): Pulsat-
ing Variable Stars, Rotation, Convective Boundaries, and Energy Conservation”, The Astrophysical
Journal Supplement Series, 243, article id. 10, 44pp.
Students are encouraged to download these papers and read them carefully as an optimal preparation of their
practical MESA lab tasks. The MESA software and the basic framework of the code will be explained to the
students as part of the project work included in this course. This has the aim that the students learn how to
compute numerical stellar models with a modern code that is currently in use by numerous active researchers
in the field of stellar astrophysics. In this way, any student that passes this course will be equipped with a
modern tool to construct important building blocks for many other topics in modern astrophysics: models
of stellar interiors in all stages of stellar evolution.
125
Chapter 9
Star formation
9.1 The interstellar medium
The existence of interstellar matter was proven in the early 1900s based on observations of the binary star
δOrionis. Because of the motion of the two components in the binary system, the spectral lines of the stars
shift back and forth in wavelength (Doppler shift) according to their orbital motion. Measurements of the
spectral lines of δOrionis revealed a spectral absorption line of calcium that did not follow the shifts of
the other spectral lines. In 1904, Hartman rightfully concluded that this absorption line must be caused by
matter in between δOrionis and the Earth.
Another indication for the presence of matter between the stars is delivered by the dark regions in the
Milky Way. Initially scientists thought these regions, where much fewer stars are seen, were intrinsically
depleted in stars. In reality, these dark regions are concentrations of dust that block the stellar light at these
locations. In the direction of such dark clouds, only the stars in front of the cloud are visible, which explains
why we see less stars in that direction. Interstellar dust has a typical temperature between 10 and 100 K.
Next to interstellar dust, interstellar gas appears everywhere in space. It is observed in the form of very
narrow absorption lines in the spectra of stars. The gas has the same composition as young, newborn stars.
The average density of the gas is extremely low, approximately 1 atom per cm3. The temperature is around
1000 K. It therefore mostly consists of neutral atoms, primarily hydrogen. In the surroundings of hot stars,
the gas is ionised due to the UV radiation, and can heat up to 10,000 K, because the neutral H atoms can
absorb the UV photons. The photons with an energy E > 13.6 eV ionise H, and put an energy E − 13.6 eV
into the kinetic energy of the electron that is released. Via collisions with other electrons and protons, this
kinetic energy is distributed over the gas as internal energy.
The interstellar matter is not homogeneously distributed in space, but localised in the disk of the Milky
Way, more specifically in the spiral arms. Moreover, local concentrations occur: interstellar clouds. These
have a diameter of a few tens of parsec, a temperature between 10 and 200 K, and a density of roughly 10 to
129
1000 atoms per cubic cm. The interstellar clouds are cooler than the low-density gas, because the radiation
that heats the matter cannot penetrate deeply. In dense interstellar clouds, also molecules can exist, primarily
H2 and, in much lower quantities, complex molecules such as CH3OH and H2CO. Such clouds are called
molecular clouds.
From detailed studies of the absorption properties of the interstellar dust, we know it consists of mi-
nuscule particles of carbon and silicate with a diameter of the order of 1µm, encapsulated in a thin layer of
ice (H2O). The mass in an interstellar cloud is in between 100 and 105 M⊙, but the dust particles constitute
only a minor fraction (∼ 1%) of this total mass. The remaining 99% consists of gas in the form of neutral H
or H2 molecules, and He atoms.
Stars are formed from the matter in molecular clouds. This happens when such a cloud becomes
gravitationally unstable and collapses. The clouds are opaque to visual radiation, which implies that the
formation process of young stars is not well known. Recently, better infra-red and mm instruments have
been used, which increased our understanding of star formation. Here, we first derive a criterion for cloud
collapse. Next, we will discuss several stages between the collapse and the birth of a new star. We necessarily
have to keep the topic of Star Formation short in this SSE course and refer to the fully dedicated master
course in this topic for details.
9.2 The Jeans criterion
Consider an infinitely extended, homogeneous cloud at rest. The density, temperature, and gravitational
potential are then constant. This is not a stable equilibrium state, because the equation of Poisson, ~∇2Φ =4πGρ then implies that ρ = 0. Nonetheless, we consider this state with a non-zero density. Even for a more
realistic equilibrium configuration, the result derived below does not change too much.
A perturbation is applied to the medium in equilibrium. This perturbation can be caused by e.g. a
supernova explosion in the vicinity, or by the passage of a density wave caused by the spiral arms in the
Milky Way. The gas must satisfy the equation of motion
d~v
dt=∂~v
∂t+ (~v.~∇)~v = −1
ρ~∇P − ~∇Φ (9.1)
and the continuity equation∂ρ
∂t+ ~v.~∇ρ+ ρ~∇.~v = 0. (9.2)
Moreover, the equation of Poisson has to be met, and we assume that the equation of state for an isothermal
ideal gas is valid :
P = a2ρ, (9.3)
with a the isothermal speed of sound, see expressions (2.52) and (2.53). In equilibrium, we have ρ = ρ0 =constant, T = T0 = constant and ~v0 = ~0. Φ0 is determined from the condition ~∇2Φ0 = 4πGρ0.
We now perturb the equilibrium and determine the effect of the perturbation on the physical quantities.
Herein, we only consider a small perturbation, and hence neglect non-linear effects. The quantities are
130
written as
ρ = ρ0 + ρ1, P = P0 + P1, Φ = Φ0 +Φ1, ~v = ~v1, (9.4)
where the functions with lower index 1 now have a spatial and a temporal dependence. If we replace (9.4)
in the equations that have to be met, the follow system of differential equations is obtained:
∂~v1∂t
= −~∇(
Φ1 + a2ρ1ρ0
)
,
∂ρ1∂t
+ ρ0~∇.~v1 = 0,
~∇2Φ1 = 4πGρ1.
(9.5)
We have assumed that the cloud remains isothermal during the perturbation. This approximation is valid,
as long as the cloud is able to radiate the released gravitational energy in an efficient way. The system of
Eqs (9.5) consists of linear, homogeneous partial differential equations with constant coefficients. We can
hence find solutions proportional to exp[i(kx+ ωt)], so that
∂
∂x= i k,
∂
∂y=
∂
∂z= 0,
∂
∂t= iω , (9.6)
where k is the wavenumber and ω the frequency of the solution. With v1x = v1, v1y = v1z = 0, we find,
based on (9.5), the following equations:
ωv1 +ka2
ρ0ρ1 + kΦ1 = 0,
kρ0v1 + ωρ1 = 0,
4πGρ1 + k2Φ1 = 0.
(9.7)
This homogeneous linear system of three equations for the three unknowns v1, ρ1,Φ1 only has solutions
different from zero if the determinant∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
ωka2
ρ0k
kρ0 ω 0
0 4πG k2
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
is equal to zero. For k 6= 0, this implies the condition
ω2 = k2a2 − 4πGρ0. (9.8)
For sufficiently large wavenumbers k, the right-hand side of the equation is positive, and the perturbation
will vary periodically in time according to ω being a real-valued eigenvalue. Because the amplitude does
not increase, this solution represents a stable equilibrium situation. In the limit for infinitely large k, the
second term on the right-hand side of Eq. (9.8) can be neglected, so that ω2 = k2a2, which is the dispersion
relation for isothermal sound waves. We thus find that, for very short waves (high k), the influence of the
gravitational force can be neglected. Each form of compression will, in this case, be restored by an increased
pressure, and the perturbations travel through the medium at the speed of sound.
131
When k2 < 4πGρ0/a2, ω is a complex number of the form ±iξ where ξ a real number. Perturbations
∼ exp(±ξt) occur, that grow or decay exponentially in time, so that the equilibrium is disturbed. We define
now a characteristic wavenumber kJ and a characteristic wavelength λJ:
k2J ≡ 4πGρ0a2
, λJ ≡ 2π
kJ. (9.9)
The perturbations with wavenumber k < kJ (or wavelength λ > λJ) cause an instability. Instability hence
occurs when
λ > λJ with λJ =
(
π
Gρ0
)1/2
a. (9.10)
Condition (9.10) is called the Jeans criterion, after J. Jeans who derived it in 1902.
Physically, the following happens: after a small compression of a set of plane-parallel layers, grav-
ity overcomes the pressure force, and the layers are compressed to narrow zones. The rate at which this
compression occurs, can be estimated by only considering the influence of gravity in Eq. (9.8). We have
iω ≈ √Gρ0 and the corresponding time scale τ ≈ 1/
√Gρ0. The latter is equivalent to the previously
defined free-fall time scale in Eq. (3.24). The time scale for thermal adaptation, on the other hand, is much
shorter if efficient “cooling agents” are present in the cloud. Water and carbon monoxide molecules, in
which many different kinds of rotational and vibrational transitions are possible, are indeed able to remove
heat and hence quickly cool the cloud during contraction. The thermal time scale in an interstellar cloud
is of order hundreds of years. To a good approximation, the collapse occurs isothermally as long as these
molecules are present.
It can be shown that the Jeans criterion is still valid when considering realistic configurations, e.g., a
spherically symmetric gas cloud. Depending on the assumed geometry, the factors in Eq. (9.9) for λJ will
differ a bit. For a given equilibrium state, there is a critical mass, termed the Jeans mass. Gas clouds with a
mass exceeding the Jeans mass are gravitationally unstable and will collapse due to a perturbation. We can
estimate the Jeans mass as follows:
MJ =4π
3ρ0λ
3J
=4π
3ρ0
(
π
Gρ0
)3/2 (RTµ
)3/2
=4π
3
(RπGµ
)3/2
T 3/2ρ−1/20
≈ 1 to 5× 105M⊙
(
T
100K
)3/2 ( ρ010−24gcm−3
)−1/2
µ−3/2,
(9.11)
where we have used that a2 = RT/µ. Typical values for interstellar clouds consisting of neutral hydrogen
are: ρ0 = 10−24 g cm−3, T = 100K, and µ = 1. With these values, the Jeans mass MJ ≈ 1 to 5× 105 M⊙.
This means that only for masses significantly larger than stellar masses, the cloud can collapse according to
the Jeans criterion.
132
9.3 Fragmentation
How are stars formed from a gas cloud that gravitationally collapses? It is assumed that a collapsing cloud
with a mass exceeding the Jeans mass will fragment. During the contraction, fragments are created which
themselves become unstable and contract at a faster rate than the initial cloud. If this process indeed occurs,
it implies that smaller sub-masses condense from the cloud.
As noticed previously, the contraction occurs isothermally. Hence, the Jeans mass decreases according
to ρ−1/2 during the contraction, in other words, the Jeans mass becomes smaller than the original mass of
the gas cloud when it started contracting. When the Jeans mass has decreased below half the initial mass,
the cloud splits up into two segments, which both collapse individually. Such fragmentation will continue as
long as the collapse occurs isothermally. We derived the Jeans criterion under the assumption of a medium
in equilibrium, and hence the theory is not strictly valid for a cloud that is already contracting but as long as
the time scale argument remain valid, the results hold.
What are the end products of the fragmentation process? Finding a detailed answer to this question
based on the equations of hydrodynamics and thermodynamics is beyond the scope of this course. As said,
we limit ourselves to a reasoning based on time scales: when does the time scale for thermal adaptation
become comparable to the free-fall time scale? At that moment, the contraction will not occur isothermally
anymore, but adiabatically. For a mono-atomic ideal gas, we have ∇ad = 2/5, so that T ∼ P 2/5 and because
P ∼ ρT the temperature changes as T ∼ ρ2/3. The Jeans mass then is proportional to T 3/2ρ−1/2 ∼ ρ1/2.
Hence, we find that the Jeans mass will increase during an adiabatic collapse. As a result, the existing
fragments will stop fragmenting further.
The characteristic time scale for free-fall of a fragment is (Gρ)−1/2. The total energy that needs to be
radiated to keep a constant temperature, is of the order of the gravitational potential energy Eg ≈ GM2/R,
in which M and R are the mass and “radius“ of the fragment. An energy A of the order
A ≡ GM2
R(Gρ)1/2 =
(
3
4π
)1/2 G3/2M5/2
R5/2(9.12)
per unit time needs to be radiated to keep the fragmentation isothermal. Let us now assume thermal equi-
librium, which is a good approximation at the end of the fragmentation process because the matter becomes
opaque. Then, however, the fragment cannot radiate more energy than a black body with the same tempera-
ture. The fragment radiates an energy given by B = 4πfσT 4R2, with σ Stefan-Boltzmann’s constant (see
Appendix A) and f a dimensionless parameter with value between 0 and 1 that takes into account the fact
that less energy is radiated than in case of a black body. The condition for isothermal collapse is A ≪ B,
and the transition to adiabatic contraction will occur at A ≈ B. The latter condition is met when
M5 =64π3
3
σ2f2T 8R9
G3. (9.13)
Fragmentation stops when the Jeans mass is equal to the mass given in Eq. (9.13). We replace M in
Eq. (9.13) with MJ, R by (3MJ/4πρ)1/3, and eliminate ρ using Eq. (9.11). Hence, we obtain the Jeans
133
mass at the end of the fragmentation process:
MJ,end =
(
46π15
38
)1/41
(σG3)1/2
(Rµ
)9/4
f−1/2T 1/4 = 0.17M⊙
T 1/4
f1/2, (9.14)
where we have set µ = 1. We now take a typical temperature of 1000 K as the temperature of the smallest
fragments. Subsequently we assume that deviations from the isothermal state occur when f = 0.1, i.e. when
the fragment loses 10% of its maximum possible energy loss. The Jeans mass at the end of the fragmentation
process hence is ∼ 3M⊙. This result does not change dramatically with varying the temperature and f -
value within reasonable limits. We conclude that fragmentation stops when the fragments have reached a
mass of the order of a few tenths to several tens of solar masses, not the order of a planetary mass, nor the
mass of a star cluster.
9.4 The formation of a protostar
The Jeans criterion we have derived is based on a first-order perturbation method, and gives the condition
under which a perturbation of the equilibrium state will grow exponentially. This theory, however, does not
provide insight in the end product of the collapse. We now describe the different stages between the collapse
and the birth of the star.
When the fragmentation process stops, the different fragments continue to contract. The gravitational
force is still dominating, and the pressure gradient can be neglected at first. We can approximate the collapse
as a free fall of a homogeneous sphere. The time scale of this free fall is very comparable to the time scale
one finds when a sudden disappearance of the pressure force occurs in the equation of motion, and is about
105 to 107 years. This time scale is not so accurate anymore near the centre of the fragment, because the
pressure force becomes important there, which stops the collapse.
We now follow the process of collapse for a homogeneous cloud with a mass of 1 M⊙ after the frag-
mentation process has ended. To a good approximation, the instability keeps outer layers of the sphere at a
quasi-constant radius while the inner matter undergoes a free fall. Hence the density increases very rapidly
in the inner parts, while the density in the outer parts of the fragment barely varies. Once a small central
concentration appears, it will inevitably continue to grow and an irreversible process has started. The free-
fall time scale for a sphere within a radius r is of the order of [Gρ(r)]−1/2 in which ρ indicates the average
density in the sphere with radius r. When ρ increases towards the center, the free-fall time scale decreases
in that direction. Hence, the inner spheres will collapse faster than the outer spheres, and the density differ-
ence will become even more pronounced. Eventually, the fragment will evolve from a density distribution
ρ =constant to ρ ∼ r−2.
The collapse of the central part occurs in free-fall as long as the matter can lose the released gravi-
tational energy. A part of this energy is radiated in the infra-red. Another part is captured in the form of
differential rotation. Matter with a small angular momentum r2Ω(r) (per unit mass) with Ω(r) the rotation
frequency at position r from the centre, will undergo a dynamical collapse on a free-fall time scale. Matter
134
CLASS I
CLASS II
CLASS III
CLASS 0
-7
-8
-9
11 12 13 14 15
-7
-8
-9
-7
-8
-9
-7
-8
-9
ν (Hz)Log
Fν
νL
og
Fν
νL
og
Fν
νL
og
Fν
νL
og
Star
Remnant
disk
Star
Protostar
Active
disk
Passive
disk
Core
zZ
5000 AU
5 AU
50 AU
500 AU
Figure 9.1: The different stages of the star formation process of a single fragment in a schematic picture
(right) and the accompanying theoretically predicted spectral energy distributions (left). For details: see
text. (Figure courtesy of Dr. Bram Acke, KU Leuven)
135
on the outer edge of the fragment, on the other hand, has a much larger angular momentum and will not just
fall towards the center, but will spiral in around the star-to-be (see Figure 9.1).
A further increase of the density will cause an adiabatic increase of the temperature. Hence the pres-
sure will rise until free fall is stopped. A central core in hydrostatic equilibrium forms, surrounded by a
(still) collapsing envelope. At this moment, the mass of the core is approximately 1/200M⊙, the radius is
1000R⊙. Typical values for the central density and temperature are ρc = 2 × 10−10 g cm−3, Tc = 170K.
The free-fall velocity at the edge of the core is approximately 75 km s−1. When the mass of the core contin-
ues to increase, while its radius decreases, this velocity will exceed the local sound speed. Hence, a shock
wave will be generated, that separates the hydrostatic “interior” from the supersonic “rain” on the core. In
the shock front, the in-falling matter comes to a stop and transfers its kinetic energy to internal energy of the
core. In this way, the accreting core is heated.
In the core, the gas consists primarily of hydrogen in molecular form. However, when the temperature
rises to ∼2000 K, the H2 molecules will dissociate. A mixture of atomic and molecular hydrogen is obtained.
This mixture has a very high opacity and the cooling mechanism becomes much less efficient. At the start
of the dissociation process, the larger part of the energy that is injected into the core via the shock wave, will
be used to dissociate all molecular hydrogen. The shock wave rapidly extinguishes, before it can reach the
outer layers of the fragment. During the stage of strong dissociation, the hydrostatic equilibrium in the core
is broken, and the latter contracts again. This happens when the mass in the core has roughly doubled, and
its radius halved. This second collapse lasts as long as the gas is partially dissociated.
When all hydrogen has been converted to its atomic form, a dynamically stable sub-core has formed
in the star-to-be. This sub-core has a mass of approximately 1.5 × 10−3M⊙ and a radius of 1.3R⊙. The
central density has increased to about 10−2 g cm−3, and the central temperature is about 2× 104 K. Again,
a shock front is formed, at the edge of the sub-core. This front is much more energetic that the first one,
and now does reach the surface of the fragment: the early protostar shows its first luminosity. A schematic
representation of the two shock fronts is given in Figure 9.2.
The evolution of the core of the fragment with mass 1M⊙, starting from the original Jeans instability,
is schematically shown in Figure 9.3. The evolution starts on the left with an isothermal collapse. When the
matter becomes opaque, the temperature increases adiabatically. The temperature increase is topped off by
the dissociation of H2. The central compression occurs adiabatically as long as the accretion time scale of
the core (or the sub-core if it exists already) remains short compared to the Helmholtz-Kelvin time scale.
The more molecular hydrogen is depleted from the envelope, the longer the accretion time scale becomes.
At a certain moment, the latter time scale will exceed the Helmholtz-Kelvin time scale and the accretion
will gradually cease: a protostar is born, and its mass will not increase anymore. We now make a sidestep
before answering the question what happens to the protostar before it becomes a newborn star.
136
Figure 9.2: The collapse of a gas cloud with a mass of 1M⊙. (a) After about 0.4 million years, the cloud
has a dense, opaque core. The collapse stops at the edge of that core, and develops a shock front between
the core, in hydrostatic equilibrium, and the envelope, which is still in free fall. (b) When the core becomes
dynamically unstable because of H2 dissociation, a second collapse of the core occurs. Consequently a
second shock front develops, but now at much smaller r. (c) The velocity modulus |v| (in cm s−1) and
density ρ (in g cm−3) as a function of r (in cm). The regions of the shock waves are characterised by large
variations in velocity. (From Kippenhahn et al. 2012)
9.5 Hayashi tracks in the HR diagram
Let us consider the limiting case of a fully convective star, i.e., a star where the convective zone stretches all
the way from the stellar core to the stellar photosphere, so that only the stellar atmosphere remains radiative.
The Hayashi track is the location in the HR diagram where fully convective stars with a certain mass and
chemical composition reside. There is a separate Hayashi track for each mass and chemical composition
as these affect the temperature and chemical gradients occurring in the Ledoux criterion for convective
instability. The Hayashi tracks are situated on the right side of the HR diagram, at effective temperatures
137
Figure 9.3: The central evolution of a cloud with mass 1M⊙, starting from the isothermal collapse to the
ignition of hydrogen. The central temperature Tc (in Kelvin) is shown as a function of the central density
ρc (in g cm−3). The dotted line is an extrapolation which indicates that the stage of thermal adaptation, that
follows after the adiabatic compression, results in the ignition of hydrogen in the core. (From Kippenhahn
et al. 2012)
between 3000 K and 5000 K. A good approximation for the Hayashi tracks in the HR diagram is:
log Teff ≃ 0.05 log L+ 0.2 logM + constante. (9.15)
The slope of the steep tracks is ∂ logL/∂ log Teff ≃ 20. This implies that the Hayashi tracks shifts to the
left in the HR diagram with increasing stellar mass.
The exact determination of the Hayashi tracks does not only depend on the stellar mass and chemical
composition of the star, but also on the details of the convection theory used. In Figure 9.4, Hayashi tracks
are shown for stars with masses ranging from 0.25 to 4M⊙. The Hayashi tracks are located far away from
the main sequence for high stellar masses, and approach the main sequence at masses below ∼ 0.5M⊙.
Stars with such low masses are fully convective main-sequence stars since their Hayashi track crosses the
main sequence.
The Hayashi track indicates the border between a permitted and a forbidden region in the HR diagram.
Positions to the right of the Hayashi track cannot occur for a star in hydrostatic and convective equilibrium.
The latter means that variations of quantities connected to the convective cells occur so slowly that the con-
138
Figure 9.4: The position of Hayashi tracks for stars with masses between 0.25 and 4M⊙, for a chemical
composition X = 0.70, Y = 0.28, Z = 0.02. The ZAMS is indicated as a reference. (Figure courtesy of
Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)
vection has had enough time to adapt to the new situation. Because hydrostatic and convective equilibrium
are quickly restored, stars can only occur to the right of the Hayashi track for a very short period.
During some stages of stellar evolution, stars closely approach, or even coincide with their Hayashi
track. The position of the Hayashi tracks hence influences stellar evolution.
9.6 Evolution of the protostar towards the zero-age main sequence
After the dynamical collapse stage described in Sect. 9.4, the protostar reaches quasi-hydrostatic equilib-
rium. As long as the protostar has a central temperature below that needed to ignite hydrogen burning, it
can only use contraction to generate the energy needed to counteract gravity. The star is not yet in thermal
equilibrium. It contracts slowly on a Helmholtz-Kelvin time scale while still accreting the last remaining
matter from the surroundings. During this evolution stage, τHK ≈ M−2.5 is a good approximation for the
contraction time scale. The virial theorem states that a part of the gravitational contraction energy is con-
verted into internal energy, and the other part is responsible for the protostar’s luminosity. The opacity of
139
Figure 9.5: Henyey tracks of evolving models representing contracting protostars between their Hayashi
track and the ZAMS. The contraction time scales are indicated for the various masses. (Source: Wikipedia)
the matter in the protostar remains very high as long as the matter is not ionised, and nuclear burning does
not occur yet, hence convection is the only way to transport energy. The protostar resides on the Hayashi
track. During the contraction, the protostar keeps an almost constant effective temperature, but its radius
decreases, hence so does its luminosity (proportional to R−2). Thus, it moves downward in the HR diagram.
This is illustrated in Figure 9.5 by the red arrows.
In higher mass protostars, the central temperature increases faster than the central density. In lower
mass stars, it is the opposite: the central density increased faster than the central temperature. Thus for
low-mass protostars, the radiation of half the contraction energy has a harder time to take place and the
star remains convective and so closer to the Hayashi track. Because the internal temperature in the core of
the protostar keeps increasing steadily, the gas eventually becomes ionised. The opacity in the core drops,
and the convective zone disappears from the core. This implies that the protostar leaves its Hayashi track,
because it is not fully convective anymore. It will continue its evolution with a radiative contraction of the
core, and hence moves on the so-called Henyey tracks to the left in the HR diagram. Because the contraction
in the protostar’s core occurs in an increasingly transparant environment because of the increasing central
temperature, the protostar’s luminosity will stop decreasing and start increasing. The star has now become
a pre-main-sequence star, or pre-MS star in brief, as indicated by the orange–yellow–white–blue tracks in
140
Fig. 9.5.
While the pre-MS star continues its way along the Henyey track to the left in the HR diagram, its central
temperature reaches the value needed to initiate deuterium burning (approximately 106 K). We recall that
deuterium is immediately burned into 3He and creates a photon. The less massive the protostar, the closer it
will be to its Hayashi track when the first nuclear reactions based on primordial deuterium occur. The full ppchains cannot be completed yet, however, because the pp reaction requires a higher temperature and the full
pp chains only occur in equilibrium for temperatures above 5× 106 K. Hence the 3He isotopes cannot reach
an equilibrium concentration needed to have the full hydrogen burning in action. As a result, the temperature
sensitivity of the nuclear reactions at this stage is higher (typically a factor 3) than when the pp chains would
occur in equilibrium. This implies that the pre-MS star develops a convective core. In pre-MS stars with
a mass below ∼ 1.1M⊙, this convective core will disappear when the pp chains, with all corresponding
chemical reactions, occur in equilibrium. More massive stars will rapidly switch from deuterium burning
to hydrogen burning through the CNO cycle (cf. Figure 6.5). This kind of burning, however, is much more
sensitive to temperature than the pp chains. Hence, these stars will maintain their convective core during the
entire stage of hydrogen core burning.
The accretion continues during almost the entire pre-MS stage, on a Helmholtz-Kelvin time scale.
Protostars with a mass above 9-ish M⊙ evolve so quickly from their Hayashi track to the main sequence
(time scale far below a million years, see Figure 9.5), that they are not visible during their pre-MS stage,
also because they remain embedded in a thick circumstellar envelope of in-falling matter. These massive
stars hence only light up in the HR diagram when they have already reached ZAMS. At that occasion, the in-
fall of matter stops due to the strong outward radiation. The pre-MS stage of high-mass stars is thus poorly
known. Pre-MS stars with masses between ∼1.6 and 9M⊙ end their accretion stage before they reach the
main sequence. Such objects are called Herbig Ae/Be stars. Pre-MS stars with masses below ∼1.6 M⊙ are
called T Tauri stars.
From an observational point of view, it is indeed so that the HR diagram of young star clusters (e.g.
the Pleiades, h and χ Persei) show massive stars on the main sequence, while stars of low mass are still
in their contraction phase, occupying the region to the right of the main sequence. Many of these stars
are indeed T Tauri stars. Observations of Herbig Ae/Be stars and T Tauri stars show that both groups of
stars experience active surface phenomena and differential rotation. The combination of this rotation and
the convection in the outer stellar layers may induce a chaotic magnetic field. The latter transports the
available angular momentum to the surface, where it (part of it) can get lost through a stellar wind. This
wind escapes via the stellar polar axis, because of the presence of the accretion disk in the plane of the
equator. Hence, a bipolar outflow forms and this stops the accretion process before the star has reached the
main sequence (Figure 9.1). During their pre-MS stage, the dust disk around T Tauri and Herbig Ae/Be stars
disappears. The details of how this happens are not yet clear but this phase coincides with the formation of a
planetary system. Whether planet formation in such disks is a common, or rather exceptional phenomenon
is under intense study nowadays. For more details, we refer to the Master course Star and Planetary System
Formation.
Once the hydrogen burning can occur in equilibrium, and fully dominates the energy production, the
star reaches a state of thermal equilibrium, and the contraction stops. This point in time is called the zero-
age-main-sequence abbreviated as ZAMS as already used a few times in figure captions above. The star is
141
Figure 9.6: Pre-MS HR diagram with evolutionary tracks for solar-type metallicity (black) and isochrones
for ages of 0.1, 1, 10, and 100 Myr (grey) for birth masses covering 0.1 to 6 M⊙. (Source: Wikipedia
Commons, based on Stahler & Palla, 2004, “The Formation of Stars”, Wiley-VCH Verlag GmbH & Co.
KGaA)
now “born”. When the energy from contraction drops below a percent of the total energy budget, the internal
structure of the star requires a reorganisation. The two energy sources, gravitational contraction energy and
nuclear energy, switch in importance. Both have a very different influence on the stellar structure. The
gravitational energy production εg ∼ T , while the hydrogen burning processes are much more concentrated
towards the stellar centre with temperature dependencies εpp ∼ T 5 and εCNO ∼ T 18. The nuclear energy
rapidly becomes the dominant energy source, and the evolution of the star is from this moment on fully
governed by hydrogen burning. In Figure 9.6, we show pre-MS evolutionary tracks until the ZAMS, for
different stellar masses, computed with the MESA code.
Finally, we note that a contracting sphere with a mass below a certain threshold mass will never reach a
central temperature high enough to start hydrogen burning. Protostars with a mass below 0.08 M⊙ will never
be able to achieve hydrogen burning in full equilibrium, and hence never reach the ZAMS. These “failed”
142
stars are fully convective during their contraction stage. Contraction is responsible for the production of
the luminosity as long as no nuclear reactions take place. The central density keeps increasing, which
in a protostar with a mass below ≃ 0.08M⊙ leads to electron degeneracy before the pp reaction ignites.
This electron degeneracy prevents a further increase of the temperature, and the latter will never become
sufficiently high to perform hydrogen burning in full equilibrium. Such objects are called brown dwarfs. A
star-to-be is doomed to become a brown dwarf when its mass is not high enough to ignite the full hydrogen
burning cycle in the core before degeneracy sets in. The limiting mass depends somewhat on the initial
chemical composition and on the degree of ionisation, as this affects the mean molecular weight and hence
the central gas pressure. Taking this into account in numerical models for Big Bang nucleosynthesis and for
more recent metal mass fractions, one obtains a minimum mass for a main-sequence star in the range 0.06 –
0.09 M⊙.
Brown dwarfs with mass above about 0.065M⊙ can burn both deuterium and 7Li during their pre-MS
contraction but those with lower mass cannot. Hence, these “high mass” brown dwarfs produce 3He and
photons but not 4He as their central temperature remains too low to overcome the Coulomb barier of the3He – 3He reaction. Their thermonuclear energy thus mostly comes from the burning of their primordial
deuterium. Once that is finished, the brown dwarf can only cool. About 107 years after the initiation of the
deuterium burning, the luminosity of brown dwarfs drops with age as L ∼ τ−1.2. A rough luminosity-mass-
age relation holds:
L
L⊙
≃(
M
M⊙
)2.6 ( τ
107 yr
)−1.2
. (9.16)
Lower-mass brown dwarfs do not fuse their primordial Li and hence spectroscopy focusing on the absence or
presence of Li spectral lines offers an observational method to distinguish a low-mass star from a low-mass
brown dwarf.
At the other (planetary) mass end, one distinguishes brown dwarfs and gaseous planets by considering
a limiting mass between the two of about 0.013M⊙, because this is the amount of gas necessary to reach
high enough central temperature and density to fuse deuterium (Tc ≃ 106 K). However, this value for the
mass limit is not meaningful for rocky planets, where it is rather placed at ≈ 0.025M⊙. Low-mass brown
dwarfs and high-mass planets thus overlap in their mass ranges.
Brown dwarfs were for some time held responsible for the so-called “missing mass” in the Universe.
Because of their low luminosity, it is difficult to observe them. Hence, it was thought that a significant
fraction of the mass in the Universe might be “hidden” below the detection limit. An accurate estimation of
the mass in the Universe is of great importance for the set-up of cosmological models (see Master course
Galaxies and Cosmology). In this framework, the search for brown dwarfs remains quite topical.
143
Chapter 10
The main sequence or core-hydrogen
burning phase
10.1 Zero-age main sequence models
We now consider a series of stellar models in mechanical and thermal equilibrium with the same chemical
composition, but with different masses. The stars have arrived on the ZAMS, as explained in the previous
chapter, and experience hydrogen burning in their core in full equilibrium. This nuclear burning is their
source of energy for a very long time. As of now, the stars evolve on a time scale τn, which is much longer
than the time span τHK covered by the star during its formation history.
The consumption of hydrogen in the core occurs at such a low rate that the star spends almost its entire
life on the main sequence (about 90%). Most stars we observe are therefore main-sequence stars. The age of
the star is usually expressed starting from the zero-age main sequence (i.e., t = 0 at the ZAMS). We repeat
again that the latter is defined as the moment when the central hydrogen burning occurs in full equilibrium
and becomes by far the most important energy source (i.e., the contribution of the contraction energy drops
below a percent).
Equilibrium models of main-sequence stars in the stage of central hydrogen burning can be determined
based on the scheme described in Chapter 8. In Figure 10.1, the position of the stellar models at the time
of the ZAMS is shown in an HR diagram for a range in masses between 0.1M⊙ and 100M⊙, for an initial
hydrogen composition given by the mass fraction X = 0.70, and for two values of the metallicity. The lumi-
nosity and effective temperature increase with increasing mass. Connecting these initial values for different
stellar masses delivers the entire ZAMS. As can be seen in Figure 10.1, there is a clear tight connection
between the values of the mass and luminosity of the stellar models. Hence, given the effective temperature
dependence of L, also the mass and radius must be tightly connected.
Stellar models at the ZAMS indeed comply with so-called homology relations. We elaborate on this
145
Figure 10.1: The zero-age main sequence (ZAMS) in the Hertzsprung-Russell for stellar models with X =0.70 and two values of the metallicity Z . The positions of the models for different masses between 0.1
and 100M⊙ are indicated with plus signs, revealing that metal-poorer stars are bluer than metal-rich ones.
(Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University
Nijmegen, NL)
in a bit more detail in the next section, from a pragmatic user approach rather than performing detailed
theoretical/analytical computations (the latter can be found in Kippenhahn et al. 2012, Chapters 20 to 22 but
are skipped here). Before going into these details, we compare the ZAMS mass-luminosity (ML) and mass-
radius (MR) relations obtained from the theory with data that does not rely on models of stellar interiors.
This can be done for a large coverage of the mass range from unevolved detached binary stars (we omit
details here but refer to Chapter 14). The outcome is shown in Figure 10.2 and reveals quite good agreement
between theory and observations, particularly if one keeps in mind that the models are for the ZAMS only,
while the observations represent stars covering the entire main sequence. We discuss this figure further
in the next subsection but note here that this good agreement between the measurements and full lines is
remarkable, as it occurs over an extraordinarily extensive range of mass and luminosity: a factor ∼ 200 in
mass and a factor 108 in luminosity!
146
Figure 10.2: The full line indicates the ZAMS mass-luminosity (left) and mass-radius (right) relation de-
rived from the computed stellar models with Z = 0.02 already shown in Figure 10.1. The dashed lines
show the approximations for the ML and MR relations discussed in the text. The coloured symbols are
observations based on dynamical masses and radii, which were determined in a model-independent way.
The accuracies of these measurements are smaller than the symbol sizes. Blue stars: visual binaries; red
plusses: spectroscopic binaries. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and
Evolution, Radboud University Nijmegen, NL)
10.2 The mass-luminosity and mass-radius relations
We can derive and study mass-luminosity and mass-radius relationships by using the outcome of model
computations based on the methods in Chapter 8 and by subsequently fitting the outcome with mathematical
functions. Given the tight relationship between the mass, effective temperature, and luminosity revealed in
Figures 10.1 and 10.2, the fact that L ∼ R2T 4eff and that the mean molecular weight is determined by the
initial chemistry (X,Y,Z), we consider approximations of the form:
L ∼Mη1µη2 ;R ∼M ξ1µξ2 . (10.1)
Looking at the outcome of the numerical integrations to obtain the ZAMS models in Figure 10.2, we can
see that the slope in the ML-relation, i.e., the values of η1, η2 (and hence of ξ1 and ξ2), will depend on the
chosen mass interval. When restricting the EOS to an ideal gas, it can be shown analytically that η1 = 3
147
and η2 = 4 (Kippenhahn et al. 2012). Given that M can change with 3 orders of magnitude, while µ hardly
changes, it is meaningful to fix η2 = 4 and only estimate η1. Doing this for the entire stellar mass range, the
best approximation to the ZAMS models is found for η1 ≈ 3.3 – this is the dashed line indicated in the left
panel of Figure 10.2. It can also be seen from that figure that the highest value of the slope η1 is attaigned
by restricting the fit to the mass range [1, 10]M⊙ , giving η1 = 3.9. The appreciable decrease of η1 for the
highest masses is due to the increase of importance of the radiation pressure in the EOS and of mass loss (cf.
the so-called Eddington luminosity, which will be discussed in Chapter 13). For ZAMS stars with a mass
below half the solar value η1 ≈ 5.
As for the mass-radius relation for ZAMS stars, we find roughly ξ1 ≈ 0.79 for models with mass
lower than the solar value, while the more massive stars have ξ1 ≈ 0.57. This steeper relationship for low-
mass stars is readily seen in the model-independent data for the dynamical masses and radii of the highest
accuracy from detached eclipsing binaries assembled by Serenelli et al. (2021, A&A Rev., Vol. 29, id.4,
141pp.) shown in Fig. 10.3. This figure also illustrates that the validity of the mass-luminosity, mass-radius,
and mass-temperature relationships are limited to main-sequence stars. For the more global mass regime,
the dashed line in the right panel of Figure 10.2 is drawn for ξ1 = 0.81. The full line in the right panel
of Figure 10.2 is the outcome of the theory and reveals a clear “kink” in the mass-radius relation around
M = 1M⊙, also in line with the more extended sample of benchmark stars in Figure 10.3. The reason is
that stars with an effective temperature lower than that of the Sun, have a much more extended convective
envelope (see also Figure 10.4 discussed further in the text), which induces an increase of the radius with
respect to a star of similar mass that would hypothetically have a radiative outer layer. This leads to a
steeper increase of the mass-radius relation as the mass decreases. This trend cannot be continued into the
high-mass regime, because such stars have radiative outer zones.
Overall, the agreement between the data and the dashed/full lines in Figure 10.2 is good, given that
these linestyle curves represent relationships based on just one exponent. In general the radius, and hence
luminosity of a star depends as well on its metallicity (cf. Figure 10.1) and so the mass-radius and mass-
luminosity relations must also be metal-dependent. Rather than taking this into account via the expression of
the molecular weight, which does not change much in numerical value for various stars, we can also consider
a fit including Z , as this is a quantity that can be measured from stellar spectroscopy. The metallicity
dependence is rather modest but well observable. A good mathematical fit valid for a fixed mass is
R ∼ Z1/6 and L ∼ Z−7/6. (10.2)
Hence, since T 4eff ∼ L/R2 we find Teff ∼ Z−9/24. As a result, metal-poor stars are bluer and more luminous
than their metal-rich counterparts of the same mass, as was already revealed in Figure 10.1, except for the
highest masses where a strong radiation-driven wind comes into play (Chapter 13).
10.3 Chemical evolution on the main sequence
During the main-sequence stage, the energy loss at the stellar surface is compensated by the energy pro-
duction due to hydrogen burning. The chemical evolution of the star is mostly concentrated in the direct
environment of the stellar core, because the energy production is strongly dependent on temperature. The
148
Figure 10.3: Mass-radius and mass-temperature relations of “benchmark stars” assembled in the review
paper of Serenelli et al. (2021, A&A Rev., Vol. 29, id.4, 141pp.). These are detached eclipsing binaries for
which the masses and radii were determined with an accuracy better than 2% for the high-mass stars and
gradually down to 1% for the low-mass stars, while their metallicity is also available for both components
from spectroscopy, either via spectral disentangling or from double-lined composite spectra. For the binaries
containing at least one low-mass star with M ≤ 0.7M⊙, a relative error of 3% in mass was allowed. The
insets zoom in on the stars with M ≤ 1.0M⊙. Cyan triangles are pre-MS stars while red squares represent
evolved stars. (Figure courtesy of the author, produced for inclusion in Serenelli et al., 2021, A&ARev, Vol.
29, id.4, 141pp.)
central part of the star, where hydrogen fusion occurs, contains between 10 and 30% of the total stellar
149
Figure 10.4: The values of the mass coordinate m/M from the centre to the surface of the star is drawn as a
function of the total stellar mass for the ZAMS models shown in Figure 10.1. Areas indicated with “clouds”
are zones in the star where the energy transport is done via convection. The two full lines indicate values of
m/M where r is equal to one fourth and half of the total radius R. The dashed lines indicate mass shells in
which 50% and 90% of the total luminosity L is produced. (From Kippenhahn et al. 2012)
mass, depending on the mass regime and on the occurrence or absence of CBM. When convection occurs,
turbulent motions cause an efficient mixing of the stellar matter, and thus a larger volume is influenced. In
Figure 10.4, the convection zones are shown as a function of stellar mass for standard models without extra
mixing (i.e., no CBM nor chemical mixing in the envelope). We see that there are no convective zones near
the stellar core of stars with masses below 1 solar mass, and that the extent of the central convective core
increases with increasing stellar mass. A schematic representation of the location of the convective zones
according to stellar mass, is shown in Figure 10.5, in which the left cartoon represents stars with M >∼ 2M⊙,
the middle one those with 1M⊙<∼ M <
∼ 2M⊙, and the right one stars with M <∼ 1M⊙.
For stars with masses between 0.1 and 1M⊙ with a radiative core, the change in hydrogen content due
to hydrogen burning is easily determined. The variation of X for a certain fluid element is proportional to
the local value of εH when there is no convective mixing going on. This means that the change in hydrogen
abundance after a time interval t is given by X ∼ εHt. Hence, the chemical evolution can be traced
150
Figure 10.5: Sketch of the convection zones in ZAMS stars. Left: Stars with masses above 2M⊙; middle:
stars with masses between 1 and 2M⊙; right: stars with masses below 1M⊙. (Figure courtesy of Prof. J.
Christensen-Dalsgaard)
easily throughout the entire hydrogen-burning stage. At the end of the main-sequence stage, X → 0 in the
stellar core. The effective temperature of these low-mass stars barely changes during the main sequence.
For more massive stars, the helium production is much more concentrated towards the center, because
there is a much larger temperature dependence for the CNO cycle than for the pp chains. The convection in
the central parts is so efficient and rapid, that the stellar core can be regarded homogeneous in composition
at all times. Within the core, we hence have X ∼ εHt, in which εH is an average value of the energy
production over the total stellar core.
The evolution of the size of the convective core during the main sequence for stars with M > 1M⊙
depends on the stellar mass. For stars with M >∼ 1.6M⊙, ∇rad changes mostly because of the variation in
opacity. In the core we have κ ∼ (1 +X), hence the opacity will decrease. For these stars, the convective
core will thus shrink when they evolve along the main sequence. This was already illustrated by the change
in the buoyancy frequency shown in Figure 5.4. As a result, products of the nucleosynthesis are left behind
in a region around the convective core. The stars get a slightly larger radiative outer zone, in which the
temperature drops faster than in a convective zone, and hence the star becomes cooler at the surface, which
induces a movement to the right in the HR diagram.
For M <∼ 1.6M⊙, on the other hand, ∇rad changes mostly due to the contribution of ℓ(r)/m(r) = ε.
Because εpp ∼ X2 and εCNO ∼ XZ , the relative importance of the CNO cycle in the energy production in-
creases with respect to the pp chains as the star evolves along the main sequence. The pp chains get a harder
and harder time to occur in equilibrium, which makes the reactions more temperature sensitive. Because
the CNO burning is concentrated in a smaller core, ℓ(r)/m(r) increases. This increase dominates over the
decrease in opacity, which implies that the radiative temperature gradient increases during the star’s main-
sequence evolution, hence the convective core grows. In reality, both opacity and energy production change
151
simultaneously, and their combined effect on ∇rad must be considered. In stars with M < 1.3M⊙, the ppchains remain the dominant energy source. The effective temperature of these stars is mostly determined
by the large outer convective zone, and barely changes as a result of the slightly larger convective core. In
stars with 1.3M⊙ < M < 1.6M⊙, the CNO cycle was already the dominant energy production mechanism
(see Figure 6.5), and the outer convective layer is very thin. These stars evolve to the right during the main
sequence evolution.
The time a star spends on the main sequence depends on its mass, because the luminosity is very
dependent on mass. If we define the energy reservoir, available from hydrogen burning by EH, the star can
remain on the main sequence for τH ≡ EH/L. As a rough approximation, we assume that a fixed fraction of
the mass in hydrogen MH is available for hydrogen burning. In this assumption EH ∼MH ∼M . Although
the luminosity L changes for stars during their main-sequence stage, we can use the mass-luminosity relation
for ZAMS models to estimate τH. We thus find the following dependence of the main-sequence lifetime on
stellar mass:
τH(M) ∼ M
L∼M1−η1 . (10.3)
For the average exponent η1 = 3.3 of the mass-luminosity relation, we find that τH(M) ∼ M−2.3: the
main-sequence lifetime decreases rapidly with increasing mass. A typical value is 2× 108 year for a 5M⊙
star, and 1010 year for the Sun.
The faster main-sequence evolution of more massive stars is clearly confirmed by the observational
studies of the HR diagram of star clusters. These are concentrations of stars on the sky, so close to each other
they must be physically linked. There are two types of star clusters. Galactic or open clusters contain stars
of Population I and are located in the Galactic disk, where they easily get disrupted depending on their mass
content and the surroundings in the disk. Indeed, they contain typically only a few hundred stars and are not
strongly gravitionally bound. Globular clusters, on the other hand, consist of millions of Population II stars.
They are found at great distances away from the Galactic disk and do not get disrupted easily. All stars in
a cluster are approximately equally distant to Earth, which implies that differences in apparent magnitudes
are equal to differences in absolute magnitudes for all cluster members – see Equation (1.10). Hence, a
diagram of apparent magnitude versus colour has the same shape as a diagram of the absolute magnitude
versus colour.
All stars in a cluster were born more or less simultaneously, and therefore have the same age τcluster.Consequently, all stars with a mass exceeding a certain limitMlimit will already have left the main sequence,
while stars with a smaller mass M < Mlimit are still in the stage of hydrogen burning in the core. Observa-
tions of stars in clusters confirm this scenario. In Figure 10.6, we show the contrast in HR diagram between a
young and an old cluster. In the bottom panel, the HR diagram of the young double cluster h and χ Persei is
shown, in which the lower-mass stars are still pre-MS stars evolving towards the ZAMS, while more massive
stars are already on the ZAMS and the most massive stars have become red supergiants. In the top panel,
the HR diagram of the old star cluster M 5 is shown, in which the more massive stars have clearly evolved
off the main sequence, while the low-mass stars are still on the main sequence. The horizontal branch (see
Chapter 12 for a definition) is clearly visible in this old cluster.
In Figure 10.7, the evolutionary tracks of several galactic clusters is indicated. The difference in main-
sequence age as a function of initial stellar mass has the following important application for star clusters.
152
Figure 10.6: The colour-magnitude diagram for a typical globular cluster (M 5), consisting of Population II
(old) stars (top), and the young galactic double cluster h&χ Persei, containing Population I stars (bottom).
(Source: Wikipedia)
153
Figure 10.7: A schematic representation of colour-magnitude diagrams of several galactic star clusters. The
age scale on the right-hand side is based on evolutionary model computations for Population I stars. The
turn-off point of each cluster reveals its age. (Source: Wikipedia)
The limit mass which indicates whether or not a star in the cluster is still on the main sequence, is given by
the condition τcluster = τH(Mlimit). This condition is the basis for the age determination of stellar clusters.
The turn-off point determines the age of the cluster, indicated on the right-hand side of the figure. The older
the cluster, the lower the turn-off from the main sequence towards the red giant branch of the cluster will
be situated (see Figure 10.7). The example of h&χ Persei (see Figure 10.6) shows that the low-mass stars in
extremely young clusters have not yet reached the main sequence. The study of these young stars reveals
details of the evolution of protostars while contracting towards the main sequence.
Also the influence of the chemical composition on the stellar evolution is noteworthy. This was already
revealed in Figures 1.9 and 10.1. In terms of cluster ageing, one has to keep in mind that globular clusters
consist of Population II stars, which are metal-poor and hence have lower opacities than their Population I
metal-rich analogues in galactic clusters. The age determination of globular clusters is used as a limit to the
age of the Universe, and is in that sense important for observational cosmology.
154
Figure 10.8: Multiple populations, due to various levels of initial helium (see model fits in the inset), in the
globular cluster NGC 2808. (From Piotto et al., ApJL, Vol. 661, L53, 2007)
star cluster studies constitute an entire research field by themselves in astrophysics, particularly since
Gaia DR2 revealed spectacular improvements of the CMD morphologies (cf. Figures 1.8 and 1.9). These
new astrometric data still have to be digested by the astronomical community, some decade after the discov-
ery of multiple main sequencies in globular clusters due to different chemical compositions (mainly helium)
caused by different populations in terms of generations of stars. Young open clusters, on the other hand,
have extended main-sequence turnoffs. These are, up to the present day, assigned to differences in stellar
rotation, although binarity certainly also plays a role. The two phenomena are illustrated in Figures 10.8 and
10.9. So far, CBM has been kept constant for all stars in isochrone fitting of young open clusters, leading
to typically 30% relative uncertainty in the aging from isochrones. This is a serious limitation, given that
asteroseismology shows a whole range in αov to occur in stars of the same mass, age, and rotation. Improv-
ing stellar ageing from isochrone computations by allowing stars that belong to the same binary or cluster
to have different levels of CBM and envelope mixing (following Chapter 7), offers a new explanation for
extended main-sequence turnoffs in clusters, cf. Figures 10.11 and 10.12 discussed in the next subsection.
155
Figure 10.9: The extended main-sequence turnoff of the open cluster NGC 2818, compared with evolution-
ary tracks of rotating stellar models, failing to explain the core helium burning cluster members. (From
Bastian et al., MNRAS, Vol. 480, p.3739, 2018)
10.4 The end of core-hydrogen burning
The evolutionary track of a core-hydrogen burning star of 7 M⊙ in the HR diagram is shown in the left
panel of Figure 10.10. This track is computed for standard models without CBM in the core boundary
layers and without extra chemical mixing in the radiative envelope. From point A on the main sequence,
the star moves up and to the right, on its way to B. The increase in luminosity is due to the increase in
mean molecular weight in the core, because of the conversion of hydrogen to helium – see Eq. (2.31) and
recall that P ∼ T/µ and L ∼ T 4. When almost all hydrogen has been used up (X = 5%), a minimal
effective temperature is reached (point B). This stage is called the terminal-age-main-sequence (TAMS).
The star will soon experience an energy crisis. Because the central temperature is far too low to start helium
burning, the stellar core starts to contract and the star evolves to the left in the HR diagram. The evolution
is accelerated, because the remainder of hydrogen in the core is consumed very quickly. At the end of the
hydrogen core burning (point C), the star consists of a core containing helium. This core is still not hot
enough to start helium burning. The helium core is surrounded by a hydrogen-rich envelope. Due to the
increase in temperature near the core when the star evolved from B to C, the temperature at the bottom of
the envelope is high enough to ignite hydrogen burning in this region and thus produce the nuclear energy
needed to counterbalance gravitational contraction. In this way, a stage of hydrogen shell burning is initiated.
156
Figure 10.10: Hertzsprung-Russell diagrams with evolutionary tracks for Population I stars during the stage
of core hydrogen burning. The ZAMS is indicated with a dashed line. (a) For a 7M⊙ star. The points A,
B, and C correspond to the time of stellar birth (ZAMS), minimal Teff , and exhaustion of hydrogen in the
core, respectively. The dotted line indicates the subsequent stellar evolution after the core hydrogen burning
stage. (b) For stars with a mass between 4 and 8M⊙. (c) For stars with a mass between 1 and 3M⊙. (From
Kippenhahn et al. 2012)
The evolutionary track shown in the left panel of Figure 10.10 is representative for all stars with a
large convective core. This is illustrated in the middle panel of the same figure. The increase in luminosity
between points A and B is larger for stars with higher masses, while the variation in effective temperature
remains more or less the same. Figure 10.11 illustrates the major effect of the presence of extra mixing in the
convective boundary layer (whatever its physical cause) in the duration of the main sequence (dotted tracks)
compared to the case where no extra mixing aside from convective mixing in the core occurs (full tracks)
for stars with birth mass between 3 and 7 M⊙. The reason of this large difference is that the extra mixing at
the bottom of the radiative envelope transports fresh hydrogen to the convective core, extending drastically
the core-hydrogen burning phase. This also implies that, by the time the star reaches the dashed-dotted line,
it has a more massive helium core and a higher luminosity compared to the standard models where no extra
mixing occurs outside the convective core.
For stars with a mass lower than that of the Sun, the evolution tracks are different. This is indicated in
157
Figure 10.11: Evolutionary tracks computed with MESA to illustrate the effect of convective boundary and
envelope mixing. The models are based on an exponentially decaying Dov(r) with fov = 0.04 (MESA
notation) and mixing due to internal gravity waves at the level of Denv(r) = 1000 cm2 s−1 (see Figure 7.3)
in the deep bottom of the radiative envelope (dotted grey tracks) compared to the case where these two types
of chemical mixing are at low value (fov = 0.01 and Denv(r) = 1 cm2 s−1; full grey lines). The dashed and
dashed-dotted lines indicate the exhaustion of the central hydrogen mass fraction. (Figure courtesy of Dr.
Cole Johnston)
the right panel of Figure 10.10. These stars do not have a convective core and are thus not subject to convec-
tive core mixing. They experience a more gradual transition from core to shell hydrogen burning, building
up their helium core starting in the stellar centre and continuously increasing from the centre outwards.
The onset of hydrogen burning in a shell has important consequences for the internal stellar structure.
The structure and change of the helium core depend on the mass and chemical composition of the star. A
core in thermal and hydrostatic equilibrium without internal source of energy does not contribute to the
luminosity and therefore has to be isothermal (since l ∼ dT/dr). However, the pressure delivered by the
core must be sufficient to compensate for the gravitational contraction due to the envelope on top of it. This
is only the case if the mass of the helium core remains below the so-called Schonberg-Chandrasekhar limit
(MSC). This limit is defined as the maximum mass that a non-fusing, isothermal He core can have while
158
Figure 10.12: The grey area in this HR diagram is an isochrone cloud, which is a zone covered by stars of
equal age, having different levels of CBM and envelope mixing between the limits indicated in Figure 10.11.
(From Johnston et al. 2019, A&A, Vol. 632, id.474, 11pp.)
still being able to support the overlying envelope. This limit is expressed as the ratio of the He core mass
to the total stellar mass and lies between roughly 7% and 15% of the total stellar mass. The helium core
can only remain isothermal until its mass reaches MSC . After that, the core cannot provide enough pressure
force to counteract the gravitational force experienced by the envelope, so it starts to contract. It can be
shown that, under the assumption of an isothermal ideal gas,
MSC ≈ 0.37×(
µenvµcore
)2
M ≈ 0.10 to 0.15M . (10.4)
Stars with an initial mass above 2 to 3 M⊙ (depending on their metallicity) already have a helium core
mass above the Schonberg-Chandrasekhar limit at the TAMS, because their convective core homogeneously
mixed all material entering the core. Their helium core therefore starts to contract on a Helmholtz-Kelvin
time scale immediately after the TAMS.
Stars that have not yet reached the Schonberg-Chandrasekhar limit at the TAMS, maintain an inert
isothermal helium core in hydrostatic equilibrium that does not contribute to the luminosity and where the
pressure is partially delivered by degenerate electrons and partially by the ions. For these stars, the stellar
structure consists of a helium core with mass Mcore = q0M , surrounded by a hydrogen-rich envelope with
mass (1 − q0)M , with q0M < MSC . This is displayed schematically in Figure 10.13. The luminosity
of these stars only originates from hydrogen shell burning at the bottom of the envelope. The functions
159
Figure 10.13: Schematic temperature profile in an equilibrium model with an isothermal inert helium core
with mass q0M < MSC . Hydrogen shell burning occurs in the shaded region, which is located at the bottom
of the stellar envelope. (From Kippenhahn et al. 2012)
that describe the stellar structure can therefore be evaluated separately for the inert helium core and for the
surrounding envelope, and be connected to each other at the boundary.
10.5 Later stages of evolution
The evolution of the star beyond the TAMS depends on the initial stellar mass:
• Stars with a birth mass just above the minimum of ∼ 0.08M⊙ are fully convective. This is implies
that the helium created through the hydrogen burning is constantly and fully mixed in the entire star.
Such stars never reach a sufficiently high temperature to burn helium. They will hence lead to helium
white dwarfs. Currently, such white dwarfs are hypothetical in the sense that none exist yet in our
Universe, since it is still too young to allow for such stars to have evolved past the core-hydrogen
burning stage.
• Stars born with a mass below some 0.5M⊙ will initiate hydrogen-shell burning past the TAMS, while
the helium core contracts. The minimum mass for a star to enable helium burning is, however, some
0.47 M⊙. Hence these stars will die as hydrogen-helium white dwarfs. Also in this case, our present
Universe is too young to allow for such white dwarfs to have formed.
• After central hydrogen burning the stars with 0.5 <∼ M <
∼ 2.3M⊙, where the precise boundary masses
depend on the metallicity, have a degenerate He core. They start the helium burning explosively with
a series of off-centre thermal runaways (collectively called the “helium flash” – see Chapter 12). They
will end as a carbon-oxygen (CO) white dwarf.
• After central helium burning the stars with intermediate mass 2.3M⊙<∼ M <
∼ 8M⊙ have a (partially)
degenerate CO core. The central temperature of the stars with 2.3M⊙<∼ M <
∼ 6 M⊙ never reaches
8 × 108 K and these stars cannot start carbon burning. They live on, relying on the hydrogen and
helium shell burning. They end as a carbon-rich white dwarf.
160
Stars with 6M⊙<∼ M <
∼ 8M⊙ can ignite a few nuclear reactions of carbon burning and end as an O,
Ne, Mg white dwarf, provided that their CO core does not reach a too high level of degeneracy when
they achieve a central temperature of 8 × 108 K. In the case of stars whose CO core reaches a high
degree of degeneracy, an off-centre thermal runaway may occur when the temperature required for
carbon burning is reached, leading to an explosion similar as in the case of the helium flash but now
called carbon flash. This flash may be fatal for the star, or not, depending on the efficiency of neutrino
cooling, as we will discuss for the helium flash in Chapter 12.
• For stars with mass M >∼ 8M⊙, the core never contains degenerate matter. Their central temperature
keeps on increasing with every core contraction cycle. They pass through all successive burning cycles
until their core consists of iron. They end their life as a supernova with a neutron star or black hole as
a remnant.
Figure 10.14: Global picture of the evolution for stars of (X,Z) = (0.7, 0.02) and masses of 1, 2, 3, 5, 7
and 10 M⊙ in the HR diagram (left panel) and in the central temperature versus central density plane (right
panel). The dotted lines are the ZAMS, while the dashed lines in the right panel show the limits for the
various EOS (as in Figure 4.6). The 1 M⊙ curves are characteristic of those of lower-mass stars, i.e., the
central core becomes degenerate after the main sequence and helium will get ignited in a thermal runaway
at the tip of the red giant branch. The 2 M⊙ model will also undergo a He flash but the 3 M⊙ model will
start burning helium quietly under non-degenerate conditions and hit degeneracy only in its CO core formed
from the core helium burning. The 5 M⊙ model is representative of stars undergoing quiet He ignition and
He burning causing a loop in the HR diagram, while never reaching a high enough temperature to ignite
carbon due to electron degeneracy. The 7 M⊙ model does so, but in a state of degeneracy. The 10 M⊙ will
undergo all burning cycles in the centre without entering the area of electron degeneracy. (Figure courtesy
of Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)
161
Figure 10.14 shows a comprehensive HR diagram (left panel) and the properties of the core physics (right
panel) summarizing some of these various options. In the following chapters, we discuss the stellar evolu-
tionary paths beyond the TAMS in detail.
162
Chapter 11
Evolution of a star with 8M⊙ <∼ M <∼ 15M⊙
11.1 The Hertzsprung gap for stars with M >∼ 2.3M⊙
We will now first take a look at the further evolution of all stars with M >∼ 2.3M⊙ consisting of a shrinking
helium core with a mass higher than the Schonberg-Chandrasekhar limit that is surrounded by a hydrogen
envelope with hydrogen burning in a shell. The evolution of the internal structure and the evolutionary track
of a Population I star of 5M⊙ is shown in the HR diagram in Figure 11.1. The different layers in the star
are characterized by their m-value in units of M⊙. Grey areas are convection zones. The red hatched zones
indicate areas with nuclear burning.
The transition from central to shell hydrogen burning takes place in point C. At that moment the 1H in
the core is exhausted hence the burning stops and the region no longer is convective. The hydrogen burning
ignites in a relatively broad shell surrounding the He core. This shell becomes thinner as the evolution
progresses, while the helium core will gain more mass and therefore shrinks faster. After point C, the
evolution with a shrinking helium core happens on a contraction time scale.
Due to the core contraction the core is no longer in thermal equilibrium, which means that the time
derivative in the equation of energy conservation can no longer be ignored. The increase in central tempera-
ture is accompanied by an increase of the local energy production (virial theorem!) and, consequently, of the
local luminosity. As a reaction to the shrinking core, the layers above the shell burning hydrogen expand.
The density in the core, while increasing, remains sufficiently low to avoid electron degeneracy, which is
the case for all stars with initial mass above ∼2.3M⊙. For such stars the contraction of the core thus leads
to a local increase in temperature.
When a temperature of 108 K is reached, the central helium burning starts (point D). The star has found
a new energy source in the core, hence its contraction stops. A state of thermal and hydrostatic equilibrium
sets in in the core. The contraction of the core between the points C and D takes about a time scale of
Helmholtz-Kelvin (≃ 3 × 106 years for a star with 5M⊙). During this time interval the outer layers have
163
Figure 11.1: Left: evolutionary track of a 5 M⊙ star with (X,Z) = (0.70, 0.02) without CBM. Right:
time evolution of the stellar interior in a so-called Kippenhahn diagram, where the evolutionary phases
correspond to those labeled in the left panel. Dark/light grey areas are convective/semiconvective. The red
hatched regions are areas where nuclear fusion is taking place, with dark red zones delivering 5 times more
energy than light red. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution,
Radboud University Nijmegen, NL)
expanded and the stellar radius has increased substantially, with about a factor 25! The star has evolved
into a red giant in point D of the HR diagram. The expansion to a red giant occurs so rapidly that the
probability to catch stars in the transition stage from C to D is low. This is called the Hertzsprung gap in the
HR diagram: it is the area between the main sequence and the red giant branch with a deficiency of observed
stars.
The evolution of a star described above and showed in the Figure 11.1 remains qualitatively the same
for all massive stars having a helium core above the Schonberg-Chandrasekhar limit at the end of the main
sequence and whose helium burning starts before electron degeneracy occurs (M >∼ 2.3M⊙). In this stage
of their lives such stars do not experience a strong stellar wind (M <∼ 15M⊙). These stars all move in very
short time towards the area close to their Hayashi track in the HR diagram.
Figure 11.1 represents a stellar model without CBM. However, observational evidence for the occur-
rence of an overshoot zone is piling up from asteroseismology and from eclipsing binary modelling (see
Chapter 14). A pair of evolutionary tracks for stars with 3 and 5 M⊙ is shown in Figure 11.2. One track
(full lines) stands for standard models without CBM and the other track (dashed lines) shows the important
impact of the presence of CBM: it gives the star a more massive He core, a higher luminosity, and a longer
main-sequence duration as it has more fuel available.
Irrespective of the occurrence of CBM or not, the models share a common property for their envelope:
164
Figure 11.2: HR diagram with two types of evolutionary tracks from the ZAMS to the red giant branch for
models with 3M⊙ and 5 M⊙, this time for (X,Z) = (0.718, 0.014). The full black line concerns standard
models with only convective mixing in the core and no additional CBM, while the dashed red line represents
models with an average level of CBM. (Figure courtesy of Dr. May Gade Pedersen)
its expansion as the star becomes a red giant causes a reduction of the radiated energy (virial theorem,
this time applied to an expansion). This implies that the models show a marked “dip” in luminosity at a
temperature of about log Teff ≃ 3.7 in Figures 11.1 and 11.2. This was also already visible in Figure 7.1.
There is no “analytical” mathematical expression for the evolutionary track towards the red giant branch
once the hydrogen-shell burning is initiated. The stellar evolution models computed from numerical integra-
tion of the system of differential equations that we discussed in Chapter 8 all show this result, irrespective
of the used code. From a physics standpoint we do understand that the star will swell, because the cool
outer stellar layers become convective to transport the energy produced in the shell fusing hydrogen. Aside
from the hydrogen shell burning, also the contraction of the helium core delivers energy, and half of this
energy will have to be transported as well (once more the virial theorem). The temperature gradient in case
of convective energy transport is lower than in the case of radiative transport, causing the temperature to
decrease more slowly going outwards in convective zones compared to the case of the radiative tempera-
ture gradient. To cool down sufficiently up to the stellar surface, the star thus has to expand and its radius
becomes substantially larger. This expansion causes a decrease of temperature in the outer envelope (via
the virial theorem: half of the expansion energy is taken from the luminosity and half of it is used to cool
the star). This way the internal energy in the envelope decreases, together with the luminosity of the star,
despite the strongly increased radius.
165
11.2 Helium burning
At the time when central helium burning is about to start, the star of 5 M⊙ without CBM hasMcore ≃ 0.6M⊙
and is situated in the surroundings of its Hayashi track. It has an extensive outer convective zone with a depth
reaching the position corresponding to m ≈ 0.9M⊙ for the example of the star with 5M⊙ (see Figure 11.1).
This is deeper than the maximum extent of the convective core during the main sequence (≈ 1.25M⊙ at the
ZAMS). The higher the birth mass, the deeper this envelope convective zone penetrates into the layers where
the chemical composition changed due to the CNO burning. This way, convective mixing in the envelope
disperses nuclear reaction products in the envelope and brings them to the surface of the star. This happens
between points D and E and is called the first dredge-up.
The dominant reaction in the central helium burning is 3α →12C. While the abundance of 12C in-
creases, the reaction 12C + α →16O will gradually take over the lead in chemical composition compared
to 12C. Indeed, in the stage that 4He gets exhausted, the exhaustion of 12C in favour of 16O will be larger
than the production of 12C due to the 3α reaction. Consequently the abundance of 12C will start to decrease
again after having reached a production maximum.
Depending on whether CBM has been active or not, the stage of central helium burning lasts between
25% (without) and 15% (with average CBM) of the main-sequence duration (22 Myr for the model in Fig-
ure 11.1). At first sight, that seems surprisingly long keeping in mind that the luminosity of the star (i.e.,
the energy consumption) is higher, that the central core where the helium burning is taking place is much
smaller than for hydrogen burning and that the energy gain is below 10% than the one delivered by hy-
drogen burning. The reason for this extensive time span is that the largest part of the energy production
is not delivered by the helium burning in this stage, but by the hydrogen shell burning. In the point E the
helium burning is responsible for less than 10% of the total energy production. However, this modest energy
production in the core is sufficient to counter the gravitational contraction and to keep the star as a whole
in thermal equilibrium. Towards the end of the helium core burning, both fusions deliver about an equal
contribution to the nuclear energy.
After the point E the star has found a new energy balance and the convection in the envelope slowly
retreats: the star moves downwards along its Hayashi track towards F. It subsequently moves to the left in the
HR diagram. The bluest point G corresponds to the time when ∼75% of the central helium burning stage has
passed. At that moment the central helium mass fraction has dropped to about Y ≈ 0.25. Afterwards, the
star again returns to its Hayashi track (towards point H) as its central helium gets more and more depleted.
11.3 Later evolution stages
The core helium burning stops when all provision of 4He is exhausted and is converted into 12C, 16O and20Ne. The precise correlations of the abundances of these produced elements depend on the temperature, the
mass, the initial chemical composition, and the occurrence (or not) of CBMs. The burning is now displaced
to a concentric shell that surrounds the CO core (as of point H). While the helium shell keeps on burning,
166
the CO core gets heavier and contracts. The situation is now similar to the one just before the central helium
burning started.
In this stage of its life, the star has two types of shell burning that produce the necessary energy:
hydrogen shell burning in a shell that is situated at the bottom of the envelope and helium burning in the
shell right above the CO core. The CO core contracts, helium is produced between the two burning shells,
and the outer envelope expands and becomes convective. In the HR diagram the star moves up from point H
to J.
The temperature in the hydrogen shell is all the time decreasing. Depending on the birth mass it may
become lower than the temperature needed to keep the process of hydrogen burning going. In that case
and at that time, there is only a contracting CO core surrounded by an area above the helium shell where
all layers expand. In that situation, the luminosity rapidly increases as a consequence of the fast increasing
mass of the CO core. Whether a second loop in the HR diagram arises in addition to the one shown in
Figure 11.1, or not, depends on the mass, the initial metallicty, the nuclear burning efficiency, the opacities,
CBM, etc.
In Figure 11.1 we notice that the outer convective zone reaches deeper and deeper into the stellar
interior as the evolution proceeds. At a certain moment this zone contains about 80% of the mass and its
bottom clearly interferes with the area where the hydrogen shell burning in the preceding millions of years
has taken place. In this area all 1H is transformed into 4He and almost all 12C in 16O and 14N. These nuclei
are transported to the surface by the convective cells during the late evolutionary stages. This is called the
second dredge-up.
11.4 Burning cycles
The evolution scenario described above is fairly complicated, when considering the position in the HR diagram
as it depends on the details of the physical properties of the star as a whole. The evolution process is however
less complicated at the level of the evolution of the stellar core. When we extrapolate the stages of central
hydrogen and helium burning, the central core undergoes subsequent cycles of nuclear fusion that can be
represented schematically as follows:
core burning
ր ցheating up core exhaustion fuel
տ ւcontraction core
The burning in a given time frame will gradually use all fuel that is available in the convective core.
The exhausted core will then contract, increasing the central temperature until it is high enough to initiate
the following burning cycle. As long as this scheme is continued, heavier nuclei keep being produced in the
167
Figure 11.3: Schematic illustration (not to scale!) of the internal “onion structure” of a highly evolved
massive star. A few typical values of the mass, the temperature and the density are given in cgs-units. (From
Kippenhahn et al. 2012)
stellar centre. These new heavier elements are homogeneously mixed by the convection in the core, which
shrinks at the onset of each new cycle: after central hydrogen burning we get an extensive helium core, in
which a smaller CO core is formed by helium burning etc.
Each time the central fuel is exhausted and the burning stops, the next burning cycle in the core cannot
immediately start, but a transition period of shell burning will take place. This shell burning occurs in the
hottest layer still containing fuel at that moment. Shell burnings can survive different subsequent central
burning cycles, which on their turn create a new shell. Several shell burnings can thus take place simul-
taneously. They are separated by mass shells with a different chemical composition, where the occurring
elements are gradually heavier as the shell is situated deeper into the star. This is called the onion model,
and is represented in Figure 11.3. Depending on the temperature differences occurring in the core at every
new cycle, a given shell burning can be activated again in the shell that was no longer active. The burning
cycles after the hydrogen and helium burning in the core all have such a short time interval that the chance
to observe a star in this stage of its life is small.
168
11.5 Explosive versus non-explosive evolution
The scheme above can be interrupted temporarily or definitively. On the one hand, a temporary interruption
can occur when the density in the central core is so high that degeneracy sets in. When the degeneracy
parameter ψ starts to increase, the electron pressure helps to counterbalance gravity so there is less need to
contract as ψ increases. As the mass of the core increases due to the shell burning, contraction continues.
The cycles of core burning will continue as long as electron degeneracy is avoided or can be lifted. On
the one hand, the central core of a star with initial mass lower than ∼ 6M⊙ will never become sufficiently
hot to start carbon burning. On the other hand, while discussing the nuclear burning mechanisms we have
mentioned that 56Fe is the most stable isotope. Therefore, the iterative core burning scheme stops definitively
when the inner core entirely consists of 56Fe and exothermal fusion is not possible anymore.
It is obvious that we now have to make a distinction in terms of birth mass for the further evolution of
the star. Whether the mass of a star comes close to the boundary masses (2.3, 6 en 8 M⊙) depends strongly
on the mass loss it undergoes during its evolution. Up to now we did not take into account the effects of
mass loss, but a large mass loss in the form of a strong dust-driven stellar wind does occur at the end of the
stellar evolution for stars with M <∼ 8M⊙ while stars born with a higher mass experience a radiation-driven
wind during the evolved stages for M >∼ 8M⊙ and already during the main sequence for M >
∼ 15M⊙. The
influence of mass loss on stellar evolution is a complicated problem. The mass loss of a star with initial
mass below 8 M⊙ is such that a final core mass below the Chandrasekhar limit of 1.44 M⊙ is left, while stars
born with a mass above 8 M⊙ end up with a core mass above the Chandrasekhar limit.
We now have to make a distinction between stars more massive than 8M⊙ at birth and stars born with
a lower mass to describe the further evolution. In this chapter, we discuss the further evolution of a star with
an initial mass higher than 8 M⊙ but lower than 15 M⊙, i.e., we consider stars that are left with a core mass
higher than 1.44 M⊙ at the end of the various burning cycles. The evolution of stars with M <∼ 8M⊙ and
M >∼ 15M⊙ will be treated in subsequent chapters.
11.6 Neutron stars
11.6.1 Supernova explosion
For stars with 8M⊙<∼ M <
∼ 15M⊙ the CO core is not degenerate after helium burning. During the
contraction following the central helium burning, the central temperature increases sufficiently to induce
subsequently carbon, oxygen, and silicon burning. These final cycles elapse very fast. For a star with 15 M⊙
carbon burning produces enough energy during about 5 000 year, oxygen burning during about 1.7 year and
silicon burning takes just a few days! The end of the silicon burning stage, which mainly produces 56Ni,
entails a serious problem for the star: it is no longer able to generate energy through nuclear reactions in the
core and to balance the gravitational force.
These stars thus complete the whole burning cycle until they have built up an Fe core. As aforemen-
169
tioned the stable state inevitably comes to an end: gravity is the winning force and the nucleus collapses
very quickly. With the collapse of the core, the material in the envelope is accelerated to a velocity that can
amount to half the speed of light. This is the consequence of the enormous gravitational force by which the
particles of the collapsing core attract those in the envelope. These accelerated envelope particles suddenly
come to a stop when they collide with the very dense core of the star: their kinetic energy is converted
into heat and a strong temperature increase occurs. The temperature of the core of the star rises up to
T > 1010 K. This time the increased energy does not result in the start of a new burning cycle. On the
contrary, the increase of the temperature implies that the photons get a higher energy and consequently the
photo-dissociation of the nuclei dominates. Because of this, the heavy nuclei that were formed during the
last burning cycle, are dissolved. First the elements of the iron group are transformed into α particles :
56Ni + γ → 14 4He,54Fe + γ → 13 4He + 2n,56Fe + γ → 13 4He + 4n, . . .
(11.1)
While energy was generated during the building up of these heavy isotopes, the process of dissociation costs
energy, i.e., these are endothermal reactions. The required energy is provided by the contraction of the core.
The resulting increase in temperature subsequently implies also a photo-dissociation of each α particle:
4He + γ → 2 1H + 2n, (11.2)
again requiring energy and thus accelerating the contraction even more. At this point, the whole sequence
of nucleosynthesis to build up the chemistry deep inside the star gets undone in less than a second. . .
The photo-dissociation results in a mixture of protons, electrons and neutrons. This results in a drastic
increase of the core density and consequently the electrons and protons are forced to recombine to form
neutrons. The density becomes so high that the neutrons subsequently collide with one another. The drastic
increase in pressure results in a shock wave that propagates through the outer layers of the star surrounding
the core full of neutrons. Part of the energy of the shock wave gets dumped in the remains of the core of
the star. Another part gets diverted as neutrinos. Because of the high densities, large quantities of neutrinos
get caught by the outer layers of the star.The result of this dumping of neutrino-energy is that the layers
surrounding the core get expelled: the star explodes as a supernova and temporarily becomes as bright as a
galaxy! The table represented in Figure 11.4 lists some quantitative values of the nuclear burning properties
of a 15 M⊙ star from birth until the supernova explosion.
Supernovae are classified in Type I and Type II according to an observational characterisations. A
star is a Type II supernova when hydrogen lines occur in the spectrum and a Type I supernova when such
lines are absent. Each supernova, however, has its own characteristic shape of the light curve, and many
subclasses have been introduced so far. Type II supernova are not observed in old stellar populations, like
the elliptic galaxies, but are observed in galaxies with spiral arms rich of gas and dust. Type I supernova are
observed everywhere and are split into categories Ia and Ib/c. In general Type II supernova are associated
with the collapse of the iron core of a single massive star; a more appropriate term instead of Type II is core
collapse supernova. At the time of implosion, these stars still have hydrogen-rich envelopes explaining the
detection of hydrogen lines in the spectrum. Because massive stars evolve much faster than low-mass stars,
elliptic galaxies already have their core-collapse supernova behind them. Type Ia supernovae originate when
a star crosses the mass limit of Chandrasekhar. This mainly occurs through accretion in a binary system,
170
Figure 11.4: Typical values for the central temperature, density, and duration of each of the core burning
cycles of a 15 M⊙ star prior to the formation of a neutron star.
where mass transition of a star to a white dwarf happens (see Chapter 14 and the MSc courses Binary Stars
and High Energy Astrophysics). Due to the constant formation of white dwarfs and close binaries in a
given population, Type Ia supernovae are observed in all types of galaxies, i.e., in young as well as in old
populations.
The observational classification in Type I and II supernovae does not always corresponds to the astro-
physical interpretation, i.e., the division between whether or not an iron core collapses and expells its outer
layers. Type Ib/c supernovae originate from the explosion of a massive star in spiral galaxies, more particu-
larly a Wolf-Rayet star that has lost all of its hydrogen due to a very strong radiation-driven stellar wind (see
Chapter 13). The spectrum of their exploding cores hardly shows any hydrogen lines, while it does concern
a collapsing iron core.
11.6.2 The neutrino flux and the r-process
In case of the very high densities achieved during the collapse of the stellar core, the electrons very efficiently
come close to the nuclei, where they can transform protons into neutrons. While neutrons are unstable
elements that decay after 7 minutes in non-degenerate matter, they no longer decay in degenerate matter:
the stellar core becomes a neutron star. The pressure becomes so high that the neutrons become degenerate.
171
Figure 11.5: Light curve of the supernova that exploded in 1987 in the Large Magellanic Cloud. This
supernova was easily visible by eye in the Southern Hemisphere. Note the long, almost linear decrease
of the brightness during the first months following the explosion and the hump in brightness. The latter
corresponds to the energy production supplied by the decay of 56Co.
This degenerate neutron gas will be able to prevent a further gravitational collapse.
The equation of state for a degenerate neutron gas is not yet well established. Consequently, the upper
limit for the mass of the neutron star cannot be derived. Current estimations of the upper limit are around
2 M⊙. This is only slightly larger than the mass limit for a degenerate electron gas. The observational
determination of the mass of a neutron star is mainly done on the basis of binary stars of which one of the
components is a neutron star (see Chapter 14). However, just as for white dwarfs (cf. Chapter 12) these stars
are subject to binary evolution and may represent a different mass distrbution. At any rate, using such a
binary approach, the masses found are compatible with the upper limit of 2 M⊙ (taking into account errors).
A neutron star has a radius of a few tens of km.
A detailed picture of the formation of a neutron star is not available. Models for the equation of
state contain several parameters whose values are not well constrained. The current models predict that the
interior temperature will drop to 108 K after formation in a timespan of about 100 years. This cooling occurs
as a consequence of the strong neutrino flux. This neutrino flux is produced by the capture of electrons.
Indeed, the 56Ni isotopes formed during the silicon burning are unstable with regard to electron capture.
Consequently this isotope decays to an 56Fe isotope as follows :
56Ni + e− → 56Co + νe,
56Co + e− → 56Fe + νe.(11.3)
The first reaction has a half-life of 6.1 days and the second one of 77 days. This radioactive decay is
172
Figure 11.6: Schematic representation of the r-process in a (N,Z) diagram. Indicated are the reaction chains
that represent the capture of neutrons, followed by β−−decay, as a result of which heavy stable isotopes
originate. (From Wanajo, S., et al., 2004, ApJ, 606, 1057 –1069)
responsible for the luminosity observed months after the explosion. All (limited) theoretical models of
neutron star formation predict that high neutrino fluxes already leave the star before the explosion becomes
optically visible. Indeed for supernova 1987A (for the light curve of SN 1987 A, see Figure 11.5) in the
Large Magellanic cloud 20 neutrinos were measured at the correct energy in two neutrino detectors in
the Northern Hemisphere (Japan and USA), about 6 hours before the discovery of the optical flash. This
number of neutrinos is compatible with the predicted neutrino production according to the nuclear reactions
described above. It is remarkable that the neutrinos first passed through the Earth before being detected,
as the supernova exploded in the Southern Hemisphere. These neutrinos produced during the supernova
were the first ones to be measured directly originating during such a process. As such they delivered a very
important and successful test for the up to then uncertain calculations of the nuclear reactions, as described
above, during the ultimate final stage of a massive star.
The 56Fe isotope is, as aforementioned, the most stable isotope in nature. Nevertheless, processes
responsible for the production of elements heavier than this isotope occur, such as the s- and the r-process,
which are abbreviations for “slow” and “rapid neutron capture”. Hereby neutrons are captured by nuclei.
These processes consequently only can occur when there is an efficient production of neutrons. A free
173
neutron is not stable and decays with a half life of only 7 minutes. However, since a neutron has no electrical
charge, it can easily reach any nucleus (no Coulomb repulsion) in dense matter. The probability for a nucleus
to capture a neutron depends on the density of the neutrons, the mutual velocity of the nucleus and the
neutron and the mass number. A nucleus with a magic number of neutrons, i.e., an isotope with a closed
neutron shell, will be less tempted to capture an extra neutron. The s-process mainly happens in AGB stars
(“Asymptotic Giant Branch”, see next chapter), while the r-process happens during supernova explosions.
The thermo-nuclear reactions that take place before and during the supernova explosion indeed produce
elements heavier than iron because the production of neutrons during the explosion is efficient enough to
produce neutron-rich nuclei past the iron peak in a stable way. We can represent the neutron capture as
follows :
(Z,A) + n→ (Z,A + 1) + γ,
(Z,A + 1) + n→ (Z,A + 2) + γ,
. . .
(11.4)
When the consecutive nuclei are unstable, they decay very rapidly through a β− decay:
(Z,A) → (Z + 1, A) + e− + νe. (11.5)
Herein νe represents an antineutrino. Such a decay does not happen if meanwhile a new neutrino gets
captured. As such very heavy nuclei can originate before they get the time to decay. In the r-process (“r”
of “rapid”: the capture of neutrons is quick with regard to the β−−decay), the neutron density needs to
be of the order of 1022cm−3. Consequently, the path of the r-process elements in the (N,Z) diagram (see
Figure 11.6) is located deeply into the neutron-rich area, far from the valley of stability. The required large
neutron production can only be realised during supernova explosions.
The matter in the core of the star just before and just after the supernova explosion indeed consist of a
substantial number of neutrons. As a result, the r-process can take place during the cooling down stage after
the explosion. The net effect of this production of heavy elements is the main source of the heavy elements
we find today in nature.
11.6.3 Pulsars
Neutron stars need to spin very fast around their axis. This is a consequence of the conservation of angular
momentum. When collapsing, the dimensions of the star are largely reduced: the radius shrinks from a
few million km to about 20 km. Consequently, the angular velocity will increase with a factor 1010. The
accompanying angular frequency is a few tens per second. Due to the strong increase of the angular velocity,
the strength of the magnetic field of the star increases with about the same factor. Stars with a weak magnetic
field of a few Gauss prior to collapse suddenly get a magnetic field of about 1010 − 1012 Gauss.
Already in 1934, two years after the discovery of the neutron, the astronomers W. Baade and F. Zwicky
predicted the existence of neutron stars as the collapsed core after a supernova explosion based on theoretical
considerations. It took up to 1967, however, before the first neutron star was discovered. PhD student
Jocelyn Bell from Cambridge (UK) discovered a source of radio-emission in the sky. This source exhibited
174
Figure 11.7: Sketch of the model of a pulsar. The radiowaves of a pulsar are emitted in two bundles, depart-
ing from the two magnetic poles of the neutron star. The star rotates around an axis, which is inclined with
respect to the magnetic axis. The bundles of radio-emission spin, similar to the beams of light of a light-
house. When the bundle swings over the Earth, we observe a pulse of radio-emission. (From quora.com)
very regular intervals of about one second, transmitting strong pulses of radio waves to Earth. Such an object
is called a pulsar. The only stars known in 1967 able to spin around their axis in one second were white
dwarfs (see next chapter) and that was consequently the initial explanation given to the pulsar. In November
1968, however, a pulsar was discovered in the Crab Nebula. It emitted 30 pulses per second (the Crab pulsar).
It was known that the Crab Nebula was the rapidly expanding remnant of a supernova explosion, because
a dazzling bright star had appeared on that spot on the 4th of July 1054. This supernova was even visible
during day time for a couple of weeks. In Japanese, Chinese and Korean chronicles, the appearance of this
“super-bright new” (super-nova) star is elaborately recorded and the evolution of the brightness is tabulated.
175
Figure 11.8: Pulse profiles of 45 pulsars. Each pulse profile depicts how the intensity of the radio-emission
received from the pulsar varies during one pulse period. Some pulsars have a pulse consisting of only one
peak, others can have two or even three peaks. The pulse periods vary between 0.1 and a few seconds.
(Figure courtesy of Prof. Ed van den Heuvel, University of Amsterdam, NL)
176
The very short pulse period of the Crab Nebula makes it impossible for this object to be a white dwarf,
because the angular rotation rate at the surface of the star would exceed the critical rotation rate caused by
the centrifugal force. As such, it became clear that it must be a fast rotating neutron star. The radio waves
are generated above the strong magnetic poles of the star, shaped like light bundles (see Figure 11.7). The
neutron star revolves around an axis inclined with regard to the magnetic axis. Consequently, the bundles
strike the Earth with regular intervals. As for the rotating light bundle of a lighthouse, we observe the radio-
emission as regular pulses. The discovery of the Crab pulsar was immensely important: it was realised that
pulsars are neutron stars and moreover that neutron stars are the final product of core-collapse supernovae.
The supervisor of Jocelyn Bell received the Nobel prize in physics for this discovery1.
Meanwhile many more pulsars have been found. In Figure 11.8 we show the original pulse profiles of
about fifty pulsars. We notice very different shapes, from narrow and symmetrical to broad and asymmetri-
cal. The shape of the pulse profile depends on, among other things, the geometry of the magnetic field and
the inclination of the rotation axis with regard to the observer.
1Still today, it is considered a major and unacceptable scandal that the (female) PhD student was not involved in this prestigious
award, she was after all the one who made the discovery. It did not stop Jocelyn to develop a beautiful career in astronomy in the
UK and to keep on fighting for women’s rights in this profession. Today she still is a very active Emerita
177
Chapter 12
Evolution of a star with M <∼ 8M⊙
12.1 Post-main-sequence evolution
Contrary to stars with M >∼ 2.3M⊙ stars with a lower mass evolve in a qualitatively different way after the
exhaustion of the hydrogen in the central parts. There are several reasons for this. First, these low-mass stars
have no or very small convective cores. For stars with a mass smaller than the Sun there is no convective
core and consequently these objects produce a helium core with a very low mass in which no mixing takes
place. Because of this there will be a very gradual transition from central to shell hydrogen burning, much
less drastic than when convection mixes the helium core and CBM occurs adjacent to it. The mass taking
part in the hydrogen-shell burning is typically <∼ 0.1M⊙.
Electron degeneracy is important during or immediately after the main sequence (cf. Figure10.14).
The pressure in the central parts of the star is not only produced by the ions that meet the ideal gas law, but
partially by the degenerate electron gas. Consequently these stars can handle a larger fractional helium mass
before contraction of the core sets in, i.e., their Schonberg-Chandrasekhar limit is at the highest level in the
expression given in Eq. (10.4), i.e., MSC is typically near 15% of the birth mass, depending on the chemical
composition and on the degree of degeneracy in the core. Therefore the core of these stars can remain in
equilibrium thanks to the (partially) degenerate electrons in the inert isothermal helium core. There is no
need for a rapid contraction of the core to start up central helium burning and there is no analogy of the
Hertzsprung gap for low-mass stars, also because such stars start much closer to their Hayashi track for their
further evolution.
During the first stage after the hydrogen burning in the core, the hydrogen shell burning starts and the
mass of the core grows at a very slow pace. The temperature of the core remains constant and far below the
temperature needed for helium burning. Therefore, for these stars, the shell burning in-between the central
hydrogen and central helium burning stages is a phase on a nuclear time scale, as long as their core has a
mass below their Schonberg-Chandrasekhar limit. Because of this slow phase, we expect to observe many
low-mass stars in this stage of hydrogen shell burning, which is called the subgiant phase.
179
Figure 12.1: Left: evolutionary track of a 1M⊙ star with (X,Z) = (0.70, 0.02). Right: corresponding
Kippenhahn diagram. The colours and hatchings have the same meaning as in Figure 11.1. (Figure courtesy
of Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)
The hydrogen shell burning causes an increase of the mass of the helium core. Consequently the
brightness increases steadily, while the hydrogen-rich envelope located above the hydrogen shell source
expands. At first the brightness only changes slightly while the star moves to the right in the HR diagram.
This movement cannot be maintained very long since the star is located close to its Hayashi track. However,
the star needs to expand its envelope further, to counterbalance the ever deeper reaching convective envelope.
This necessarily leads to a relatively strong increase of the brightness due to the increasing radius. In this
stage, the brightness will increase with a factor 100 while Mcore keeps on growing. The star climbs up the
red giant branch (RGB) , see Figure 12.1.
Figure 12.1 shows the evolutionary track, as well as the inner structure of a star with initial mass 1 M⊙,
born with an initial chemistry represented by the mass fractions (X,Z) = (0.70, 0.02). During the central
hydrogen burning the star first moves up and later on evolves to the right in the HR diagram. It is about 9 Gyr
at the bluest point B. Between B and C the central hydrogen burning stops and shell burning continues.
Because of that, the helium core becomes more massive and slowly contracts, while the envelope expands.
This takes about 2 Gyr. The star’s track is located very close to its Hayashi track, implying, as mentioned
before, that the star inevitably needs to climb up the red giant branch towards the point D.
Aside from a slightly different main-sequence evolution, all stars with M <∼ 2.3M⊙ end up in more
or less the same point D on the RGB. Stars with 1.1M⊙<∼ M <
∼ 1.5M⊙ do have a convective core during
core-hydrogen burning but do not reach their Schonberg-Chandrasekhar limit at the TAMS. On the other
hand, stars with 1.5M⊙<∼ M <
∼ 2.3M⊙ do reach this limit at the TAMS so they start core contraction and
envelope expansion on a thermal time scale τKH, passing through a small Hertzsprung gap prior to reaching
point D.
The radius of the star, and therefore as well its brightness, increase. That the star moves closely to the
180
Hayashi track can also be seen from the inner structure, revealing that the outer convective zone contains
about 75% of the total mass. The convection zone thus reaches layers already containing heavy nuclei
produced by nuclear reactions. The star goes through its first dredge-up. The turbulent convective motions
imply that the processed material gets transported homogeneously throughout the envelope and in particular
to the surface of the star. As a consequence, the surface composition is substantially changed after the first
dredge-up: Li and C abundances as well as the carbon isotopic ratio 12C/13C decline, while 3He and the 14N
abundances increase, providing a marked decrease in the C/N abundance ratio at the surface.
The monotonically increasing character of the brightness along the RGB above point D gets interrupted
in point E. Indeed, at that stage, the hydrogen burning shell reaches the region that got replenished by fresh
hydrogen at the expense of heavier nuclei by the bottom of the convective envelope at the epoch of the first
dredge-up. This lowered µ in that position compared to what it used to be prior to the first dredge-up. As a
consequence, the luminosity of the star all of a sudden decreases. Depending on the mass and metallicity,
this happens around log(L/L⊙) ∈ [1.5, 2]. The star thus makes a small down-and upward zig-zag along the
RGB. This means it passes there three times, going up, down, and back up again. Due to this we observe an
overpopulation of stars in that part of the RGB, called the “luminosity bump”.
When the outwardly moving hydrogen burning shell reaches the discontinuity in µ created by the deep
convective envelope, it finds a reservoir of 3He, feeding the reaction 3He(3He,2p)4He and creating protons.
This lowers µ locally and hence delivers a negative ∇µ. Indeed, we have
3He +3He → 2 1H + 4He, hence µ : 6/6 → 6/7
On the other hand, the following reactions also take place:
3He +4 He → 7Be + γ7Be + e− → 7Li + νe(+γ)7Li +1 H → 4He +4 He
, hence µ : 7/6 → 8/6,
implying an increase in µ and thus a positive ∇µ. In the case that the reaction 3He(3He,2p)4He is dominant,
the process of thermohaline mixing becomes active. This represents a slow mixing process acting on the
local thermal (expansion) timescale. The condition for the occurrence of thermohaline mixing is
ϕ
δ∇µ ≤ ∇−∇ad ≤ 0, (12.1)
i.e., it operates in regions that are stable against convection according to the Ledoux criterion and where an
inversion in the mean molecular weight is present. Thermohaline mixing may transport chemical species
between the H burning shell and the convective envelope. The efficiency of the thermohaline mixing is
regulated by the local abundance of 3He and 4He: the second set of reactions depends linearly on the
local abundance of 3He and 4He, while the first reaction depends quadratically on the abundance of 3He.
181
For increasing initial mass, the second set of reactions becomes more important than the first reaction and
creates a composition barrier because µ increases, while the first reaction leads to a decrease in µ. For this
reason, we find a threshold in mass at about 1.5 M⊙ above which thermohaline mixing is not efficient.
The evolution calculations of stars born with a different mass lead to similar results as those shown in
Figure 12.1. The main sequence tracks are different depending on the birth mass M <∼ 2.3M⊙ but close to
their (slightly different) Hayashi tracks, all these evolutionary tracks come together. For all of these evolved
stellar models, the central parts of the star have become dense enough to treat these parts independently
from the stellar envelope (and thus from the total mass). Stars with a different mass but a similar core mass
Mcore will display the same luminosity and will occupy the same position in the HR diagram.
When MSC is reached along the RGB, the helium core will quickly start to contract, and the upper
stellar layers will expand. The star reaches the tip of the red giant branch in point F. Numerical calculations
demonstrate that the temperature in the core rises strongly (virial theorem!), but not in the very inner part of
the degenerate core. This phenomenon of an off-centre temperature increase has to do with the occurrence
of a temperature inversion in the inner core, due to a neutrino cooling process. Indeed, the contracting core
also implies a contraction of the hydrogen-burning shell that surrounds it and the core mass thus grows
efficiently in this fast contraction phase. In the inner stellar core, the plasmon neutrino cooling process
sets in at that stage, which implies that energetic photons can decay into two neutrinos. The accompanying
neutrino production gives rise to an energy flux that can easily escape the star. This neutrino cooling thus
gives rise to an energy loss from the inner stellar core. The efficiency of the production of plasmon neutrinos
increases with density, and so the temperature profile of the star is inverted locally, in the sense that the core
becomes cooler in the centre than in its surrounding layers. Thus, the temperature for helium ignition is first
reached in a shell surrounding the very inner core.
Typical values for the density and temperature in the off-centre core regions are 106 g cm−3 and 108 K
(see Figure 10.14). That way the temperature necessary for the triple alpha reaction is reached, starting the
helium burning off-centre. This takes place when Mcore ≈ 0.47M⊙, quite independently of the value of Mon the ZAMS. But the stellar matter in the core is in an advanced state of degeneracy and the helium burning
is unstable in such an environment. Indeed, the energy production by the nuclear fusion is not accompanied
by an increasing outward ion pressure force, since the electrons are mainly responsible for the outerward
pressure, independently of the temperature. Thus, the nuclear energy is not used to expand but rather to
increase the temperature. Due to the large temperature dependence of the He burning, a thermal runaway
occurs, which ends the quiet evolution of the star on the RGB in point F.
12.2 The helium flash
The thermal runaway originating from the ignition of helium burning in the degenerate core region has a time
scale of the order of the thermal time scale of the helium burning in the off-centre region. The temperature
increases, while the matter does not expand nor contract (the pressure is not related to the temperature).
There is no work done and therefore there is an enormous overproduction of nuclear energy. For a few
seconds, the local luminosity l reaches a maximum of about 1011 L⊙ (the luminosity of a whole galaxy!):
182
Figure 12.2: Scheme indicating the change in temperature during the helium flash. After the ignition tem-
perature of helium is reached in the degenerate core, the temperature increases without giving rise to an
increase in pressure or density until the degeneracy is lifted (nearby the dashed line). Thereafter, a stage of
stable central helium burning in a non-degenerate regime occurs. (From Kippenhahn et al. 2012)
the star is said to experience a helium flash.
Figure 12.2 shows the behaviour of the temperature as a function of the density during the flash. The
increase in temperature at a constant density first results in a removal of the degeneracy and, afterwards,
in an expansion of the region. By removing the degeneracy, the helium burning becomes stable given that
the expansion causes the temperature to reach an equilibrium value. Stable He burning is first reached in a
shell surrounding the inner core, and gradually occurs deeper and deeper in the star until the stage of stable
central helium burning occurs. Thus, after the main off-centre flash, several secondary sub-flashes occur
deeper and deeper inside the star, which each heat the interior layer below it. In the end, the nuclear energy
produced in the core is carried away gradually by an expansion and cooling after the He (sub-)flashes until
the degenerary is completely lifted and the temperature reaches an equilibrium value appropriate for central
stable helium burning.
The path followed by the star in the HR diagram as a consequence of the helium flashes is as follows.
Just before the flash the luminosity reached a maximal value which was only delivered by the hydrogen shell
burning and the core contraction. During the helium flash the area of hydrogen shell burning becomes so
thin that the shell disappears on a time scale of ≈ 10−3 year. The energy produced immediately after the
helium flash by the helium burning (first in a shell that gradually moves deeper in, later on centrally) is much
lower than that produced by the hydrogen shell burning before the main flash. Consequently the luminosity
will decrease substantially and the star occurs in Point G in Figure 12.1 when its core helium burning occurs
in equilibrium. Once the core He burning happens in equilibrium, the hydrogen shell burning restarts in a
thin shell as well, such that He production re-occurs in the envelope.
183
12.3 Evolution after the helium flash
After the violent stage of the helium flash, a more quite stage of central helium burning in a non-degenerate
environment follows. The star has a luminosity of about 100 L⊙ and is again located close to its Hayashi
track. The star has now arrived on the horizontal branch (see Figure 12.3). The position of arrival of the
star on the horizontal branch depends on its core mass, its envelope mass, and its chemical composition
at that time. Differences in the positions observed thus reflect a difference in mass loss that must have
occurred before the helium flash and/or a difference in metallicity. By analogy with the zero-age main
sequence (ZAMS), this stage is called the ZAHB: “zero-age-horizontal-branch”, although this terminology
is sometimes limited to metal-poor stars as in Figure 12.3. For Population I stars, as the model in Figure 12.1,
one usually refers to this phase as the core helium burning phase (Point G to H).
For metal-poor stars, the entire ZAHB is covered by stars of about the same core mass but with a
clearly different envelope mass, if we consider the same metallicity. Stars that have suffered the largest
mass loss are located on the left side of the ZAHB, while stars with a lower mass loss occupy the ZAHB on
the right. In practice this discrimination cannot be made easily, because the different observed positions of
the stars on the horizontal branch reflect the evolution of the star during its stay on the horizontal branch,
as well as the different chemical composition. The star makes a loop to the left and back to the right due to
the growing mass of the core caused by the hydrogen shell burning while the helium burns in the core. A
different position on the horizontal branch thus reflects a combination of a different chemical composition
upon arrival on the ZAHB, a different core composition and mass, and a difference in envelope mass. This
situation is depicted separately in a schematic way in Figure 12.3.
Upon arrival on the ZAHB, the star has a homogeneous non-degenerate helium core with a mass
Mcore ≈ 0.47M⊙. The core is surrounded by a hydrogen rich envelope with a mass M − Mcore. The
total luminosity is delivered by the slow central helium burning and the hydrogen shell burning, which
started again in a thin shell after the He (sub-)flash(es). The mass of the helium core increases due to the
shell burning, while the helium burning forms a central convective CO-core within the helium core. Thus,
in fact, shell burning soon occurs in two shells, one surrounding the tiny CO core and one on top of the He
core and below the H-rich envelope. The masses of these shells will grow during the following stage. Thus,
the luminosity increases slowly during the evolution on the horizontal branch, resulting in a non-negligible
extended region in the HR diagram, with the ZAHB as a lower limit.
Cluster stars are born together and have the same initial chemical composition; they will coagulate
on the horizontal branch. When it concerns metal-rich (young clusters), their red giants will be located on
the red (cool) side of the horizontal branch with a relatively low luminosity, because of their larger opacity
(compared to metal-poor stars). This is the reason why the region between Points G and H in Figure 12.1
is called the red clump. This phenomenon is also visible for stars with the same metallicity that do not
belong to a cluster, in particular also for the red giants in the environment of the Sun (see Figure 1.7). The
horizontal branch of the globular cluster M5 shown in Figure 10.6 looks entirely different than the red clump
for stars in our environment. The horizontal branch of M5 is located at high temperatures and luminosity
and is stretched out over a large temperature range. On one hand this is due to the lower metallicity of the
stars in M5 in comparison to the stars close to us and on the other hand this reflects the age of the cluster,
as a result of which the horizontal-branch evolution already took longer. In general we find that the more
184
Figure 12.3: The position of metal-poor stellar models on the horizontal branch with a similar helium core
but a different total mass and a different metallicity (indicated as XCNO in the plot). All models have
X = 0.65 in the envelope. The solid line indicates a series of models with a constant XCNO = 0.01 but
with different masses, ranging from 0.6 to 1.25M⊙. The dotted line indicates a series of models with a
constant mass 1.25M⊙ but with a varying chemical composition XCNO ranging from 10−5 to 0.01. The
dashed line on the left is the main sequence and the one on the right is the Hayashi track for 1.25M⊙. (From
Kippenhahn et al. 2012)
metal-poor the cluster, the higher the temperature and the luminosity of the horizontal-branch stars.
The theoretically determined evolutionary tracks for the horizontal branch are very hard to compare
to observations at a great level of detail due to relatively large observational errors. In any case the tracks
always start on the ZAHB and the star arrives (possibly after a few loops) nearby its Hayashi track at the
time when the central helium gets exhausted. The phase of core helium burning lasts about 100 to 150 Myr,
depending on the occurrence of CBM surrounding the burning convective helium core or not. The post-
horizontal-branch evolutionary tracks are all located above the horizontal-branch tracks (see Figures 12.1
and 12.4). At the end of the core helium burning, the star is said to have arrived on the asymptotic giant
185
Figure 12.4: The ZAHB in the HR diagram and evolution tracks representing the HB evolution. The bold
solid line is the ZAHB for models with a helium core of a mass 0.475M⊙ and a hydrogen rich envelope
with X = 0.699, Y = 0.3 of different mass M −Mcore. The total mass is indicated for several models
(bold dots). The subsequent evolution is depicted for the three models by solid lines (slow evolution) and
by dashed lines (fast evolution). The slow evolutionary stages are those of central helium burning combined
with hydrogen shell burning (107 year) on the one hand, and of hydrogen and helium shell burning on the
other hand. In between, a fast stage occurs during the transition from central helium burning to helium shell
burning. (From Kippenhahn et al. 2012)
branch (AGB, see Figure 12.5).
During the evolution on the horizontal branch the stars cross, as indicated in Figure 12.5, the instability
strip, where the Cepheids (indicated by “W”) and the RR Lyrae (RR) stars are found. These stars experience
large-amplitude radial oscillations driven by a heat mechanism similar to a Carnot cycle in thermodynamics.
Therefore they shrink and expand periodically while the spherical symmetry is preserved. For a detailed
description of the observational characteristics of pulsating stars we refer to the course Asteroseismology.
12.4 AGB stars
Up to now we only treated the evolution of a star with a mass lower than 2.3M⊙ in this chapter. For a
further description of the evolution we pick up the stars whose evolution we did not describe until the end
of their life in the previous chapter, i.e., we unite all stars born with a mass below 8M⊙. After the central
helium burning, these stars have all arrived on the asymptotic giant branch, where they undergo two shell
186
Figure 12.5: The evolution of three stellar models in the HR diagram. After the helium flash the stars end
up on the horizontal branch. Subsequently they evolve to the upper right and arrive at the asymptotic giant
branch. The dashed line indicates the classical instability strip, where the RR Lyrae (RR) stars and Cepheids
(W) are found. (From Kippenhahn et al. 2012)
burnings. A zoom of their Kippenhahn diagram is offered in Figure 12.6, following Figure 11.1. As of the
start of the early AGB phase, all these stars follow a common evolution towards their death. During the early
AGB phase (point H) the He burning shifts from the centre to a shell. The H-burning shell extinguishes and
at point K, which indicates the time when the second dredge-up occurs. The H-burning shell is re-ignited
some time later at point J. This is the start of the double shell-burning phase, which soon leads to thermal
pulses of the He burning shell. At this stage, the star also undergoes strong mass loss, removing the the
stellar envelope in a time frame of about a million years, leaving the degenerate CO core as a cooling white
dwarf. We now discuss these events in a bit more detail.
187
Figure 12.6: Zoom in on the Kippenhahn diagram of the 5 M⊙ already shown in Figure 11.1 for the end of
the core helium burning phase and the AGB. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar
Structure and Evolution, Radboud University Nijmegen, NL)
12.4.1 The circumstellar envelope and mass loss during the AGB
During this stage of its life, the star is subject to a considerable mass loss caused by a dust-driven stellar
wind. Even though it is poorly known when the mass loss starts and how strong it varies during the AGB, it is
clear that it is affected by large-amplitude radial pulsations of the star driven by the H partial ionization zone
via a thermodynamical Carnot cycle (also known as the opacity mechanism). The circumstances are good
for a stellar wind to occur, given that the star meanwhile expanded so much that is has a very extensive radius
but only very limited mass in its envelope. This envelope makes the star look like a red supergiant. Stars on
the asymptotic giant branch have radii between 100 and 500R⊙ and an effective temperature between 2 200
and 3 500 K. A cartoon is presented in Figure 12.7.
AGB stars thus exist of a small hot CO core that is strongly gravitationally bound and a very large cool
“loose” envelope of which the outer layers are only very weakly gravitationally bound. Because of this,
the AGB stars can easily undergo a substantial mass loss. Due to the pulsations, an extensive circumstellar
188
Figure 12.7: Cartoon of an AGB star during its thermally pulsing phase. The CO core is degenerate and
very compact, and is surrounded by two burning shells (indicated in red). The convective envelope is very
extensive in size and only loosely bound to the interior. It experiences a strong stellar wind, creating a
circumstellar (CS) envelope. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and
Evolution, Radboud University Nijmegen, NL)
envelope of gas and dust forms. This envelope can be considered as the third part of the star. The temper-
ature of the star decreases typically from 3 500 K in the envelope to only some 10 K on the outside of the
circumstellar envelope. At such a low temperature, (complex) molecules (among which the OH and the CO
molecule) and dust grains form. The latter determine not only the spectral characteristics of the AGB star
in the infra-red, but also the further evolution of the star. Indeed, the dust in the envelope is very efficient in
absorbing the stellar light due to its large opacity. This environment succeeds in transforming the absorbed
189
light into outward motion of the molecules. This dust-driven slow wind has an outflow velocity, also named
terminal wind speed, of typically v∞ ∼ 15 km s−1.
The theory of a dust-driven stellar wind is poorly developed, given the mathematical, physical and
chemical complexities at play in such a cool low-density environment, which implies non-equilibrium chem-
istry. The basic idea is that a fraction of the stellar radiation is converted into momentum of the dust particles
and that these then move away from the star. Consequently, one makes the reasonable assumptions that the
mass loss is proportional to the luminosity of the star and inversely proportional to the escape velocity. To
predict the mass loss, an empirical correlation is used, calibrated by observations of mass loss in the infra-
red. This formalism was suggested for the first time by Reimers, hence the term Reimers wind. The mass
loss caused in this way, amounts to a good approximation to
MReimers = 10−13 L
L⊙
R
R⊙
M⊙
MM⊙ yr−1. (12.2)
Here the convention is used that the mass loss is a positive quantity expressed in solar masses per year.
Typical values range from 10−7 M⊙yr−1 to 10−4 M⊙yr
−1. The dust-driven wind thus relies on a mechanism
that can convert the stellar radiation into wind momentum in an efficient way, thanks to the absorption
capacities of the dust particles in the cool circumstellar envelope.
Meanwhile different alternative formulations to the Reimers treatment for a dust-driven stellar wind
have been proposed. Moreover the Reimers approach was deduced for Population I stars and adaptations are
needed for metal-poor stars. At present, the knowledge of the physical mechanism behind a Reimers wind
is insufficiently known to include it in stellar evolution calculations in a consistent way, i.e., by abandoning
conservation of mass and replacing it by a proper dust-driven wind theory. On the other hand, we cannot
neglect the mass loss on the AGB to treat the stellar evolution properly. One then proceeds as follows: for
two subsequent time steps along the AGB track, the Reimers formula (or a variant) is computed. In that way
the total mass loss between the two time steps is computed (assuming a constant mass loss), and this amount
of mass is subtracted from of the envelope mass the star had during the previous step. Thus one still assumes
that hydrostatic equilibrium and conservation of mass applies for each time step that the structure models
are computed, but each time the envelope mass is decreased. As the AGB stage is very short compared to the
total life span (typically only about one million year versus billions of years), it is assumed that this mode
of operation is quite a good approximation to predict the further life cycle of the star.
The last decades, accurate observations of bright AGB stars in the infra-red, with a high spatial and
spectral resolution, have become available. These observations revealed that the empirically determined
formalisms of a dust-driven wind fail to explain the details of the outflow. Incorporating a better approach
to describe the mass loss, especially its time dependence, requires the full coupling between the dust-driven
wind mechanism and the dynamical stellar pulsations. This requires taking into account the details of the
radiative energy transport through the low-density cool circumstellar envelope, where molecules and dust
grains of various sorts are present. Moreover the assumption of conservation of mass needs to be abandoned,
and this basic equation needs to be replaced by a good wind description, abandoning the stationary boundary
conditions to solve the differential equations of stellar structure. The uncertainties in the physics of such a
complex non-equilibrium chemical system are yet severe. Hence, one opts for the simple description in terms
of the Reimers approach. The improvement of this aspect of stellar evolution for low- and intermediate-mass
stars is an active research domain in stellar astrophysics, in which members if the Institute of Astronomy of
190
the KU Leuven are active.
12.4.2 Thermal pulses, Hot Bottom Burning and the 3rd dredge-up
The two shells that burn and deliver energy during the AGB are separated by a helium layer. The outer
shell, at the bottom of the hydrogen envelope, burns hydrogen and increases the mass of the helium layer.
When the inner shell, located on top of the CO core, is hot enough it burns helium resulting in a heavier CO
core and a decrease in mass of the helium layer. In principle, both shell burnings could constantly go on
simultaneously in a stable way, taken that both shells grow outwards at the same pace. This does not happen
in reality because of the large difference in the conditions (temperature!) in which hydrogen and helium
burning occur. Consequently both shells produce energy in a cyclic manner and the mass of the helium
intershell changes in a quasi-periodic way. During most part of these cycles the hydrogen shell burns, while
the inner helium shell is not hot enough for burning. Consequently the mass of the intershell consisting of
helium increases. When there is no energy source below this helium layer, it will start to shrink. Because of
this it heats up, until its bottom is hot enough to ignite the helium burning. However the helium burning in
this thin shell is not stable as the material at that position already has a high degree of electron degeneracy. A
thermal runaway, termed helium shell flash occurs, resulting in a typical energy production of about 108 L⊙
and lasting a century. This energy is absorbed by the layers lying on top, so these expand and cool down.
As these layers contain the hydrogen shell burning layer, the latter comes to a halt. During a short period
of time the helium shell burning makes the CO core heavier and gives rise to an intershell convective zone
(ICZ) in the helium layer that is needed to efficiently remove the energy produced. During this helium
shell burning the shell moves outwards and constantly comes closer to the area where the hydrogen burning
stopped. Due to the outward heating the hydrogen shell burning layer ignites again. Because the hydrogen
burning is much less sensitive to the temperature, and provides more energy, the hydrogen can burn again in
a stable way. In the mean time the helium burning stops, because this shell has cooled too much due to its
outward displacement. Hence a new cycle begins.
In that manner, periodic cycles of thermal pulses arise. These occur about every 103 − 105 years
depending on the mass of the star. The star has arrived on the TP-AGB: “thermally pulsing AGB”. As the
stellar evolution progresses, the thermal pulses gain in strength, while the time span between them decreases.
The number of thermal pulses that occur depends on the initial mass, the metallicity and the mass loss that
the star experiences in this stage. The pulses continue as long as there is enough hydrogen in the outer
envelope to restart the hydrogen burning.
The luminosity and the surface temperature can change substantially with every thermal pulse. The
difference is more outspoken the less mass there is above the two burning shells. Because of the large
changes in luminosity and temperature the star makes fierce movements in the HR diagram. In Figure 12.9
we show the evolutionary tracks of a star with 0.6M⊙ that experiences 11 pulses. Nowadays clues are found
using interferometric observations in the infra-red that the mass loss of some of the TP-AGB stars indeed
changes cyclically, with periods that are compatible with the occurrence of thermal pulses (but also with
shorter timescales of a few hundreds of years that are not yet fully understood).
The helium shell burning that starts during the thermal pulse transforms 4He into 12C and 16O. Through
191
Figure 12.8: The region around the two burning shells inside an AGB star experiencing two thermal pulses is
shown. Convective regions are shown in grey, where ICZ denotes intershell convection zones caused by the
He-shell flash. The H-exhausted intershell region is indicated by the thin red solid line and the He-exhausted
core mass by the dashed red line. Thick red lines indicate when nuclear burning is active in these shells. The
hatched blue region indicates a shell (“pocket”) rich in 13C formed at the interface of the H-rich envelope
and the C-rich intershell region, following an episode of 3rd dredge-up. (Figure courtesy of Prof. Onno
Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)
the energy production of this burning a convection zone arises in the intershell in-between the two burning
shells. This convection layer brings the 12C and 16O isotopes to the hydrogen shell burning zone. With each
following cycle of hydrogen shell burning starting when the bottom of the hydrogen layer is hot enough due
to the moving burning helium shell, the 12C and 16O isotopes are transformed into 14N through CNO-type
hydrogen burning. In this phase of stellar evolution, this hydrogen burning is called hot-bottom-burning
(HBB). Thus, HBB prevents in this way that AGB stars become carbon-rich stars at their surface.
The luminosity of the AGB star is entirely determined by its core mass and is independent of the mass
of the envelope. To a good approximation, we have
L
L⊙
= 6× 104(
Mcore
M⊙
− 1
2
)
. (12.3)
In order to remove the luminosity efficiently, the star has a large convective envelope, as is always the case
192
above a layer with shell burning. The outer convective zone reaches deep enough to transport the products
of the hydrogen and helium shell burning to the stellar surface in the case of AGB stars with a relatively high
birth mass, i.e., M >∼ 4M⊙. In this way, these stars experience cycles of a so-called third dredge-up. Stars
with M <∼ 4M⊙ never climb high enough on the AGB to experience these 3rd dredge-up episodes due to
their relatively low core mass, which implies that they have no need for such an extensive outer convective
envelope. They will thus not experience efficient 3rd dredge-ups and the products of nucleosynthesis of
these low-mass AGB stars formed during the two shell burnings will not be revealed at their surface.
12.4.3 The s-process in AGB stars
In between two thermal pulses, 14N is produced through the hydrogen shell burning, and the convective layer
between the two shells transports these 14N isotopes to the helium burning shell during the next pulse. In that
region, the following chain of reactions takes place:14N(αγ)18F(β+ν)18O(α, γ)22Ne. For a pulse in a rather
massive AGB star the temperature reaches a value high enough to also burn the 22Ne isotopes in the reaction22Ne(α, n)25Mg. This reaction accomplishes the production of a neutron. Another, much more efficient
neutron source has been mentioned in the discussion of the carbon burning, but to make this active, sufficient13C isotopes need to be brought to the helium burning shell. It concerns 12C(p, γ)13N(e+ν)13C(α, n)16O.
The latter reaction is much faster than 22Ne(α, n)25Mg but it does require a proton concentration of about
10−4. This can be met when hydrogen-rich material is transported to 12C-rich areas during the pulses, as a
result of which a 13C pocket can be formed (see Figure 12.8).
The neutron sources mentioned above can be strong enough to form elements beyond the iron peak
through the s-process. Neutron capture has been described in the discussion of the r-process. In the s-
process, neutron capture continues until too many neutrons have been captured and the isotope gets too far
outside the stability valley in the (N,Z) domain. The nucleus then experiences a β− decay:
(Z,A) → (Z + 1, A) + e− + νe . (12.4)
The path of the s-process is located along the neutron-rich border of the stability valley in the (N,Z)diagram, but less deep than for the r-process (see Figure 11.6). The name “s-process” refers to the fact
that the neutron capture is slow compared to the β− decay, in contrast to the case of the r-process. The
s-process takes place for neutron number densities of the order of 108 − 1012cm−3 and strongly depends on
the metallicity of the star.
Due to the neutron sources discussed above, the s-process takes place in AGB stars that experience
thermal pulses. Typical s-process elements are on one hand those of the light s-process elements group (i.e.,
the strontium peak). These are the elements with N = 50: Strontium (Sr, Z = 38), Ytrium (Y, Z = 39),
Zirkonium (Zr, Z = 40) and Technetium (Tc, Z = 43). On the other hand there are the numerous heavy
s-process elements of the barium peak with N = 82, of which (Ba, Z = 56) is the main example. The
s-process product Tc is important as its isotope decays already after 105 year. This implies that the detection
of Tc spectral lines in the spectrum of a star is the most direct diagnostic to prove the AGB character of a
star, since Tc can only be formed from the s-process in the AGB phase. The precise details of the transport
of s-process elements to the surface of AGB stars by dredge-up are not well known. An efficient episode of
193
Figure 12.9: The evolutionary track after central helium burning of a star with core mass 0.6M⊙ and
chemical composition X = 0.749, Y = 0.25. The track increases along the AGB until thermal pulses
(indicated by bold dots) occur. The change in position in the HR diagram during a pulse is shown for pulses
9 and 10. Before the last pulse, the track has reached the domain of the white dwarfs. The main sequence,
the horizontal branch and the line of constant radius for white dwarfs are also depicted. (From Kippenhahn
et al. 2012)
cyclic 3rd dredge-ups only occurs for the more massive AGB stars because of its strong dependence on the
precise extent and location of the outer convective layer.
194
12.5 Post-AGB stars
As aforementioned, the number of thermal pulses experienced by an AGB star during the TP-AGB stage
depends on the birth and core mass, the mass loss and the metallicity of the star. The stars keep climbing the
AGB as their CO core becomes more massive – see Eq. (12.3). Meanwhile the stars lose a large part of their
envelope through the coupling between the pulsations and the dust-driven stellar wind. The thermal pulses
keep returning as long as the hydrogen layer has a mass of >∼ 0.1M⊙. When it becomes less massive, the
hydrogen burning can no longer continue and consequently the helium layer and the stellar core no longer
grow in mass. A last thermal pulse occurs when the helium layer contracts one last time until it is hot enough
to start the last cycle of helium shell burning.
When the mass of the hydrogen envelope is reduced to <∼ 0.03M⊙, the pulsations also come to an
end, and as a result the mass loss decreases quickly and stops. The effective temperature of the star starts
to increase when the mass loss stops. This is caused by the disappearance of the outer envelope and the
appearance of the hotter inner layers surrounding the shrinking stellar core. The star leaves the AGB and
starts its post-AGB stage. This takes only about 10 000 years. During this stage the luminosity of the star
remains nearly constant, as it is only determined by the core mass – see Eq. (12.3). The effective temperature,
on the contrary, keeps rising due to the contraction of the core on the one hand and the visibility of hotter
inner regions on the other hand.
The last thermal pulse can still occur during the post-AGB stage, and even somewhat later during the
hot part of the cooling track of white dwarfs. This is due to the short post-AGB phase, while the contraction
of the core still continuous. As a result the helium burning can start one last time. Such a last thermal
pulse takes place in about 25% of the post-AGB stars. In that case the star returns fast to the AGB as it
experiences again shell burning. This is called the born-again scenario, where the star very quickly crosses
the HR diagram. Subsequently it returns to the white-dwarf stage in a time interval of typically 200 years.
Depending on the core mass, the star experiences a radiation-driven stellar wind (see next chapter) and
becomes a hydrogen-deficient helium burning star consisting of a CO core surrounded by surface layers
enriched with helium, carbon and oxygen.
Even though the post-AGB stage is short, it is extremely useful for observational tests about the 3rd
dredge-up and the associated mixing in the intershell and outer convection zone. With the pulsations, the
dust envelope also disappears and consequently the products of nucleosynthesis on the surface of the star
can be studied. The chemical analysis of post-AGB stars on the basis of high-resolution spectroscopy is an
active domain in stellar astrophysics in which members of the Institute of Astronomy play a leading role.
This subject receives ample treatment in the Master course Stellar Atmospheres in Leuven.
As the effective temperature raises to a value of about 30 000 K, the circumstellar matter becomes
ionized. The star has become a planetary nebula. Not every post-AGB star becomes a planetary nebula as in
some cases the circumstellar dust shell is already too far away from the star before the effective temperature
surpasses 30 000 K and/or it contains too little mass. The thermal pulses experienced by the AGB star are an
envelope phenomenon which does not affect the CO core. The latter has the characteristics of a white dwarf.
From the fact that there are far more white dwarfs than planetary nebulae, we deduce that the planetary-
nebula stage needs to be much shorter, even when taking into account that not every post-AGB star lights
195
up as planetary nebula. The planetary-nebula stage takes about 105 year.
At the end of the post-AGB phase, the mass of the CO core is between 0.6 and 1.1M⊙, the higher
masses resulting from stars with a birth mass between 6 and 8M⊙. Because there are many more stars born
with a low mass than with a high mass, we expect the masses of the CO cores to peak around 0.6M⊙. This
is indeed in accordance with the mass distribution observed for white dwarfs.
12.6 White dwarfs
As mentioned in the previous chapter, stars with an intermediate mass of 2.3M⊙<∼ M <
∼ 8M⊙ finally
develop a degenerate CO core after the stage of helium burning. The precise mass of this core depends on
the (up to now not yet fully understood) mechanism of mass loss on the AGB. When the core mass of the
post-AGB star, with an initial mass of M <∼ 8M⊙, is below the Chandrasekhar limit, a fully degenerate star
is left at the end of the evolution: a white dwarf is born after the post-AGB stage. The study of star clusters
confirms that stars with an initial mass >∼ 6M⊙ can indeed end up as white dwarfs. A number of white
dwarfs has been found in star clusters with the turning point of the main sequence below stars with birth
mass 6M⊙. In those clusters, stars with M <∼ 6M⊙ are still located on the main sequence, consequently the
few white dwarfs have to be the end product of stars with an initial mass M >∼ 6M⊙. These stars clearly
have lost a large amount of mass as AGB star.
White dwarfs have sizes comparable to that of the Earth and their mass is about 3×105 times larger than
that of the Earth. The white dwarfs are a homogeneous class of stellar remnants. They form a well-defined
sequence in the B−V , MV diagram. The coolest objects detected have a luminosity of about 3× 10−5 L⊙.
The strong correlation between the luminosity (or MV ) and the effective temperature (or B−V ) shows that
the radii of white dwarfs need to be very similar, i.e., R ≈ 0.01R⊙. From observed values of their surface
gravity, it is deduced that the masses of single white dwarfs are very similar, with a strong peak around
M ≈ 0.6M⊙. For white dwarfs located in a binary system a much larger range in mass has been observed.
The white dwarfs mainly exist of C, O, and thin outer layers of He and possibly H. The ratios depend
on the efficiency of the helium burning. In general, the more massive white dwarfs contain more carbon.
From spectroscopic observations it is deduced that the composition of the stellar atmosphere can be quite
different. The most frequent are white dwarfs with an atmosphere mainly consisting of hydrogen. These are
DA white dwarfs. 80% of the known white dwarfs are of type DA. There is also a group of white dwarfs
with atmospheres mainly consisting of helium. These are the DB white dwarfs. Their percentage is about
20%. A very small number of white dwarfs has an atmosphere with a special chemical composition and
does not belong to the two main classes. They are divided into different classes depending on the observed
spectral lines of certain chemical elements. The effective temperature of white dwarfs cover a large interval
from 200 000 K to 4 000 K. The majority of these stars thus has a temperature higher than the Sun, therefore
the term “white” dwarf has been introduced.
In a star with a mass smaller than 1.44 M⊙, the degenerate electron gas is capable of counteracting the
enormous gravitational force. The less massive the white dwarf, the more non-degenerate matter still exists
196
Figure 12.10: Schematic representation of a mass-radius relation of a “classical” white dwarf structure
according to the theory of Chandrasekhar for a fully relativistic degenerate electron gas. The non-relativistic
case is shown for comparison. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and
Evolution, Radboud University Nijmegen, NL)
in the outer layers. As characteristic for configurations consisting of degenerate matter, the mechanical and
thermal properties are decoupled. On one hand, the mechanical structure is well described by electron pres-
sure belonging to a gas consisting of degenerate electrons. For this an expression was derived in Chapter 4.
The non-degenerate ions adhering to the ideal gas low on the other hand, are responsible for the mass of the
white dwarf.
It can be shown that white dwarfs follow a mass-radius relation, i.e. the radius of the white dwarfs only
depends on the mass and not on the temperature. Moreover, from the mass-radius relation it is deduced
that the radius is smaller as the mass is larger, i.e., the mass is inversely proportional to the volume. This
“classical white dwarf structure” is shown in Figure 12.10. On both sides of the mass interval, corrections
are needed as the classical theory deduced by Chandrasekhar does not apply any more. As such, the more
accurate determined Chandrasekhar limit is only 1.44M⊙.
The thermal properties are responsible for the radiation and the further evolution of the white dwarf.
In the deep interior of the white dwarf, the matter is degenerate and the energy transport happens very
efficiently through conduction, for which the nuclei transport the heat and not the photons. In the outer
layers, the energy transport happens differently. The outer layers contain less degenerate matter and the
energy transport happens through radiation or convection. The outer layers exist of normal gas that acts
197
as a very effective insulating layer, causing the white dwarf to cool very gradually and slowly. Thus we
have a non-degenerate outer layer in which the temperature is much lower and that insulates the degenerate
isothermal hot core. Because of this, the white dwarf is optically faint but luminous in X-rays since those
wavelengths capture the hot core.
As no nuclear reactions take place any more, the radiation that the white dwarf emits must come from a
different energy reservoir. For white dwarfs the energy needed to account for the luminosity is obtained from
the cooling of the ions: L ∼ T . An extremely small gravitational contraction takes place due to the cooling
as only the ion pressure decreases and not the electron pressure, the latter being the most important of both
by far. Half of the gravitational energy that is released by the contraction supplies the luminosity, the other
half is used to increase the Fermi-energy of the electrons. Finally, the result of this cooling mechanism is
that the white dwarf evolves to form a “black dwarf”: the contraction stops completely and all energy has the
shape of Fermi-energy at that point. The typical cooling time for a white dwarf of 1 M⊙ and L/L⊙ = 10−3
is 109 years. The oldest observed white dwarfs have an age comparable to the age of our Milky Way.
198
Chapter 13
Evolution of a star with M >∼ 15M⊙
Stars born with a mass above ∼ 15M⊙ evolve differently from those that are less massive. This is because
they are subject to a stellar wind from their birth onwards. Their continuous outflow of hot stellar gas
influences their life cycle between their birth and explosion as a supernova. An example of such a star is
shown in Figure 13.1: it is clear that we cannot properly describe the evolution of this star if we do not take
its mass loss into account. The basic assumption of conservation of mass is not justified for such stars.
13.1 The spectra of hot massive stars with mass loss
The discovery of the expansion of the atmosphere and the mass loss of massive stars by a stellar wind
has mainly been established since the “International Ultraviolet Explorer”, launched in 1979, intensively
observed such stars. Before the ultra-violet (UV) spectrum was available, it was assumed that conservation
of mass was not a bad approximation during the main-sequence stage. However, from the spectral lines in
the UV part of the spectrum of such stars, it became evident that they have a fast expanding atmosphere and
constantly experience mass loss already during the core hydrogen burning phase.
The continuum radiation in the stellar wind of hot stars is dominated by scattering processes, as the
densities of the wind are usually low. More specifically, it concerns the scattering of photons by free elec-
trons which is wavelength-independent. Such photon scattering by resonance lines of ions moving in the
wind occurs efficiently. The spectral lines formed in the stellar wind can be easily distinguished from those
formed in the photosphere due to their large broadening and large displacement with regard to the rest wave-
length. In general, line-profile shapes depend on the efficiency of the creation, scattering and destruction of
photons. Spectral lines can therefore occur as absorption lines, emission lines or as a combination of both.
When an ion in a stellar wind collides with an electron, it can use this electron to recombine. The
most probable collision recombination is the one to the ground state of the ion. The ion can, however, also
recombine to an excited state and subsequently descend in the energy-level-diagram through a sequence of
199
Figure 13.1: Image made with the Hubble Space Telescope of the Luminous Blue Variable ηCarina. (From
hubblesite.org)
radiative de-excitations. In this case each de-excitation is accompanied by the emission of a photon. This
process therefore produces a considerable amount of photons in the stellar wind. Lines belonging to specific
electron transitions, that have a high probability to be fed by collisional recombination followed by radiative
de-excitation, can display in this way a clear surplus of radiation: these spectral lines occur in emission. The
process of photon creation is responsible for the Hα emission and the infra-red emission lines in the wind
of hot stars. Photon creation requires much larger densities than photon scattering. It therefore only takes
place in the most dense parts of the stellar wind, not far from the stellar photosphere. Photon destruction
hardly occurs in the stellar wind. Most ions occur in their ground state. After radiative excitation, the decay
time for radiative de-excitation is very short. Moreover, the particle density in the stellar wind is very low.
Consequently, the ion will not be able to experience a collisional de-excitation in time. The absorbed photon
will therefore experience photon scattering and will not be annihilated by photon destruction.
When the spectral line consists of both an absorption and an emission component, it is called a P Cygni
profile, as was first observed for the supergiant P Cygni. Most of the P Cygni profiles are formed by resonant
scattering. Examples of P Cygni profiles are shown in Figure 13.2 for the stars ζ Pup and τ Sco. Compare
these spectral lines with those of stars without substantial mass loss, as shown in Figures 1.1 and 1.4. It
is evident that the profiles of the lines in Figure 13.2 look different. The formation and interpretation of a
P Cygni profile can be understood qualitatively as follows. We consider a simple model of a spherical wind
where the velocity increases with the distance away from the star (see left panel of Figure 13.3). An observer
recognises four areas that contribute to the formation of a spectral line :
1. the STAR emitting continuum radiation with a possible photospheric absorption component at a wave-
length λ0 of the spectral line,
2. the cylinder F in front of the stellar disc. The gas in F moves towards the observer with a velocity
between v ≃ 0 and v∞,
200
Figure 13.2: Observed P Cygni profiles of the N V doublet (upper panels) and the O VI doublet (lower
panels) in the UV spectrum of the massive stars ζ Pup (O4 supergiant) and τ Sco (B0 dwarf). The rest
wavelength is indicated by the arrows. The doublet lines merge in one strong P Cygni profile in case of ζ Pup.
They are observed separately in for τ Sco. The spectrum of τ Sco also displays many narrow photospheric
absorption lines. The profiles of both stars reach large negative velocities, pointing at matter outflow in the
direction of the observer. (From Lamers & Cassinelli 1999)
201
Figure 13.3: Left panel: geometry of a spherically symmetric stellar wind with increasing outward velocity.
The observer discriminates four areas STAR, F, O, H (see text for explanation). Right panel: the contribution
of the star (the continuum flux), the absorption by F and the emission by H. The P Cygni profile covers the
interval [−v∞, v∞] in velocity and is the sum of these three contributions. (From Lamers & Cassinelli 1999)
3. the cylinder O is located behind the star and is occulted by it. The gas in O moves away from the
observer, but the radiation from this area does not reach the observer.
4. the areas H around the star, which the observer would observe as a “halo” around the star if the wind
could be spatially resolved. The gas in H has negative as well as positive velocity components with
regard to the observer.
In the right panel of Figure 13.3 the contributions from the four different areas to the formation of the spectral
line are shown. The STAR provides continuum radiation with a photospheric absorption line. The area F in
front of the star scatters the photons that leave the star and consequently some of them disappear from the
line-of-sight. These photons would reach the observer when there is no stellar wind. The removal of stellar
photons by the accelerated wind particles results in a blue-shifted absorption coefficient with a Doppler shift
between −v∞ and 0 km s−1. For optically thin matter the absorption coefficient does not reach a flux equal
to 0 as there is also scattering in the line of sight in the direction of the observer from area F. For optically
thick lines the flux can be fully blocked from the observer. The halo H scatters radiation coming from the
stellar photosphere in all directions. Part of that radiation moves in the direction of the observer. This part
induces an emission component with a Doppler shift between −v∞ and v∞, with the biggest contribution
centred at velocity 0 km s−1. The net result of all these contributions is obtained by summing them up.
The P Cygni profiles of resonance lines occur in the UV part of the spectrum for hot stars. Therefore the
conclusion of outflow by means of a stellar wind had to wait until this part of the electromagnetic spectrum
could be observed and this requires a space mission. The detection of P Cygni profiles allows at once to
deduce the maximum velocity occuring in the wind, denoted as v∞.
202
13.2 Basic characteristics of radiation-driven stellar winds
The stellar winds of massive stars are caused by a different mechanism than the dust-driven winds during the
AGB. In the case of hot stars it concerns a radiation-driven or also termed line-driven wind. The mechanism
of such a stellar wind is well understood and can be mathematically derived. The latter will not be discussed
in this course due to the limited time. We will summarise a few of the concepts and results.
The two most important parameters that describe the stellar wind of massive stars and that can be
derived from observations are the mass loss M and the terminal velocity of the stellar wind: v∞. The
terminal wind velocities of hot stars have values up to 3000 km s−1 (this is 1% of the speed of light!) and
are thus of a completely different order than the slow dust-driven winds. For hot stars with M >∼ 15M⊙
the mass loss is important from their birth onwards as it influences the stellar evolution (think about the
mass-luminosity relation and the main-sequence life time!).
Each photon created in the core of the star by nuclear reactions has an energy hν and a momentum
hν/c. The total momentum loss the star encounters by emitting photons at its surface is given by L/c =4πR2F/c with F the stellar flux. The corresponding mass loss is L/c2. Conversely, the loss of momentum
due to a stellar wind is given by Mv∞. Observationally, it is found that Mv∞ ≃ L/c. This means that an
efficient mechanism must be at work in the stellar wind. Apparently, this mechanism is capable of absorbing
almost all photons that leave the star and convert them into kinetic energy of the wind ions.
The gas that escapes from the star into the interstellar medium, has a certain kinetic energy. The amount
of kinetic energy that the stellar wind transmits to the interstellar medium per unit of time is 12Mv2∞. To
know the effect of the stellar wind on the environment, we need to derive values for M and v∞. We also
need these values to evaluate the effect of the mass lost due to the wind on the evolution of the star. For a
star that experiences a stationary spherically symmetric wind, the mass loss at a certain point in the wind is
related to the density and the velocity at that point:
M = 4πr2ρ(r)v(r), (13.1)
where r is the distance from the point in the wind to the centre of the star, and ρ and v are the density
and velocity of the wind at that certain point, respectively. Equation (13.1) expresses that no material is
destroyed or created in the wind and consequently it is always the same amount of gas that passes through
a sphere at a distance r from the star. This equation for a stationary wind and its accompanying mass loss
replace the equation for the conservation of mass in the system Eqs (8.1) of the stellar structure equations.
The gas that escapes from the outer stellar layers is accelerated. It has a low radial velocity, typically
of order 10 km s−1, at the stellar photosphere and is accelerated up to a high velocity at a large distance
away from the star. At a very large distance from the stellar centre, the velocity experienced by a particle in
the stellar wind approaches asymptotically the terminal wind velocity: v∞ = v(r → ∞). The distribution
of the velocity in the stellar wind as a function of the radial distance r to the stellar centre is called the
wind velocity-law v(r). The observations of stellar winds give rise to a velocity described by a so-called
βwind-law :
v(r) ≃ v0 + (v∞ − v0)
(
1− R
r
)βwind
. (13.2)
203
This velocity-law describes a general increase in v with the radial distance, with v0 the velocity at the
photosphere r = R and v∞ the terminal wind velocity at large distance, with v0 ≪ v∞. The wind velocity
is thus assumed to be time-independent. The parameter βwind describes how steep the velocity-law is. Hot
stars have a velocity-law that is quite well described by βwind ≃ 0.8. Particles in these winds thus experience
a large acceleration and reach a velocity equal to 80% of the terminal velocity at a distance ≃ 4R, i.e., at
≃ 3R beyond the stellar surface.
13.3 Mass loss and terminal wind speed
We study the hydrodynamics of the stellar wind with the goal to derive an expression for the mass loss and
the terminal velocity of the wind. The assumption of hydrostatic equilibrium is no longer justified. The
equation of motion for a spherically symmetric configuration has the general form of Eq. (3.22) that we
rewrite here as:1
4πr2∂2r
∂t2= −∂P
∂m− Gm
4πr4.
Herein P is the total pressure: P = Pgas + Prad. The outward directed acceleration due to the radiation
pressure thus reduces the effect of the inwardly directed gravitational acceleration. We can write the balance
of the forces on the right-hand side of the equation above differently :
1
4πr2∂2r
∂t2= −∂Pgas
∂m+
(ggrav + grad)
4πr2, (13.3)
where ggrav represents the gravitational acceleration and grad the radiative acceleration. We have to deter-
mine grad and solve the equation.
The stellar wind of massive hot stars is driven by the scattering of photons by ions. The ions are
responsible for the spectral lines and therefore the wind is called a line-driven wind. The computation of
grad hence requires a study of the radiative transport in the outer stellar atmosphere. Here, we only consider
the result (the derivation requires the calculation of a complicated integral of the radiation pressure, defined
as the radiation flux per unit of surface). In analogy to the force exerted by the gradient of the gas pressure,
we find the force exerted by the gradient of the radiation pressure at position r in the stellar wind:
grad(r) =1
c
∫
∞
0κν(r)Fν(r)dν. (13.4)
Hot stars radiate most of their energy at UV wavelengths (see Figure 13.4). In this wavelength interval,
the atmospheres of such stars have numerous absorption lines. The opacity of the absorption lines is much
higher in the UV than the one of the continuum radiation. The opacity of one strong absorption line, for
example the C IV resonance line at 1550A, can easily be a million times higher than the opacity of electron
scattering.
The large radiation pressure experienced by the ions due to the absorption lines would not be an efficient
driving mechanism for the mass loss without the Doppler effect. In a static atmosphere with a strong line
204
Figure 13.4: The fraction of the stellar radiation from hot stars absorbed in the UV by the stellar wind is
indicated by the shaded parts for stars of different effective temperatures. (From Abbott, 1982, Astrophysical
Journal, Volume 259, p.282)
absorption, the photons radiated outwards will be absorbed or scattered by an ion in the atmosphere if they
have the appropriate wavelength. The ions in the outer layers see few “suitable” photons, i.e., photons with
the appropriate wavelength to make a line transition, passing by. Thus the radiative acceleration grad in the
205
outer layers of the atmosphere due to line absorption will be weak. When the outer atmosphere is dynamical,
however, there is a velocity gradient causing the ions in the outer layers to see the photons red-shifted with a
level covering the velocity range [−v∞,+v∞]. Thanks to their Doppler shift, the ions can “use” many more
photons to form spectral lines. Consequently, the ions can absorb the radiation coming from the photosphere
very efficiently. This is the basis for the efficient driving mechanism causing the stellar wind in hot stars.
13.3.1 Thomson scattering in the stellar wind
First we only consider the radiative acceleration grad caused by the continuum opacity from a point source
of radiation as if all photons come from the core of the star. For the continuum radiation it concerns photons
that are scattered by free electrons in the stellar wind. It is called Thomson scattering and it is independent
of the frequency of the photon. The frequency-independent effective cross section of one electron is
κT =8π
3
e2
mec2, (13.5)
where me represents the mass of the electron and e its charge. The opacity caused by the electron scattering
is given by
κe = κTneρ, (13.6)
where ne is the number of electrons per cm3. The latter depends on the mass fractions X,Y,Z and the
degree of ionization in the wind. For early-type massive stars the opacity at the base of the wind can be
approximated by relying on the mean molecular weight per electron defined by µemu = ρ/ne and the
expression for µe in the case of fully ionised material given by Eq. (2.38). This leads to the approximation
given by κe ≈ 0.20 (1 + X) cm2/g. Thus, in the approximation that the degree of ionization is constant
throughout the whole wind, the radiative acceleration due to Thomson scattering is κeF/c, i.e., it only
depends on the position in the stellar wind. At a distance r > R in the wind, following Eq. (13.4) and the
approximation of Thomson scattering by free electrons, the radiative acceleration is therefore given by
ge(r) =κeL
4πr2c=GM
r2Γe, (13.7)
with
Γe ≡κeL
4πcGM. (13.8)
For main-sequence stars with M <∼ 15M⊙, Γe ≃ 0 and we can simply neglect the stellar wind caused
by the continuum radiation. For more massive main-sequence stars and hot giants and supergiants, Γe is
significantly different from zero.
We find that the radiative acceleration caused by the continuum radiation has the same r-dependence
as the gravitational acceleration. The corresponding force, however, is opposite to the gravitational force,
and will diminish the effect of gravity. Therefore, we can merge both terms in an effective gravitational
acceleration:
geff(r) ≡ −GMr2
[1− Γe] . (13.9)
The radiation pressure caused by the continuum radiation can overcome gravity when Γe > 1. In practice,
however, Γe < 1. The continuum opacity alone thus cannot be the source of the stellar wind.
206
13.3.2 LBVs, WR stars and the Eddington limit
The quantity Γe is a function of L/M . We now consider the total opacity κ resulting from all radiative
processes. We then find an upper limit for the luminosity of a star by expressing that gravity is capable of
holding the gaseous sphere together:
L <4πcGM
κ. (13.10)
When this condition is not met, the star cannot exist. This leads us to the critical luminosity that cannot be
exceeded, termed the Eddington luminosity:
LEdd ≡ 4πcGM
κ. (13.11)
The corresponding condition for L/M is called the Eddington limit.
Stars can get close to their Eddington limit when they have a very large energy flux and/or when the
opacity becomes very large. In that case, they cannot be very stable and the slightest disturbance helping the
radiation to overcome gravity results in a large loss of matter. This is the case for luminous OB supergiants
and so called Luminous Blue Variables (LBVs) and Wolf-Rayet (WR) stars. LBVs are very massive stars
that experience an unstable state in their evolution. The outwardly directed acceleration caused by the
strong radiation pressure is so large in LBVs that, with the slightest pertubation, they can overcome the
inward gravitational acceleration. Consequently an unstable state occurs, resulting in severe mass loss. The
outbursts of an LBV can last for decades and can be very irregular, with long periods of equilibrium in
between. The star ηCarina, of which the geometry was shown in Figure 13.1, is an LBV.
A star is called a Wolf-Rayet (WR) star when a hot helium core remains after the evolution of a massive
star that lost its entire outer envelope due to an extremely strong radiation pressure. WR stars are thus the
successors of the LBVs. In the spectrum of such a WR star we find mainly emission lines caused by the
fast expanding envelope. Due to the presence of this envelope, it is hard to define the stellar photosphere.
The effective temperature of a Wolf-Rayet star is about 30 000 to 50 000 K. These stars had an original mass
above 40M⊙ and lost so much mass during their evolution via their stellar wind that they have only about
∼ 4M⊙ left at this stage of their evolution. The WR stars are divided in two groups: the carbon-rich
WC stars and the nitrogen-rich WN stars. These classes are subdivided in WC5 – WC9 and WN3 – WN8
depending on the presence of particular spectral lines in the observed spectrum. The WN and WC varieties
represent different evolutionary stages. WN stars evolve to WC stars as more stellar material is lost through
the stellar wind. The LBVs as well as the WR stars are located close to their Eddington limit.
The application of the mass-luminosity relationship results in an upper limit for the mass of a star in
order for the gravity to keep the gaseous sphere together. We thus find that the main sequence must have
an upper limit in mass. When we only take the continuum opacity by scattering into account, we find an
upper limit for the mass of about 150M⊙. In practice we know that stars with a much lower mass already
experience a strong mass loss. This is due to the large line opacity of massive stars.
207
13.3.3 A realistic description of a line-driven stellar wind: the CAK-model
The radiative acceleration caused by the spectral lines no longer is proportional to 1/r2. Consequently it
can no longer be combined with the gravitational acceleration in a simple way. We need to take into account
all the spectral lines to find an accurate description of grad, i.e., the “combined effect” of all lines has to be
regarded to determine an accurate expression for grad. This requires the determination of ionisation degrees
and excitation states of a large number of energy levels for a large number of ions. As an illustration we
show in Figure 13.5 parts of the spectrum of the B1III giant ξ1 CMa from the far ultra-violet up to and
including the visible. The depth and width of each of these spectral lines is described by its line absorption
coefficient. This coefficient depends on the temperature of the gas, the density, the pressure, the abundance
of the element, the ionisation state of the gas, the collision probabilities, the line transition probabilities,
etc. The number of spectral lines that needs to be taken into account to compute the radiative acceleration
caused by line radiation reduces drastically (with a factor 105!) when limiting to the strong resonance lines.
This is a good approximation due to the low density in the stellar wind. The astronomers Castor, Abbott &
Klein (CAK) came up with this approximation in their 1975 theory, which is still in use today and is called
the CAK-theory.
The spread of the (resonance) lines over the wavelengths is not homogeneous as shown in Figures 13.4
and 13.5. There are regions in the spectrum where they hardly occur, while they are abundant in others,
even overlapping sometimes. Once the radiative force of all individual lines is computed and tabulated,
the corresponding overall radiative acceleration, grad, can be parametrised. The aim then is to solve the
equation of motion and to derive expressions for the mass loss and the wind velocity.
The cumulative effect of an ensemble of non-overlapping absorption lines was parametrised by Castor,
Abbott & Klein through a power law of the line opacity: N(κℓν) ∼(
κℓν
)αCAK−2with 0 < αCAK < 1 and
κℓν the opacity of the spectral line ℓ at a frequency ν (or wavelength λ = hc/ν). In this empirical law, the
power αCAK determines how optically thin (value below 0.4) or optically thick (value close to 1) the line
ℓ at frequency ν is. They found this result empirically from a list of a few hundred observed carbon lines.
They gave each line a weight according to the value νℓFν/F (see Figure 13.4), so that a higher weight is
associated with a stronger line. The proportionality constant is chosen such that κ0N(κ0) = 1 with κ0 the
opacity of the strongest line in the entire stellar spectrum. After these first empirical results and with the
increase of available computational power, many more spectral lines were used to determine grad.
Figure 13.4 graphically represented which fractions of the stellar flux in the UV occur in spectral lines
to drive the stellar wind. It concerns substantial fractions of the produced flux, explaining why the line-
driven stellar wind can be such an efficient mass loss mechanism: a large part of the stellar flux is converted
into wind momentum.
The parametrisation of CAK results in:
gCAKrad (r) =
KL
r2
(
1
ρ
dv
dr
)αCAK
, (13.12)
with K a constant and αCAK determined by the relative proportion of the number of optically thin (read:
weak) and optically thick (read: strong) absorption lines in the spectrum. With this result, one can deter-
208
Figure 13.5: Parts of the spectrum of the star ξ1 CMa (B1III). The derivation of grad(r) requires the accurate
determination of the absorption coefficients of all spectral lines in a large wavelength range, including the
UV part of the spectrum.
209
mine the effect of gCAKrad for the motion of particles in the wind by substituting the expression for gCAK
rad in
the equation of motion and solving it numerically under the assumption of a stationary wind. Using the ex-
pression for an ideal gas for the gas pressure, the solutions in the approximation of a point source of photons
read (derivation omitted here, see MSc course Stellar Atmospheres):
MCAK =
(
κe4πc
)1/αCAK 4π
κevthαCAK (0.32)1/αCAK
(
1− αCAK
GM(1− Γe)
)(1−αCAK)/αCAK
L1/αCAK (13.13)
and
vCAK
(r) = v∞
√
1− R
r=
√
αCAK
1− αCAK
√
2(1− Γe)GM
R
√
1− R
r. (13.14)
In this approximation it is thus found that βwind = 1/2.
In practice the star is not a point source, and this needs to be corrected for, especially for wind particles
that are located close to the stellar surface. The corrections are made by redoing the calculation but this time
in the approximation that the stellar surface has the correct spherical shape. Various additional numerical
upgrades have been made to the original CAK theory to compute wind mass-loss predictions. In the left
panel of Figure 13.6 we show recent results for the mass-loss rates as a function of luminosity for O-type
stars at the metallicities of our Milky Way and the Magellanic Clouds computed with the FASTWIND code,
originally developed by Prof. Joachim Puls at Munich Observatory and currently under active development
at KU Leuven by Prof. Jon Sundqvist. The dependence of the mass loss on metallicity is obvious from
this graph. Most stellar evolution codes, including MESA, take the mass loss rate predictions by Vink et
al. (2001, A&A, Volume 369, p.574) as the standard values for the regime of OB-type stars. However, it
has lately become clear that these rates predict systematically too high values for M according to various
types of modern observations. A comparison with the latest Leuven FASTWIND mass-loss rates and those
based on the older Vink et al. recipes is shown in the right panel of Figure 13.6 and reveals a factor ∼ 3overestimation when using the Vink prescriptions – students of this SSE course may want to take this into
account in their MESA Labs!
The mass loss caused by line-driven stellar winds has a large effect on the evolution of stars with an
initial mass larger than about 25 M⊙, i.e., O-type dwarfs on the main sequence. These stars experience
substantial mass loss during their entire life. Stars with 15M⊙<∼ M <
∼ 25M⊙ also experience a stellar
wind, but it only becomes strong when leaving the main sequence. Consequently all of these stars play a
crucial role in the chemical enrichment of the interstellar medium, from their birth until their explosion as
core-collapse supernova.
For the stars born with a mass above 25 M⊙, the mass lost in the core-hydrogen burning phase due
to the stellar wind implies a change in the Mass-Luminosity relation and, given the Mass-Radius relation
valid for the main sequence, the mass loss also affects the radius of the star. Both the mass and radius in
their turn determine the escape velocity of the star. Figure 13.7 shows the Mass-Luminosity relation and
the ratio of the terminal wind speed with respect to the escape velocity as a function of luminosity taking
into account the effects of the line-driven wind for dwarfs, giants, and supergiants. Calibrations of these
numerical computations by means of detached eclipsing double-lined binaries are hardly available, as the
catalogues of such objects typically cover masses until about 25 M⊙.
210
Figure 13.6: Left: Mass-loss rates as a function of luminosity for O-type stars resulting from modern CAK-
based radiation-driven wind computations done with the FASTWIND code at KU Leuven. Right: Most
stellar evolution codes, including MESA, take the mass-loss rate predictions by Vink et al. (2001, A&A,
Volume 369, p.574). The right panel compares these with our Leuven results. Squares represent dwarfs,
bullets are for giants and diamonds for supergiants. (From Bjorklund et al., 2020, A&A, Vol. 648, id.A36,
16pp.)
Figure 13.7: Left: Mass-Luminosity relation for O-type stars undergoing a radiation-driven wind. Right:
the ratio of the terminal wind speed versus the escape velocity from the star as a function of luminosity.
(From Bjorklund et al., 2020, A&A, Vol. 648, id.A36, 16pp.)
13.4 Consequences of mass loss for stellar evolution
In Figure 13.8 we show the results of stellar model calculations during hydrogen and helium burning where
the mass loss due to a line-driven stellar wind was taken into account. Several approximations are being
considered to determine evolutionary tracks. In principle one should make a fully self-consistent integration
211
of the system of differential equations Eqs (8.1) such that the conservation of mass is replaced by the equa-
tion for a stationary stellar wind as in Eq. (13.1) and the equation of motion by Eq. (13.3) with grad given
by Eq. (13.12). However, as the free parameter αCAK is a non-integer, this implies a serious mathematical
complication of the system of differential equations. Moreover, the boundary conditions at the surface need
to be adapted, as the hydrostatic equilibrium is no longer valid in a dynamical atmosphere where the ions
experience an accelerated motion. Also recall that the energy transport equation in Eqs (8.1) relied on the
assumption of hydrostatic equilibrium . . . . For these reasons, a different road is taken for the calculation of
evolutionary tracks, as was done for the AGB (only here we have a much better theory for the determina-
tion of M and v∞). The mass is simply reduced during each time step with an amount MCAK multiplied
by the duration of the time interval, and the system of Eqs (8.1) is solved. Clearly this approach does not
result in a complete consistency between the used mass loss and the evolutionary stage for which the model
is computed. A fully consistent integration of the system of differential equations, including a dynamical
atmosphere with an outflow, is not at hand yet.
Stars with a birth mass M > 60M⊙ experience such a strong stellar wind during their main sequence
and their hydrogen shell burning stage that their entire hydrogen envelope is blown away. After these stages,
they are left with a naked helium core. Consequently they stay in the blue part of the HR diagram. The stellar
wind of stars with 25M⊙<∼ M <
∼ 60M⊙ is not strong enough to blow away the entire hydrogen envelope
during the main sequence. These stars very quickly become red supergiants after the TAMS. At that stage,
the stellar wind does blow away the remaining hydrogen envelope. When the mass of the hydrogen rich
envelope has decreased through mass loss, the convective energy transport in this envelope can no longer
happen in equilibrium. Consequently, the outer envelope contracts until it is in radiative equilibrium, in
other words the radius decreases. In that way, the stars can never remain red supergiants due to their mass
loss and they have to return to the blue side of the HR diagram. In practice, red supergiants are missing
for L > 5 × 105 L⊙ (or in terms of absolute bolometric magnitude: Mbol < −9.5). This observed upper
limit for the distribution of stars in the HR diagram is called the Humphreys-Davidson limit, named after
its discoverers. The limit is a slant line with a negative slope for effective temperatures between 50 000
and 10 000 K, and a horizontal line for cooler temperatures. In Figure 13.9 we show the upper part of the
HR diagram with the observed Humphreys-Davidson limit.
Finally, stars with 15M⊙<∼ M <
∼ 25M⊙ do experience mass loss but it remains so limited that they
never lose their outer hydrogen envelope. Because of that they succeed to traverse to the red part of the
HR diagram and proceed with their evolution from there on, as the stars discussed in the former two chapters.
They end their lives as a neutron star.
As can be established from Figure 13.8, the main sequence broadens considerably vis-a-vis the one
for stars with M <∼ 15M⊙, as mass loss prolonges the stay on the main sequence. Since the mass loss
decreases the mass of the star, the luminosity will also decrease (mass-luminosity relation). Massive stars
that experience mass loss have therefore a lower luminosity than stars born with the same mass but without
a stellar wind. The mass loss thus indeed prolonges the life span on the main sequence. In Figure 13.10
both effects, lower luminosity and lower mass, are schematically depicted for a star with an initial mass
of 30M⊙. The three evolutionary tracks represent a different value of the mass loss. This mass loss is
calibrated in units M = N L/c2, where N indicates the number of resonance lines in the UV responsible
for the line driving. In practice, N ≃ 100. The luminosity of the star decreases as the mass increases.
The mass at the end of the main-sequence stage and the length of this stage are also indicated. The effect
212
Figure 13.8: Evolutionary tracks of massive stars with an initial composition X = 0.73 and Z = 0.02taking into account the mass loss as described in the text. The tracks indicated by a solid line assume the
Schwarzschild criterion for convection, use the mixing-length theory, and ignore CBM. For comparison,
parts of the tracks are indicated by a dashed line: these are based on a variant of the mixing-length theory
but otherwise have the same input physics. The shaded areas indicate the main sequence and the start of the
helium burning. The first bold dot is the time when the first triple α reaction starts, the second dot (when
present) indicates the time when the carbon burning starts. (From Maeder 2009)
213
Figure 13.9: The upper part of the HR diagram. Thin lines represent evolutionary tracks. The original
Humphreys-Davidson limit is indicated by a bold dot-circle line, while the dotted line represents its refine-
ment. The grey area contains red supergiants, which are not shown individually for reasons of clarity. (From
Fitzpatrick & Garmany, 1990, Astrophysical Journal, Volume 363, p.119)
of a longer lasting main sequence is reinforced when models with CBM are computed. Indeed, due to the
overshooting more hydrogen is brought into the core, making the main sequence last longer. Moreover,
the helium burning starts at a higher effective temperature for these stars (Figure 13.8), as they have a less
thick hydrogen envelope. Finally, the stars with M >∼ 25M⊙ also evolve very quickly to the blue side
of the HR diagram. All of this implies that there is no analogy for the Hertzsprung gap for M >∼ 15M⊙.
Observations confirm this, as can be seen in Figure 13.9.
Observations show that the part in between the TAMS and the start of the helium burning stage is
populated by stars with a different chemical composition at their surface. This reflects that stars moving to
the right, as well as stars moving to the left occur in the same area of the HR diagram. The massive stars
burn hydrogen through the CNO cycle and mainly produce nitrogen at the expense of carbon when they
have reached the TAMS. The mass loss additionally makes the outer hydrogen layers disappear during the
short red giant stage. Consequently the products of the nuclear reactions reach the stellar surface while the
stars move back to the blue part of the HR diagram. This explains the existence of blue
• ON stars. These are massive stars with a very high nitrogen abundance and a very low carbon abun-
dance;
• WR stars with a high He abundance and a very low H abundance at their surface. The WN stars have
a high nitrogen abundance through the hydrogen burning via the CN cycle while WC stars have a high
carbon abundance due to the triple α reaction.
214
Figure 13.10: Evolutionary tracks for a star with an initial mass of 30M⊙ for different values of the mass
loss during the main-sequence stage. The mass loss is expressed as M = N L/c2. (From Lamers &
Levesque 2017)
13.5 Example: the evolution of a star with an initial mass of 60M⊙
We show the evolution of a star with a birth mass of 60 M⊙ in Figure 13.11. The upper limit in the bottom
panel indicates the decreasing mass due to the mass loss. During the stage of central hydrogen burning
(from A to C), H is converted into He via the CN cycle. This stage only lasts 3.7× 106 years and causes an
increase in He and 14N, as well as a decrease of H and 12C in the extensive convective core. Meanwhile, the
stellar mass decreases substantially through the mass loss. The latter increases from 1.4× 10−6 M⊙/year on
the ZAMS to 7.0 × 10−6 M⊙/year at the TAMS.
In stage B, when the main-sequence stage has almost come to an end, the products of the hydrogen
burning through the CN-cycle in the core are transported to the surface of the star by an extensive convective
zone. The star reaches the stage where the vertical lines occur at point B in the bottom panel of Figure 13.11.
At that time the N and He abundances increase at the stellar surface while H and C abundances decrease: the
star becomes a N-rich ON star. When the hydrogen supply is completely finished in the stellar core, it starts
to contract, making the star move to the left in the HR diagram. This motion continues until the temperature
in the area around the core is high enough to induce hydrogen shell burning. This happens in point C. At
that time, the star already lost about 15M⊙ and the burning shell surrounds a core containing about 30M⊙.
By starting the hydrogen burning in a shell, the outer layers expand and the star moves to the right
215
Figure 13.11: Top panel: the evolutionary track of a star with an initial birth mass of 60M⊙ in the
HR diagram. Bottom panel: The internal stellar structure as a function of time. The stages of different
spectral classes are indicated. In the areas with “clouds” the energy transport happens through convection.
Areas with diagonal lines indicate where the nuclear burning takes place. In the areas with the thin vertical
lines the original chemical composition changes substantially. The time-axis is divided in three parts. The
characters in each of the panels denote the specific stages in the life of the star, as discussed in the text.
(From Maeder 2009)
216
in the HR diagram. Shortly after that, the star becomes unstable and turns into an LBV (stages E → F). It
suffers from strong mass loss, of the order of 5 × 10−4 M⊙/year. In total it loses about 5M⊙ during this
LBV stage, that takes about 10 000 years. Due to the strong mass loss, the He-enriched layers appear at the
stellar surface, where He/H mass ratio equals about 0.4. While the star is busy with its post-main-sequence
evolution and is moving to the right in the HR diagram, it loses that much mass during the LBV stage that
its expansion is stopped. In other words, it reaches the Humphreys-Davidson limit, and consequently the
envelope contracts again and the star moves back to the left in the HR diagram (after point E). This star is
now transformed into a luminous, relatively small hot star with a He-rich and N-rich photosphere.
The star becomes a WR star of type WN during the stages F → G. It continues to lose its hydrogen
envelope and receives its energy from the combination of He-burning in the core and hydrogen shell burning,
as long as there is enough hydrogen left in the outer envelope. The mass loss during the WR stage remains
substantial, about 3 × 10−5 M⊙/year. After about 4 × 106 year (stage G) the outer layers are lost to such
an extent that the carbon-rich layers appear at the stellar surface. The C-rich material was formed by the
triple-α reaction and is brought to the surface through convective motions. The star now becomes a WR star
of type WC with an C-rich and He-rich envelope.
Meanwhile the star keeps contracting (G → H). The luminosity remains nearly constant until it reaches
a radius of about 0.8R⊙ and an effective temperature of about 200 000 K. However, an observer does not
“see” this temperature due to the circumstellar material. The large mass loss results in an optically thick
wind. The radiation that escapes in the direction of the observer stems from that wind, where the temperature
is about 30 000 K. The radius of the area where the wind is optically thick is about 10 R⊙. This radiation is
observed from the near UV until the IR. Observed WR stars therefore have a much lower effective temper-
ature and a much larger radius than those deduced from stellar evolution calculations. The optically thick
wind makes it very hard to estimate the observational parameters of the WR stars.
The stage of central He-burning (D → H) lasts about 6 × 105 year and is followed by a stage of C-
burning. The latter takes only 2 000 year. The next stages, burning always heavier elements, end in less than
a year. The star will explode as a core-collapse supernova and a black hole remains. During its evolution
before the supernova explosion, the star has emitted 38M⊙ of its material and transmitted it to the interstellar
medium: 29M⊙ of hydrogen that was partially enriched with 14N, 8M⊙ of helium and 1M⊙ of carbon and
oxygen. The interaction of the emitted material with the surrounding interstellar medium often causes a WR
(ring)nebula.
13.6 Black holes
Once a star with M >∼ 25M⊙ has passed through its different burning cycles and passed the LBV and
WR stages, there is no return: the star will soon end its live as a supernova. The (still quite uncertain)
theoretical EOS options for neutron stars impose an upper limit of about 2 M⊙ for the mass of such a
remnant. For compact objects that are more massive, there is at this moment no known mechanism capable
of counteracting gravity. Such objects collapse into, what is called, a black hole. Where neutron stars
were already extreme in their density, rotation, and magnetic fields, black holes are the ultimate form of
217
compactness that a massive star can evolve into. By definition, the collapsed star will no longer be directly
observable by electromagnetic radiation. Only a strong gravitational field remains.
The theoretical description of a black hole, and even the whole concept of such an object, is fully
based on the theory of general relativity. This is outside the scope of this course and is studied in separated
courses throughout the BSc and MSc Physics educational tracks at KU Leuven. Nevertheless, even simple
arguments allow to infer the bizarre situation when the radius of a star with a certain mass becomes so small
that the escape velocity approaches the speed of light. This limiting radius is known as the Schwarzschild
radius and is given by
RSch ≈ 2GM
c2≃ 3
M
M⊙
km. (13.15)
Classical mechanics would never result in a gravitational field for r ≤ RSch from where photons can no
longer escape, as photons have no mass. However, the concept of the Schwarzschild radius gives an intuitive
explanation for the “blackness” of a black hole.
Despite the recent “imaging” of the supermassive black hole at the centre of the galaxy M87, stellar
black holes remain hard to detect. One approach is by detecting X-rays emitted by the matter falling into
the black hole. Another, easier, method is by detecting the motion of the visual component in a binary star,
where the other component is a black hole (see Chapter 14). In that way the existence of black holes can be
proven. Nowadays many black holes in binary stars are known. The best-known and first-found example
is Cygnus X-1. This object is an X-ray source for which the binary character reveals a massive O-type
supergiant companion. From the estimation of the mass of this component and the inclination of the orbital
plane, the mass of the invisible compact star is estimated to be 6 M⊙, which excludes it to be a neutron star.
Meanwhile, many other such obvious examples of binary systems with an invisible yet massive compact
companion have been found, the so called “X-ray binaries”. In those objects, the invisible companions all
have a mass above the (uncertain) upper limit for a neutron star. Black holes in binary systems are discussed
in the master courses Binary Stars and High Energy Astrophysics and we briefly touch upon them in the
final chapter of this SSE course.
Summary
We summarize a few of the results we have described in this chapter up to now. Mass loss due to a radiation-
driven stellar wind has a considerable effect on the life cycle of stars with a birth mass above 15M⊙, and
consequently on that of the galaxies containing such stars. This mass loss :
1. changes the chemical composition of the stellar surface,
2. drastically changes the life span of the star during certain evolutionary stages,
3. explains the occurrence of circumstellar matter around black holes,
4. considerably changes the post-main-sequence evolutionary tracks in the HR diagram,
218
5. explains the lack of very luminous red supergiants,
6. explains the existence of LBVs and WR stars, which are the precursors of black holes,
7. and, last but not least, results in a considerable enrichment of heavy elements in the interstellar
medium, and thereby determines the chemical evolution of galaxies.
13.7 Chemical evolution of galaxies
13.7.1 Chemical enrichment by stellar evolution
In Figure 13.12 a schematic representation is shown of the chemical enrichment of the interstellar medium
through the mass loss of massive stars. The figure shows the masses in terms of percentage of helium and
heavier elements that are expelled by the stellar wind and during the supernova explosion of massive stars. A
division is made for helium, carbon, oxygen and elements of the silicon-iron group. For the Population I star,
we notice a clear increase in the importance of the stellar wind in terms of expelled helium for increasing
mass. Around 35M⊙, the contribution of the wind and of the explosion are about equal. With increasing
mass, the chemical enrichment of helium by the stellar wind becomes much more important than the one
during the explosion. The latter is of no significance for stars with a mass above 60M⊙. The enrichment of
elements heavier than hydrogen and helium by the stellar wind only becomes important for masses above
50M⊙, during their WC stage. Even for these stars, the enrichment during the supernova explosion is
dominant compared to the enrichment due to the stellar wind. For the metal-poor star, the wind is a lot less
strong as the line-driving is heavily dependent on metals. As revealed by Figure 13.12, the fractional mass
of helium and carbon by the dust-driven wind emitted by AGB stars is significant and less dependent on
metallicity than the line-driven wind of massive stars. The fractional masses residing in the three types of
remnants, i.e., white dwarf, neutron star or black hole, are also indicated.
13.7.2 Initial mass function
Continuous starformation results in a steady decrease of the population of massive stars. Such stars live so
short on a galactic time scale that their relative number is determined by the fractional amount of gas in a
galaxy. So even when massive stars would be born with a similar probability as low-mass stars, they still
would become less common as the galaxy evolves. This effect is enhanced as we know that the probability
to form massive stars is much lower than the probability to form low-mass stars.
Suppose that starformation is independent of the location in the galaxy and of its age. In that case,
the number of stars that form at a given time and in a given volume only depends on the mass. Denote the
number of stars born with a mass in the interval [M,M + dM ] as
dN = Φ(M) dM. (13.16)
219
Figure 13.12: Mass fractions ejected by a stellar wind and during the supernova explosion as a function of
birth mass, for two metallicities. The fractional mass left behind in the form of a white dwarf, a neutron star
or a black hole is also indicated. (From Maeder, A., 1992, A&A, Vol. 264, p.105–120)
In this expression, Φ(M) is called the birth function. It was estimated empirically based on observations of
main-sequence stars in the environment of the Sun by Salpeter in 1955. Salpeter made a histogram of the
stars according to the luminosity using the mass-luminosity relation and he assumed that the life span of the
220
main sequence was proportional to M/L. Meanwhile observations have improved substantially, as well as
the estimate of the life span of the main sequence by stellar evolution models. An adapted version of the
Salpeter distribution is:
Φ(M) ∼ M (−2.35±0.3). (13.17)
The initial mass function (IMF), ξ(M), is defined as follows. We write the amount of mass that is
found at a given time and in a given place in stars with a mass in the interval [M,M + dM ] as
M dN = ξ(M) dM. (13.18)
When we then combine Eqs (13.16), (13.17), and (13.18), we obtain for the IMF:
ξ(M) ∼(
M
M⊙
)(−1.35±0.3)
. (13.19)
For the environment of the Sun, deviations from this approximation for low mass stars and particularly for
brown dwarfs are found. This does not come as a surprise as it is very hard to characterise the stars with
the lowest masses and brown dwarfs, and consequently there is a bias of the IMF exponent towards high
masses, i.e., it is hard to describe the IMF in terms of one single exponent.
In general, the star formation rate changed quite drastically over the history of galaxies and these
changes produce variations over time in the local birth functions and in the overall IMF. Moreover, the
main-sequence lifetimes of massive stars overlaps with the duration of the star formation process of lower-
mass stars. This must imply that the original IMF is different from the current one, i.e., that the assumption
of the star formation rate being constant or only slowly varying with time is not justified. The distribution
function in Eq. (13.19) hence is and remains a semi-empirical result which cannot be but of limited precision.
We presently lack a sound theory that has one simple IMF description as its solution, because of the
complexity of the evolutionary aspects that need to be taken into account. Appropriate star formation rates
should indeed include the history of the most massive stars, star bursts, the evolution of OB associations and
star clusters, along with the evolution of the field stars. All those complex evolutionary phenomena lead to
variations in the IMF from parsec scales to entire galaxies. Improvement of the IMF based on the theory of
star formation and stellar evolution is a whole branch of research by itself.
13.7.3 Global enrichment of the Universe
The fractional mass returned to the interstellar medium by a generation of stars and the number of compact
remnants resulting from that generation (white dwarfs, neutron stars, black holes) is obtained after con-
volving the chemical enrichment shown as a function of the fractional mass in Figure 13.12 with the birth
function in Eq. (13.17). This hence delivers the chemical enrichment per generation. This is shown for a
generation of 1000 massive stars that have all exploded as supernova in Figure 13.13. It is this enrichment
that is subsequently used in chemical evolution models of galaxies. The helium enrichment is mainly due
to stars with a mass below 20M⊙. Heavier elements are provided by stars with M > 20M⊙, with carbon
and nitrogen as an exception. As there are many more low-mass stars than massive stars, the major part of
221
Figure 13.13: Convolution of the stellar yields (as indicated in Figure 13.12) for a sample of 1000 massive
stars of solar metallicity that have exploded as supernova with the Salpeter birth function in Eq. (13.17).
This reveals the overall chemical enrichment delivered by this one generation of massive stars. Left panel:
for non-rotating stellar models; right panel: for models with rotational mixing. The dotted areas show the
wind contributions. (From Hirschi et al., 2005, A&A, Vol. 433, p.1013–1022)
the carbon production in the Universe is provided by AGB stars. This scenario implies that stars of a later
generation are always born with a higher metallicity Z .
The winners of this whole galactic evolution are the compact remnants, as well as the brown dwarfs
and the low-mass main-sequence stars that did not have enough time yet since the Big Bang to evolve away
from the main sequence. At the end of the galactic evolution, when all available gas will be locked in these
low-mass objects, the starformation stops entirely. On a galactic scale we can summarize the evolution
processes as follows:
1. the amount of available gas decreases constantly. The number of gas and dust clouds therefore de-
creases steadily;
2. the luminosity of the galaxy, provided by the luminosities of the individual stars, decreases, as the
relative number of massive stars strongly decreases in favour of the number of compact remnants and
low mass stars;
3. the galaxy becomes more and more metal rich.
222
The strongest chemical enrichment took place in the early life of the galaxy, i.e. the degree of enrichment
decreases strongly in time. Consider the Sun as an example. Its age is about 1/3 of the age of our Milky
Way and its metallicity Z is about 0.014. The metallicity of the youngest stars in our galaxy is 0.04, while
that of the oldest stars is about 0.0003. This means that Z has increased by a factor ∼50 during the first 2/3
of the life time of the galaxy, and afterwards only by a factor ∼3.
It thus becomes clear that the division of stars in only three Populations, as mentioned in the first
chapter, is too simple. The division has grown historically, mainly based on the position of the stars in and
around the galactic plane. In practice, it is impossible to group the stars in a discrete division of Populations,
as we have a continuous variation in Z . There are old Population I stars that formed between Populations
I and II and there are also extreme Population II stars with a metallicity lower than the average population
II star as they originated during the earliest stages after the galaxy formation. The latter stars are nowadays
called Population III stars, although they may have already obtained some metal fraction.
In any case we conclude that the mass fraction of heavy elements is very small even in the youngest
stars. The absolute mass of heavy elements that the star is born with nowadays is, however, substantial
when we compare it to the masses of exoplanets around host stars. For the Sun, e.g., we find that its mass
of heavy elements exceeds by far those of all planets in our solar system. The gaseous planets still contain
some primordial gas, but all other bodies in the solar system exist of heavy elements that where present
in the proto-solar dust cloud from which the Sun has originated. The source of these heavy elements is
nucleosynthesis. We thus find that most atoms in our body have once belonged to stars and that most of
them experienced dramatic explosions.
223
Chapter 14
Binary stars and their evolution
More than half of the observed stars belong to multiple systems. In some binary systems, stars move at
such a considerable distance that their evolution is not influenced by the companion. Such wide binaries
follow the evolutionary theory of single stars as outlined in the course so far, hence these do not require an
extra chapter. Here, we deal with systems of stars that are gravitationally bound to each other and move
around a common centre of gravity in such a way that they influence each other’s evolution. The fraction
of binarity for main-sequence stars changes according to the masses of the stars involved, as shown in
Figures 14.1, 14.2, and 14.3. The latter two figures were reproduced from the pioneering study by Sana
et al. (2012), which revealed that binary interaction dominates the evolution of O-type stars, such that one
cannot ignore binarity when it comes to understanding and modelling the evolution of the most massive
stars in the Universe. The previous chapter has to be recalled keeping this in mind. In fact, the LBV shown
in Figure 13.1 is a binary and SN1987A was one as well.
This chapter offers only a very concise overview of close binary systems and their evolution, keeping
in mind that the biennial MSc courses Binary Stars and High-Energy Astrophysics are fully dedicated to this
topic and obviously provide many more details. Compact binaries are the end products of binary evolution.
These are subject to general relativity (not treated here) including gravitational wave progenitors. As for the
origin of binaries, we do not go into details. Binary formation is mainly attributed to two processes: tidal
capture due to close encounters (e.g., in globular clusters) and fragmentation as discussed for single stars in
Chapter 9.
In a close binary, each star is distorted by the tidal force induced by its companion. More precisely,
a close binary is defined as a binary of which the distance between the components is comparable with or
only slightly larger than their own dimensions. The evolution of a star in such a close binary is different
from the evolution of an isolated star because of the proximity of the companion. Because of its presence,
the primary component cannot grow unlimitedly during evolution. As discussed in the previous chapters,
all evolved stars, once beyond the TAMS, go through giant or supergiant phases increasing their radius
drastically. The evolution of a star in a close binary will therefore change once the evolved primary comes
close to its companion. We recall from the discussion of stellar evolution theory of a single star that the most
227
Figure 14.1: Top panel: the y-axis shows the fraction of main-sequence primaries with at least one stellar
main-sequence companion having a mass at least 10% of the primary’s mass shown on the x-axis. Bottom
panel: the average frequency of stellar main-sequence companions having a mass at least 10% of the pri-
mary’s mass per primary. The red cross is for pre-MS binaries. These statistics exclude sub-stellar brown
dwarf companions, as well as compact remnant companions such as white dwarfs, neutron stars, or black
holes. (From Moe, M., 2019, Memorie della Societa Astronomica Italiana, Vol.90, p.347)
massive of both components will first evolve away from the main sequence.
14.1 Observational classification of close binary stars
It is known since long that gas streams from one component to the other occur in close binaries. This is a
result of the tidal forces and in some cases even of the physical interaction of two stars in contact. Such gas
streams were observed for the first time in the spectrum of the star β Lyrae during one of its eclipses.
Mass transfer is a necessary ingredient to understand binary stars, as shown by the so called Algol
paradox. This paradox was derived from observations of the binary Algol in the 1940s. Algol is composed
of a red giant and a main-sequence star. Because the red giant is the most evolved star, it should be the most
massive one. However, measurements of the orbital radial velocity showed that the red giant is less massive
228
Figure 14.2: Cumulative number distributions of orbital periods (left panel) and of mass ratios (right panel)
for O-type objects. The horizontal solid line and the dark green area indicate the most probable intrinsic
number of binaries and its 1σ uncertainty, corresponding to an intrinsic binary fraction of 69±9%. The
horizontal dashed line indicates the most probable simulated number of detected binaries. (From Sana et al.,
2012, Science, Vol. 337, p. 444)
than the unevolved main-sequence star. This is the case for many similar binary systems. The fact that the
most evolved star is the less massive one stands in sharp contrast with the evolution theory in the previous
chapters. The paradox finds its solution by realising that the more evolved star has already encountered
considerable mass loss. The ejected mass is captured by its companion making the latter the most massive
one of the system. Algol and the similar semi-detached systems clearly demonstrate the occurrence of mass
transfer between binary components.
The evolution of close binaries is mainly determined by the increase of the stellar radii, because the
size of the stars determines the start of mass transfer. The episodes of the fastest increase of the radius
correspond with transitions from central to shell burning, i.e., when the core contracts and the outer layers
expand. It is thus most plausible that mass transfer in a close binary occurs during these phases.
Observers usually define the primary component, or simply primary as the brightest star of the pair
and give it mass M1. The other star is then called the secondary with mass M2. As we just discussed in
the case of Algol, this primary need not necessarily be the most massive one, as the mass ratio (denoted as
q ≡M2/M1) may have switched already during phases of mass transfer. The star undergoing the mass loss
is called the donor, while the star accreting this mass is called the gainer. Much of the discussion to follow,
will be based on Newton’s theory of gravity and on Kepler’s laws. We will not derive these laws here as this
229
Figure 14.3: Schematic representation of the relative importance of different binary interaction processes.
All percentages are expressed in terms of the fraction of born O-type stars, including the single and binary
ones, either as the initially more massive component (the primary), or the less massive one (the secondary).
(Figure courtesy of Prof. Selma de Mink, reproduced from Sana et a., 2012, Science, Vol. 337, p. 444)
was already done in detail in BSc courses.
As is often the case in astronomy, objects are divided in different categories or classes. The case of
binary stars is no exception. In fact, there are several types of classifications for binaries: based on observed
properties, based on physical principles, based on type of mass transfer. A first classification is based on
observational data:
• visual binaries: two stars in orbit around each other, far enough apart that we can see them as two stars
(by eye or by telescope). These binaries are situated relatively close to the Sun and have a relatively
wide orbit. Visual binaries are highly important for detailed mass determination.
• composite spectrum binaries: two stars in orbit around each other but generally too close to be seen
separately, their binary nature is revealed by spectral lines characteristic of two different stars (having
different or similar spectral types).
• eclipsing binaries: two stars in orbit around each other that periodically occult (or eclipse) one another
230
in the line-of-sight of an observer. This implies that the observer cannot be much inclined with
respect to the orbital plane. As visual binaries, eclipsing ones are important for fundamental parameter
determination.
• ellipsoidal variables: characterised by a double-wave light curve due to a non-spherical shape of the
components. There are no eclipses which must mean that the orbital plane is heavily inclined with
respect to the observer.
• astrometric binaries: the binary nature is revealed because the visible star shows wobbles in its proper
motion.
• spectroscopic binaries: the binary nature is revealed by the back and forth motion of spectral lines
in the spectrum. Spectroscopic binaries are discovered from periodic changes in radial velocity, a
quantity deduced from the spectrum. In case of a circular orbit, we can deduce from the third Kepler
law (V ∼ P−1/3) that the largest variation in the radial velocity curve is observed in close systems
with a short orbital period. That is why, for spectroscopic binaries, unlike for visual binaries, a
selection effect occurs towards close systems. This class is subdivided:
1. SB1 or single-lined spectroscopic binaries: only the spectrum of one of the components is de-
tected. These are very common. Indeed, there is a relatively low probability that the spectrum of
the brightest component does not dominate the spectrum of the weaker component, because the
luminosity is a quantity which can vary with orders of magnitude. The spectrum of the brightest
component fully dominates in this case.
2. SB2 or double-lined spectroscopic binaries: the spectra of both components are detected simul-
taneously. This means that both stars cannot be very different in effective temperature. They are
thus not so common compared to SB1s.
14.2 The Roche model
We study a binary system consisting of two stars with respective masses M1 and M2. When both stars are
spherically symmetric, their gravitational potential is ∼ 1/r. In this case, both components move around
the centre of mass of the system according to an elliptical orbit. The orbital period is connected with the
extent of the system via the third Kepler law:
P = 2π
√
[
a3
G(M1 +M2)
]
, (14.1)
where a is the semimajor axis of the orbit. When e is the orbit’s eccentricity, the separation between both
components varies from a(1−e) at periastron to a(1+e) at apastron. In case of a circular orbit, a is simply
the separation between the two stars.
We want to assess how long one of the stars can expand before mass transfer occurs. For simplicity,
and also because this will turn out to be a valid approximation prior to the expansion of the more massive
binary component (see below), we assume that the binary system has a circular orbit and the components
are rotating with a constant angular frequency Ωorb = 2π/Porb.
231
Figure 14.4: The coordinate system with origin in the stellar centre of the primary component.
We consider a coordinate system with origin in the centre of mass of the binary. In an inertial system
the velocity ~v of a particle with position vector ~r is given by
~v = ~r + ~Ωorb × ~r. (14.2)
Here ~r ≡ d~r/dt stands for the motion of the particle with respect to a co-rotating coordinate system, while~Ωorb ×~r represents the velocity of the point ~r with respect to the inertial system. Similarly, the acceleration
~a of a particle expressed in an inertial system is provided by
~a = ~v + ~Ωorb × ~v, (14.3)
and using Eq. (14.2) this can be transformed into
~a = ~r + 2~Ωorb × ~r + ~Ωorb × (~Ωorb × ~r). (14.4)
The second term on the right hand side of this equation is the Coriolis acceleration and the third term is the
centrifugal acceleration. Let us consider a Cartesian coordinate system (X,Y,Z) that co-rotates in such a
way that the Z-axis coincides with the rotation axis: ~Ωorb = (0, 0,Ωorb). For ~r = (X,Y,Z) we can deduce
the equality
~Ωorb × (~Ωorb × ~r) = ~∇[
−1
2Ω2orb
(
X2 +Y
2)
]
≡ ~∇ΦΩorb. (14.5)
The centrifugal force can be defined as being derived from a force field ΦΩorbacting perpendicular to the
direction of the rotation axis. The acceleration ~a of a mass element is controlled by the forces acting upon
it: (we again work per unit mass):
~a = −1
ρ~∇P − ~∇ΦG, (14.6)
232
with ΦG the gravitational potential due to the two stars. This way, the equation of motion for a unit mass
expressed in rotating coordinates becomes:
~r + 2~Ωorb × ~r = −1
ρ~∇P − ~∇ΦG − ~∇ΦΩorb
. (14.7)
Besides this, also the continuity equation
∂ρ
∂t= −~∇.(ρ~r) (14.8)
and the equation of Poisson~∇2ΦG = 4πGρ (14.9)
must be fulfilled.
We will discuss below that the tidal forces cause stars to end up in a state of co-rotation. In this case
~r = ~0 and the equation of motion becomes
1
ρ~∇P + ~∇Φ = ~0 with Φ = ΦG +ΦΩorb
. (14.10)
This equation implies that surfaces of equal P and surfaces of equal Φ coincide, so it follows that P and
ρ are functions of the total potential Φ. In particular, the pressure and the density at the stellar surface
evolve towards value zero, so the stellar surface is determined by the shape of the surface Φ = constant. We
conclude that if we want to deduce the stellar shape in a binary system, we have to determine the shape of
the equipotential surfaces Φ = constant.
Determining the general shape of the equipotential surfaces is complex. Through the Poisson equation
ΦG depends on the density distribution in each of the stars. The solution for an equipotential surface can
therefore only be obtained in a numerical way. In practice, an approximate solution is introduced following
the French physicist E. Roche (1820–1883). In the Roche approximation it is assumed that the gravitational
field of each of the stars can be approximated as if the star is not disturbed by rotation or by its companion.
In other words, we suppose that the mass of each of the stars is concentrated in the stellar centre. In this
case
ΦG = − GM1
|~r − ~r1|− GM2
|~r − ~r2|(14.11)
is the solution to the Poisson equation, where the stellar centre of the first component is situated in ~r1 and
that of the second component in ~r2. This approach is good up to the level of a few percent because most
stars have a strong mass concentration towards their centre.
To deduce an explicit expression for the Roche equipotentials, it is convenient to shift to a coordinate
system with origin in the centre of the primary component (see Figure 14.4). We consider Cartesian coordi-
nates (x, y, z) with the z−axis coinciding with the rotation axis. The x−axis connects the two stellar centres
and points towards the secondary component. The y−axis is chosen so as to achieve a right-handed rotating
coordinate system. The coordinates of the stellar centre of the secondary component then are (a, 0, 0). The
binary’s centre of mass is located in the point with coordinates (µa, 0, 0), with µ ≡ M2/(M1 +M2). The
transformation formulae between the systems (X,Y,Z) and (x, y, z) are
x = X+ µa, y = Y, z = Z. (14.12)
233
Roche lobe
common
envelope
semi-detached
mass transfer
(RLOF)
detached
Figure 14.5: Top panel: Sections in the orbital plane of Roche equipotentials, for a binary system with mass
ratio q = M2/M1 = 0.6. The position of the Lagrangian points L1, ldots, L5 (black dots) are indicated, as
well as that of the centre of mass (plus symbol) of the system. Bottom panel: A schematic representation
of the equipotentials as a function of distance along the axis joining the two stars, with top panel a detached
binary, middle panel a semi-detached binary, and bottom panel a contact binary. (Figure courtesy of Prof.
A. Jorissen, Universite Libre de Bruxelles, B).
In these new coordinates, the Roche potential is given by
Φ = ΦG +ΦΩorb= − GM1
(x2 + y2 + z2)1/2− GM2
((x− a)2 + y2 + z2)1/2− 1
2Ω2orb
[
(x− µa)2 + y2]
(14.13)
and the Roche equipotentials are given by Φ =constant.
234
Figure 14.6: Schematic representation of stable RLOF occurring near the tip of the RGB and leading to a
configuration of a subdwarf binary according to the properties as indicated. (Figure courtesy of Prof. Philipp
Podsiadlowski, Oxford University, UK)
235
In Figure 14.5 we show the equipotentials for z = 0, coinciding with the orbital plane. The equipo-
tential surfaces close to each star are almost spherically symmetric hence the influence of the other star is
minimal. The presence of the component distorts the surfaces in two different ways. Because of the tidal
forces due to the gravitational field of the component the equipotential surface gets an elongated spheroidal
shape . The co-rotation of the star with the orbital frequency causes that the spheroid to be flattened by the
centrifugal force and that the axis of symmetry of the flattened spheroid coincides with the rotation axis.
The net effect of both of these quasi-ellipsoidal distortions is that the star has its maximum dimension along
the connecting line of the two stellar centres, while it is minimal in the direction of the rotation axis.
The further from the star, the more distorted the equipotential surfaces are, until they eventually reach
the critical potential. This contains the inner Lagrangian point L1, which is a saddle point of the potential.
In three dimensions, the equipotential surfaces are approximately axisymmetric around the connecting line
between the stellar centres. So the equipotential surface through the point L1 contains two special volumes,
which are called the Roches lobes. Figure 14.5 reveals the three most important maxima, corresponding with
the three Lagrangian points L1, L2, L3. These can be found by solving the following equations
∂Φ
∂x= 0, y = 0, z = 0 . (14.14)
When both stars are unevolved, each component is situated within its Roche lobe and the stars are
almost spherically symmetric. This is called a detached system. When the stars evolve, they expand. Each
star can only expand until it has filled its Roche lobe. In case of further expansion, mass transfer will occur
to the other component via L1. From this moment on, the system is semidetached and experiences Roche
lobe overflow (RLOF). When both stars are expanded beyond their Roche lobe, a contact system is formed.
A diagram for these three cases is shown in Figure 14.5. The envelope of a contact system can co-rotate
with the orbital motion until the expansion reaches beyond L2. When the binary expands even further, it
loses mass via L2 and the mass is no longer forced to co-rotate. In this case ~r 6= ~0 and the Roche potentials
are no longer relevant.
The tidal forces get stronger as a star fills its Roche lobe. The Roche Lobe radius RL is defined in such
a way that the volume filled by the Roche lobe equals 4π/3R3L. The ratio RL/b is fully determined by the
ratio of the component’s masses. Once the star’s radius equals the radius of the Roche lobe, the tidal effects
are so strong that the stars will move in a circular orbit and will rotate synchronously. This state will be
achieved first in the outer layers.
14.3 Determination of the orbital and fundamental parameters of binaries
14.3.1 Orbital elements
Binary stars are subject to the equations describing a two-body orbital motion. In general they move on
elliptical orbits, characterised by a semi-major axis a, an eccentricity e, and an orbital period Porb. It is
customary to describe the position and orientation of the orbit with respect to an inertial coordinate frame
236
whose z-axis points towards the observer. The orbit is then described with the three so-called Euler angles.
The first angle is the inclination i. It is defined as the angle between the line-of-sight and the axis of the
angular momentum vector of the binary, such that zero inclination corresponds to a pole-on view and i = 90
to a view in the orbital plane. A second angle, Ωorb, describes the longitude of the ascending node. The
orientation of the orbit within its own plane is described by the longitude of periastron ω.
The position of the stars as a function of time is given by Kepler’s equation. Thus, we add a sixth
parameter describing the motion of the bodies, by fixing the time of periastron passage T . The six quantities
(a, e, i, ω,Ωorb, T ) thus determine the binary orbit in three-dimensional space. They are termed the orbital
elements. These definitions as well as the equations of the two-body problem are fully described in the
biennial MSc course Binary Stars. The students not attending this elective course are referred to Chapter 2
of the monograph “An Introduction to Close Binary Stars” by Hilditch (2001) for details. Moreover, Chapter
3 of that book describes in detail how to determine the orbital elements from various types of data.
14.3.2 Masses and radii of main-sequence stars
Important quantities that can be determined from data, besides the six parameters describing the orbit, are
• the radial velocity of the centre of mass of the binary, termed the gamma-velocity or also the systemic
velocity;
• the semi-amplitude of the radial-velocity curves of the primary K1 and of the secondary K2;
• the mass function, defined as
f(M) ≡ M2 sin3 i
(M1 +M2)2 .
As shown in Hilditch (2001), the mass function can be determined for each spectroscopic binary from
its T , K1 and e:
f(M) = Tyr
(
K1
29.79
)3
(1− e2)3/2,
where K1 should be expressed in km s−1 to obtain f(M) in solar masses. The mass function is one
equation containing three unknowns. It represents a lower limit for the mass of the secondary (M1 = 0and i = 90).
As explained in Hilditch (2001), SB2s allow the derivation of the mass ratio as we can deduce the ratio of
the semi-major axes of both components from their semi-amplitudes. SB2s eclipsing binaries deliver the
masses and radii of the individual components, because a value of the inclination angle can be derived as
well. Visual or astrometric binaries also provide good constraints but only for SB2 eclipsers do we get all
four of M1,M2, R1, R2 with very high accuracy, from Kepler’s laws. There are about 100 binary systems
nowadays for which such dynamical masses and radii can be derived with 1% accuracy, without having to
rely on stellar models. Most of these are included in the review paper of unevolved binaries covering a large
mass range by Torres et al. (2010), to which we refer for the values.
237
Recall that the MLR-relations of ZAMS stellar models were evaluated by means of binaries in Fig-
ure 10.2. We noted a very good agreement between observations and theory, for a large mass range covering
a huge range in luminosities. This agreement between theory and observations at the level of 1% in the
masses and radii is especially impressive because we compare theoretical ZAMS models with observations
of stars that cover the entire main sequence. Noteworthy is that internal mixing, both in the form of CBM
and envelope mixing, occurs during the evolution on the main sequence (see Chapter 7). This creates more
massive helium cores. The study by Torres et al. (2010) did not take this into account. An improved ho-
mogoneous analysis for 11 among the massive binaries was performed by Tkachenko et al. (2020) taking
CBM into account and revealing the need of higher core masses than those obtained from ignoring CBM. In
this way, binaries play a critical role to evaluate single-star models in the core-hydrogen burning phase, to
evaluate the internal mixing and angular momentum transport processes discussed in Chapter 7.
14.3.3 Masses of white dwarfs
The derivation of accurate masses of white dwarfs is evidently important to test the Chandrasekhar mass
limit of 1.44 M⊙. Opportunities to derive model-independent masses of single white dwarfs, or white dwarfs
in wide binaries are scarce because the luminosity of the white dwarf is usually much lower compared to
the one of its companion star. Three well-known visual binaries with a white dwarf component are Sirius
(αCMa, A1V primary, orbital period of 50 years); αCMi (F5IV primary, orbital period of 41 years) and
o2 Eri BC (M4.5V primary, orbital period of 252 years). The masses of these white dwarfs are, respectively,
0.94, 0.65 en 0.43 M⊙. Asteroseismology also delivers good mass estimates for tens of white dwarfs. The
conclusion is that all mass estimates so far are compatible with the Chandrasekhar limit. The histogram of
the masses of white dwarfs shows a clear maximum at M = 0.58M⊙. This is understood in terms of the
initial mass function: stars born with 1.5 M⊙ end their lives as white dwarfs with a mass below 0.65 M⊙.
The determination of white dwarf masses is also important to try and understand the mass loss during
the AGB. Model computations predict that Sirius B descended from a star with initial mass between 3 and
4 M⊙. All white dwarfs in open cluster with a turn-off point near 6 M⊙ result in a remnant mass of 1.2 M⊙.
This shows that stars with birth masses until 6 M⊙ have no problem at all to lose enough mass and end their
life as a white dwarf with a mass below the Chandrasekhar limit.
White dwarfs occur frequently in cataclysmic variables. These are close binaries consisting of a cool
low-mass main-sequence star losing mass to its white-dwarf companion. The masses of these white dwarfs
are determined with less accuracy than for a visual binary because the white dwarf itself is not detected.
Rather, emission lines due to the infall of matter of the donor on the accretion disk around the gainer are
observed. These emission lines follow the orbital motion and thus allow to derive how the white dwarf itself
moves in its orbit. This is possible whenever the emission lines are clearly detected and not too broad (in
order to derive a precise value for the velocity changes). Usually, the estimate of the orbital inclination is
less accurate in this case compared to a visual binary and this limits the accuracy of the masses. The white
dwarfs in cataclysmic variables have higher masses than the single white dwarfs.
238
Figure 14.7: Schematic presentation of a binary whose least evolved component transfers mass to a compact
companion, creating an accretion disk around the latter. The position in the disk where matter is accreted
reveals a hot spot, giving rise to emission lines in the spectrum of the binary. This allows to determine the
orbital motion, even if the gainer itself is invisible. (From Pringle & Wade 1985)
14.3.4 Type Ia supernovae
The standard model for the formation of Type-I supernovae was already mentioned in Chapter 11. A close
binary with a white dwarf companion whose mass is already close to the Chandrasekhar limit keeps accreting
mass from its donor (cf. cartoon in Figure 14.7). At a certain point, the Chandrasekhar limit is surpassed
and a nuclear reaction transforming carbon into oxygen, takes place in degenerate matter. This leads to
a thermo-nuclear runaway and leads to an explosion. Such Type-I supernovae are classified in different
subcategories according to the spectral lines observed in their spectrum. Type-Ia supernovae are observed
most frequently. They show spectral lines of O, Mg, Si, Ca, and Fe and their Fe and Co lines remain visible
several months after the explosion. They are found in all types of galaxies and are thus ideally suited as
distant indicators. Type-Ia supernovae are important in observational cosmology, to derive the size and age
of the expanding Universe.
Uncertainties of 3D hydrodynamical simulations to predict the explosion of Type-I supernovae are
numerous. However, it is clear that two parameters are crucial in the evolution: the mass of the white dwarf
and its accretion rate. Different combinations of these parameters lead to quite different types of explosions
in terms of the remnants. Some explosions are so violent that they leave no remnant; others result again
in a white dwarf which is less massive than the one that went over the Chandrasekhar limit. An important
chemical aspect occurring for Type-Ia supernovae is the presence of 56Ni. This isotope decays into 56Co
which, in its turn, decays into 56Fe with a half-life time of 77 days. This picture is fully confirmed from the
observation of Co lines several months after the explosion of Type-Ia supernovae.
239
14.3.5 Masses of neutron stars
A degenerate neutron gas is obviously able to counterbalance gravity in a neutron star, up to a mass of about
2.5 M⊙. Unlike the case for a degenerate electron gas, the EOS of a degenerate neutron gas is still debated,
with various formulations under investigation. These EOS candidates lead to different limiting masses (in
analogy with the Chandrasekhar limit), and thus accurate observational mass determination of neutron stars
is of utmost importance for high-energy physics.
The mass determination of neutron stars is based on data of X-ray binaries. Low-mass X-ray binaries
(LMXBs) are analogous to cataclysmic variables, with the gainer a neutron star or a black hole instead of a
white dwarf. For the same amount of mass transfer, an LMBX produces much more energy output and at
shorter wavelengths because the potential well of a neutron star is ∼1000 times deeper than that of a white
dwarf. This number is even higher for a black hole. The matter falling into the gainer’s gravitational well
radiates energy in the form of X-rays, hence the naming. The surface temperature of such sources is about
106 – 107 K and brings the radiated energy in the X-ray area of the electromagnetic spectrum. A high-mass
X-ray binary (HMXB) results from a massive OB-type main-sequence star dumping material on a neutron
star in a close binary. The magnetic field lines of the neutron star capture the material of the main-sequence
donor, which has a strong stellar wind throughout its life (cf. Chapter 13) and transports this matter to the
magnetic poles of the gainer. When the captured matter falls onto the neutron star, a high-energy stream
emerges and sends out X-rays. The orbital periods of X-ray binaries vary from minutes to tens of days.
In terms of mass determination, X-ray binaries can be treated in the same way as double-lined spec-
troscopic binaries. We do not observe the neutron star directly, but its motion around its companion can
be derived with high precision in case of a pulsar thanks to the light-time effect. The pulses of the neutron
star are shorter whenever it moves towards the observer, while they are longer when the neutron star moves
away from the observer. This effect allows one to determine the mass, provided that the inclination angle is
known. The inclination is estimated from the eclipse of the X-rays by the companion and its uncertainty is
the largest limitation for the neutron star mass estimate. Derived masses range from 0.5 to 2.5 M⊙, hence
most neutron stars have masses not much above the Chandrasekhar limit. A unique opportunity to derive the
mass of neutron stars with an extremely high accuracy occurs for binary pulsars. One can reach accuracy
levels of order 0.001 M⊙ in that way.
14.4 Mass transfer and evolution of close binaries
In general, the components of unevolved close binaries rotate much more slowly than single stars in the same
stage of their evolution. Observations also show that many stars in close binary systems are synchronised
with the orbital motion. Both results are attributed to the effects of tidal forces and show that these are
important in close binaries.
240
Figure 14.8: Schematic representation of the difference between the properties of typical cases giving rise
to a HMXB and a LMXB. (Figure courtesy of Prof. Ed van den Heuvel)
14.4.1 Tidal effects: circularisation and synchronisation
If we want to study the dynamical effects caused by tidal forces in a close binary, it is essential to first
determine the external gravitational field caused by the two components. Whenever the two components of
a binary are located closely to each other, the tidal forces will distort one or both of the components. In
this case, the gravitational potential is no longer a simple function of 1/r but extra terms occur. These extra
241
terms appear because the mass distribution is no longer spherically symmetric in the affected components.
The extra terms in the gravitational potential cause a variation of the order of a few percent. When the stars
are moving around each other, the extra terms produce a force that causes a variation in energy and angular
momentum of the orbit. Among other things, this variation may cause apsidal motion, so the orbit is no
longer a closed ellipse. In this way, the close binary will evolve into a circular system.
An equilibrium (or static) tide occurs when the stars are constantly in a state of hydrostatic equilibrium
in a circular orbit and with their rotation synchronised with the orbital motion. In that case, the distortion of
the components is static in a coordinate system rotating with the binary. In the general case of an eccentric
non-synchonised binary, the tidal forces experienced by the components change throughout the orbital mo-
tion. Such time-dependent tides are called dynamic tides. The tidal interaction causes the system to evolve
towards a circularised orbit and a state of co-rotation with the orbital period. As long as the stars are sub-
jected to dynamical tides, they experience tidal forces with a variable amplitude. In this way, the stars are
forced to oscillate. Hence, tides can be described as a superposition of forced oscillations of a spherically
symmetric star. Given that the tidal forces imply periodic perturbations, resonances may occur between the
dynamic tides and low-frequency gravity-mode oscillations (see course Asteroseismology for a definition).
These resonances intensify the effects of the tidal forces considerably and guide mixing via tidal friction
and angular momentum transport in the system.
Theoretical calculations (omitted here) show that the tidal forces first give rise to the synchronisation
of the components, while the process of circularisation takes more time. The two following mechanisms
explain synchronisation in close binaries:
1. In a star consisting of a convective core and a radiative envelope, the forced oscillations are damped
by viscous effects in the outer stellar layers. This dissipates the pulsation energy. Following this
dissipation, the companion exerts a torque on the other star causing the synchronisation of the rotation
with the companion’s orbital motion. For relatively close binaries, the torque is strong enough to
synchronise the system within a time span shorter than the nuclear time scale of the star. This radiative
damping of dynamic tides is the most efficient mechanism for the synchronisation of the rotation of
the close binary components without a convective envelope.
2. In the components of close binaries with a convective envelope, turbulent convection can slow down
the equilibrium tide in relation to the tide-generating potential. A torque is induced and this leads to
synchronisation. In this case, there is a higher degree of uncertainty for the time scale upon which
synchronisation is induced than for stars with a radiative envelope.
These theoretical results are based on the simplified assumptions that the primary is spherically symmetric,
rotates uniformly in its interior and with the axis of rotation perpendicular to the orbital plane. Ignoring the
effects of the Coriolis force in the treatment of the forced oscillations leads to the following expressions for
the time scales of synchronisation and circularisation of the orbit:
τcirc ≈ 106 · (1/q) [(1 + q)/2]5/3 P16/3orb ; τsyn ≈ 104 · [(1 + q)/2q]2 P 4
orb (14.15)
(result taken from the seminal paper by J. P. Zahn, 1977, “Tidal Friction in Close Binaries”, A&A, Vol. 500,
pp. 121–132). The orbital periods in these formulae must be expressed in days to get the results in years.
242
Mechanism 2 above is effective during the pre-MS phase. Hence some pre-MS stars may arrive on the
ZAMS with already circularised orbits (and are hence expected to be synchronised), particularly for binary
protostars with orbital periods below some 10 days.
We conclude that, either during the pre-MS phase or on the main sequence, the dynamic tides result
in close binaries to evolve into a state of minimal energy: two co-rotating stars in a circular orbit. The
time scale for circularisation is roughly twice the one for synchronisation. Both time scales are shorter
than the nuclear time scale. It is therefore appropriate to assume that close binaries are synchronised and
circularised by the time that the primary reaches the TAMS. Should this not be the case, then this state will
soon thereafter be achieved once RLOF starts.
14.4.2 Mass transfer
The evolution of a star in a close binary is different from the evolution of a single star because mass transfer
occurs and therefore the evolution cannot just be described in terms of the initial birth mass. The evolution
of a star in a close binary is mainly determined by the size of its Roche lobe. In describing the evolution
of the components we rely on simple theoretical principles and adopt the convention that the donor is star 1
and the gainer is star 2. Because more massive stars evolve faster, we assume that the birth mass of star 1 is
higher than the one of star 2.
When star 1 fills its Roche lobe as a result of its evolution, the evolution is guided by the mass transfer
via L1 as soon as the size of the star becomes larger than the size of its Roche lobe. The gas stream of the
gas particles from the atmosphere of star 1 through L1 to the Roche lobe of star 2 is behaving as if the gas
leaks from the stellar atmosphere of star 1 into a vacuum, because star 2 is not yet filling its Roche lobe.
Therefore, the flow velocity of the gas equals the speed of sound in the atmosphere of star 1, which can be
approximated as vs ≈ 15T−1/24 km s−1 (again with the notation T4 ≡ (T/104)K). Since the temperature of
the star can range from a 3 000 K for a M dwarf to some 30 000 K for a late O dwarf, vs can vary between
10 – 30 km s−1.
To understand what happens with the gas stream when it is accelerated towards star 2 after having
passed L1, a second relevant velocity is considered, namely the dynamical velocity in the binary which
equals the orbital velocity of the stars. This can be deduced from Kepler’s laws. A good approximation is
for a system with not too different masses is
vorb ≈ 100
(
M1 +M2
M⊙
)1/3 (P
d
)−1/3
km s−1 . (14.16)
Hence, the sound speed of the gas particles in the stream is far below the orbital velocity in the system so
they get accelerated to a supersonic velocity. This fact implies that the transferred material stays within
a well-defined stream after it has passed L1 because it moves at a velocity much higher than its original
natural velocity vs. Numerical simulations show the gas stream to stay within the Roche lobe of star 2. Two
scenarios can occur:
1. The incoming gas stream immediately collides with the stellar surface of star 2. In this case, the
243
energy gained by the gas stream as a result of the attraction by star 2, is dissipated in a shock at the
stellar surface of star 2. Almost all of the transferred material is immediately accreted by star 2. There
is conservation of mass and of orbital angular momentum in the whole system.
2. The incoming gas stream is too far away from the stellar surface of star 2 and it is not captured at once.
In this case, the gas stream moves around star 2 and, at a given moment, collides with itself in a point
near to star 2 within the Roche lobe. Since the stream moves supersonically, the collision dissipates
kinetic energy, causing heating of the gas stream. This gas stream will radiate thermal energy and cool
down. Since the collision occurs near to star 2, we can neglect the influence of star 1 to the further
trajectory of the gas stream. The stream evolves to a state of minimal energy, which corresponds with
a circular ring around star 2. The radius of this orbit of this ring, RH , is as such that its orbital velocity
equals the tangential velocity of the stream in the point of collision with itself.
Because the material in the gas ring radiates energy, some of the particles move closer towards star
2. The conservation of orbital angular momentum implies that particles will be found further away
from the star as well. So, the gas ring will evolve to a disk around star 2. This way, an accretion disk
is formed around star 2. Matter from this disk can then easily be accreted by star 2. In this scenario,
there is also conservation of mass and of orbital angular momentum (cf. Figure 14.6).
The continuous flow of matter coming from the cooler donor and passing via L1 creates a so-called
hot spot at the exterior of the accretion disk (cf. Figure 14.7). This hot spot causes energetic radiation,
which can be detected as an excess of radiation at ultraviolet or at even shorter wavelengths according
to the level of heating that occurs in the hot spot.
14.4.3 Effect of mass transfer on the orbital parameters
Conservative mass transfer
The orbital period is the quantity of a binary that is the easiest to determine observationally. The orbital
period changes as a result of the mass transfer from donor to gainer. We confine ourselves to the description
of circular orbits, because we can roughly assume that the tidal force acts in such a way that close binaries
evolve towards a circular configuration as argued above.
In a system of coordinates of which the origin coincides with the mass centre of the close binary, each
of the stars moves in a circular orbit around the mass centre of the system. The orbits of stars 1 and 2 then
have radii:
a1 = aM2
M1 +M2en a2 = a
M1
M1 +M2, (14.17)
in which a is the orbital separation. The total angular momentum in the system is approximately
Jorb =M1a21Ωorb +M2a
22Ωorb, (14.18)
in which Ωorb = 2π/Porb. Two additional terms occur when the stars do not rotate synchronously with
the orbit, i.e., Ωorb 6= Ωrot. However, those rotational contributions are so small that they are negligible
in comparison with the terms on the right hand side of the equation (14.18) describing the orbital angular
244
momentum because the stellar radii are smaller than the orbital radii. With the help of Eqs (14.17) for the
radii, the equation for the momentum can be transformed into
Jorb =M1M2
M1 +M2a2Ωorb. (14.19)
When calculating the evolution of close binaries, it is often assumed that the process of mass transfer
is conservative. This means that there is no mass loss from the binary system and there is no loss of angular
momentum. In this case,d
dtM =
d
dt(M1 +M2) = 0 en
dJorbdt
= 0. (14.20)
By taking the time derivative of Eq. (14.19), we get
˙JorbJorb
= 2a
a+
˙Ωorb
Ωorb+M1
M1+M2
M2− M
M, (14.21)
and, taking the conservation of mass and angular momentum taken in account, this can be reduced to
2a
a+
˙Ωorb
Ωorb+M1
M1+M2
M2= 0. (14.22)
By applying Kepler’s 3rd law,
a3 =G (M1 +M2)
4π2P 2, (14.23)
this result can be transformed into the following conditions for the variation of the orbital period, respectively
orbital separation:˙Porb
Porb=
3M1 (M1 −M2)
M1M2, (14.24)
a
a=
2M1 (M1 −M2)
M1M2. (14.25)
Conservative mass loss causes the orbital period to change at a rate that is determined by the donor’s mass
loss. The orbital period as well as the orbital separation decrease because of the mass loss of star 1, given
that M1 > M2 and M1 < 0. The ratio RL/a decreases because the importance of the decreasing mass ratio
M1/M2 is higher than that of the decreasing separation. This means that the donor’s Roche lobe shrinks,
accelerating even more the mass loss and causing it to increase rapidly.
The initiated gas stream transport angular momentum from the donor to the gainer. The mass loss,
which is constantly accelerated, keeps on going until the mass ratio of the stars reverses, causing the orbital
period and separation to increase again. Also the donor’s Roche lobe will increase. On the other hand, the
size of the Roche lobe keeps on decreasing because of the mass loss of the donor. The net result from both
effects is a minimal increase of the Roche lobe radius. This puts an end to the process of accelerated mass
loss, and the loss of mass again occurs at a slower rate (on a thermal of even nuclear time scale rather than a
dynamical time scale). Most close binaries transferring mass are discovered in this phase of their evolution.
At a certain point in the evolution, the radius of the Roche lobe again becomes larger than the stellar
radius. At that moment, the separation between the components is much larger than the initial separation
245
before the start of the mass transfer, because the mass difference between the components is now higher.
Therefore, the thermodynamical equilibrium gets restored in the donor. The stellar surface of the donor
disconnects with the Roche lobe of star 2 and the system gets detached. The donor now consists of a helium
core surrounded by a very thin envelope that dissipates after a short while. Depending on the duration of
the mass transfer and the parameters of the close binary, the donor is transformed into a helium star, a white
dwarf or a neutron star.
Non-conservative mass transfer
In the case of non-conservative mass transfer, mass is lost from the system via L2. In that case, the evolution
of the orbit is more complex. Assume that a fraction β of the transferred mass leaves the system. In that
case, we have
M = β M1 and M2 = − (1 − β) M1. (14.26)
From˙Jorb
Jorb=
1
2
a
a+M1
M1+M2
M2− 1
2
M
M, (14.27)
we get˙Jorb
Jorb=
1
2
a
a+
M1
M1M2M
[
M2M − (1 − β)M1M − 1
2βM2M1
]
(14.28)
and3
2
a
a=P
P= 3
˙JorbJorb
+3 M1
M1M2M
[
(M1 − M2)− βM1 (1 − M2
2M)
]
. (14.29)
The second term on the right-hand side is less negative than in the case of conservative mass transfer and
turns positive somewhat before the mass ratio inverts. The first term, however, is always negative and implies
an increased reduction of the orbital separation compared to the conservative case.
Effect of mass loss due to a stellar wind
For binary systems involving massive components, one also has to take into account the mass loss by a
radiation-driven wind. The same is true for lower mass binaries which undergo a dust-driven wind. Such
winds also affect the orbital elements of the binary. Assume that one of the components undergoes mass
loss M1. In that case, there is a loss of orbital angular momentum:
˙Jorb = M1 a21 Ωorb. (14.30)
We have˙Jorb
Jorb=
M1 a21Ωorb
(a2 ΩorbM1M2)/M= M1
M2
M1M(14.31)
and also˙Jorb
Jorb=
2
3(M1
M− 2
˙Ωorb
Ωorb) +
˙Ωorb
Ωorb+M1
M1− M1
M= − 1
3
M1
M+
1
3
P
P+M1
M1(14.32)
246
such that3
2
a
a=P
P= − 2 M1
M. (14.33)
In this case, both the orbital period and the separation increase such that the mass transfer is slowed down.
14.4.4 The common envelope phase
The common envelope (CE) formalism in a binary was introduced to explain the existence of short-period
binaries with white dwarf components. The situation of a CE binary, compared with a detached and semi-
detached system, is sketched in the bottom right panel of Figure 14.5. During the CE phase, a spiral-in of
the secondary towards the primary occurs, because the companion experiences drag forces which makes it
move into the envelope of the giant primary. This sets in soon after the CE is formed, i.e., when the mass
ratio is high. In that stage, the CE is not in co-rotation and the envelope is heated.
The outcome of the CE phase is determined by the energy balance within the system, assuming angular
momentum conservation. In this model, the orbital energy of the binary is used to expel the envelope of the
giant with some unknown efficiency. The orbital energy Eorb released in the spiral-in process, may be used
to eject the envelope with an efficiency α,
α(Eorb,f − Eorb,i) = Eenv, (14.34)
where Eenv is the binding energy of the ejected envelope, and the subscripts i and f denote initial and final
values before and after the CE phase. In principle, one expects 0 < α ≤ 1. However, in order to explain
observed binaries one often finds that α exceeds unity. This indicates that other energy sources contribute to
the ejection of the envelope, e.g. the luminosity of the giant. Still, a very high value for α is not anticipated,
since it would be physically difficult to explain where such a large amount of energy would come from. The
poorly understood physics of the CE phase does not allow us to set a hard limit on α.
It is reasonable to assume that the secondary does not accrete matter since the mass transfer time-scale
is short. Expression (14.34) can then be written as
α(−GMremnantM2
2af+GMgiantM2
2ai
)
= −GMgiantMenv
λRgiant, (14.35)
where we have expressed Eenv in terms of the structural parameter λ. The combined parameter αλ can be
calculated from Eq. (14.35). To isolate α, one usually takes λ = 0.5, but an appropriate calculation should
take into account that λ depends on the stellar structure.
The total binding energy of the binary consists of the gravitational binding energy and the internal
energy U ,
Ebind =
∫ Mgiant
Mremnant
(
− GM
r+ U
)
dm. (14.36)
It is uncertain how much of the internal energy could contribute to the ejection of (part of) the envelope.
This uncertainty is expressed in a parameter αth:
Eenv =
∫ Mgiant
Mremnant
(
− GM
r
)
dm+ αth
∫ Mgiant
Mremnant
Udm. (14.37)
247
Figure 14.9: Two possible outcomes of a common envelope phase of binary evolution. (Figure courtesy of
Prof. Philipp Podsiadlowski, Oxford University, UK)
Expression (14.37) can be regarded as the effective binding energy of the envelope and is used to derive λ.
The first phase of mass transfer of observed double white dwarfs cannot be described by the standard
α formalism, nor by stable RLOF (which is graphically depicted in Figure 14.6). For binaries with mass
248
Figure 14.10: Scenario of binary evolution leading to a double white-dwarf binary. (Figure courtesy of Prof.
Alain Jorissen, Universite Libre de Bruxelles, B)
ratio close to unity, the common envelope is formed by a runaway mass transfer rather than a decay of the
orbit. In this case, the angular momentum of the orbit is so large that the common envelope is brought into
co-rotation. Consequently, there are no drag forces that can convert orbital energy into heat and kinetic
energy. This scenario is described in terms of the angular momentum balance, the so-called γ-formalism.
249
Figure 14.11: Scenario of binary evolution leading to the formation of a compact binary consisting of a
pulsar and a white dwarf. (Figure courtesy of Prof. Ed van den Heuvel, University of Amsterdam, NL)
The assumption is that the orbital angular momentum carried away by the envelope is γ times the initial
orbital angular momentum,
γJi
Mgiant +M2=Ji − JfMenv
. (14.38)
250
Figure 14.12: Scenario of a massive binary evolution. (From https://hfstevance.com/ccsne)
Although the γ-formalism was originally developed for double white dwarfs, it was put forward to explain
systems in which a main-sequence star transfers mass to a neutron star or black hole but this physical picture
is also applicable to systems where a red giant overflows onto a main-sequence star. The treatment gives
251
Figure 14.13: Formation scenario of a single or black hole black hole binary giving rise to gravitational
wave emission. (From Mapeli, M., 2020, Frontiers in Astronomy and Space Sciences, Volume 7, id.38)
an upper limit for γ, since the angular momentum carried away by the envelope cannot be higher than the
252
angular momentum of the secondary,
γmax =Mgiant +M2
Menv− (Mgiant +M2)
2
Menv(Mcore +M2)exp
(
− Menv
M2
)
. (14.39)
In conclusion, the orbital evolution during a common envelope phase is uncertain. In computations of
binary evolution, this phase is described by the free parameters α or γ (or both), by lack of a theory based
on first principles. There are basically two major outcomes of this phase, as illustrated in Figure 14.9 for a
low-mass binary: ejection of the envelope leaving a compact binary or a merger product.
14.5 Some binary scenarios
Given the above theoretical considerations, simplifications, and uncertainties in the various theories, numer-
ous binary scenarios across stellar evolution have been constructed and studied over the years. We show four
of those in cartoon-like figures, with low-mass to high-mass components as end product from Figures 14.10
to Figure 14.13. We stress that these are just some of the numerous binary evolution channels published in
the literature. It is obvious that major uncertainties come into play in the various channels and in the stellar
structure models accompanying those channels. The scenarios mainly aim to understand the intermediate
and end-products of close binary evolution. The theories cannot but deliver very rough models of stellar
interiors in a qualitative sense. Their level of quantitative detail comes nowhere near to one resulting from
the quite simple elegant theory of single-star evolution described in the other chapters of this course.
Many more details of binary stars and their evolution are covered in the biennial twin MSc courses
Binary Stars and High Energy Astrophysics.
253
Appendix A
Planck’s radiation laws
Most part of the electromagnetic radiation that exists in the Universe has a thermal origin. Each source
with a temperature T displays a characteristic spectrum in which the intensity Bν(T ) and the energy density
uν(T ) can be described, in a good approximation, as a function of Planck’s radiation law:
uν(T )dν =4π
cBν(T )dν =
8πh
c3ν3
exp (hν/kT )− 1dν. (A.1)
Here, k is Boltzmann’s constant (see Appendix,B). The derivation of this law is discussed in the course
Natuurkunde III and will not be treated here. An object that radiates following the law (A.1) is called a
black body.
Planck’s law can also be written as a function of the wavelength, which is more practical to compare
with observations of stars:
Bλ(T )dλ =2hc2
λ51
exp (hc/λkT ) − 1dλ. (A.2)
In Figure A.1 we show Bλ(T ) for different temperatures. It is clear that the intensity strives towards zero
in the limit of very small and very large wavelengths. The effective temperatures of the stars is roughly
in-between 3 000 K and 30 000 K. Consequently they radiate in the so called optical window of the electro-
magnetic radiation. Notice how strongly the intensity changes at blue wavelengths for temperatures relevant
for stars.
The curves in Figure A.1 reach a maximum of which the position depends on the temperature:
λmaxT = 2898µm K. (A.3)
This is called Wien’s displacement law. The temperature on the outside of the sun is almost 5 800 K. Con-
sequently the intensity of the solar radiation has a maximum around 500 nm. Planets and warm dust have
temperatures of about 290 K, and therefore radiate with a maximum in the infra-red, at about 10µm. Cool
molecular clouds with a temperature of 10 K radiate in the far infra-red up to the mm area.
257
Figure A.1: Black body radiation for different temperatures. top panel: T varies from 5000 K (lowest curve)
to 9000 K (upper curve) with steps of 1000 K; bottom panel: T varies from 9000 K (lowest curve) to 25000 K
(upper curve) with steps of 4000 K
258
The total energy density, integrated over all frequencies, is given by
u(T ) =
∫ ∞
0uν(T )dν = aT 4, (A.4)
with a the radiation constant (see Appendix B). From this we conclude that the energy of thermal radiation
strongly depends on the temperature of the object that emits the radiation. The energy flux per unit of surface
of a black body is given by Stefan-Boltzmann’s law:
B(T ) = σT 4 (A.5)
where σ is the constant of Stefan-Boltzmann (see Appendix B).
Planck’s law is usually a good first approximation for the description of the radiation intensity of stars.
However, the radiation we receive from stars does not come from one single layer in the stellar atmosphere,
and therefore cannot be characterised by one unique temperature. Moreover, absorption and sometimes
emission lines occur in the intensity spectra of stars. Thanks to quantum physics, they can be interpreted in
terms of transitions in atomic nuclei (for the γ- and X-rays), in electron shells (for UV, visual and infra-red
wavelengths) and in molecules (infra-red and mm wavelengths). A detailed analysis of the stellar spectrum
allows us to interpret the physical condition and the chemical composition of the outer layers of the star.
259
Appendix B
Values of some physical and astronomical
constants
In astronomy all quantities are usually still expressed in cgs units. However, students are (as they should
be!) more familiar with the SI system. I leave the choice with you what units you use in this course and have
used them mixed myself. Below we list a few physical and astronomical constants and other commonly
used quantities in the cgs as well as the SI system. We also give a few conversion formulas to other units.
Physical constants :
Constant Symbol cgs units SI units
Speed of light c = 2.99792458 × 1010 cm s−1 2.99792458 × 108 m s−1
Gravitation G = 6.67259 × 10−8 cm3 g−1 s−2 6.67259 × 10−11 m3 kg−1 s−2
Atomic Mass Unit mu = 1.6605390 ×10−24 g 1.6605390 ×10−27 kg
Electron Mass me = 9.1093836 ×10−28 g 9.1093836 ×10−31 kg
Proton Mass mp = 1.6726219 ×10−24 g 1.6726219 ×10−27 kg
Neutron mass mn = 1.6749275 × 10−24 g 1.6749275 × 10−27 kg
Mass helium nucleus mHe = 6.6446572 × 10−24 g 6.6446572 × 10−27 kg
Electron charge e = 1.60217733 ×10−20c esu 1.60217733 ×10−19 Coulomb
Planck h = 2πh = 6.6260755 ×10−27 erg s 6.6260755 ×10−34 J s
Boltzmann k = 1.380658 ×10−16 erg K−1 1.380658 ×10−23 J K−1
Gas R = 8.314510 ×107 erg K−1 mol−1 8.314510 J K−1 mol−1
Radiation a = 7.5646 ×10−15 erg cm−3 K−4 7.5646 ×10−16 J m−3 K−4
Stefan-Boltzmann σ = 5.67051 ×10−5 erg cm−2 s−1 K−4 5.67051 ×10−8 J m−2 s−1 K−4
261
Astronomic constants :
Constant Symbol cgs units SI units
Radius Sun R⊙ = 6.9598 ×1010 cm 6.9598 ×108 m
Mass Sun M⊙ = 1.9891 ×1033 g 1.9891 ×1030 kg
Luminosity Sun L⊙ = 3.8515 ×1033 erg s−1 3.8515 ×1026 J s−1
Astronomical unit AU = 1.49598 × 1013 cm 1.49598 × 1011 m
Parsec pc = 3.08568 × 1018 cm 3.08568 × 1016 m
Light year ly = 9.463 × 1017 cm 9.463 × 1015 m
Conversion :
From Angstrom to cm : 1 A = 10−8 cm
From Newton to dyne : 1 N = 105 dyne
From Joule to erg : 1 J = 107 erg
From electronvolt to erg : 1 eV = 1.60217733 × 10−12 erg
From atmosphere to dyne cm−2 : 1 atm = 1.01325 × 106 dyne cm−2
262
Appendix C
Some key references for this discipline, used
in these lecture notes
The notes for this course are based on the following standard works, from which many of the illustrations
were taken, as indicated in the figure captions.
Aerts, C., Christensen-Dalsgaard, J., Kurtz, D. W., 2010, “Asteroseismology”, Springer-Verlag
Aerts, C., Mathis, S., Rogers, T.M., 2019, “Angular Momentum Transport in Stellar Interiors”, Annual Re-
view of Astronomy & Astrophysics, Volume 57, pp.35–78
Cox, J.P., Guili, R.T., 1968, “Principles of Stellar Structure”, Volume I & II, Gordon & Breech, New York
Hansen, C.J., Kawaler, S.D., Trimble, V., 2004, “Stellar Interiors: Physical Principles, Structure, and Evo-
lution”, Second Edition, Springer-Verlag
Hilditch R.W., 2001, “An Introduction to Close Binary Stars”, Cambridge University Press
Kippenhahn, R., Weigert, A., Weiss, A., 2012, “Stellar Structure and Evolution”, 2nd edition, Springer-
Verlag
Lamers, H.J.G.L.M, Cassinelli, J.P., 1999, “Introduction to Stellar Winds”, Cambridge University Press
Lamers, H.J.G.L.M, Levesque, E.M., 2017, “The Evolution of Massive Stars of 25–120 M⊙: Dominated by
Mass Loss”, IOP Publishing
Maeder, A., 2009, “Physics, Formation and Evolution of Rotating Stars”, Springer-Verlag
263
Pringle J.E., Wade R.A., 1985, “Interacting Binary Stars”, Cambridge University Press
Tassoul, J.-L., Tassoul, M., 2004, “A concise history of solar and stellar physics”, Princeton University Press
Tkachenko, A., Pavlovski, K., Johnston, C., et al., 2020, “The mass discrepancy in intermediate- and high-
mass eclipsing binaries: The need for higher convective core masses”, Astronomy & Astrophysics, Volume
637, id. A60, 20pp.
Torres, G., Andersen, J., Gimenez, A., 2010, “Accurate masses and radii of normal stars: modern results
and applications”, Astronomy & Astrophysics Review, Volume 18, pp. 67–126
Weiss, A., Hillebrandt, W., Thomas, H.-C., Ritter, H., 2004, “Cox & Giuli’s principles of stellar structure:
Extended Second Edition”, Cambridge Scientific Publishers
264