STELLAR STRUCTURE & EVOLUTION

278
STELLAR STRUCTURE & EVOLUTION Prof. Dr. Conny Aerts KU Leuven, Belgium Master Program in Astronomy & Astrophysics (NL: Master Sterrenkunde) 6 Study Points in the European Credit system (ECTS)

Transcript of STELLAR STRUCTURE & EVOLUTION

STELLAR STRUCTURE & EVOLUTION

Prof. Dr. Conny Aerts

KU Leuven, Belgium

Master Program in Astronomy & Astrophysics

(NL: Master Sterrenkunde)6 Study Points in the European Credit system (ECTS)

Contents

Preface xi

PART I: BASIC INTRODUCTION TO ASTRONOMY 1

1 Observational framework of astronomy 1

1.1 Magnitudes and colour indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Spectral types and luminosity classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 The formation of spectral lines in the stellar spectrum . . . . . . . . . . . . . . . . . 4

1.2.2 Spectral types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.3 Luminosity classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2.4 Stellar atmosphere models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3 The Hertzsprung-Russell diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.4 Stars in our Milky Way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.5 Galaxies in the Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.6 Starting point of this course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

25

iii

PART II: STELLAR STRUCTURE 25

2 A simple equation of state: an ideal gas with radiation 25

2.1 Introduction to thermodynamics, applied to stars . . . . . . . . . . . . . . . . . . . . . . . . 25

2.1.1 Thermodynamic equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.1.2 The first law of thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.1.3 The entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.1.4 The specific heats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2 An ideal gas with radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2.1 The classical ideal gas law applied to stars . . . . . . . . . . . . . . . . . . . . . . . 30

2.2.2 The mean molecular weight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2.3 The internal energy of an ideal gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.2.4 The contribution of the photon gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3 Classical mechanics applied to stellar structure 37

3.1 Some preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2.1 Eulerian description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2.2 Lagrangian description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.4 Conservation of momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.4.1 Hydrostatic equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.4.2 Simple solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.4.3 The equation of motion in case of spherical symmetry . . . . . . . . . . . . . . . . 44

3.5 Conservation of energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.5.1 The virial theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.5.2 Conservation of energy in stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.5.3 The different time scales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 Additional relevant equations of state 53

4.1 Polytropes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2 The degenerate electron gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.3 The Chandrasekhar limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.4 Schematic representation of the relevant equations of state . . . . . . . . . . . . . . . . . . 63

5 Energy transport 65

5.1 Energy transport by radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.1.1 Mean free path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.1.2 The temperature gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.1.3 The diffusion approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

5.1.4 The Rosseland mean opacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.2 Energy transport by conduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.3 Stability analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.3.1 Dynamical instability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.3.2 Buoyancy frequency and semiconvection . . . . . . . . . . . . . . . . . . . . . . . 78

5.4 Convective energy transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.4.1 Mixing length theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.4.2 A computational scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.4.3 The parametric implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6 The chemical composition of stellar matter 87

6.1 The relative mass fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.2 Variations of chemical composition due to nuclear reactions . . . . . . . . . . . . . . . . . 88

6.3 Effective cross sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.4 Nuclear burning cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.4.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.4.2 Big Bang nucleosynthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

6.4.3 Hydrogen burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.4.4 Helium burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.4.5 Fusion of heavier elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.5 Summary: the periodic table filled via the nuclear reactions in stars . . . . . . . . . . . . . . 102

7 Complications: mixing of chemical elements due to transport processes 103

7.1 Convective mixing and nuclear burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

7.2 Convective boundary mixing aka CBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

7.3 Mixing due to rotational instabilities and waves . . . . . . . . . . . . . . . . . . . . . . . . 105

7.3.1 Models with rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7.3.2 Rotational and pulsational mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

7.4 Microscopic atomic diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

8 Numerical computation of stellar structure and evolution models 113

8.1 The full system of basic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

8.2 Time scales and simplifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

8.3 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

8.3.1 Central boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

8.3.2 Boundary conditions for the surface . . . . . . . . . . . . . . . . . . . . . . . . . . 117

8.4 A simple numerical scheme: the Henyey method . . . . . . . . . . . . . . . . . . . . . . . 118

8.5 The MESA stellar structure and evolution code . . . . . . . . . . . . . . . . . . . . . . . . 124

PART III: STELLAR EVOLUTION 129

9 Star formation 129

9.1 The interstellar medium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

9.2 The Jeans criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

9.3 Fragmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

9.4 The formation of a protostar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

9.5 Hayashi tracks in the HR diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

9.6 Evolution of the protostar towards the zero-age main sequence . . . . . . . . . . . . . . . . 139

10 The main sequence or core-hydrogen burning phase 145

10.1 Zero-age main sequence models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

10.2 The mass-luminosity and mass-radius relations . . . . . . . . . . . . . . . . . . . . . . . . 147

10.3 Chemical evolution on the main sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

10.4 The end of core-hydrogen burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

10.5 Later stages of evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

11 Evolution of a star with 8M⊙<∼ M <

∼ 15M⊙ 163

11.1 The Hertzsprung gap for stars with M >∼ 2.3M⊙ . . . . . . . . . . . . . . . . . . . . . . . . 163

11.2 Helium burning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

11.3 Later evolution stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

11.4 Burning cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

11.5 Explosive versus non-explosive evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

11.6 Neutron stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

11.6.1 Supernova explosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

11.6.2 The neutrino flux and the r-process . . . . . . . . . . . . . . . . . . . . . . . . . . 171

11.6.3 Pulsars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

12 Evolution of a star with M <∼ 8M⊙ 179

12.1 Post-main-sequence evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

12.2 The helium flash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

12.3 Evolution after the helium flash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

12.4 AGB stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

12.4.1 The circumstellar envelope and mass loss during the AGB . . . . . . . . . . . . . . 188

12.4.2 Thermal pulses, Hot Bottom Burning and the 3rd dredge-up . . . . . . . . . . . . . 191

12.4.3 The s-process in AGB stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

12.5 Post-AGB stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

12.6 White dwarfs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

13 Evolution of a star with M >∼ 15M⊙ 199

13.1 The spectra of hot massive stars with mass loss . . . . . . . . . . . . . . . . . . . . . . . . 199

13.2 Basic characteristics of radiation-driven stellar winds . . . . . . . . . . . . . . . . . . . . . 203

13.3 Mass loss and terminal wind speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

13.3.1 Thomson scattering in the stellar wind . . . . . . . . . . . . . . . . . . . . . . . . . 206

13.3.2 LBVs, WR stars and the Eddington limit . . . . . . . . . . . . . . . . . . . . . . . 207

13.3.3 A realistic description of a line-driven stellar wind: the CAK-model . . . . . . . . . 208

13.4 Consequences of mass loss for stellar evolution . . . . . . . . . . . . . . . . . . . . . . . . 211

13.5 Example: the evolution of a star with an initial mass of 60M⊙ . . . . . . . . . . . . . . . . 215

13.6 Black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

13.7 Chemical evolution of galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

13.7.1 Chemical enrichment by stellar evolution . . . . . . . . . . . . . . . . . . . . . . . 219

13.7.2 Initial mass function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

13.7.3 Global enrichment of the Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

PART IV: BINARY EVOLUTION 227

14 Binary stars and their evolution 227

14.1 Observational classification of close binary stars . . . . . . . . . . . . . . . . . . . . . . . . 228

14.2 The Roche model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

14.3 Determination of the orbital and fundamental parameters of binaries . . . . . . . . . . . . . 236

14.3.1 Orbital elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

14.3.2 Masses and radii of main-sequence stars . . . . . . . . . . . . . . . . . . . . . . . . 237

14.3.3 Masses of white dwarfs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

14.3.4 Type Ia supernovae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

14.3.5 Masses of neutron stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

14.4 Mass transfer and evolution of close binaries . . . . . . . . . . . . . . . . . . . . . . . . . . 240

14.4.1 Tidal effects: circularisation and synchronisation . . . . . . . . . . . . . . . . . . . 241

14.4.2 Mass transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

14.4.3 Effect of mass transfer on the orbital parameters . . . . . . . . . . . . . . . . . . . 244

14.4.4 The common envelope phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

14.5 Some binary scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

THE END 257

A Planck’s radiation laws 257

B Values of some physical and astronomical constants 261

C Some key references for this discipline, used in these lecture notes 263

x

Preface

Stars are the building blocks of the Universe as a whole, and of galaxies and exoplanetary systems in

particular. Stars determine the dynamical and chemical evolution of galaxies and of their multiple systems

such as star clusters, binaries, and planetary systems. Stars are responsible for the production of nearly all

chemical elements and as such they determine the chemical enrichment of their host galaxy and any forms

of biological life occurring on planets (if any). During their lives, the products of nucleosynthesis are fed

back into the interstellar medium through stellar winds. Massive stars also contribute to the chemical and

dynamical evolution of their host galaxy at the end of their life when they explode as a supernova. It is

thanks to this chemical enrichment of galaxies, and of our Milky Way in particular, that life on Earth has

been able to emerge and evolve. The derivation of distances in the Universe is based on our knowledge of

stars. This is also the case for the determination of the age of the Universe. We cannot but conclude that

models of stellar structure and evolution play a pivotal role in most subjects of modern astrophysics.

Keeping this in mind, it is not only obvious but as well recommendable to have a mandatory 6 ECTS

master course describing these essential constituents of the Universe in quite some detail. At Bachelor

level, most students already had descriptive courses covering various topics in astronomy, e.g., (exo)planets,

stars, interstellar medium, galaxies, cosmology, etc. These courses might only have touched upon each

of these many topics, depending on the background of the student. Some students may enter the MSc in

A&A at KU Leuven without any prior knowledge of astronomy whatsoever. Our aim with this course is to

get everyone comfortably aboard, despite the large diversity in prior knowledge and scientific or cultural

background.

⋆ ⋆

Computations of stellar structure and evolution models are based on our current knowledge of the physical

properties of matter and radiation occurring in stars. By comparing the computed models with ever more pre-

cise observations, we are able to test the assumptions on the input physics used in the model computations.

This is the only practical method to evaluate stellar evolution models, because stellar interior conditions are

so extreme in terms of temperature, density, and pressure that it is impossible to carry out laboratory tests

under the appropriate circumstances and for the proper time scales. This is what distinguishes observational

stellar astronomy from experimental physics.

The structure and evolution of stars is mainly determined by the microscopic properties of stellar matter,

xi

in particular the equation of state of the gas, the energy transport, the nuclear reactions, and the interaction

of radiation and matter. The equation of state determines the relations between the various thermodynamic

properties like temperature, density, and pressure inside the star. This equation of state is quite simple in the

case of stellar interiors because the prevalent high temperatures imply that matter is almost entirely ionized.

However, some complications occur, for example when the stellar gas is only partially ionized in the layers

near the stellar surface. In these regions, we need to take the degree of ionization into account. This level

of ionization depends on the interaction between the different gas components. In the stellar atmosphere of

cool stars, molecules may appear aside from neutral atoms, influencing the equation of state. On the other

hand, in the hot cores of evolved stars, the temperature may become so high that energy loss due to the

production of neutrinos has to be taken into account. Also, the density may become so high that the pressure

will be dominated by degenerate electrons or neutrons. In such a case, the ideal gas law is no longer valid

but the laws of quantum physics have to be considered to arrive at a proper equation of state.

A detailed description of the physical properties of stellar interiors needs in-depth studies. Fortunately,

it is possible to derive an appropriate description of stellar structure and evolution without going into the full

details and obtain a good solid understanding of how stars live their life. The theory described in this course

is elegant, has an impressive predictive power, and combines several branches of mathematics, physics,

chemistry, and computer science. Furthermore, via modern implementations of the theory into an efficient

set of modules called MESA, detailed numerical computations of this theory can be performed (with much

gratitude to the MESA developers team!). In that way, we are able to predict how the complicated internal

stellar structure changes during a star’s life and what the final end product of the stellar evolution will be: a

white dwarf, a neutron star or a black hole.

⋆ ⋆

The current 6 ECTS course Stellar Structure & Evolution (SSE) aims to give a stimulating derivation and

description of the physics of stars to Master students. This way, students will be able to understand the

origin, life and death of stars and learn how this cycle influences the chemical evolution of galaxies and of

the Universe as a whole. Introductory bachelor courses may have suggested that the branch of physics as a

scientific domain is composed of several separate sub-branches, like mechanics, thermodynamics, nuclear

physics, quantum mechanics, etc. In this stellar structure & evolution course, these seemingly separate

segments nicely come together in a natural way, taking chemistry onboard as well. This course is thus

intended for students who have a basic knowledge of mathematics, physics, and to some extent chemistry.

We tried to limit specific technical jargon to a minimum and to compose a course well accessible for all

students at the start of the Master in Astronomy & Astrophysics, whatever their background in science.

The first chapter recalls some of the basics of bachelor courses, limiting to only those aspects and

definitions that are essential as a start-up for the current course. Some of the aspects of astrophysics were

already treated in Leuven Bachelor courses, which master students who did their Bachelor studies elsewhere

might not have encountered in their education. The ingredients that are of major importance here, are

therefore briefly repeated in this Master course.

Conny Aerts, September 2021.

xii

PART I: BASIC INTRODUCTION TO

STELLAR ASTROPHYSICS

xiii

Chapter 1

Observational framework of astronomy

All we know about stars and other celestial bodies, is derived from observations of their electromagnetic

spectrum. The amount of light radiated by a star is mainly determined by its size and its temperature and

chemical composition near its surface. The challenge is to decode this radiated light and deduce information

which is not observable, like stellar mass, age, internal temperature, pressure, chemical composition, . . .

The brightness or luminosity L of an object is the amount of energy it radiates per second. This quantity

is a crucial parameter for a star’s life, together with its mass M . None of both variables can be measured

directly. The amount of energy coming from a star that can be measured with an instrument, depends on the

distance d between star and observer. Therefore it is clear that the determination of distance is an important

part of observational astronomy.

Often we will assume that stars roughly behave like black bodies, following Planck’s radiation law. A

short description of these laws and their interpretation, can be found in appendix A. The correlation between

temperature, luminosity and radius of a black body is described in the law of Stefan-Boltzmann:

L = 4πσR2T 4, (1.1)

where σ is a constant defined as 2π5k4/15c2h3 with k the constant of Boltzmann, h the constant of Planck

and c the speed of light. The effective temperature of a star is then defined by the comparison of a star

with radius R and luminosity L with a spherical black body with radius R: L = 4πR2σT 4eff . This way the

effective temperature is a measure of the temperature in the star’s photosphere. This is the area near the

outer layers of the star where the radiation escapes. The Sun has an effective temperature of ≃ 5780K.

An important characteristic coming from Planck’s radiation laws is the displacement law of Wien.

This law states that the wavelength at which the maximum flux is radiated is entirely determined by the

temperature of the radiant body, according to λmax = (2.9/T )mm, where the temperature is expressed in

Kelvin. This implies that the Sun’s radiation is at its maximum around 500 nm. Humans and planets radiate

from infra-red to submm and mm wavelengths. Stars have an effective temperature range from about 3000 K

until some 50 000 K. By consequence, they dominantly radiate in the UV and at visual wavelengths.

1

1.1 Magnitudes and colour indices

A system of magnitudes is a logarithmic scale depending on the amount of radiative energy coming from an

astronomical source. When we consider two sources, the difference in magnitude of source 2 in relation to

the one of source 1 is given by

m2 −m1 = −2.5 logS2S1, (1.2)

by which S is the amount of received radiative energy per time unit. This leads to the conclusion that, in

case more energy is received from source 2 than from source 1, the magnitude of source 2 is lower than the

magnitude of source 1.

The introduction of magnitudes comes from the Greek astronomer Hipparchus of Nicea. In the sec-

ond century before Christ, Hipparchus classified all stars visible with the naked eye and divided them into

six classes. The brightest stars were in class 1 and the faintest ones in class 6. Only last century, the

more mathematical definition (1.2) was introduced and this was matched as best as possible to Hipparchus’

classification. In order to achieve this, the zero point of the magnitude scale had to be established in a

well-determined way. Eq. (1.2) was then rewritten as

m = (m1 + 2.5 log S1)− 2.5 log S = C − 2.5 log S. (1.3)

Fixing the constant C thus comes down to determining the zero point of the magnitude scale.

Another important aspect is the considered wavelength range involved in the magnitude system. The

first extreme possibility is to consider the so-called bolometric magnitude mbol. This is the magnitude of the

object when all wavelengths of the electromagnetic spectrum are considered. On the other hand, there is the

extreme possibility of a monochromatic magnitude mλ, which is the magnitude when only one wavelength λis considered. In practice we neither use bolometric nor monochromatic magnitudes, but rather magnitudes

related to a selected wavelength range. Hence, this range has to be specified.

The amount of received radiative energy S in definition (1.3) can be described as follows. Define Sλas the amount of radiative energy an observer receives at wavelength λ through continuum radiation (see

below for a description). A fraction ηλ of the radiation might be blocked from the observer by absorption

lines in the electromagnetic spectrum. The received radiation at wavelength λ is then Sλ(1−ηλ). Moreover,

let us assume that the sensitivity and the efficiency of the instrument used to measure the radiative energy is

described by the function ϕ(λ). This way, we can describe the amount of received radiative energy as

S =

0Sλ(1− ηλ)ϕ(λ)dλ. (1.4)

The standard system used for the determination of magnitudes is the UBV system designed by Johnson

and Morgan. The functions ϕU (λ), ϕB(λ), ϕV (λ) are determined by the wavelength filters used at the

telescope. These functions have a maximum at respective wavelengths 365, 440 and 548 nm. The definition

for the visual magnitude can now be described as:

mV = CV − 2.5 log

∫ ∞

0Sλ(1− ηλ)ϕV (λ)dλ, (1.5)

2

where the constant CV is fixed in such a way that the visual magnitude matches as best as possible the

magnitude classification determined by Hipparchus. Similarly to Eq. (1.5), the U− and B−magnitudes can

be defined. The zero point of the magnitude scale was chosen in such a way that mU = mB = mV = 0 for

the bright star Vega, which has a spectral type of A0V (see below for the definition).

When the magnitudes are corrected for interstellar absorption and extinction due to the Earth’s atmo-

sphere, the magnitude is only determined by the amount of radiative energy a source is emitting per unit

time and by the distance of the source to the observer. To rule out the differences in distance, the sources are

fictively positioned at an equal distance from the Sun. Indeed, when the sources are at an equal distance, the

differences in magnitude are solely determined by the differences in the amount of radiative energy. This

way of thinking precedes the introduction of an absolute magnitude scale. For a spherical star, the product

Sλ(1− ηλ) in Eq. (1.4) can this way be linked to the outward radiation flux F+λ :

4πR2F+λ = 4πd2Sλ(1− ηλ), (1.6)

where d is the distance between the star and the observer and R is the stellar radius. We thus have

Sλ(1− ηλ) =

(

R

d

)2

F+λ . (1.7)

The expression for what we call the apparent magnitude is then

m = C − 2.5 log

[

(

R

d

)2 ∫ ∞

0F+λ ϕ(λ)dλ

]

. (1.8)

Hence, the apparent magnitude is not only determined by the amount of emitted energy but also by the

distance of the star to the observer.

We now introduce the absolute magnitude M of a star. This is the apparent magnitude the star would

have when positioned at a distance of 10 parsec away from the Sun. The difference between the absolute

and apparent magnitude then is

M −m = 2.5 log

(

10pc

d

)2

. (1.9)

This can also be written as

M = m+ 5− 5 log dpc, (1.10)

where dpc is the distance in parsec. A parsec is some 3.26 light years, which corresponds to 3 × 1013 km.

The difference m−M is called the distance modulus.

Finally, we introduce the term colour index. Colour indices of stars are differences between the mag-

nitudes of the same star at different wavelengths. With the three magnitudes U,B, V , two commonly used

colour indices are constructed: U −B and B − V .

If we apply relation (1.10) on two different magnitudes of the same star and subtract term by term, we

find

M2 −M1 = m2 −m1. (1.11)

3

The difference in apparent magnitude is a quantity that is easily measured. The colour indices thus are a

measure of an intrinsic characteristic of the star. The index B − V , e.g., is a good measure for the effective

temperature of the star.

1.2 Spectral types and luminosity classes

In Figure 1.1 the stellar flux is shown as a function of the wavelength λ for different types of stars. These

were ordered from cool on top to hot at the bottom. The hottest stars are blue and their spectrum shows

absorption lines of ionized atoms. Cool stars, on the other hand, radiate particularly strongly at red wave-

lengths and show absorption lines of neutral atoms and molecules.

1.2.1 The formation of spectral lines in the stellar spectrum

Spectra of astronomical objects show continua dominated by spectral lines. The latter can occur in absorp-

tion as well as in emission vis-a-vis the local continuum. We will now shortly describe the formation of

continua and spectral lines. For a more extensive description we refer to the courses Radiative Processes

in Astronomy and Stellar Atmospheres lectured in the Bachelor of Physics and Master of Astronomy &

Astrophysics of the KU Leuven, respectively.

Spectral lines

Spectral lines are the result of discrete energy transitions like the jumps of an electron between bound levels

in an atom or an ion. These are described as bound – bound transitions or bb-transitions. Excitation to a

higher energy level can on the one hand be caused by the absorption of kinetic energy (collisional excitation)

or on the other hand by the absorption of a photon (radiative excitation). Analogously the de-excitation to

a lower energy level can be caused by a collision (collisional de-excitation) or by emission of a photon

(radiative de-excitation).

The energy exchange accompanying a bb-transition has always to do with a difference in energy hν =Emn, where Emn = Em − En is the difference in energy between the levels m and n (m > n). The

photons suitable to be involved must have the specific wavelength λ = hc/Emn.

Spectral lines are always linked to discrete bb-processes. However, emission lines are not always the

result of radiative de-excitation while absorption lines are not always the result of radiative excitation. The

origin of a spectral line is depending on the radiative transport throughout the medium. In general, spectral

lines are the result of extra bb-processes that occur at specific line wavelengths in the medium, next to the

processes that define the continuum spectrum at that wavelength.

4

Figure 1.1: Optical stellar spectra of main-sequence stars with approximately the same chemical com-

position but with increasing effective temperature from top to bottom. (From Spark & Gallagher, 2000,

Cambridge University Press)

5

Continua

Continua are the result of non-discrete processes where photons are absorbed or emitted. First, there are

bound-free transitions (or bf-transitions) from atoms and ions. For example, an electron is released from

a bound state n through the absorption of a photon with an energy higher than or equal to the ionization

energy E∞n = E∞ − En from the same level. This is called: radiative ionization. Trapping of a free

electron can lead to a bound state. Here, a photon is emitted (emission) with an energy higher than or

equal to E∞n. This is called: radiative recombination. Ionization and recombination can also occur via

absorption or emission of kinetic energy without any involvement of photons. This is called: collisional

ionization and collisional recombination. The free states of the electrons above the level of ionization are

not discrete because the free electron can have an arbitrarily large amount of kinetic energy mev2/2, in other

words hν = E∞n +mev2/2.

Bound-bound excitation and de-excitation, as well as bound-free ionization and recombination, can

occur through the absorption or emission of radiative energy from photons as well as through the absorption

or emission of kinetic energy through the collision of particles.

Furthermore, there are also free-free transitions (or ff-transitions). This is also called: Bremsstrahlung.

This is the emission or absorption of photons as a result of the acceleration or deceleration of a charged

particle in a Coulomb field, for example at a collision between an ion and an electron.

Photon creation, photon destruction, photon scattering

We can compile the bb-processes in three pairs :

1. collisional excitation followed by radiative de-excitation. This results in the creation of a photon. In

this case kinetic energy is transformed into radiation;

2. radiative excitation followed by collisional de-excitation. This results in the destruction of a photon.

In this case radiation is transformed into kinetic energy;

3. radiative excitation followed by radiative de-excitation. This is called scattering of a photon. In this

case only a repartition of radiation occurs.

When photon scattering occurs, at least the direction between the incoming and the scattered photon

changes. For the lower atomic levels, photon scattering is an important process because the decay time for

radiative de-excitation is very short at these low levels (typical 10−10 − 10−9 seconds). When there is a line

transition caused by scattering from the ground state, it is called a resonance line. In this case, the scattering

process is called resonant scattering. Resonance lines represent the lowest possible energy transitions from

the ground state. By consequence, the involved transitions have a short life time, and therefore occur very

frequently at high densities as well as at low densities (there is always a huge reservoir of electrons in the

ground state waiting for a photon to arrive so that they can make a line transition based on photon scattering).

6

Figure 1.2: Energy levels for the hydrogen atom. The bound levels approach the ionisation boundary at

13.598eV. For each of the four first hydrogen levels the bound states are designated by vertical lines with

the name and wavelength of the corresponding spectral line. The limit of each series is also indicated. The

Lyman lines are found in the UV, the Balmer lines in the visual and the Paschen and Bracket lines in the

infra-red. (From Rob Rutten, 2003, Lecture Notes, Utrecht University, NL)

1.2.2 Spectral types

Since the birth of spectroscopy in the second half of the 19th century, astronomers have classified the stars

in classes according to the strength of their Balmer lines. These are absorption lines of neutral hydrogen

(HI, see Figure 1.2) 1. This way the A stars were defined as the ones with the strongest Balmer lines, B stars

are second according to the strength of these lines and so on. At the end of the 19th century the Harvard

astronomer Antonia Maury realized that the strength of all spectral lines, not only the Balmer lines, followed

a nice sequence when she ordered the classes according O B A F G K M. This is illustrated in Figure 1.3.

On this basis the first large-scale stellar spectrum classification was carried out at Harvard College

Observatory. This was possible thanks to the financial contribution of Ms. Draper, who wanted a beautiful,

lasting memory to her deceased husband. Henry Draper was the first person ever to photograph a stellar

spectrum. The classification was carried through between 1886 and 1924 under the leadership of Annie

1In astronomy we use a different notation for isotopes and for the lines they create in a spectrum. For example, when we speak of

“iron four” and write down Fe IV, we mean the spectrum of the Fe3+ isotope. HI therefore means the spectrum of neutral hydrogen.

7

Figure 1.3: The spectral sequence of Harvard. These example spectra are printed in such a way that the

absorption lines are dark on a clear background of the continuum radiation of the star. The wavelengths are,

as is common in astronomy, expressed in Angstrom (1 A= 0.1 nm = 10−8 cm). The clearest parts in the

spectrum shift from the early-type stars (O and B) to the late-type stars (GKM). (From Rob Rutten, 2003,

Lecture Notes, Utrecht University, NL)

Cannon2. Almost 240 000 stars were classified in the Henry Draper Catalogue (as exchange for 250 000

dollar of Mw. Draper). Nowadays, there are 400 000 classified stars if we enclose those of the catalogue’s

supplement.

Today we know that the sequence of Maury is one of descending effective temperature and that the

line strength in the spectra is determined by the ionization law of Saha and the Boltzmann equation. This

very important interpretation was done by Cecilia Payne-Gaposhkin (1925)3 in her doctoral study. She

demonstrated that stars are mainly made of hydrogen (∼ 70%), helium (∼ 28%) and for the rest only for

2Note that women were not allowed to study astronomy, except as a hobby. This was the case until the classification of tens

of thousands of stars had to be carried through. The idea and plan to go ahead with this classification was not only the work of

women, but, moreover, the implementation required such a patient work, at a very low wage, that the president Edward Pickering

of the Harvard College did not find men who were willing to do this task. This is the way women entered professional astronomy

in the first half of the 20th century.3Cecilia studied astronomy at Cambridge, UK, with Sir Arthur Eddington. Eddington was however convinced that women were

not apt to do research in astronomy. Intrigued as she was by astronomy, and displeased with Eddington’s attitude, Cecilia left to

Harvard, USA, where she was welcomed to continue with her research work. She did this brilliantly.

8

∼ 2% of heavier elements (or shortly metals) 4.

Each of the seven classes was subsequently divided into ten subclasses, from 0 for the hottest until 9

for the coolest star in one class. This scheme is still in use today (so Mw. Draper should be pleased). The

Sun is a star of spectral type G2. Recently, an additional class L was added for very cool stars detected by

infra-red observations. Often the first classes of the hot stars (OBA) are called early-type stars while the

classes at the end of the classification series (GKML) are called late-type stars.

The effective temperature of O stars is higher than 30 000 K. Their strongest spectral lines are those of

singly ionized helium (He II lines) and doubly ionized helium carbon (CIII). Their Balmer lines are weak

because almost all the hydrogen in the photosphere is fully ionized at such high temperatures. The spectra

of B stars do have strong Balmer lines and also strong lines of neutral helium (HeI), their temperature is

between 12 000 and 25 000 K from B9 until B0.

The A stars have temperatures around 10 000 K and are cool enough to keep the hydrogen neutral in

their photosphere. Next to very strong Balmer lines they also have many lines of singly ionized metals, like

calcium. Also remarkable in their spectra is the so-called Balmer jump at 365 nm (3646A, see Figure 1.2).

F stars have weaker Balmer lines than A stars. In their spectrum lines of neutral metals are visible. In G

stars like the Sun, the singly ionized calcium lines (CaII) at 4300A are remarkable. These were discovered

by Fraunhofer in 1815. He labelled all lines he could distinguish in the solar spectrum, from A to K from

red to blue wavelengths. Up to now, the strongest calcium lines are therefore still called the H and K line.

The D line of neutral sodium (NaI) is also remarkable in G stars.

In the spectra of K stars we can mainly distinguish lines of neutral metals and of molecules like TiO

(titanium oxide). M stars are mostly cooler than 4 000 K at their surface. Therefore we can find deep

absorption bands of TiO and VO (vanadium oxide) as well as lines of neutral metals. In the even cooler L

stars, sodium D lines are remarkable, causing broad molecular bands.

1.2.3 Luminosity classes

The lines in a stellar spectrum do not only give us information about the effective temperature and chemical

composition, but also about the value of the surface gravity. This quantity refers to the gravitational accel-

eration at the surface of the star, namely g ≡ GM/R2 with M the stellar mass5. In astronomy, quantities

are mostly expressed in the cgs system. This is due to historical reasons, but also because this gives “handy

numbers” for some important observational parameters characterizing stars. We refer to Appendix B for the

values of physical and astronomical constants, in this system as well as in the SI system. This gives values

4Although in other branches of science, carbon, oxygen, nitrogen, iron, . . . are not grouped under the single denominator “met-

als”, it is very meaningful to do so in astronomy. This is because hydrogen and helium (and a tiny bit of lithium) were formed

within half an hour after de Big Bang, while all other elements were only formed afterwards through nucleosynthesis in stellar

interiors.5For the mass and the absolute magnitude of a star the same symbol is used. It is always clear from the context which quantity

is meant.

9

Figure 1.4: Optical stellar spectra of three stars of spectral type A, but belonging to a different luminosity

class. (From Spark & Gallagher, 2000, Cambridge University Press)

for log g between 1 and 5 for most stars, and log g from 6 to 8 for compact stellar remnants.

In Figure 1.4 the spectra of three A stars are shown. The upper star is a dwarf, like the Sun, the middle

star is a giant and the lower is a supergiant. Their log g reduces from top to bottom because the radius is

increasing in that direction. So the dwarf star is much denser than the giant and supergiant. The atoms in

this star are thus more crowded than in the giant and supergiant. This has an effect on the spectral lines

because they are subject to the Stark pressure broadening effect. This implies that the width of a spectral

line at a well-defined temperature is mainly a function of the pressure experienced by the atoms, responsible

for the line.

Therefore stars are not only divided into classes according to their temperature, but also following their

gravity. Most stars are dwarf stars like the Sun. Calling them dwarfs is somewhat misleading because the

hottest dwarfs are much bigger than the Sun, having a radius of about 10 R⊙. Giants and supergiants are in

any case much bigger, with radii of respectively a factor 10 to 100, and a factor 100 to 1000 times bigger

than the radius of the Sun. According to Eq. (1.1), they have a much higher luminosity than dwarf stars

with the same temperature. Moreover, there are also white dwarfs and these dwarfs differ very much from

the Sun. Actually, these aren’t stars anymore, but compact stellar remnants left over at the end of stellar

evolution. This is also the case for neutron stars. They have such a high density that their gravity can rise

until log g between 7 and 8 (in cgs). This is because their radius is very small, typical 0.01R⊙ (i.e. about

the Earth’s radius) for a white dwarf and only a few tens of kilometers for a neutron star.

In practice, stars are classified in luminosity classes: VII for whit dwarfs, VI for subdwarfs, V for

10

dwarfs, IV for subgiants, III for normal giants, II for bright giants, and I for supergiants. These last ones

are then subdivided in Ia and Ib according to their luminosity (Ia highest L, Ib less luminous). Often

small capitals are added to the luminosity classes, based on the appearance of specific characteristics of the

spectral lines. This way stars of which Balmer lines are observed in emission are labelled with small capital

“e” behind the luminosity class, and hot stars with NIII and HeII lines in emission get an “f” , etc.

1.2.4 Stellar atmosphere models

In practice, the effective temperature and gravity of a star are estimated by comparing its spectrum with

the one of other stars of which this value is already known. So-called model atmospheres are also used

here. These are computer models calculating how the radiation is transmitted through a stellar atmosphere

with a given effective temperature, log g and chemical composition. Model atmospheres are then calibrated

based on stars of which high-resolution spectra are available with a high wavelength range and of which the

effective temperature, gravity, chemical composition, interstellar absorption and distance are all well-known

quantities (so-called standard stars or calibration stars).

Computed model atmospheres also allow to determine the surface abundances of elements of a star,

based on high-resolution high signal-to-noise spectra. The purpose of such research is to determine the

amount of heavy elements in the stellar atmosphere and to confront that with the outcome of evolutionary

models. The abundances are always expressed in relation to those of the Sun. Abundance determination

is of major importance for the interpretation of stellar spectra of evolved stars, as treated in Part III of this

course. The modelling and interpretation of measured stellar spectra is discussed in detail in the KU Leuven

Bachelor and Master courses Radiative Processes in Astronomy and Stellar Atmospheres.

1.3 The Hertzsprung-Russell diagram

The Hertzsprung-Russell diagram (HR diagram, named after the American astronomer Henry Russell and

the Danish astronomer Enjar Hertzsprung) gives an important statistical relation for stars by means of a

diagram. The diagram represents the evolution of stars and is thus a basic diagnostic for the discussion of

stellar evolution theory. A very schematic representation of the HR diagram according to the luminosity

classes is shown in Figure 1.5, while Figure 1.6 shows the connection between the effective temperature and

spectral type on the x-axis, with several “well-known” stars named in the graph.

Russell was the first to study the relation between the spectral type and the absolute magnitude MV

of stars. Hereto, he constructed a diagram in which he plotted the absolute magnitude versus the spectral

type. On the other hand, Hertzsprung noticed a difference between dwarf stars and giant stars for late

spectral types. Often the colour index B − V is put on the abscissa instead of the spectral type or effective

temperature. This is then referred to as the colour-luminosity diagram (CD) or colour-magnitude diagram

(CMD). The use of the colour index has the advantage that this observable is a continuous variable, which

is not the case for the spectral type, which is a so-called categorical (discrete) data-type. In this way, one

11

Figure 1.5: Schematic representation of the HR diagram, where the luminosity (wrt the solar value) is repre-

sented as a function of the effective temperature. The positions of the dwarfs on the main sequence, the red

giants, the supergiants and the white dwarfs are indicated. The Sun is a main-sequence dwarf star of spectral

type G 2, with an effective temperature of 5 780 K (see Figure 1.6 for its position with respect to other bright

stars; source: tim-thompson.com)

can incorporate much weaker stars in a C(M)D, because these can be observed photometrically and not

necessarily spectroscopically.

In the schematic HR diagram shown in Figure 1.5 we notice that the stars are not scattered in an arbitrary

way. Certain combinations of luminosity and effective temperature occur much more frequently than others.

The vast majority of stars is situated in one of the following groups: the main sequence, the (red) giants and

supergiants groups, or the white dwarfs group. Most of the stars belong to the main sequence, going from

stars with negative absolute visual magnitude and low colour index (hot blue stars) to stars with a high

absolute visual magnitude and high colour index (red dwarfs). The Sun is a rather ordinary main-sequence

star of spectral type G2V with an absolute visual magnitude MV = 4.79 and effective temperature of

5 780 K.

Since the absolute magnitude of a star is only known if its visual magnitude and distance are known,

the determination of precise distances is important to deduce the positions of stars in the HR diagram. In the

current era, the satellite mission Gaia of the European space agency ESA is determining precise distances

of about a billion stars in our Milky Way The first and second data releases of the mission6, DR1 and

DR2, took place on 14 September 2016 and 25 April 2018, respectively. The DR1 data already allowed

to compose a very detailed observational HR diagram, albeit for the close vicinity of the Sun. All stars

6http://gea.esac.esa.int/archive/

12

Figure 1.6: HR diagram, where the luminosity (wrt the solar value) is represented as a function of the

effective temperature and corresponding spectral type. The positions of various brigth stars are indicated.

(Source: Wikipedia)

in Gaia DR1 with a relative precision of the parallax better than 20%, are shown in the HR diagram in

Figure 1.7. A particular division of the stars is again obvious. The main sequence and the red giant groups

are prominently present. Within DR1, Gaia was not yet able to measure the distances of a huge amount

of distant white dwarfs accurately enough to be part of DR1. That’s why this group of stars is not densely

populated in the observational Gaia DR1 HR diagram. This is also the case for massive OB-type stars of all

luminosity classes. These are far away from the Sun and their distance could not be determined precisely

enough to be part of the DR1 Gaia HR diagram.

Figures 1.8 and 1.9 show the latest results from DR2 for 32 open and 14 globular clusters, respectively.

We come back to the issue of stellar ageing and its metallicity dependence from clusters in Chapter 10.

13

Figure 1.7: Observational CMD for the solar neighbourhood constructed on the basis of measurements by

the ESA satellite Gaia, from DR1 (Gaia collaboration, Brown et al., Vol. 595, id.A2, 23 pp. , 2016). All stars

whose parallax was measured with a relative precision better than 20%, are shown. The contours indicate

the borders of the position containing 10, 30, and 50% of all the shown stars.

Subsequent Gaia data releases with even more precise parallaxes, including as well the astrometric solutions

for binaries, are foreseen for 2021 and beyond 2022.

The version of the HR diagram used for studies of stellar evolution theory has the logarithm of the

luminosity against the effective temperature of the star and includes so-called evolutionary tracks. This

diagram is shown on the cover of these lecture notes. It is the diagram we shall use throughout this course.

14

Figure 1.8: Observational CMD of 32 open clusters constructed on the basis of measurements by the ESA

satellite Gaia, from DR2 (Gaia Collaboration, Babusiaux et al., A&A, Vol. 616, id.A10, 29 pp., 2018). The

colour coding is according to age.

1.4 Stars in our Milky Way

Our Milky Way is a spiral galaxy consisting of a central bulge with a radius of some kpc and an extensive

flat disk, surrounded by a halo of star clusters (figuur 1.10). The Sun is situated in the disk at some 8 kpc

away from the bulge. Around the Sun, approximately one star per 10 pc3 is found, so the interstellar space is

rather empty. The interstellar medium consists mostly of gas and dust, efficiently absorbing and re-emitting

stellar radiation. This radiation shows us that the interstellar medium is mainly composed of H−, H and H2.

More complex molecules, like O, HCN and CS also occur.

The disk of the Milky Way is composed of a thin part (300 to 400 pc) where dark gas and dust clouds

are found and where new stars are being formed all the time. On the other hand, there is a thick disk (1000

to 1500 pc) where star formation took place earlier on in the history of the Milky Way. The stars which were

formed there clearly contain less metals. The bulge consists of a dense nucleus of massive stars and a black

hole with a mass of around one million solar masses.

The disk as well as the bulge of the Milky Way rotate around its centre. Stars in the disk move at

circular orbits with a velocity of some 200 km s−1, so it takes the Sun about 250 million years to finish one

complete orbit. The stars in the bulge randomly move at velocities of some tens of km s−1. The stars in

the metal-poor spherical star clusters do not undergo a global rotational movement around the bulge in the

centre of the galaxy. They move randomly and their orbits are often very eccentric so that they are mainly

situated far above the disk and just once in a while pass through it. At this moment, they lose their gas,

15

Figure 1.9: Observational CMD of 14 globular clusters constructed on the basis of measurements by the

ESA satellite Gaia, from DR2 (Gaia Collaboration, Babusiaux et al., A&A, Vol. Volume 616, id.A10, 29

pp., 2018). The colour coding is according to metallicity.

which remains captured in the disk.

In general, the stars are divided in different populations: on the one hand according to their metallicity

and on the other hand according to their location and movement in the galaxy (see Figure 1.10). Popula-

tion I stars have a relatively high metallicity and are concentrated in or near the galactic plane. They move

according to the rotation of the galaxy. Population II stars, on the other hand, have an extremely low metal-

licity. They are found far from the galactic plane and do not follow the rotation of the disk of the Milky

Way. The interpretation of this division in populations is that the population II stars were formed before

the matter in the galaxy had collapsed into a disk and the population I stars are born later on in the disk,

up to the present day and in the future. Given that there is a thick and a thin disk, the division of the stars

formed in the thick disk in terms of only two populations is not that simple. Moreover, one has introduced

a third category of Population III stars to denote stars born straight after the Big Bang, having no metals at

all. We will again discuss these populations near the end of the course when we recapitulize the chemical

enrichment in galaxies.

The total mass in the Milky Way Galaxy is about 60×109 M⊙ and the luminosity is about 20×109 L⊙.

The mass in the halo is only 109 M⊙. From the motion of the clusters and stars far away from our Milky

Way, we can deduce that its total mass must be much more than the one we find on the basis of the observed

stellar population. This leads to the introduction of the concept of dark matter in galaxies. One assumes

that, without a thorough argumentation, that this dark matter must mainly be situated in the dark halo of

galaxies. The search for the missing dark matter is an active research topic in modern astronomy.

16

Figure 1.10: Overview of the components of our Milky Way. More explanation: see text. (From Spark &

Gallagher, 2000, Cambridge University Press)

For a detailed study of galaxies and their structure, we refer to the KU Leuven master course Galaxies

and Cosmology.

1.5 Galaxies in the Universe

Galaxies are perceived as little light clouds in the sky. They were discovered in the 1920s as “nebulae”

but as the telescopes qualitatively evolved Edwin Hubble concluded that these nebulae were composed of

individual stars. The diameter of a galaxy typically is a few thousands of light years. Each galaxy contains

between a million and 1012 stars. Almost all light that we receive from the galaxies is emitted by their stars.

Galaxies also contain gas and dust clouds.

17

Figure 1.11: Classification scheme of galaxies. Explanation: see text. (From Spark & Gallagher, 2000,

Cambridge University Press)

Galaxies are classified according to their shape in optical light. In Figure 1.11 the famous Hubble

classification scheme is shown (adapted version). Although large galaxies emit most of the light, the small

dwarf galaxies are dominantly present in the Universe.

Ellipsoidal galaxies E are “flat” and show little or no structures. They contain a small amount of cool

gas and therefore no young blue stars can be formed. Their brightest star population is mainly constituted of

red giants and asymptotic giant branch stars (see further in the course for a definition). They mainly occur

in large clusters of galaxies and the largest of them, the cD systems, are then to be found in the core of the

cluster. Their stars do not move in an organized way, like the rotational motion in spiral galaxies, but they

randomly move through the galaxy. In less bright ellipsoidal galaxies, on the contrary, the stars follow a

common rotational movement. The faintest of these galaxies are divided into two groups: dwarf galaxies dE

and dwarf spheroids dSph. Lenticular systems have, aside from a central bulge, also a rotating disk. They

are indicated as SO and form the transition between ellipsoidal galaxies and spiral galaxies. SOs have no

gas and dust, but a thin and fast rotation disk like spiral galaxies.

Spiral galaxies are easily observable thanks to their radiation at blue wavelengths coming from the

spiral arms where O and B stars are located between the gas and the dust. The spiral arms are areas where

18

efficient star formation is happening through the collapse of molecular clouds of H and H2. Approximately

half of the lenticular systems have a central bar. This is called the sequence of barred systems SBa, . . . ,

SBd parallel to these without bar. These are subsequently classified from Sa until Sd depending whether the

central bulge is more or less pronounced in comparison with the fast rotating disk. Our Milky Way is of the

type Sc. The Sm and SBm systems, finally, are called Magellanic systems according to their prototype, the

Large Magellanic Cloud (LMC). The Magellanic cloud itself rotates with a average velocity of 80 km s−1

which is three times slower than our own Milky Way.

Furthermore, there are also small blue galaxies without any structure. The smallest among those are

called dwarf irregular galaxies. They differ from dwarf spheroids because they contain young hot stars and

gas. In that way, dwarf spheroids are dwarf irregulars which have already lost their gas. Another type are the

so-called “starburst” galaxies, where stars have been formed recently after gas was spit out by supernovae

explosions. The gas is often sucked to the centre of the galaxy, where subsequently a lot of young stars are

born and are packed within a close distance (a few parsecs). This way multiple episodes of efficient star

formation are happening. In this class there are also the interacting or colliding galaxies, where merging can

lead to new star formation areas.

It is clear that the galaxies are no isolated islands but are influencing each other’s evolution. The

Local Group, including our own galaxy, consists of some 40 galaxies centred around our Milky Way and

Andromeda (the closest massive neighbour) at a distance of a megaparsec. As Andromeda, our Milky

Way has more than ten known satellites. Aside from these, only one small ellipsoidal system and another

few randomly moving systems occur in the Local Group. The mutual gravitational attraction within the

Local Group was obviously strong enough to rule over the global expansion of the Universe. In this way

we approach Andromeda with a velocity of 120 km s−1 and the other members of the Group move with a

velocity which differs less than 60 km s−1 of the common proper motion of our Milky Way and Andromeda.

Therefore the galaxies within the Local Group have a kinetic energy too low to escape from it.

Other clusters of galaxies in our neighbourhood are the Virgo and Coma clusters at respectively 20 and

70 Mpc. These large structures obviously form major complexes within the otherwise mainly empty volume

of the Universe. About 50 percent of all systems are found in a cluster where the density is high enough

to counteract the cosmological expansion. The Universe is indeed not static, but expands: all clusters of

galaxies move away from each other and thus away from us. This expansion has started with the Big Bang.

This happened quite “recently”, i.e., only about 13.7 billion years ago, which is only three times the age of

the Sun (and Earth).

The classification of galaxies by itself is not a very tantilizing topic. The interesting aspects of this is

the astrophysical interpretation. This situation is similar to stellar classification: it only becomes interesting

and meaningful when these classifications also turn out to have a physical meaning, like Cecilia Payne

discovered. Concerning the evolution of galaxies, we distinguish between their chemical evolution based on

the evolution of the stars and the dynamical evolution due to the motion and dynamical interactions of all

the galaxy’s constituents (and those that may have come from outside in the case of a merger event). The

chemical evolution is totally determined by the evolution of the stars that form the galaxy. This is further

discussed in Part III of this course. Studies of the dynamical evolution of galaxies are based on N -body

simulations. This topic is not treated here but we refer to the Master course Dynamics of Stellar Systems,

taught at the University of Ghent. KU Leuven students are invited to take the course in the framework of the

19

Master in Astronomy and Astrophysics at KU Leuven.

1.6 Starting point of this course

The main purpose of this course is to understand the evolution of the stars in the Universe, i.e., the life cycle

of stars. We will put the main focus on the evolution of single stars, but briefly touch upon the complications

arising from binary evolution as well. The more complex evolution of multiple stars, with binaries as a

particular category, follows other pathways. Indeed, tidal forces and spatial restrictions prevent each of the

components to evolve as if there was no companion. This implies specific phenomena such as mass transfer

and angular momentum exchange between the components. We will highlight some of the main aspects

of binary evolution but refer to the biennial Master courses Binary Stars and High Energy Astrophysics for

thorough coverage of this topic.

The theoretical HR diagram representing stellar evolution is shown on the cover page of these lecture

notes. The abscissa contains the effective temperature of the stars and the ordinate shows the logarithm of

the stellar luminosity. The evolutionary tracks of stars of different mass are shown. These are the pathways

of stellar evolution. The main goal of this course is to understand the position of the stars and their evolution

tracks, as indicated for stellar models in the HR diagram on the cover. To reach this aim, we first study the

internal structure of stars; this is the subject of part II of this course. Once we have dealt with the basic

concepts and equations describing the stellar structure, we study the life cycle of a star, as represented by

the evolution tracks in the HR diagram. We treat the evolution of single stars of different birth masses in

Part III. The short Part IV is dedicated to basic aspects of binary evolution.

20

The World-Wide-Web offers many illustrations and photos of stars (and their planets), star clusters, and

galaxies in different evolutionary stages. These illustrations lose much of their quality when printed and,

moreover, we care about our environment and aim for a sustainable planet. Therefore I direct the readers to

the internet to look for illustrations of the objects in the Universe and to admire their beauty. Here are some

links:

• https://www.eso.org/public/images/

• https://www.esa.int/ESA−Multimedia/Images

• http://hubblesite.org/gallery/showcase/text.shtml

• http://antwrp.gsfc.nasa.gov/apod/

Some of the professional astronomers are incidentally called “Archivists of the Cosmos”. Astronomy is

a specialisation that promotes open access, proper documentation, and exhaustive collection of information

and data in electronic form (from telescopes and instruments operative from laboratories on Earth as well as

from satellites in space). An absolute must for astronomers as sources of information are the two following

astronomical databases available on the World-Wide-Web:

• https://ui.adsabs.harvard.edu/classic-form

• http://simbad.u-strasbg.fr/simbad/

The first site consists of all international peer-reviewed articles published in astronomy and includes links

to the most common journals and magazines. You can enter a search on the name of the author, name of

the star, the magazine, etc. The second is a database where you can consult and if necessary query the

known factsheet and data of each known celestial body. Anyone who wants to specialise in astronomy, will

definitely use these databases frequently.

21

22

PART II : STELLAR STRUCTURE

23

Chapter 2

A simple equation of state: an ideal gas with

radiation

Describing the stellar structure requires the knowledge of the characteristics of stellar material. This chapter

describes the thermodynamic properties of the stellar gas. The general assumption that we make is that,

at every location in the star, the gas is in a state of thermodynamic equilibrium. With this assumption, we

do not have to take into account the detailed reactions between the particles, like atoms, electrons, ions,

photons. . . , which are the building blocks of the gas. The average characteristics of the gas can be described

in terms of local variables and the relations between them. Consequently, at a given temperature, density and

chemical composition, it will be possible to determine all other variables like pressure and internal energy.

The specification of these relations is called the determination of the equation of state (EOS) of the gas.

In this chapter we will discuss one example of an EOS which is highly relevant for stars. Other realistic

EOS will be discussed in Chapter 4. First, we will focus on some basic notions and briefly recall some basic

relations of thermodynamics which are relevant in the context of stellar structure.

2.1 Introduction to thermodynamics, applied to stars

The changes in the state of the gas of which a star is composed are important during the evolution of the

star. The basic equation that describes the evolution of the characteristics of a gas is the first law of thermo-

dynamics. We will now summarize the notions of thermodynamics, which are important to understand the

internal stellar structure.

25

2.1.1 Thermodynamic equilibrium

Classical thermodynamics describes the systems that have a uniform temperature and chemical composition

and are in mechanical and thermodynamic equilibrium. In general, these conditions are not met in stars.

The state of mechanical equilibrium is reached when in each point the pressure force is compensated

by the sum of all other acting forces. In astronomy, this is called hydrostatic equilibrium.

We now consider a volume consisting of radiation and matter that is adiabatically enclosed. This

means there is no possible heath exchange with the surroundings. When the mechanical equilibrium is

accompanied by a single temperature in the volume, there is a mechanical and thermal equilibrium.

In general the system consists of reacting elements with varying concentration over time due to chem-

ical reactions that occur. When the density and the temperature remain constant, the relative concentrations

of the particles remain in equilibrium. In this case, we are dealing with chemical equilibrium. When both

chemical and thermal equilibrium are reached, the system does not change anymore. This is called: thermo-

dynamic equilibrium.

Although classical thermodynamics is not strictly valid in stars, it can be used for the description of

stellar structure. The reason is that the star can be divided in a large number of layers, which can each be

taken fairly thin so that these enclose the characteristics of equilibrium in the sense of classical thermo-

dynamics. In astronomy, this state is called local thermodynamic equilibrium, or LTE. If LTE is a close

approximation, the basic laws of the classical thermodynamics are valid within each layer of the star, even

when the star is not in thermodynamic equilibrium as a whole.

2.1.2 The first law of thermodynamics

We consider the work that is connected with the volume change of a system. Assume that P is the pressure

at the surface of the system and that the system undergoes a surface variation. The work that is done by the

pressure on a uniform unity surface is dW = PdV . In astronomy the work per unit of mass W is used, by

the introduction of the specific volume v = 1/ρ, i.e., v is the volume that is taken by one unit mass. We then

get dw = Pdv.

We now consider an infinitesimal thermodynamic transformation of the system, corresponding with

infinitesimal variations of the pressure, the density and the temperature. Define dq as the amount of heat

that is absorbed by the system per unit mass and dw as the work done per unit mass by the system. The first

law of thermodynamics states that the differential du ≡ dq − dw is a total differential. The first law thus

allows us to define a function u, which will be called the internal energy per unit mass of the system. By

consequence, the internal energy of the system can be altered by doing work or by heating or cooling the

system.

The first law of thermodynamics gives the relation between the added heat dq, the internal energy u

26

and the specific volume v = 1/ρ (each defined per unit mass):

dq = du+ Pdv. (2.1)

An adiabatic process is a process that occurs in such a way that no heat enters or leaves the system: dq = 0.

For an adiabatic process the change of the internal energy is opposite to the work done by the system.

When dw is negative, like when you have a compression, the internal energy increases, which is mostly

accompanied by an increase in temperature. On the other hand dw > 0 implies a decrease of the internal

energy, accompanied by a decrease in temperature. When a process like compression or expansion occurs

quickly, it will be approximately adiabatic because the increase or decrease of heat occurs very slowly.

When there is no work done during the adiabatic process, the internal energy of the system does not

change. It is however possible that there are alterations in P, ρ, T .

2.1.3 The entropy

Suppose that a system runs through a series of states of thermodynamic equilibrium. This is called a quasi-

static transformation. Such a quasi-static transformation is called a reversible transformation when during

the transformation no energy is lost due to effects like friction. By consequence, a reversible transformation

can be run through in two opposite directions.

We let the system run through a reversible cycle, first in one direction, and afterwards in the opposite

direction. We then define the quantity s of the system by ds ≡ dq/T ; s is called the entropy of the system

and is defined per unit mass. According to the first law ds also is a total differential, given by du/T+P/Tdv.

The entropy of a system is solely defined for states of thermodynamic equilibrium. Moreover, we can only

determine the variation of the entropy from the first law. We want to stress that the relation du = Tds−Pdvdoes not suppose any variation in chemical composition.

2.1.4 The specific heats

From a mathematical point of view it is useful to import general specific heats

cα ≡(

∂q

∂T

)

α, (2.2)

i.e., cα is the amount of heat a system has to absorb so that the temperature rises with one unit. From a

physical point of view, two kinds of specific heats are particularly relevant:

cP ≡(

dq

dT

)

P=

(

∂u

∂T

)

P+ P

(

∂v

∂T

)

P,

cv ≡(

dq

dT

)

v=

(

∂u

∂T

)

v.

(2.3)

27

When searching for a relation between cP and cv , we consider general equations of state ρ = ρ(P, T )and u = u(ρ, T ). In general, ρ and u also depend on the chemical composition, but here we assume these

are constant. We now define the derivatives:

α ≡(

∂ ln ρ

∂ lnP

)

T= −P

v

(

∂v

∂P

)

T,

δ ≡ −(

∂ ln ρ

∂ lnT

)

P=T

v

(

∂v

∂T

)

P.

(2.4)

The equation of state can then be written as

ρ= α

dP

P− δ

dT

T. (2.5)

We now use (2.1) and

du =

(

∂u

∂v

)

Tdv +

(

∂u

∂T

)

vdT (2.6)

to determine the change ds = dq/T of the specific entropy:

ds =dq

T=

1

T

[(

∂u

∂v

)

T+ P

]

dv +1

T

(

∂u

∂T

)

vdT. (2.7)

Since ds is a total differential, this makes ∂2s/∂T∂v = ∂2s/∂v∂T . We apply this on the former equation

and this way get∂

∂T

[

1

T

(

∂u

∂v

)

T+P

T

]

=1

T

∂2u

∂T∂v. (2.8)

After the implementation of the differentiation in the left part of the equation we get

(

∂u

∂v

)

T= T

(

∂P

∂T

)

v− P. (2.9)

This relation is called the reciprocity relation.

To get cP − cv we first deduce an expression for (∂u/∂T )P in which we take P and T as independent

variables. From (2.6) we getdu

dT=

(

∂u

∂T

)

v+

(

∂u

∂v

)

T

dv

dT, (2.10)

and thus

(

∂u

∂T

)

P=

(

∂u

∂T

)

v+

(

∂u

∂v

)

T

(

∂v

∂T

)

P=

(

∂u

∂T

)

v+

(

∂v

∂T

)

P

[

T

(

∂P

∂T

)

v− P

]

, (2.11)

hereby using (2.9). Together with the definition of the specific heats, this last result gives:

cP − cv = T

(

∂v

∂T

)

P

(

∂P

∂T

)

v. (2.12)

28

On the other hand we use the right hand sides in the definitions of α and δ of Eqs (2.4) to obtain:

Tα= −

(

∂v∂T

)

P(

∂v∂P

)

T

=

(

∂P

∂T

)

v. (2.13)

Using T (∂v/∂T )P = vδ = δ/ρ, we then get the basic relation

cP − cv =Pδ2

Tρα. (2.14)

We establish that the difference of the specific heats can be determined entirely from the derivatives of the

equation of state.

Now we want to rewrite the first law of thermodynamics in terms of the variation of the pressure and

the temperature. Therefore we first write:

dq = du+ Pdv =

(

∂u

∂T

)

vdT +

[(

∂u

∂v

)

T+ P

]

dv. (2.15)

Using firstly (2.9) ,and then the definition of v and (2.13) we find

dq =

(

∂u

∂T

)

vdT + T

(

∂P

∂T

)

vdv = cvdT − T

(

∂P

∂T

)

v

1

ρ2dρ = cvdT − Pδ

ρα

ρ, (2.16)

which can be rewritten as

dq = cvdT − Pδ

ρα

(

αdP

P− δ

dT

T

)

=

(

cv +Pδ2

Tρα

)

dT − δ

ρdP. (2.17)

We then find, via (2.14)

dq = cP dT − δ

ρdP. (2.18)

For adiabatic transformations the entropy remains constant ds = dq/T = 0. We now define the

adiabatic temperature gradient ∇ad as follows :

∇ad ≡(

∂ lnT

∂ lnP

)

s, (2.19)

in which the lower index s indicates that the definition is valid for constant entropy. From (2.18) we deduce

that (dT/dP )s = δ/ρcP . This leads to an expression for ∇ad:

∇ad =

(

P

T

dT

dP

)

s=

TρcP. (2.20)

∇ad defines the temperature variation perceived by the particles in a mass element of a system when this

element suffers a pressure variation as a result of adiabatic expansion. This is an expansion which causes

no heat exchange with the surroundings. What happens is this: mass elements which are heated deeply

in the star rise, because, due to their lower density, they are lighter than their surroundings. Due to this

rise the mass elements end up in the higher layers where the density is lower and they therefore expand.

The expansion of the mass elements causes a decrease in temperature of the gas. ∇ad is the value of this

temperature change. The pressure as well as the temperature decrease moving outwardly. The value of the

decrease in pressure is given by the equation of hydrostatic equilibrium (see below) and once this value is

determined, we can compute ∇ad.

29

2.2 An ideal gas with radiation

2.2.1 The classical ideal gas law applied to stars

The presumption of thermodynamic equilibrium implicitly assumes that the conditions in the gas do not

remarkably alter on an average free path length and during the average time between two collisions of the

gas particles. With the term gas particle we do not only mean the material particles like atoms or electrons,

but also photons. The condition of thermodynamic equilibrium is surely met in stellar interiors, where the

density is high. It is not valid anymore in the stellar atmosphere.

A remarkable simplification is obtained when we take the high temperatures in the stellar interiors into

account. After all, in most stars the gas can be considered as fully ionized, in other words only consisting of

nuclei and free electrons without any internal degrees of freedom. The particles in such a gas do not interact.

Such a gas is called an ideal gas.

The well-known notification of the ideal gas law for gas particles of one single type in a volume is:

PV = NkT, (2.21)

with P the pressure, V the volume, N the amount of gas particles in the volume, T the temperature and

k Boltzmann’s constant (see Appendix B) given by R/NA with R the gas constant and NA = g/mu (by

which mu is expressed in grams) Avogadro’s number. In stellar media, we cannot easily specify the amount

of particles in the volume, and therefore we choose to work with densities. We represent the amount of

particles per volume unit as n = N/V . This way we can also write the ideal gas law as follows:

P = nRNA

T = nmuRT, (2.22)

where we use the gas constant with a dimension (energy per K and per unit mass).

We now define the molecular weight µ as the particle mass expressed in mu. In our definition, µ is

dimensionless (instead of having the dimension of mass per mole as is common habit in thermodynamics).

The density of the stellar material is in fact the product of the number of particles per volume unit and the

mass of the particles. This way, we find nmu = ρ/µ and for the ideal gas law, we get:

P =RµρT. (2.23)

This is the usual notation of the equation of state of an ideal gas consisting of one type of particles in

astrophysics.

2.2.2 The mean molecular weight

In the stellar interior close to the stellar core all matter is ionized. This means that there is one free electron

per hydrogen atom and for each helium atom there are two free electrons. In reality, we thus have a gas

30

mixture consisting of two types of particles, the ions (each composed of different components – protons

and neutrons) and the free electrons. This mixture is again an ideal gas when both components meet the

requirements of the ideal gas law.

The composition of the stars is very simple compared with the one of materials on Earth. Due to the

high pressure and temperature the stellar interior is almost entirely composed of fully ionized matter. In

such a medium it would be sufficient to describe the different types of nuclei, which we will call particles in

the future. To each type of particle we will assign an index i. With Xi we denote the relative mass fraction

of the particles of type i, i.e., the fraction of one unit of mass consisting of type i particles. From this, we

get:∑

i

Xi = 1. (2.24)

The chemical state of the gas mixture composed of fully ionized nuclei and free electrons is described

by specifying all Xi, which have a molecular weight µi and a charge Zi. For ni particles per volume with a

particle density of ρi, we have Xi = ρi/ρ and

ni =ρi

µimu=

ρ

mu

Xi

µi. (2.25)

We ignore the mass of the electrons vis-a-vis the mass of the ions (see Appendix B for the mass of both).

The total pressure P of the gas mixture is the sum of the partial pressures:

P = Pe +∑

i

Pi =

(

ne +∑

i

ni

)

kT, (2.26)

where Pe is the pressure of the free electrons, Pi the partial pressure due to the type i particles and supposing

that each of the components is an ideal gas. The contribution of one fully ionized atom of type i to the total

amount of particles (core and Zi free electrons) is 1 + Zi from which

n = ne +∑

i

ni =∑

i

(1 + Zi)ni. (2.27)

This expression together with (2.25) and (2.26) gives the following new expression for the total pressure

P = R∑

i

Xi(1 + Zi)

µiρT. (2.28)

This result can be simplified (2.23) when we introduce the mean molecular weight:

µ ≡(

i

Xi(1 + Zi)

µi

)−1

. (2.29)

This way we can treat a gas mixture of components which are each of them an ideal gas, as one uniform

ideal gas. We only have to replace the molecular weight µ in Eq. (2.23) by the mean molecular weight µ.

31

The definition of the mean molecular weight can easily be adjusted for a neutral gas where all electrons

are still bound to their nucleus. In this case we simply replace the factor 1 + Zi by 1. With this description

we can handle all situations with fully ionized matter or with neutral atoms.

The mean molecular weight depends on the chemical composition. Let us consider a chemical compo-

sition based on a fraction X of hydrogen, Y of helium and Z of heavy elements so that X + Y + Z = 1.

The fraction of heavy elements is generally coming from j different elements: Z =∑

j Zj , having mass

number Aj . The average amount of free electrons that is released when these heavy elements with fraction

Zj are fully ionized is Aj/2.

When all atoms are ionized, the mean molecular weight can be written as follows:

µ =

X(1 + 1)

1+Y (1 + 2)

4+∑

j

(

Zj(1 +Aj/2)

Aj

)

−1

. (2.30)

In practice, all terms Zj/Aj drop out, because their contribution is negligible (consider that Z ≈ 2 − 3%).

We then get:

µ =

(

2X +3Y

4+

1

2(1−X − Y )

)−1

=

(

3X

2+Y

4+

1

2

)−1

. (2.31)

In the central layers of a newborn star like the Sun (X = 0.717, Y = 0.270, Z = 0.013) we then find

µ = 0.61. In the case of pure, fully ionized hydrogen we find µ = 1/2. In case of a fully ionized helium

gas we contrarily find µ = 4/3.

When we are dealing with a neutral gas, we get

µ =

X

1+Y

4+∑

j

Zj

Aj

−1

, (2.32)

or simplified:

µ =

(

X +Y

4

)−1

(2.33)

where again all contributions Zj/Aj were neglected. For the outer stellar layers of the Sun we thus find

µ = 1.29.

In reality, the outer layer of cool stars will not contain any ionized gas. On the other hand all the

atoms in the inner layers will be fully ionized. Somewhere in the star there is thus a critical layer where

both ionized as well as non-ionized matter from a chemical element is occurring. This is called a partial

ionization layer. The ionization of hydrogen, e.g., requires 13.6 eV. The first ionization of helium needs

24.6 eV, etc. We conclude from this that the first partial ionization layer of helium is located deeper inside

the star than the partial ionization layer of hydrogen. Analogously, the second partial ionization layer of

helium is located deeper inside the star than the first partial ionization layer. When the temperature is higher

than roughly 200 000 K all hydrogen and helium is fully ionized.

When the stellar material is partially ionized, we have to consider all the different ionization states

when determining µ and thus it is not possible to compute this quantity analytically as we did for the fully

32

ionized and fully neutral case. In general the proportion of the number of particles in the ionization state

(r + 1) to the number of particles in the ionization state r is described by the ionization law of Saha:

Nr+1

Nr=

1

Ne

2Ur+1

Ur

(

2πmekT

h2

)3/2

exp [−χr/kT ] , (2.34)

with Ne the electron density, me the mass of the electron, χr the energy needed to ionize a particle from

state r to state r + 1, and Ur+1 and Ur the so-called partition functions of the ionization states r + 1 and r.

These last ones are found from the Boltzmann distribution:

nr,sNr

=gr,sUr

exp [−χr,s/kT ] , (2.35)

with nr,s the number of particles per cm3 in level s of ionization state r, gr,s the statistical weighing factor of

that level, χr,s the excitation energy of that level measured from the ground state (r, 1), and Nr ≡ ∑

s nr,sthe total particle density in all levels of the ionisation state r, and Ur:

Ur ≡∑

s

gr,s exp [−χr,s/kT ] . (2.36)

The excitation energy χr,s is the energy difference between the excited state (r, s) and the ground state

(r, 1). The statical weights gr,s measure the degeneracy of the levels as a consequence of magnetic fine

splitting. In absence of a magnetic field, these are equal to two (spin “up” or “down” for the proton or the

electron).

We also notice that the mean molecular weight changes during the evolution of a star, since the mutual

fractions X,Y,Z change as a consequence of the nuclear reactions. The mean molecular weight changes

during the evolution layer by layer, as the efficiency of the nuclear reactions is uttermost temperature depen-

dent. Because of this, the star builds up a gradient of µ in its deepest layers during its life. As of now, we

will use the simplified notation µ to still indicate the mean molecular weight inside the star.

Finally, for subsequent use in the case where the electrons are responsible for the dominant pressure

rather than the ions, we wish to determine the mean molecular weight per free electron µe. For a fully

ionized gas every nucleus i delivers Zi free electrons and we get

µe =

(

i

XiZi/µi

)−1

. (2.37)

Since for all elements that are heavier than helium, the approximation µi/Zi ≈ 2 is valid, we find

µe =

(

X +1

2Y +

1

2(1−X − Y )

)−1

=2

1 +X. (2.38)

This result will be used in Chapters 12 and 13.

2.2.3 The internal energy of an ideal gas

For an ideal gas (α = δ = 1), Eq. (2.14) simplifies to the well-known result cP − cv = R/µ, from which

we can deduce that cP > cv. Note that in classical thermodynamics we find cP − cV = R for an ideal gas.

33

That we find a factor 1/µ here is due to the fact that we work per unit of mass in astronomy rather than per

unit of volume.

From the reciprocity relation we find that(

∂u

∂v

)

T= 0. (2.39)

From this we deduce that the internal energy of an ideal gas is only a function of its temperature.

The distribution of the velocity v in an ideal gas consisting of classical particles (thus ignoring rela-

tivistic effects) is given by the Maxwell distribution function :

f(v) = 4πv2(

m

2πkT

)3/2

exp

(

−mv2

2kT

)

, (2.40)

with m the mass of the particle. This distribution function is defined such that f(v)dv represents the proba-

bility that the particle has a velocity between v and v + dv. The function f is normalised such that∫ ∞

0f(v)dv = 1. (2.41)

The maximum of the distribution, i.e., the most likely velocity, is given by√

2kT/m. On the other hand, the

average velocity is equal to

< v >=

0vf(v)dv =

(

8kT

πm

)1/2

(2.42)

and the average quadratic velocity is given by

< v2 >=

0v2f(v)dv =

3kT

m. (2.43)

From this equation we deduce that the average kinetic energy per particle equals 3kT/2. The average kinetic

energy density, which is the average amount of kinetic energy per unit of mass, is therefore found by dividing

3kT/2 by the average mass of a particle. This average mass is nothing else than µmu, so that we find an

average kinetic energy density equal to 3kT/2µmu. Since k/mu = R, we finally find 3RT/2µ for the

average kinetic energy density per unit mass.

The internal energy of the ideal gas is in general given by the sum of the kinetic energy due to thermal

motion and the ionization energy. A fully ionized gas or an entirely neutral gas have no ionization energy.

In this case, we thus find for the internal energy of the gas that

u =3RT2µ

. (2.44)

The average internal energy per unit mass is equal to 3P/2ρ in the limit of a classical ideal gas consisting

of only one type of particles.

From the expression of u we immediately find

cv =

(

∂u

∂T

)

v=

3

2

Rµ. (2.45)

34

Consequently cP − cv = R/µ then gives

cP =5

2

Rµ, (2.46)

from which we can deduce that

γ ≡ cPcv

=5

3. (2.47)

We then find ∇ad = 2/5 for an ideal gas that is entirely composed of on the one hand fully ionized matter

or on the other hand neutral atoms. This means that the temperature variation of an ideal gas in adiabatic

compression or expansion follows T ∼ P 2/5.

For an ideal gas we can link the pressure, volume and density variations as follows:

dP

P= −cP

cv

dv

v= −γ dv

v= γ

ρ, (2.48)

which can be rewritten as

(

∂ lnP

∂ ln ρ

)

s

= γ ;

(

∂ lnP

∂ lnT

)

s=

γ

γ − 1;

(

∂ lnT

∂ ln ρ

)

s

= γ − 1. (2.49)

These expressions are only valid when the motion of the gas particles is the only contribution to the internal

energy, like in the case of a fully ionized or entirely neutral ideal gas. The expressions are not valid in more

general conditions. Nevertheless it is useful to define the adiabatic variations through similar equations for

such more general conditions. Therefore the following adiabatic exponents are used:

Γ1 ≡(

d lnP

d ln ρ

)

s

,Γ2

Γ2 − 1≡(

d lnP

d lnT

)

s,Γ3 ≡

(

d lnT

d ln ρ

)

s

+ 1, (2.50)

which show compliance toΓ1

Γ3 − 1=

Γ2

Γ2 − 1. (2.51)

These definitions are not based on any hypothesis regarding the equation of state. For a fully ionized ideal

gas we recover Γ1 = Γ2 = Γ3 = 5/3.

We finally define the isothermal speed of sound a by

a2 ≡ RµT. (2.52)

In the case of an isothermal ideal gas we can thus also formulate the ideal gas law as follows:

P = a2ρ, (2.53)

with a constant. We will use this formulation in the description of the star formation process (see Part III of

the lecture notes).

35

2.2.4 The contribution of the photon gas

Due to the high temperatures in stellar interiors the photons considerably contribute to the pressure and

the internal energy of the gas. The pressure in the star is therefore not completely determined by the gas

pressure but there is also a component coming from the pressure due to the photon gas. This radiation

pressure accounts for a considerable fraction of the total pressure in the stellar core of all stars and also in

the photosphere of hot massive stars.

The radiation can very well be approximated by the one that is valid for a black body. The energy

density of a black body is described by the radiation law of Planck (also see Appendix A):

uν(T ) =2hν3

c2(exp(hν/kT )− 1)−1 . (2.54)

Since the photons carry along momentum, there is a pressure connected with the radiation. This radiation

pressure is given by Prad = aT 4/3 with a the radiation constant (see Appendix A). The energy density

per unit mass corresponding to this radiation pressure is u = aT 4/ρ = 3Prad/ρ. We can conclude that the

energy density per unit mass is 3P/2ρ for a non-relativistic ideal gas and 3P/ρ for the photon gas.

According to the law (2.1) we see that, for an adiabatic variation of a photon gas,

0 = dq = du+ Pdv = du+ Pd

(

1

ρ

)

= 3d

(

P

ρ

)

+ Pd

(

1

ρ

)

= 4Pd

(

1

ρ

)

+3

ρdP = −4P

ρ2dρ+

3

ρdP.

(2.55)

From this we can conclude Γ1 = 4/3. On the other hand we see

0 = dq = d

(

aT 4

ρ

)

+1

3aT 4d

(

1

ρ

)

= −4aT 4

3ρ2dρ+

4aT 3

ρdT, (2.56)

from which we can deduce that Γ3 = 4/3. From (2.51) we then also find Γ2 = 4/3.

When the system consists of a mixture of particles that behave like an ideal gas and of radiation, the

total pressure is given by

P = Pgas + Prad =RµρT +

a

3T 4. (2.57)

Often a measure for the contribution of the radiation pressure is defined by introducing β ≡ Pgas/P ,

which is equivalent to 1 − β = Prad/P . For β = 0 the gas pressure is zero and for β = 1 the radiative

pressure is zero. Fixing a value for β is thus the same as establishing a mutual link between the gas and

radiative pressure. Obviously, β changes when we move from the stellar interior to the stellar surface. For

stars with M ≥ 10M⊙, β 6= 0 in the whole star, even in the area near the stellar surface. For very massive

stars Pgas is even negligible compared to Prad. On the other hand, Prad is negligible near the stellar surface

for stars like the Sun or cooler.

36

Chapter 3

Classical mechanics applied to stellar

structure

In this chapter, we discuss the equations of classical mechanics relevant for the study of stellar structure.

When we derive and solve these equations, we will use some of the thermodynamic relations that were

discussed in the previous chapter. We first consider a few key simplifications to develop the theory and will

come back to these in Chapter 7.

3.1 Some preliminaries

In this chapter, we will be approximating a star as a non-rotating non-magnetic gaseous sphere. Obviously,

the theory will only be valid for single stars that are rotating “slowly” and have only a “weak” magnetic

field. In that case, the forces acting upon a fluid element in the star are the pressure force and gravity,

while the Coriolis, centrifugal, Lorentz and any tidal forces can be ignored. These assumptions imply a

tremendous simplification: all quantities are in this case constant in concentric spheres and one spatial

coordinate suffices to describe them. How good an approximation is this?

The aspect of multiplicity and tidal forces and interactions will be treated in the final chapter of these

lecture notes. We do not discuss it here other than remarking that, the higher the birth mass of a star, the

more likely it is to reside in a binary or multiple system. On average, half of the stars occur in binaries so

this cannot be ignored. But in order to understand how binary or multiple stars live their life, one must first

understand how single stars evolve and this is the major topic of this course.

The assumption that the Lorentz force can be ignored is very reasonable for the majority of stars when

studying stellar evolution. The magnetic field of the Sun (and similar stars) causes spectacular phenomena

at the solar surface, like solar flares and coronal mass ejections. However, in the overall life cycle of the

Sun, these effects do not play a major role, because they are local phenomena limited to the solar outer

37

layers and the corona, where the density of matter is immensely low. Magnetic effects do play a role for

the circumstellar environment, such as the planetary systems consisting of planets, moons, asteroids, comets

and other rocky material revolving their host star. However, stellar evolution is dominantly determined by

internal physical processes taking place deep inside the star, in and near its core, where the pressure gradient

and gravity are the dominant acting forces.

We make a distinction in terms of birth mass when it comes to the importance of the interplay between

magnetic effects and rotation during most of the lifetime of a star. As we will discuss in Part III of the course,

star formation will be accompanied by the occurrence of extensive convective regions (see Chapter 5) in a

rotating sphere (see Chapter 9). In such circumstances, a magnetic dynamo gets created. Stars born with

a mass M <∼ 1.3M⊙ will keep a convective outer envelope while they are burning hydrogen in their core,

which constitutes by far the longest phase of their evolution. All these stars tend to be slow rotators during

almost their entire life. This is due to an efficient slow-down of their rotation caused by magnetic braking.

This phenomenon is induced by the fossil magnetic field originating in their convective envelope during the

process of star formation. This dynamo along with angular momentum loss via a thin stellar wind, which

in case of the Sun causes a mass loss of some 10−14 M⊙ per year, is very effective in slowing down the

rotation. As we will discuss further in these notes, stars born with a mass M >∼ 1.3M⊙ will have a radiative

outer envelope at birth. They do not sustain a magnetic dynamo in their envelope at birth and their rotation

is not slowed down by magnetic braking. Either way, the magnetic fields, if any, lead to a Lorentz force that

is far less important than the pressure and gravity forces in the stellar interior, aside from a few exceptional

stars with a very strong magnetic field. When we encounter an evolutionary stage in which the Lorentz force

does play an important role, such as during the star formation process and with the formation and evolution

of neutron stars, we will explicitly state this further on in the lecture notes, but for the bulk of the course it

is fine to ignore it.

The above makes it clear that the assumption of slow rotation is harder to justify for stars born with

M >∼ 1.3M⊙. Some of these stars even rotate at a considerable fraction of their so-called critical rotation

velocity (see Chapter 7). For such stars the effects of the Coriolis- and centrifugal forces can be substantial.

As long as the rotational velocity remains below, say, 50% of the critical velocity, the star is not seriously

flattened at its polar regions by rotation. So at first instance, we will ignore any rotational effects. We do this

because it brings about a huge mathematical simplification. Indeed, when we ignore the centrifugal force,

the star does not deviate from a sphere. Another major reason for not considering rotation in the description

of the stellar structure is that we only have very limited knowledge about the internal rotation laws in stars.

Since we neither have a good star formation theory that explains how the stellar interior rotates at birth nor

a good theory of angular momentum transport in stellar interiors, we start off by ignoring rotation in the

theory. We come back to this issue in Chapter 7.

38

Figure 3.1: We use the mass within the sphere with radius r as an independent variable for the description

of the equations determining the stellar structure. (From Kippenhahn et al. 2012)

3.2 Coordinates

3.2.1 Eulerian description

In the approximation of a non-rotating non-magnetic spherically symmetric star, all functions are well de-

scribed by one spatial coordinate. The distance r, measured from the stellar core to the fluid element, is a

natural choice for this spatial coordinate. The distance r can vary from r = 0 to r = R, where R is the

stellar radius.

To describe the evolution of the quantities in time, we introduce the time coordinate t. If we use the two

independent variables r and t to compute the stellar structure, we use the so-called Eulerian description. All

other quantities are then determined as a function of r and t. Examples are the density ρ = ρ(r, t), pressure

P = P (r, t), temperature T (r, t), etc.

We now want to describe the effect of the mass distribution in the star to compute the gravitational

force. In order to do so, we define the function m(r, t) as the mass in a sphere with radius r at the time t; mvaries according r and t:

dm = 4πr2ρdr − 4πr2ρvdt. (3.1)

The first term on the right-hand side of this equation is the mass in a spherical shell with thickness dr (see

Figure 3.1). This term expresses the variation of m(r, t) as the result of a variation of r at constant t:

∂m

∂r= 4πr2ρ. (3.2)

Equation (3.2) is the first of the basic equations that determine the stellar structure in the Eulerian description.

The second term on the right-hand side of Eq. (3.1) represents the spherically symmetric mass flow

39

throughout the sphere with a constant radius r, as a result of an outward velocity v in the time span dt:

∂m

∂t= −4πr2ρv. (3.3)

Taking the derivative of Eq. (3.2) with respect to t and the one of Eq. (3.3) to r, and equating both these

derivatives, we get the well-known continuity equation for spherical symmetry:

∂ρ

∂t= − 1

r2∂(ρr2v)

∂r. (3.4)

3.2.2 Lagrangian description

As will become clear later on it is often more efficient to work with a Lagrangian coordinate instead of

the Eulerian coordinate r. This is a spatial coordinate connected to a fluid element that does not change in

the course of time. In the Lagrangian description, we characterize a fluid element by m, which is the mass

contained in a concentric sphere at a given time t0.

The new independent variables then are m and t and all other quantities are written in terms of these

variables. An example is again the density ρ = ρ(m, t), and now also the distance r of the fluid element to

the stellar centre: r = r(m, t). In the stellar centre we have m = 0 and at the surface m = M , the total

mass of the star. This example already shows a great advantage of the Lagrangian description: as opposed

to the large variation in the radius R during a star’s lifetime, the independent variable m varies, to a good

approximation, over the constant interval [0,M ] for more than 90% of the star’s lifetime.

There is an unambiguous connection between the coordinates r and m. For the partial derivatives to

both variables, the following formulae apply:

∂m=

∂r

∂m.∂

∂r,

(

∂t

)

m=

(

∂r

∂t

)

m· ∂

∂r+

(

∂t

)

r.

(3.5)

If we apply the first of these derivatives to m, we get

1 =∂m

∂r.∂r

∂m,

which gives the following equation if we fill in the relation (3.2):

∂r

∂m=

1

4πr2ρ. (3.6)

This differential equation describes the spatial behaviour of the function r(m, t). It replaces equation

(3.2) and is the first basic stellar structure equation in the Lagrangian description. By substituting this

40

equation in the upper relation of (3.5), we find the following relation between the two operators:

∂m=

1

4πr2ρ

∂r. (3.7)

The second equation of (3.5) is the main reason to use the Lagrangian description. The time derivative

on the left-hand side describes the change of a function of time when following a given fluid element. The

laws of conservation for time-dependent spherical stars are just simple expressions for this time derivative.

If we would work in terms of the local time derivative (∂/∂t)r , terms with the velocity (∂r/∂t)m would

appear explicitly, which is not the case in the Lagrangian formalism.

3.3 Poisson’s equation

In a spherically symmetric body, the modulus of the gravitational acceleration ~g at a distance r of the centre

does not depend on the mass located at a larger distance than r away from the centre, i.e., g = |~g| is only

dependent of r and of the mass that is located in the concentric sphere with radius r, which we have defined

as m:

g =Gm

r2, (3.8)

with G = 6.673 × 10−11 m3/kg.s2 the gravitational constant in SI units.

In general the gravitational field in a star can be described by a gravitational potential Φ, which is a

solution of Poisson’s equation:~∇2Φ = 4πGρ, (3.9)

in which ~∇2 stands for the Laplace operator. For spherically symmetric configurations, Poisson’s equation

simplifies to:1

r2∂

∂r

(

r2∂Φ

∂r

)

= 4πGρ. (3.10)

The gravitational acceleration vector ~g is pointed towards the stellar core and is written in spherical

coordinates as ~g = (−g, 0, 0) with g = |~g| > 0. The vector ~g = −g~er is derived from the potential Φ as in

the equation ~g = −~∇Φ. For a spherically symmetric star, only the partial derivative with respect to r differs

from zero and we get:

g =∂Φ

∂r. (3.11)

Using the expressions (3.11) and (3.8) we get

∂Φ

∂r=Gm

r2. (3.12)

Integration of the expression (3.12) gives

Φ =

∫ r

0

Gm

r2dr + constante. (3.13)

41

Figure 3.2: Representation of a state of hydrostatic equilibrium: the outward pointing pressure force has to

compensate the inward pointing gravitational force. This can only be the case when the force at the inner

boundary of the shell is larger than the one at the outer boundary. (From Kippenhahn et al. 2012)

The integration constant is chosen in such a way that Φ disappears for r → ∞. Moreover, Φ is minimal at

the stellar core.

3.4 Conservation of momentum

3.4.1 Hydrostatic equilibrium

We cannot observe structural changes for most of the stars in real time. This implies that the stellar material

is not accelerated noticeably, implying that all forces acting on fluid elements must compensate each other.

This mechanical equilibrium is called hydrostatic equilibrium. Supposing that we are dealing with a gaseous

non-rotating star which has no magnetic field or close companion. In such a case, the acting forces are the

gravitational force and the pressure force.

Let us consider, for a given time t, a thin spherical mass shell with infinitesimal thickness dr at a

distance r of the stellar centre. The density at the border of the shell is ρdr and the acceleration of the shell

is −gρdr, which represents the gravitational force that is pointed towards the stellar centre. To avoid that

the fluid elements of the shell are accelerated into the direction of the centre, they have to experience a net

pressure force that is exactly equal to the gravitational force, but pointed outwardly. This implies that there

is a higher pressure at the inside of the shell (Pi) than at the outside of the shell (Pe). We refer to Figure 3.2.

The total force per surface unit on the shell as a consequence of these different pressures is:

Pi − Pe = −∂P∂r

dr. (3.14)

42

The sum of the forces as a consequence of gravity and pressure has to be zero, so

∂P

∂r+ ρg = 0. (3.15)

This equation is rewritten by using Eq. (3.8) to find the equation of hydrostatic equilibrium:

∂P

∂r= −Gm

r2ρ. (3.16)

It is the second basic equation describing the stellar structure in Eulerian form.

If we choose m to be the independent variable, then we get the Lagrangian form of the hydrostatic

equilibrium by multiplying Eq. (3.16) with ∂r/∂m = (4πr2ρ)−1 following Eq. (3.6), while using the first

relation of (3.5):∂P

∂m= − Gm

4πr4. (3.17)

3.4.2 Simple solutions

Until now, we have only concentrated on the mechanical problem that is linked to the gravitational field and

the pressure ratification in the star. As such, we deduced two basic equations, taking the following form in

the Lagrangian formalism:∂r

∂m=

1

4πr2ρ,

∂P

∂m= − Gm

4πr4. (3.18)

We will now search for provisional solutions for these differential equations.

We search for a solution for the three unknown functions r, P, ρ and have to define a relation between

at least two of these three quantities. In some special situations we can write the density ρ as a function of

r and P or of m and P . In that case we have to do with normal differential equations because time does

not play an explicit role. An example of this is a homogeneous sphere with ρ = constant. A more realistic

physical example is given by the so-called barotropic solutions where ρ = ρ(P ), for example an ideal gas

at a constant temperature. A class of simple barotropic solutions which are important for studying stellar

structures are the polytropes. We will go more deeply into this special class of equations of state later on.

In general, though, the density is not only a function of the pressure, but it also depends on the tem-

perature: ρ = ρ(P, T ). A well-known example is that of an ideal gas. If we have to do with an equation

of state in which the temperature is playing a role, it becomes much more difficult to determine the internal

structure of a self-gravitating gas sphere. The mechanical structure is then depending on the temperature

stratification, which in turn is coupled to the production and the transport of energy in the star. To describe

this situation, we need to add more equations to solve the stellar structure.

43

3.4.3 The equation of motion in case of spherical symmetry

The equation of hydrostatic equilibrium (3.16) is a special case of conservation of momentum. When accel-

erated motions occur in the spherically symmetric star, the inertia of the fluid elements must be taken into

account. Below, we limit ourselves to the Lagrangian description.

Consider again a thin shell with mass dm at a distance r from the stellar center. This shell experiences

a force per unit area fp due to the pressure gradient given by (3.14). This equation can be rewritten as

fP = −∂P∂m

.dm. (3.19)

The gravitational force per unit area acting on the shell is given by

fg = −g dm4πr2

= −Gmr2

dm

4πr2, (3.20)

in which we made use of (3.8). If the sum of the pressure force and the gravitational force is non-zero, the

shell will be accelerated according todm

4πr2∂2r

∂t2= fP + fg. (3.21)

From this and using (3.19) and (3.20), we derive the equation of motion:

1

4πr2∂2r

∂t2= −∂P

∂m− Gm

4πr4. (3.22)

If only the pressure gradient would be active, there would be an outward acceleration (∂P/∂m); with only

the gravitational force at play, there would be an inward acceleration. The equation of motion is reduced to

the equation of hydrostatic equilibrium when all fluid elements are at rest or move in the radial direction at

a constant velocity. When the two terms on the right-hand side of the equation of motion compensate each

other, the assumption of hydrostatic equilibrium is a good approximation, and the star will evolve through

quasi-equilibrium states.

Let us assume in a thought-experiment that there is a deviation from hydrostatic equilibrium because

of a sudden “out fall” of the pressure force. The inertial term on the left-hand side of the equation of motion

would then need to compensate for the gravitational term on the right-hand side. We define a characteristic

time scale τff , connected to the implosion of the star due to the sudden disappearance of the pressure force:

∂2r

∂t2

≡ R

τ2ff, (3.23)

in which R is the radius of the star. Using the equation of motion (3.22), we can write τff as:

τff ≈(

R

g

)1/2

. (3.24)

This shows that τff is a mean value for the free-fall time over a distance of the order of the stellar radius

caused by the sudden disappearance of the pressure force.

44

In a similar way, we can define the characteristic timescale τexpl for the expansion of the star caused by

the sudden disappearance of the gravitational force:

∂2r

∂t2

=R

τ2expl= 4πr2

∂P

∂m=∂P

∂r

1

ρ≈ P

ρR, (3.25)

in which we have replaced ∂P/∂r by P/R. This yields

τexpl ≈ R

(

ρ

P

)1/2

. (3.26)

Because√

P/ρ is a measure for the average speed of sound in the stellar interior, we can interpret τexpl as

the mean time it takes a sound wave to travel from the stellar centre to the stellar surface.

When a star is in a state of hydrostatic equilibrium, the two terms on the right-hand side are approxi-

mately equal, yielding τff ≈ τexpl. We define the hydrostatic timescale τhydro as the time needed to restore

hydrostatic equilibrium in the star after a small perturbation. Using g ≈ GM/R2, Eq. (3.24) yields

τhydro ≈(

R3

GM

)1/2

≈ 1

2(Gρ)−1/2 . (3.27)

The equations mentioned above that describe the stellar structure are special cases of the equations

known from hydrodynamics and are only valid for spherically symmetric bodies.

3.5 Conservation of energy

3.5.1 The virial theorem

The virial theorem is not so important for solving most physical problems. In the study of stellar structures,

however, it is of major importance, since it connects two dominant energy reservoirs and helps to deduce

predictions and interpretations for certain evolutionary stages in the life of the star.

If we multiply the left-hand side of the Lagrangian form of the hydrostatic equilibrium (3.17) with 4πr3

and we integrate over the mass in the interval [0,M ] from the centre to the stellar surface, we get

∫ M

04πr3

∂P

∂mdm =

[

4πr3P]M

0−∫ M

012πr2

∂r

∂mPdm. (3.28)

The term between square brackets disappears as r = 0 in the stellar centre and P = 0 at the stellar surface.

On the other hand, we can reduce the integrand of the second term in the right-hand side by using (3.6) to

obtain 3P/ρ. At last we get∫ M

0

Gm

rdm = 3

∫ M

0

P

ρdm, (3.29)

45

where we get the left-hand side of (3.29) by replacing the left-hand side of (3.17) into its right-hand side.

Both sides of equation (3.29) have the dimension of an energy. We define the gravitational energy Eg of the

star by

Eg ≡ −∫ M

0

Gm

rdm. (3.30)

Let us consider a unit mass at a position r. The potential energy of this unit mass, as a consequence of

the gravitational field undergone by the mass m situated within a radius r, is −Gm/r. We can see that Eg

is the potential energy of all fluid elements dm of the star, which is normalised to zero at infinity. An energy

−Eg(> 0) is necessary to expand all mass elements to infinity, while this amount of energy is released when

an infinite cloud contracts into a star.

When all fluid elements within a star expand or contract together, Eg will gradually increase or de-

crease, respectively. This must then also be true for the integral in the right-hand side of Eq. (3.29). Here,

we stress that the contraction or expansion must be seen on a timescale that is much longer than τhydr since

otherwise Eq. (3.29) will not be valid.

To discover the meaning of the term in the right-hand side of Eq. (3.29), we consider an ideal gas:

P

ρ=

RµT = (cP − cv)T = (γ − 1)cvT. (3.31)

For a mono-atomic gas γ = 5/3 and we get P/ρ = 2/3u with u = cvT the internal energy of the ideal gas

per unit mass. If we define

Ei ≡∫ M

0u dm (3.32)

as the total internal energy of the star, we deduce from Eq. (3.29), in the case of an ideal gas,

Eg = −2Ei. (3.33)

This result is the virial theorem for a mono-atomic ideal gas.

For a general equation of state we define the quantity ζ by

ζu ≡ 3P

ρ. (3.34)

For an ideal gas we have ζ = 3(γ − 1). In the mono-atomic case (γ = 5/3) this gives ζ = 2. For a gas

solely composed of photons we have γ = 4/3, P = aT 4/3 and uρ = aT 4 with a as the radiative constant,

leading to ζ = 1. When ζ is constant in the star, Eq. (3.29) gives the more general result that

ζEi + Eg = 0. (3.35)

We now define total energy W of the star as W ≡ Ei + Eg, for which W < 0 for a gravitationally bound

system. Based on (3.35) we then get

W = (1− ζ)Ei =ζ − 1

ζEg. (3.36)

46

From this we deduce that the total energy of a gas of photons is zero.

If the star expands or contracts in a way so as to maintain the hydrostatic equilibrium, then Eg and Ei

will vary and the total energy will change. The gas will then radiate energy. If we define the total energy

loss by radiation per unit of time as the luminosity L of the star, we can deduce, following the conservation

of energy, that (dW/dt) + L = 0, which, via (3.36) implies that

L = (ζ − 1)dEi

dt= −ζ − 1

ζ

dEg

dt. (3.37)

When all mass shells contract simultaneously, dEg/dt < 0 and we get L = dEi/dt = −0.5dEg/dt > 0for a mono-atomic ideal gas. This means that half of the energy that is released as a consequence of the

contraction is radiated and the other half is used to heat up the star.

Equation (3.37) shows that L is of the order of |dEg/dt|. This way we can define a characteristic time

scale

τHK ≡ |Eg|L

≈ Ei

L, (3.38)

which is called the Helmholtz-Kelvin time scale (referring to the two physicists that deduced this as the

evolution timescale for a contracting or expanding star). A rough estimate of |Eg| is

|Eg| ≈Gm2

r≈ GM2

2R, (3.39)

in which m and r represent average values for m and r over the star (which we have replaced by M/2 and

R/2). This way we get

τHK ≈ GM2

2RL. (3.40)

During certain stages in the life of the star Eg is the main energy source and the star evolves on a timescale

τHK. For a detailed description of stellar evolution we refer to part III of this course, but we now already

stress why the virial theorem, together with the energy transport equation (see Chapter 5), is of major im-

portance for the star’s life.

The temperature of the star is decreasing from the inner regions towards the outer regions. This implies

that energy is transported outwardly through the star and is radiated away into space, i.e., energy is taken

away from the stellar interior. If there is no more nuclear source, e.g. when all H in the stellar core is con-

verted into He, the star can only deliver the necesssary energy by contracting. This contraction is happening

slowly, so that the star stays in hydrostatic equilibrium during the contraction. The star has no other choice

but to contract because shrinking is the only way to cover the energy loss. It does so on a timescale of

Helmholtz-Kelvin. The timescale that is necessary to recover from a pressure distortion is much shorter,

namely τhydro. This means that, during the slow contraction process of the star, a new pressure equilibrium

can always be installed quasi-instantaneously: during the contraction the virial theorem stays valid. Thus,

when the star contracts, half of the gained potential energy is radiated while the other half is used to increase

the temperature of the gas. Due to the increase in temperature, the temperature gradient is raised, causing

even more energy radiation and consequently stronger contraction of the star is needed. Due to this vicious

circle, the stellar core keeps on shrinking and getting hotter until the temperature is high enough for a next

fusion process (for example when T=108 K, helium burning can start). Afterwards the star can again radiate

without shrinking for a long period of time.

47

Figure 3.3: Representation of the quantity l, which depicts the amount of energy that radiates per second

through a sphere with radius r. (From Kippenhahn et al. 2012)

3.5.2 Conservation of energy in stars

We define l(r) as the net amount of energy, integrated over all frequencies, that is radiated per second

through a sphere with radius r. We assume there is no infinitely high energy source in the stellar centre.

This way the function l is zero in the stellar centre. It also equals the total luminosity L of the star at the

stellar surface. Between r = 0 and r = R, l is a complicated function that depends on the distribution of all

energy sources occurring in the different stellar layers. This way, l encloses the energy that is transported

by radiation as well as conduction and convection. In the following chapter, we will focus on these means

of energy transport, all of which require a temperature gradient. In the function l, we do not take into

account a possible energy flux as a result of neutrinos. After all, they have a negligible interaction with the

stellar material and we shall treat the neutrino flux, which does not require a temperature gradient, always

separately.

Local conservation of energy

Consider a spherically symmetric mass shell at radius r, with thickness dr and mass dm. Depict the energy

that enters the inner part of the shell by l and the energy that leaves along the outer side of the shell by l+ dl(see Figure 3.3). The surplus dl can be provided by nuclear reactions, by cooling, or by the contraction or

expansion of the shell. In a stationary situation dl only originates from the release of energy coming from

nuclear reactions. If we represent the nuclear energy released per unit mass and per time unit by ε, we get

dl = 4πr2ρεdr = ε dm (3.41)

or∂l

∂m= ε. (3.42)

The quantity ε in general depends on the temperature, the density and the abundances of the different react-

ing nuclear particles.

48

For a non-stationary shell, dl can differ from zero, even when no nuclear reactions take place. Indeed,

the shell can alter its internal energy and, moreover, exchange mechanical energy with neighbouring shells.

In that case we write instead of (3.42)

dq =

(

ε− ∂l

∂m

)

dt, (3.43)

in which dq is the heat added to the shell per unit mass. If we change dq according to the first law of

thermodynamics, we get∂l

∂m= ε− ∂u

∂t− P

∂v

∂t= ε− ∂u

∂t+P

ρ2∂ρ

∂t. (3.44)

Keeping the thermodynamic relation (2.18) in mind, we can write this expression in terms of the pressure

and the temperature:∂l

∂m= ε− cP

∂T

∂t+δ

ρ

∂P

∂t. (3.45)

This equation is the third equation that describes the stellar structure.

Often the terms containing a time derivative in Eq. (3.45) are treated together in a so-called source

function εg:

εg ≡ −T ∂s∂t

= −cP∂T

∂t+δ

ρ

∂P

∂t= −cPT

(

1

T

∂T

∂t− ∇ad

P

∂P

∂t

)

, (3.46)

where we have used ds = dq/T and Eq. (2.20) for ∇ad.

Let us now look into the energy change due to neutrinos. Neutrinos can occur in large amounts as a

side product of nuclear reactions (see later on, description of the different burning cycles). On the other

hand, the average free path of a neutrino in a typical stellar medium is about 100 parsec! In the stellar core

of a main-sequence star they even have an average free path of about 3000 R⊙. The stellar material is thus

fully transparent for neutrinos and therefore they can easily transport the energy they are carrying with them

to the stellar surface (this assumption is not valid anymore in the last end stages of the life of a star). This is

the reason why we treat the influence of neutrinos separately from the energy fluxes requiring a temperature

gradient. The only fluid elements that are influenced by neutrinos are those where neutrinos are formed. In

such regions, the neutrinos can cause a decrease of energy. We define εν(> 0) as the energy that is taken

from the stellar material per unit mass and per time unit in the form of neutrinos. The total equation for local

conservation of energy then is∂l

∂m= ε− εν + εg. (3.47)

The energy that is transported per second by neutrinos is called the neutrino luminosity and is given by

Lν ≡∫ M

0ενdm. (3.48)

As mentioned already l = 0 in the stellar centre and l = L at the stellar surface. For an intermediate

value of r, l is not necessarily monotonously increasing and can even become higher than L or negative. An

example of this is an expanding star where L is lower than the energy produced by the nuclear reactions in

the central parts due to the expansion (εg < 0). A strong neutrino loss can induce l < 0 in some stellar

layers.

49

Since neutrinos can leave the star without any problems after their creation by numerous nuclear re-

actions, they provide direct information about these reactions and hence on the physical conditions in the

stellar core. This offers a unique opportunity to probe stellar interiors. However, due to their very large

free path it is difficult to detect neutrinos in laboratories on Earth. Nevertheless, it is well possible to catch

neutrinos produced by hydrogen burning in the Sun. In one of the successful detections the neutrinos are

caught by the reaction

νe + 37Cl → e− + 37Ar. (3.49)

One of the original successful detectors was based on a volume filled with 380 000 liter C2Cl4 (a standard

detergent). Despite this gigantic big volume only one neutrino was detected every other day. This is much

less than the amount of neutrinos that is predicted using the solar models. This problem was known for

30 years as the solar-neutrino-problem. In another experiment, one looked at the scattering of neutrinos at

electrons in a volume of 680 ton of water. In contrast with the Cl experiment, the direction of the neutrinos is

measured, from which it could be deduced that they are produced by the Sun. Also in this case, the number

of detections was far too low compared to the theoretical predictions.

A solution to the problem was found after realising that the detectors were insensitive to the more un-

common types of neutrinos. The Cl and electron experiments are indeed only sensitive to a small fraction of

the total neutrino production in the Sun, namely only the high-energy electron-neutrinos. The above exper-

iments are not sensitive to mu- and tau-neutrinos. Two additional experiments are sensitive to the majority

of the produced neutrinos. They are based on a reaction of the neutrino with 71Ga. The gallium experi-

ments gave results that were closer to the theoretical expectations, but until 2001 considerable differences

remained. The solution was found thanks to hundreds of researchers active at the Sudbury Neutrino Obser-

vatory in Canada, who developed a new generation of neutrino detectors. They could confirm that a part

of the solar neutrinos change their character from electron-neutrinos into mu- or tau-neutrinos by the time

they arrive on Earth. Estimates of the sum of the three types of neutrinos fairly well correspond to models

of the Sun. The leading team of scientists was rewarded with the Nobel Price in Physics for having brought

a solution to the solar neutrino problem.

Given the challenges to detect the neutrinos coming from the Sun, it is of course even more difficult to

detect neutrinos produced by other stars. However, neutrinos produced during supernovae explosions have

been detected. The famous supernova SN 1987 A in the Large Magellanic Cloud gave rise to the detection

of 20 of its emitted neutrinos in two different detectors, both situated in the Northern Hemisphere. This

offered a very precise test of nuclear reactions during supernova explosions (see later on).

Global conservation of energy

When describing the virial theorem we have limited ourselves to taking into account the internal energy Ei

and the gravitational energy Eg. We neglected the nuclear energy as well as the energy of the neutrinos and

the kinetic energy of the fluid elements (for example due to stellar oscillations). If we now redefine the total

energy of the star as W = Ekin +Eg +Ei +En with En the nuclear energy-content of the whole star, then

the equation that describes the global conservation of energy is given by:

d

dt(Ekin + Eg + Ei + En) + L+ Lν = 0. (3.50)

50

3.5.3 The different time scales

Suppose that the luminosity of the star is only caused by the release of nuclear energy. If L is constant, this

energy loss can take place during the nuclear time scale that is defined by

τn ≡ En

L. (3.51)

En represents the energy reservoir built up by nuclear reactions. The main reactions during the largest

fraction of the stellar life are those that achieve the fusion of four 1H nuclei in one 4He nucleus. This

hydrogen burning releases an energy of 6.3× 1018 erg g−1, which corresponds to a mass deficit of ∼ 0.75%

(it is equal to the total mass of four protons minus the mass of a helium nucleus, divided by the total mass

of four protons (see Appendix B). The nuclear time scale shows the total life span a star can have based on

the production of nuclear energy. Later we will show that the luminosity of a star is a strongly increasing

function of the stellar mass. Because of this, the nuclear time scale decreases very fast with increasing mass.

A star with initial mass of 30M⊙, for example, can only live for about 5 million years while a star with half

a solar mass barely had enough time to evolve in the current Universe with its age of ∼ 13.79± 0.02 billion

years.

The relation between the different time scales for the Sun is (see exercises) :

τn >> τHK >> τhydr. (3.52)

This relation is valid for all stars for which hydrogen or helium burning is the main energy source. The re-

lation between these time scales helps us to simplify the equation that expresses the conservation of energy.

Let us consider the four terms that occur in Eq. (3.45) for a star of which the properties are changing consid-

erably on a time scale τ , which can be small or large compared to τHK. A cause of this change could be, for

example, the depletion of a certain nuclear fuel inside the core. For an ideal gas we can easily approximate

the terms in Eq. (3.45) by:

∂l

∂m

≈ L

M≈ Ei

τHKM,

ε ≈ L

M=

En

Mτn≈ Ei

τHKM,

cP∂T

∂t

≈ cPT

τ,

δ

ρ

∂P

∂t

≈ Rµ

T

τ≈ cPT

τ≈ Ei

τM.

(3.53)

In the case of τ >> τHK the values of the last two expressions given in Eqs (3.53) are far below those of

the first two expressions and we can neglect the time-dependent terms in the energy equation (|εg| << ε). The

latter then reduces to ∂l/∂m = ε like in (3.42). This approximation is valid when the burning of hydrogen

or helium determines the stellar evolution (τ = τn) and implies a huge simplification when computing stellar

models. These models are in full mechanical and thermal equilibrium.

In contrast, if τ << τHK, the values of the right-hand sides of the last two equations given in (3.53) are

large compared to those of the first two equations. This means that the time-dependent terms in the energy

51

equation compensate each other to a very good approximation, implying that dq/dt ≈ 0. In this case, we are

dealing with a quasi-adiabatic change. An example of this is a star pulsating on a time scale τ << τHK. The

variable luminosity of a pulsating star is the consequence of variations in εg and not in ε. For an extensive

description of pulsating stars we refer to the course Asteroseismology in the Leuven Master of Astronomy

and Astrophysics, while the course Theory of Stellar Oscillations fully describes the theoretical aspects of

this research field.

As the reader will have noticed, the determination of the time scales is somewhat arbitrary. We just as

well could have taken R or R/10 as the average distance to use in the expressions, instead of R/2; a similar

remark is valid for the average mass. However, it is not the intention to determine accurate values for the

time scales but rather to have an idea of their order of magnitude.

Finally, when deducing the relation between the different time scales we have sort of assumed implicitly

that the stellar quantities change linearly. However, when only certain parts of the star have to be considered

because of non-uniform variations, the above-mentioned argumentation is not appropriate anymore because

local rather than global time scales should be taken into consideration.

52

Chapter 4

Additional relevant equations of state

The temperature does not occur in Eqs (3.18). For certain equations of state, this allows to separate these

two equations from the thermo-energetic equations that are also necessary to define the stellar structure. We

will now discuss two of such equations of state that are important for the life of the star.

4.1 Polytropes

We take a star in hydrostatic equilibrium and use the Eulerian description. For a time-independent stellar

model the gravitational potential has to fulfill the following equations:

dP

dr= −ρdΦ

dr,

1

r2d

dr

(

r2dΦ

dr

)

= 4πGρ.

(4.1)

When ρ is not depending on T : ρ = ρ(P ), this relation can be substituted into Eqs (4.1), which then forms

a system of two equations for the two unknowns P and Φ. These equations can be solved without having to

use the equation describing the energy transport (see next chapter).

We assume that we have a simple relation between the pressure and the density which looks like this:

P = Kργ = Kρ1+1n , (4.2)

in which K, γ and n are constants. An equation of state of the form (4.2) is called a polytrope. K is the

polytropic constant and γ the polytropic exponent. Instead of γ, often the polytropic index n is used, which

is defined as n ≡ 1/(γ − 1).

53

In general K is constant for one specific star, but it can take different values for different stars. For an

isothermal ideal gas, the equation of state can be written as follows: P = (RT0/µ)ρ. In this case, we are

dealing with a polytrope with K = RT0/µ, γ = 1, n = ∞. For an ideal mono-atomic gas with a negligible

radiation pressure: ∇ad = 2/5. This means that T ∼ P 2/5. Furthermore, in this case µ =constant, and

therefore T ∼ P/ρ, so we finally get P ∼ ρ5/3. This is again a polytrope, this time with γ = 5/3, n = 3/2.

A homogeneous gaseous sphere can be seen as a special case of (4.2) for γ = ∞, n = 0. We thus conclude

that polytropes indeed occur in the case of simple equations of state that already have the form (4.2) as well

as in case of an ideal gas when an extra relation between temperature and pressure can be deduced.

For a polytropic relation (4.2), the first equation of the system (4.1) can be transformed into

dr= −γKργ−2dρ

dr. (4.3)

For γ 6= 1 this equation can be integrated to result in:

ρ =

( −Φ

(n+ 1)K

)n

, (4.4)

in which we have used the definition of n and the integration constant was chosen in a way that Φ = 0 at the

stellar surface. When we substitute (4.4) in the second equation of (4.1), we become an ordinary differential

equation for Φ:d2Φ

dr2+

2

r

dr= 4πG

( −Φ

(n+ 1)K

)n

. (4.5)

We now define the dimensionless variables z and w by

z = Ar met A2 =4πG

(n+ 1)nKn(−Φc)

n−1 =4πG

(n+ 1)Kρ

n−1n

c ,

w =Φ

Φc=

(

ρ

ρc

)1/n

,

(4.6)

where the subscript “c” indicates the stellar centre. In the centre, we have r = z = 0,Φ = Φc, ρ = ρc and

so w = 1. Substituting these variables into (4.5), we get

d2w

dz2+

2

z

dw

dz+ wn = 0, (4.7)

which again can be transformed into

1

z2d

dz

(

z2dw

dz

)

+wn = 0. (4.8)

Equation (4.8) is the Lane-Emden equation. We search for solutions of this equation that remain finite in

the stellar centre. This condition is met when dw/dz(0) = 0. In general, we have to determine solutions of

the Lane-Emden equation numerically, since only for n = 0, 1, 5 analytic solutions exist. The function w is

represented in Figure 4.1 for the two cases n = 3 and n = 3/2.

54

Figure 4.1: The solutions for the Lane-Emden equation (4.8) for n = 3/2 and n = 3. (From Kippenhahn et

al. 2012)

Imagine we have found a solution w(z) of the Lane-Emden equation for whichw(0) = 1 and dw/dz(0) =0. Following (4.6), the radial dependence of the density is then given by

ρ(r) =

[ −Φc

(n+ 1)K

]n

wn(Ar). (4.9)

For the pressure, we can then find the solution from the definition (4.2): γ, P (r) = Pcwn+1(Ar) with

Pc = Kργc . Finally we deduce an expression for the mass within the sphere with radius r:

m(r) =

∫ r

04πρr2dr = 4πρc

∫ r

0wnr2dr = 4πρc

r3

z3

∫ z

0wnz2dz, (4.10)

where we used (4.6). According to the Lane-Emden equation, the integrand wnz2 is a derivative and can

therefore be integrated with as result −z2dw/dz. The mass can then be written as

m(r) = 4πρcr3(

−1

z

dw

dz

)

. (4.11)

4.2 The degenerate electron gas

If a gas reaches a very high density, it cannot be described any longer by the ideal gas law. At high densities

quantum mechanical effects are interfering and such a gas is then called a degenerate gas. A schematic

comparison between “ordinary” and degenerate matter in a neutral gas is shown in Figure 4.2. In case “a”

the electrons move in a normal way, i.e., in their shells around the nuclei, wile in case “b” the mutual distance

between the nuclei is so small that the electrons cannot move in their shells anymore and form a “gas” that

moves in between the nuclei.

55

Figure 4.2: A schematic representation of the difference between ordinary (a) and degenerate (b) matter for

a neutral gas. In ordinary matter the inner electron shells are still intact. In degenerate matter the nuclei

are closer to each other than half of the diameter of the smallest possible electron shell. Because of this,

the electrons cannot move according to their shells but must move freely between the nuclei. In this way,

they form a “gas”. This degenerate electron gas exerts a huge pressure. (Image courtesy of Prof. E. van den

Heuvel, University of Amsterdam, NL)

Quantum mechanics states that there cannot be two identical particles that have the same position

and velocity, within the accuracy in which these can be measured according to the uncertainty relation of

Heisenberg. This law is called the Pauli Exclusion Principle. In other words: if two electrons are found very

close to each other, they cannot have exactly the same velocity.

In a low-density gas the average velocity of the particles is determined by the temperature. When the

temperature is high, the mean velocity of the particles is high. The gas pressure depends on the velocity of

the particles. Because the distance between the particles is large, the constraint that is put on the velocities

of particles by the exclusion principle has no effect. Such a gas is then called an ideal gas (see Chapter 2).

The situation is different for a gas that is compressed to a high density: all possible low velocities are in

this case filled up, causing many particles to undergo high velocities. These velocities are much higher than

the ones the particles would have when they would occur in a low-density gas with the same temperature.

When the density of the degenerate gas is extremely high, the velocities with which the particles are forced

to move, reach the level of the speed of light. Such a gas is called a relativistic degenerate gas. Because

the uncertainty relation contains the product of the mass and the velocity, the lightest particles will become

degenerate first. In a normal gas, these are the electrons.

We consider a gas with an adequately high density so that pressure ionization occurs. This effect occurs

when no bound atoms are found because the orbital radius a of the electrons becomes comparable or larger

56

than half of the distance d between two atoms. In the case of neutral hydrogen, a and d are given by

a = a0ν2, d ≈

(

3

4πnH

)1/3

, (4.12)

with a0 = 5.3 × 10−9 cm, the Bohr radius, ν the main quantum number and nH the number of hydrogen

parts per volume unit. A gas will experience no pressure-ionization as long as a < d/2, which implies the

following condition for the main quantum number:

ν2 <

(

3

4πnH

)1/3 1

2a0. (4.13)

In the centre of the Sun, we have ρc ≈ 170 g/cm3, nH ≈ 1026 cm−3 and so the condition by which pressure-

ionisation will not occur is given by ν2 < 0.13. This means that the ground state of the hydrogen atom

cannot occur and that all hydrogen atoms in the centre of the Sun have to be ionized. In stellar centres we

always have to do with pressure-ionized gasses for which the electrons first become degenerate and if that

is insufficient to provide enough pressure, the nuclei also become degenerate.

We now study free electrons that occur in a pressure-ionized gas. In the local space of momentum

px, py, pz , each electron is represented as a spherically symmetric “cloud”. Representing the absolute value

of the momentum by p (with p2 = p2x+p2y+p

2z), the distribution function of the momentum of the electrons

in a classical gas is a Maxwellian distribution function (2.40), which we now write down in terms of the

momentum rather than velocity:

f(p) =4πp2

(2πmekT )3/2exp

(

− p2

2mekT

)

. (4.14)

The maximum of this distribution function occurs at pmax = (2mekT )1/2. When there is a decrease in

temperature T , the maximum shifts to a lower p-value and the value of the maximum of f(p) increases (see

Figure 4.3).

The number of free electrons with particle density ne occurring in a volume dV of a pressure-ionized

gas and having a momentum in the interval [p, p + dp], is obtained by multiplying the distribution function

with nedV ; this way we obtain the so-called Boltzmann distribution function:

nef(p)dpdV = ne4πp2

(2πmekT )3/2exp

(

− p2

2mekT

)

dpdV. (4.15)

We now forget about classical mechanics and take the quantum mechanical principles into account. Since

the electrons have to fulfil the Pauli principle, there is a restriction on the amount of electrons that can occur

in a given state. Each quantum cell of the six-dimensional phase space (x, y, z, px, py, pz) can only contain

two electrons. The volume of such a quantum cell is dpxdpydpzdV = h3, with h Planck’s constant. In the

shell [p, p + dp] of the space of momentum, there are 4πp2dpdV/h3 quantum cells, that can only contain

8πp2dpdV/h3 electrons. These quantum mechanical considerations thus give an upper limit for the number

of electrons:

f(p)dpdV ≤ 8πp2dpdV/h3. (4.16)

57

Figure 4.3: Maxwellian distribution functions f(p) are shown as a function of the momentum p (thin lines)

for an electron gas with density ne = 1028cm−3 (which agrees with a density of ρ = 1.66 × 104g cm−3

for µe = 1) for different temperatures. The bold line shows the upper limit, imposed by the Pauli principle.

(From Kippenhahn et al. 2012)

This quantum-mechanical upper limit for f(p) is indicated as the parabola in Figure 4.3. We deduce that the

Boltzmann distribution for ne = constant is in contradiction with the quantum mechanical upper limit for

extremely low temperatures. The same result holds for T =constant and sufficiently high densities, since the

Boltzmann distribution is proportional to ne. Therefore, we have to abandon the classical description and

take the quantum mechanical effects into account when the gas temperature is too low or when the electron

density becomes too high. In that case, the distribution function exceeds the upper limit imposed by the

Pauli principle.

Let us consider an electron gas where the electrons have the lowest possible energy (T = 0K). The state

in which al these electrons have the lowest energy possible, while still complying with the Pauli principle, is

the one where all phase cells are populated with two electrons until a certain value of the momentum noted

as pF, while all the other cells are empty:

f(p) =8πp2

h3voor p ≤ pF,

f(p) = 0 voor p > pF.(4.17)

58

Figure 4.4: The distribution function f(p) as a function of the momentum p for a completely degenerate

electron gas with temperature the absolute zero point and density ne = 1028cm−3. (From Kippenhahn et al.

2012)

The distribution function is shown in Figure 4.4. The total amount of electrons in the volume dV is given by

nedV = dV

∫ pF

0

8πp2

h3dp =

3h3p3FdV. (4.18)

For a given electron density, we then find the Fermi-momentum pF ∼ n1/3e . For non-relativistic electrons the

Fermi-energy is EF = p2F/2me ∼ n2/3e . We can see that, even though the temperature of the electron gas

is zero, the electrons have an energy different from zero that can amount to EF. When the electron density

is very high, the velocities of the fastest electrons can reach a considerable fraction of the speed of light.

Therefore we have to use expressions for the total energy and the momentum that are deduced according to

the theory of special relativity:

p =mev

1− v2/c2,

Etot =mec

2

1− v2/c2= mec

2

1 +p2

m2ec

2,

(4.19)

with me the rest mass of the electron. The kinetic energy of the electron is connected with the total energy

by E = Etot −mec2.

To deduce an equation of state for a degenerate electron gas we have to consider an expression for the

pressure of the electrons. The pressure is by definition the flux of momentum through a unit surface per

unity of time. Let us consider a unit surface dσ with normal vector ~n (see Figure 4.5). An arbitrary vector

59

~s defines the angle θ that is enclosed by ~n and ~s. We now determine the number of electrons that move

through dσ per second within the solid angle dΩs around the direction ~s. We will limit to electrons with

a momentum in the interval [p, p + dp]. At the position of the surface element there are f(p)dpdΩs/(4π)electrons per volume unit and per solid angle unit that have the appropriate momentum. There will be

moving f(p)dpdΩsv(p) cos θdσ/(4π) electrons per second through the surface dσ within the solid angle

dΩs. Here v(p) is the velocity defined by (4.19). Each electron has a momentum with absolute value pand with direction ~s. The component of it in the direction of ~n is p cos θ. We then obtain the total flux of

momentum in the direction ~n by integrating over all directions ~s of a sphere and over all absolute values p.

We this way find an electron pressure Pe

Pe =

Ω

∫ ∞

0f(p)v(p)p cos2 θdpdΩs/(4π) =

3h3

∫ pF

0p3v(p)dp, (4.20)

in which we have replaced f(p) by (4.17). By using the expression for p given in (4.19) we then find

Pe =8πc

3h3

∫ pF

0p3

p/mec

[1 + p2/(m2ec

2)]1/2dp =

8πc5m4e

3h3

∫ x

0

ξ4dξ

(1 + ξ2)1/2, (4.21)

where we have introduced the new variables ξ ≡ p/(mec), x ≡ pF/(mec). It can be shown that the integral

in the right-hand side of this expression is given by

1

8

[

x(

2x2 − 3)(

1 + x2)1/2

+ 3 sinh−1 x

]

=x

8

(

2x2 − 3)(

x2 + 1)1/2

+3

8ln

[

x+(

1 + x2)1/2

]

≡ 1

8g(x)

so that

Pe =πm4

ec5

3h3g(x). (4.22)

With the help of the definition of x we finally write the number of electrons as

ne =ρ

µemu=

8πm3ec

3

3h3x3. (4.23)

These last two equations define the function Pe(ne).

To find an expression for the equation of state Pe(ρ), we first deduce the asymptotic behaviour of the

function g(x). Therefore we write x as

x =pFmec

=vF/c

(1− v2F/c2)1/2

ofv2Fc2

=x2

1 + x2, (4.24)

in which vF is the velocity of the electrons with a momentum p = pF. When x ≪ 1, then vF/c ≪ 1 and

the electrons clearly move slower than the speed of light (non-relativistic limit). On the other hand, x ≫ 1implies that vF/c → 1. The higher x, the more electrons move relativistically and, for very large x, almost

all electrons move relativistically. The function g(x) shows the following asymptotic behaviour:

x→ 0 : g(x) → 8

5x5, x→ ∞ : g(x) → 2x4. (4.25)

When x≪ 1, relativistic effects can be neglected; (4.22) gives in this limit

Pe =8πm4

ec5

15h3x5. (4.26)

60

Figure 4.5: A surface element dσ with normal vector ~n and an arbitrary unity vector ~s, which is the axis of

the solid angle dΩs. (From Kippenhahn et al. 2012)

If we substitute the expression for x given in (4.23), then we get

Pe =1

20

(

3

π

)2/3 h2

men5/3e =

1

20

(

3

π

)2/3 h2

me

(

ρ

µemu

)5/3

, (4.27)

where we used that ρ = neµemu in the last step. We notice that this equation of state has the form of a

polytrope with γ = 5/3, n = 3/2.

For x ≫ 1 we are in the extreme relativistic limit and we find the following equation for the electron

pressure

Pe =2πm4

ec5

3h3x4. (4.28)

If we again substitute x based on (4.23) this gives now

Pe =

(

3

π

)1/3 hc

8n4/3e =

(

3

π

)1/3 hc

8

(

ρ

µemu

)4/3

. (4.29)

We again find a polytrope, this time with γ = 4/3, n = 3.

For both extremes of the degenerate electron gas (relativistic and non-relativistic), we find a polytropic

equation of state (where the function w was shown in Figure 4.1) in which the constant K is only determined

by physical constants. This in contradiction with the examples in the former section where K was a free

constant that can vary from star to star.

When the temperature is not zero, not all electrons will be packed in cells with a momentum that is

as low as possible. For temperatures which are high enough, the electrons will comply with Boltzmann

statistics. There is a continuous transition of a state of full degeneracy to a state of a non-degenerate gas.

This is called partial degeneracy. The distribution of the phase cells then follows a so-called Fermi-Dirac

statistic, that contains a degeneracy parameter ψ ∈ [−∞,∞]. This parameter shows which fraction of the

phase cells is filled and depends on ne and T . In this case, the equation of state cannot be described as a

61

simple analytic relation between the electron pressure and the density. For ψ → −∞ we recover an electron

pressure for the ideal gas approximation, Pe = nekT , in the case of the non-relativistic partial degenerate

electron gas. For a non-relativistic partial degenerate gas with ψ ≫ 1 (high level of degeneracy) we recover

the equation of state (4.27). For the relativistic limit of strong degeneracy (ψ → +∞), we find the equation

of state (4.29).

An important graph is the one where the temperature is plotted against the density and where the validity

areas with different equations of state are indicated. This graph is the outcome of one of the exercises.

4.3 The Chandrasekhar limit

We now look into a polytropic model in which the pressure is connected with a non-relativistic degenerate

electron gas. In such a medium, the central density and mean density increase with increasing stellar mass.

However, when the density increases, the electron gas becomes more and more relativistic. We can imagine

that we are evolving to a star with a relativistic core where the pressure is described by a polytrope with

n = 3 (see 4.29) and a non-relativistic envelope with a pressure given by a polytrope with n = 3/2 (see

4.27). Hence a transition will occur, where the pressure takes a value between both expressions (4.27) and

(4.29). The physicist Chandrashekhar was the first to look into such models in order to understand the

so-called white dwarfs (see Chapter 12).

A natural question is how such a model varies with increasing mass. At low M the whole model stays

non-relativistic and a polytrope with n = 3/2 gives an adequate description. When the central density is

high enough, an ever larger part of the stellar core will become relativistic. We expect that the star eventually

evolves into a state in which all particles move relativistically and the pressure is described by a polytrope

with polytropic index n = 3. This view has the following interesting property. From the definition of the

variable z, we find

R ∼ ρ1−n

2nc , (4.30)

thus from M ∼ ρcR3, it follows that

M ∼ ρ3−n

2nc . (4.31)

We can thus conclude that the mass of a polytrope with n = 3 does not depend on the central density:

M = constant. Therefore, there is only one admitted mass for a fully degenerate relativistic electron gas

that meets the requirements of a polytrope with n = 3. This mass is totally determined by physical constants

and the value of the functions z and w′ in the zero point of the polytrope with n = 3. The numerical limit

value of the only admitted mass is

MCh =5.836

µ2eM⊙. (4.32)

This mass is called the Chandrasekhar limit. It indicates the end point of the convergence process of

models with increasing central density for which the pressure is delivered by a fully relativistic degenerate

electron gas. This limiting mass (4.32) is very low if one keeps in mind that there are many stars that have

a much larger birth mass. However, all stars that have not yet started the ultimate end stage of their lives,

have an equation of state that differs a lot from a degenerate electron gas and so this mass limit is not of any

62

Figure 4.6: State of a stellar gas, where the central temperature expressed in K is shown as a function of the

central density expressed in g/cm3. (From Kippenhahn et al. 2012)

relevance for them. For white dwarfs, however, a degenerate electron gas does occur, as we shall discuss in

Chapter 12. For these stars, µe = 2 is a good approximation and we find the condition

M < MCh = 1.46M⊙. (4.33)

Even though we deduced the Chandrasekhar limit by using a polytropic model, the result is almost the

same when considering a more realistic equation of state, because, for extremely high densities, the pressure

of the electron gas converges to a pressure which is well described by a polytropic law with γ = 4/3, n = 3.

When we work with a more realistic, non-polytropic model, we find MCh = 1.44M⊙. Until now, no white

dwarf has been found with a mass that exceeds MCh. Chandrasekhar received the Nobel Prize in Physics

for his studies of white dwarfs.

4.4 Schematic representation of the relevant equations of state

In Figure 4.6 the different equations of state of the gas are shown in a (temperature-density)-diagram. Above

the dotted line, the radiation pressure dominates. Below the solid line the electrons are degenerate, on the

one hand in the relativistic limit (to the right of the dashed line) and on the other hand in the non-relativistic

limits (to the left of the dashed line). The thcik dashed line shows the central conditions for a model

representative of the Sun.

63

So far we have neglected the interaction between the ions. This is no longer justified for high densities

and low temperatures, because their Coulomb interaction starts to interfere. Instead of moving around freely,

the ions will, under the right circumstances, orderly take place in a grid so that their energy is minimal. Using

crystallization theory we can compute for which combinations of temperature and density these effects start

to dominate. This crystallization area is marked by the dashed-dotted line in Figure 4.6. In the interior of

stars that have not died yet, the densities are high but the temperatures as well. Therefore the crystallization

area is not important for stars. Cooling white dwarfs, however, end up in this area, since their density

essentially remains constant but they move down in the HR diagram along the cooling track (see further on).

This way they will obtain a crystallized core of carbon and oxygen. Cool white dwarfs are thus in fact giant

diamonds in the sky; they are massively present in the Universe!

64

Chapter 5

Energy transport

The energy radiated by a star at its surface is created in the hot central parts. Thus, energy is transported

through the stellar material. This energy transport is possible thanks to the existence of a temperature

gradient. Depending on the local conditions, the transport is done by radiation, conduction or convection.

Ions, atoms and electrons are constantly exchanged between cooler and warmer regions while they interact

with photons. The temperature differences between surrounding layers determine how the energy transport

occurs. In this chapter, we discuss the equation that describes the energy transport. This equation is the next

one in the system of equations that describe the stellar structure.

5.1 Energy transport by radiation

We start off with a few rough estimates of crucial quantities that characterize radiative energy transport by

photons. This will enable us to simplify the formalism.

5.1.1 Mean free path

A first estimation concerns the length of the mean free path ℓf of a photon located at a certain point in the

star where the density is ρ:

ℓf =1

κρ, (5.1)

with κ the “mean” absorption coefficient or opacity, which is the microscopic radiative effective cross sec-

tion per unit mass, averaged over all frequencies. First, we clarify the meaning of an effective cross section

and the mean free path. These concepts are introduced in the general context of collision probabilities.

What is the condition for two particles to collide? When we consider two spherical particles A and B,

65

with respective radii ra and rb, they collide when the distance d between their centres is equal or smaller than

the sum of the radii: d ≤ ra + rb. Equivalently, this condition can be expressed by requiring that the centre

of the particle B (the projectile) must be within or on a circle centred around A with a radius r = ra + rb.Consequently, the collision can be regarded as the collision between a stationary particle with radius ra+ rband a point-shaped incoming particle. The spherical stationary particle A can be further simplified to a disk

(a target), perpendicular to the direction of motion of the incoming particle B. The surface of this disk is

called the microscopic effective cross section, and is equal to κ = π(ra + rb)2.

We now consider a bar-shaped plane parallel stellar layer with dimensions l× l×dx, with thickness dxso small that the individual targets in the layer in the direction parallel to dx do not overlap. Furthermore,

dx is oriented parallel to the direction of the incoming projectile. We assume that the density in the plane

parallel layer is ρ. In total, the layer contains ρl2dx targets. These targets have a total joint effective

cross section given by κρl2dx. The impact probability of an incoming projectile is defined by the ratio of

the surface covered by a unit mass of targets, with respect to the total surface of the layer, and hence is

κρl2dx/l2 = ρκdx. The product κρ is called the macroscopic effective cross section per unit mass. This

quantity has the dimension of a reciprocal length.

Let the probability of a collision with one incoming particle be p, then on average 1/p particles have

to be directed to the plane paralleled layer to obtain one collision. In the case described above, an average

of 1/ρκ particles have to be sent in a unit mass of the plan-paralleled layer to cause a collision over the

distance dx. The mean distance the particle will travel before it collides with a target in the layer, then is

1/ρκdx/dx = 1/ρκ per unit mass. This mean distance is called the mean free path. In the case that the

projectiles are photons, we will note the mean free path as ℓf .

The opacity depends on the interaction between radiation and matter. Specifically, it depends on the

detailed distribution of atoms in the gas, the population of the energy levels, the ionisation states, and the

equation of state of the gas. The computation of κ is complex and requires intense efforts. This type of

work is taken up by various dedicated international research teams. These specialised activities result in the

publication of opacity tables, which describe the value of κ as a function of the density, the temperature,

and the chemical composition.

A few simple yet relevant approximations for the opacity exist. They provide a rough idea of the

dependence on the thermodynamical state of the gas described by ρ and T . Kramers’s approximation is the

best-known:

κ = κ0 ρ T−3.5, (5.2)

in which κ0 is a constant that depends on the chemical composition. This density- and temperature-

dependence of the opacity is appropriate in the stellar interior of low-mass stars, where the temperature

remains relatively low. In the core of massive stars, scattering by free electrons (Thomson scattering) dom-

inates the opacity, which makes it independent of the density and the temperature. The latter is also valid

when the gas is fully ionized. In this case, κe = 0.2(1 +X) ≈ 0.4 cm2/g is a good approximation for the

opacity. It provides a lower limit for κ, since bound-bound transitions in partially ionised atoms are respon-

sible for a large fraction of the opacity. At temperatures below 6 000 to 10 000 K, the absorption of photons

by the negatively charged hydrogen atom (hydrogen atom with one additional electron, H−) dominates the

opacity. This situation occurs in the atmosphere of stars with a mass below one solar mass. The required

66

electrons are provided by the ionisation of metals. In this case, the opacity is proportional to the density of

H− and hence to the density of the electrons. Put differently, the opacity is set by the degree of ionisation,

and increases with increasing temperature, contrary to expression (5.2) which is valid in stellar interiors. As

a typical value for κ in a star, we can regard the case of ionised hydrogen in the stellar core: κ ≈ 1 cm2/g.

Taking the average density of the Sun, ρ⊙ = 3M⊙/4πR3⊙ = 1.4 g/cm3, and using κe, one obtains

an upper limit for the mean free path of photons in the Sun: ℓf ≈ 2 cm ! Photons hence experience many

interactions before they get from the location of their creation (the stellar core) to the stellar surface. This

means that, in general, stellar matter is very opaque. This is not valid anymore in the photosphere of a star

or in red (super)giants, where the mean free path of a photon is much larger.

5.1.2 The temperature gradient

A typical value for the temperature gradient in a star like the Sun can be obtained by comparing the core

(TC ≈ 107 K) and surface temperature (TS ≈ 104 K), with respect to the stellar dimensions:

Tr ≈ TC − TS

R⊙

≈ 1.4× 10−4 Kcm−1, (5.3)

i.e., 14 K per kilometer.

Over the mean free path of a photon, the stellar interior is thus almost perfectly isothermal. The

difference in temperature over this length scale is only T = ℓf(dT/dr) ≈ 3 × 10−4 K. The relative

anisotropy of the radiation in a point with temperature T = 107 K is given by T/T ∼ 3 × 10−11. This

value shows that the physical state in the stellar interior must indeed be very close to thermal equilibrium,

and hence that the radiation can be very well approximated by that of a black body for which the energy

density is proportional to ∼ T 4. Hence, the relative anisotropy of the radiation is only ∼ 10−10. Despite

its extremely small value, this anisotropy is responsible for the outward transport of energy and this way for

the enormous luminosity of the star. A fraction of 10−10 of the flux radiated through a surface of 1 cm2 of a

black body with a temperature of 107 K, is still a factor 1000 larger than the flux we receive from the Solar

surface!

5.1.3 The diffusion approximation

Radiative energy transport in a star occurs because there is more outward radiation (coming from the hot mat-

ter close to the core) than inward radiation (coming from the cooler outer layers). The estimations described

above show that the mean free path of the “transporting particles” (photons) is extremely small with respect

to the characteristic length scale over which the transport occurs (i.e., the stellar radius): ℓf/R⊙ ≈ 3×10−11.

In such a case, the energy transport can be treated as a diffusion process, which implies a significant simpli-

fication of the formalism. We repeat that this approximation is not valid in the photosphere of a star.

67

General description

First, we recall the diffusion equation as often discussed in physics. In general the diffusive flux ~f of

particles per unit surface, per unit time, averaged over all frequencies, between areas with different particle

densities n (expressed per unit volume), is given by

~f = −D~∇n. (5.4)

Here, the diffusion coefficient D is set by the velocity v of the particles and their mean free path ℓd:

D =1

3vℓd. (5.5)

This form of the diffusion equation is general. Below, we recall how it is derived.

Consider a layer of gas, in which the motion of the particles happens in one direction, e.g., along the

x-axis. We want to determine the stream of particles through a fictitious plane perpendicular to the x-axis.

The number of particles per unit volume to the left of the plane is noted by n−, the particle density at the

right of the plane as n+. To be able to travel through the plane in a time interval t, the particles must

initially be closer to the plane than a distance vx t, where vx is their velocity in the x-direction. We assume

random motion of the particles. Hence, half of the particles at a distance vx t will move towards the plane,

the other half will move away from the plane. The resulting stream of particles through the plane per unit

time is:

fx =n−vx t

2 t− n+vx t

2 t=

(n− − n+) vx2

. (5.6)

Each of the particles can travel a distance ℓx before interacting with another particle. Hence, we can connect

the difference in particle density left and right of the plane to the mean free path:

n+ − n− =dn

dxx =

dn

dx2ℓx. (5.7)

The flux in the x-direction becomes:

fx = −ℓxvxdn

dx. (5.8)

We assume that there is no preferential direction. In that case, the average velocity of the particles is equally

large in the three spatial directions. Hence, the velocity in the x-direction is vx ≃ v/√3. A similar reasoning

applies for the mean free path: ℓx = ℓ/√3. We find

fx = −1

3ℓvdn

dx, (5.9)

The generalisation of this equation to three dimensions gives Eqs (5.4) and (5.5).

Application to stellar gas

To obtain the corresponding radiative energy flux ~f in a star, averaged over all frequencies, we replace nby the energy density of a black body (this time per unit of volume to be able to directly make use of the

68

diffusion equation) u = aT 4. The velocity v is replaced by the speed of light c, and ℓd by ℓf given in (5.1).

Because of the spherical symmetry of the star, ~f only has a radial component fr = |~f | = f and ~∇u reduces

to a derivative in the radial direction:∂u

∂r= 4aT 3 ∂T

∂r. (5.10)

Because the equations of the stellar structure are described in units per mass, we rewrite the equation

above:

f = −4ac

3

T 3

κρ

∂T

∂r. (5.11)

This equation can be formally considered as an equation describing heat conduction by writing it as

~f = −krad~∇T, (5.12)

with

krad ≡ 4ac

3

T 3

κρ(5.13)

the conduction coefficient for radiative transport. When we solve equation (5.11) for the temperature gradi-

ent and replace f by the local luminosity l = 4πr2f , we obtain

∂T

∂r= − 3

16πac

κρl

r2T 3. (5.14)

Finally, after transformation to the independent variable m, we obtain the basic equation for radiative energy

transport.∂T

∂m= − 3

64π2ac

κl

r4T 3. (5.15)

This equation is called the Eddington equation of energy transport through radiation.

We stress that this simple approximation is not valid close to the stellar surface. Indeed, due to the low

densities, the mean free path of the photons there becomes comparable to the remaining distance they have to

travel to reach the stellar surface. Hence the diffusion approximation breaks down in the stellar atmosphere,

and a much more complicated differential equation needs to be solved to describe the energy transport. In the

current course, we limit ourselves to the region in the star where the diffusion approximation is justified. For

a description of the energy transport in the stellar atmosphere, we refer to the courses Radiative Processes

in Astronomy and Stellar Atmospheres of the Master of Astronomy & Astrophysics at KU Leuven.

5.1.4 The Rosseland mean opacity

The equations described above are independent of the frequency ν because f, l and κ are defined as “av-

erages” over all frequencies. Here we discuss a useful and appropriate method to determine this average

opacity κ. We indicate the dependence of κ on the frequency ν by adding the lower index ν. We do the

same for all relevant frequency-dependent quantities κν , ℓν ,Dν , uν and so on. The diffusive radiation flux~fν in the frequency interval [ν, ν + dν] can be described as

~fν = −Dν~∇uν met Dν =

1

3cℓν =

c

3κνρ. (5.16)

69

The energy density in the frequency interval [ν, ν + dν] is given by

uν(T ) =4π

cBν(T ) =

8πh

c3ν3

exp (hν/kT ) − 1, (5.17)

in which Bν(T ) and uν(t) are the Planck functions for the intensity and the energy density of a black body,

respectively (see Appendix A). Hence, we find

~∇uν =4π

c

∂B

∂T~∇T. (5.18)

The latter equation yields, together with (5.16), the following expression for the total, frequency-integrated

flux ~f :

~f =

∫ ∞

0

~fνdν = −(

∫ ∞

0

1

κν

∂B

∂Tdν

)

~∇T. (5.19)

The equation has the same form as (5.12), but now with

krad =4π

∫ ∞

0

1

κν

∂B

∂Tdν. (5.20)

If we compare this expression for krad to the one given in (5.13), we obtain a useful method to average the

absorption coefficients:1

κ≡ π

acT 3

∫ ∞

0

1

κν

∂B

∂Tdν. (5.21)

This is the so-called Rosseland mean opacity. Given that

0

∂B

∂Tdν =

acT 3

π, (5.22)

the Rosseland mean opacity is a harmonic average with weight functions ∂B/∂T . It is simple to compute

once the function κν is known in the form of opacity tables, as described above.

To derive the physical interpretation of the Rosseland mean opacity, we rewrite ~fν = −Dν~∇uν using

the expressions (5.16), (5.17) and (5.18):

~fν = −(

1

κν

∂B

∂T

)

3ρ~∇T. (5.23)

This result shows that, for a given point in the star (given ρ and ~∇T ), the integrand in expression (5.21) is

proportional to the net energy flux ~fν for all frequencies. The Rosseland mean is hence created in such a

way that the highest weight is given to frequencies with maximal energy flux.

A downside of the Rosseland mean is that the opacity κ of a mixture of two different gases with

opacities κ1 and κ2, is not equal to the sum of the individual opacities: κ 6= κ1 + κ2. Therefore, it is

not sufficient to know the Rosseland mean for the two different gases that both occur in the gas mixture,

to be able to determine the Rosseland mean of the mixture. Suppose, for example, that the gas contains

a hydrogen fraction X and a helium fraction Y , then the Rosseland mean opacity must be computed for

κν = Xκν(H) + Y κν(He). Each time the abundance Y/X changes, κν has to be recomputed before the

Rosseland opacity can be evaluated using expression (5.21).

70

Until now, we have assumed that the energy flux is only the result of a diffusion process in which

photons take part. In the following sections, we will discuss two other ways of energy transport. Therefore,

we will from now on indicate all quantities that relate to radiative energy transfer with a lower index “rad”,

e.g. κrad, ~frad, etc.

5.2 Energy transport by conduction

Energy transport via heat conduction occurs through collisions induced by the thermal motion of particles

such as electrons and atomic nuclei in ionised matter, and atoms and molecules in non-ionised matter. In

“common” stellar material, conduction is not an important energy transport mechanism. Although the effec-

tive cross section for collisions of particles is relatively low in the stellar interior (approximately 10−20 cm2

per particle), the high density implies that the mean free path is many orders of magnitude smaller than that

of photons. Moreover, the velocity of the particles is only a fraction of the speed of light c. As a result, the

diffusion coefficient D is much smaller than the one for radiative transport via photons.

This situation changes, however, when considering the stellar cores of evolved stars in which the elec-

tron gas is degenerate. The density in a degenerate electron gas is enormous: typically 106 g cm−3, but, on

the other hand, the velocities attained by the electrons is a significant fraction of c. The degeneracy increases

the mean free path significantly. As a result, the diffusion coefficient becomes large, and heat conduction

becomes an important energy transport mechanism, that dominates over the radiative transport.

The energy flux caused by heat conduction ~fcd can also be described by the diffusion formula ~fcd =−kcd~∇T . The sum of the radiative and conductive flux can be written as

~f = ~frad + ~fcd = − (krad + kcd) ~∇T. (5.24)

Similar to (5.13), we can formally write the conduction coefficient kcd as

kcd =4ac

3

T 3

κcdρ, (5.25)

in which we have introduced the conductive opacity κcd. The total energy flux becomes

~f = −4ac

3

T 3

ρ

(

1

κrad+

1

κcd

)

~∇T. (5.26)

This equation shows that we can formally obtain the same equation as the one we obtained in the purely

radiative case (5.11), if we replace 1/κ by 1/κrad + 1/κcd. The transport mechanism that dominates the

sum, is the one which has the highest “transparency”.

Equation (5.15) with adapted κ is now valid for radiative and conductive transport. We reformulate

the equation in a form which will prove to be convenient later on. Under the assumption of hydrostatic

equilibrium, we divide (5.15) by (3.17) and obtain

(∂T/∂m)

(∂P/∂m)=

3

16πacG

κl

mT 3. (5.27)

71

We define the ratio between the partial derivatives on the left-hand side as (dT/dP )rad: the variation of

T with depth, where depth is expressed in terms of pressure (the pressure is a monotonically increasing

function towards the stellar center). For a star in hydrostatic equilibrium that transports energy via radiation

and conduction, (dT/dP )rad has the meaning of a gradient that describes the temperature variation with

depth. Using the common abbreviation

∇rad ≡(

d ln T

d lnP

)

rad, (5.28)

we obtain for (5.27)

∇rad =3

16πacG

κlP

mT 4, (5.29)

in which κ refers to the combined opacity of radiative and conductive transport. ∇rad is called the radiative

temperature gradient. It is the local logarithmic derivative of the temperature with respect to the pressure

that would be necessary if the entire luminosity had to be transported through radiation only.

We note that ∇rad and ∇ad are defined differently, and have, apart from different numerical values, a

different physical meaning. ∇rad describes a local derivative that connects P and T in two neighbouring

fluid elements, while ∇ad is a thermodynamical derivative, that describes the thermal variation of a single

fluid element during adiabatic expansion/compression.

Again we define a characteristic time scale, this time based on equation (5.29): the thermal time scale

or time scale for thermal adaptation τth. One can show that τth ≈ τHK when considering the value of

these time scales averaged over the entire star. This means that the Helmholtz-Kelvin time scale can be

interpreted as the period of time that a thermal fluctuation needs to travel from the stellar core to the stellar

surface. Despite the equivalence between the two time scales, it is best to use them separately. In most cases,

the Helmholtz-Kelvin time scale is used to describe the entire star, while the time scale of thermal adaptation

is often used for specific local layers in the star, and that these values are very different for different layers.

5.3 Stability analysis

Until now, we have assumed strict spherical symmetry. We hence assume that all functions are constant

over concentric spheres. In practice, small fluctuations occur, e.g. the thermal motion of gas particles. Such

local disturbances can be neglected, under the condition that they never grow to macroscopic, non-spherical,

local motions. This means that we are allowed to maintain spherical symmetry in the basic equations if we

consider the variables as accurate average values over the concentric spheres.

The microscopic motions, however, can have a large impact on the stellar structure. They can “mix”

stellar material, and moreover transport energy. The latter is because hot fluid elements will rise, while cool

fluid elements will sink. This energy transport mechanism is called convection. Whether or not convection

occurs in a certain stellar layer depends on whether small fluctuations remain small, or are able to grow

and become larger. In other words, it is a question of stability. Therefore, we will first derive criteria for

stability with respect to local, non-spherically symmetric disturbances, before we discuss convective energy

transport.

72

5.3.1 Dynamical instability

The starting point for the discussion of dynamical instability is the assumption that moving fluid elements do

not have a sufficient amount of time to exchange a substantial fraction of their heat with their surroundings.

In other words, these elements move in an adiabatic1 way. Consider the situation where physical quantities

such as temperature, density, etc. are not constant at the edge of a concentric sphere inside a star, but

have small local fluctuations. In our treatment of the global stellar structure, we assumed that the physical

quantities derived in the previous sections are good averages over the concentric spheres.

For a local description we will represent a fluctuation by considering a fluid element (with lower index

“e”) in which the physical quantities are slightly different than those in the surroundings (lower index “s”)

of the element. For a quantity A, we define the difference DA between the element and the surroundings

as DA ≡ Ae −As. Assume there is a small temperature fluctuation, e.g. the fluid element is slightly hotter

than the surroundings with DT > 0. At first instance, one would then also expect an excess in pressure

DP . However, the fluid element will expand to restore the pressure equilibrium with the surroundings. This

expansion occurs at the speed of sound, i.e., much faster than any other possible motion of the element.

Hence, we can assume that the pressure in the element is always in equilibrium with that of the surround-

ings: DP = 0. In other words, we assume that the fluid element and its surroundings are in hydrostatic

equilibrium.

In case of an ideal gas with ρ ∼ P/T , the excess in temperature DT leads to Dρ < 0. The element be-

comes less dense (“lighter”) than its surroundings and will start to feel buoyancy (principle of Archimedes),

given by −gρ, which lifts up the element. Temperature fluctuations hence lead to element movements in

the radial direction. To test the stability of a layer with respect to local temperature fluctuations, one can

thus equivalently take a radial displacement r > 0 as the initial perturbation of the element.

Consider a fluid element that is in equilibrium with its surroundings at its original position r, but that

is lifted by a perturbation to a position r+r (see Figure 5.1). The density difference between the element

and its surroundings at location r +r, is

Dρ =

[(

dr

)

e−(

dr

)

s

]

r, (5.30)

in which(dρ/dr)e represents the change in density of the element due to its rise. The other derivative has

a similar meaning, it is the density gradient of the surroundings. Dρ induces a radial component Kr =−gDρ/ρ of the force ~K per unit mass. This is the so-called buoyancy force of Archimedes. When Dρ < 0,

the element is less dense (“lighter”) than its new surroundings at r +r, and Kr > 0. This means that the

force ~K is pointing outward. This situation is unstable, because the element will be lifted up even further

away from its initial location r. On the other hand, Kr < 0 when Dρ > 0. In this case, ~K points inward.

The element is “heavier” than the other elements in its new surroundings, and the element will be pulled

back down. The equilibrium is restored, and the layer remains stable. The condition for stability hence reads(

dr

)

e−(

dr

)

s> 0. (5.31)

1When fluid elements do have enough time to exchange a subtstantial fraction of their heat with their surroundings, they move

diabatically. In astronomy, however, the word “non-adiabatic” is used for the latter movement, which is in fact a double negation.

73

e

e

r : T, P, !

r +!r :

T +!T,

P +!P,

! +!

ss

s

s

s

s s

Figure 5.1: The fluid element “e” with initial position r in the gas with local circumstances T, P, ρ is lifted

due to a fluctuation with respect to its surroundings “s” to a position r +r, where the circumstances are

T +∆T, P +∆P, ρ+∆ ρ.

In practice, however, it is difficult to apply this criterion because it is based on the knowledge of the density

gradient, a quantity that does not appear in the basic equations of the stellar structure. It would be more

convenient if we could derive a criterion based on the temperature gradient, because the latter appears in the

equation that describes radiative and conductive energy transport.

To evaluate (dρ/dr)e correctly, one has to determine, in principle, the energy exchange between the

element and its surroundings. Here, we make the approximation that there is no heat exchange, i.e., the

element moves adiabatically. For areas in the deep stellar interior, this is a good approximation. To convert

the derivative of the density into a derivative of the temperature, we consider the equation of state ρ =

74

ρ(P, T, µ) in its differential form:dρ

ρ= α

dP

P− δ

dT

T+ ϕ

µ. (5.32)

The quantities α and δ were defined previously. In expression (5.32), we allow for a change in chemical

composition, which was characterised by the molecular weight µ. We assume that dµ = 0 for the element

that carries its chemical composition with it. However, it is possible that dµ 6= 0 for the surroundings when

the element arrives in a layer with a different chemical composition than the layer at its initial location. In

analogy to α and δ, which are to be evaluated for constant T, µ and P, µ, respectively, we define:

ϕ ≡(

∂ ln ρ

∂ lnµ

)

P,T

. (5.33)

For an ideal mono-atomic gas, we have ρ ∼ Pµ/T and hence α = δ = ϕ = 1.

Using (5.32), the stability criterion (5.31) can now be written as

(

α

P

dP

dr

)

e−(

δ

T

dT

dr

)

e−(

α

P

dP

dr

)

s+

(

δ

T

dT

dr

)

s−(

ϕ

µ

dr

)

s

> 0. (5.34)

The sum of the two terms that contain the pressure gradient are zero because of the assumption DP = 0.

We define the pressure scale height HP :

HP ≡ − dr

d lnP= −P dr

dP. (5.35)

The local pressure scale height is a length scale. It is the distance over which the pressure drops by a factor

e (the base of the natural logarithm). Because P decreases with increasing r, HP > 0. Expressed in terms

of HP , the condition for hydrostatic equilibrium is: HP = P/ρg. Typical values are HP ≃ 1.4× 107 cm in

the photosphere of the Sun, and approximately 5.2× 109 cm at a depth of R⊙/2.

When multiplying all terms of (5.34) withHP > 0, and taking into account δ > 0, the stability criterion

becomes(

d ln T

d lnP

)

s<

(

d lnT

d lnP

)

e+ϕ

δ

(

d lnµ

d lnP

)

s. (5.36)

In analogy to the quantities ∇rad and ∇ad, we define three new derivatives:

∇ ≡(

d ln T

d lnP

)

s,∇e ≡

(

d lnT

d lnP

)

e,∇µ ≡

(

d lnµ

d lnP

)

s. (5.37)

∇ and ∇µ are spatial derivatives, to be evaluated in the new surroundings of the fluid element. They describe

the variation of T and µ with depth, in which P is a probe for that depth. ∇e describes the variation of Tin the element during its motion. Also here, the position of the element is expressed in terms of pressure

P . ∇e and ∇ad are defined in a similar way, because both describe the temperature variation of the gas in

the fluid element, when it experiences a change in pressure. On the other hand, ∇rad and ∇µ describe the

spatial variation of T and µ in the surroundings. When ∇ = ∇rad, the energy transport is fully done by

radiation (and conduction). However, when ∇ < ∇rad, a part of the transport is done by convection.

75

The condition for stability becomes:

∇ < ∇e +ϕ

δ∇µ. (5.38)

In a layer where the energy transport is uniquely done by radiation, we have ∇ = ∇rad. We now investigate

the stability of such a layer assuming that fluid elements move adiabatically (∇e = ∇ad). The condition for

stability now reads

∇rad < ∇ad +ϕ

δ∇µ. (5.39)

This stability criterion is known as the criterion of Ledoux for dynamical stability.2

In a region in the star with a homogeneous chemical composition, we deduce the Schwarzschild crite-

rion for dynamical stability

∇rad < ∇ad. (5.40)

When, in both criteria, the left-hand side is larger than the right-hand side, the layer is dynamically unstable.

This means that the energy transport via radiation would impose too large a temperature gradient, and hence

a switch to convection is needed to carry away the energy. When both sides in the equation are equal, there

is marginal stability. The difference between the two criteria is only important in layers where the chemical

composition changes in the radial direction. This occurs in layers close to the core of evolved stars, where

the heavy chemical elements are produced deeper in the core than the light elements, and hence µ changes

strongly going inward. The last term on the right-hand side in the Ledoux criterion has a stabilising effect,

because a mass element with heavier matter will be lifted up to a surrounding with lighter material. The

buoyancy force will push the heavier fluid element back down, to its initial position. When the criteria

of Ledoux or Schwarzschild are met, the energy transport is done exclusively by radiation, and one has

∇ = ∇rad.

Convective motion only occurs in a star when the criteria of Ledoux or Schwarzschild are not fulfilled.

This happens when:

• l(r)/m(r) is large, i.e., when the energy production within a radius r is very large. This occurs in

massive stars, and they therefore have a convective core.

• the opacity κ is large. This occurs in (the outer layers of) stars with low surface temperatures.

• ∇ad is small. This occurs mostly in partial ionisation zones of hydrogen, in the outer layers of cool

stars, because cP becomes very large there (the absorbed heat is mostly used to further ionise the

matter, not to heat it).

In this case, small perturbations will grow to a large amplitude until the whole region “boils”, with convective

spheres carrying away a part of the energy. The convective energy transport must be treated as described in

the next section. Convection occurs in the inner regions of high-mass stars and in the outer layers of cool

stars. The different temperature gradients in the current Sun are shown in Figure 5.2. As mentioned before,

∇rad becomes very large in the outer layers of the Sun due to the strong increase in opacity. Moreover, ∇2named after the U Liege astrophysicist Prof. Paul Ledoux.

76

Figure 5.2: The temperature gradient, ∇, in the current Sun (full line). The full line shows the effective

temperature gradient ∇. The dashed line is the adiabatic temperature gradient ∇ad. In the radiative region,

which reaches out to r ≤ 0.72R⊙, ∇ = ∇rad. In the convective envelope, ∇ is almost perfectly equal to

∇ad and the full and dashed lines coincide, except very near the surface, where radiation has no problem

escaping the star efficiently. (Figure courtesy of Prof. J. Christensen-Dalsgaard, Aarhus University, DK)

drops strongly below 2/5 in the ionisation zones of hydrogen and helium. In the regions which are stable

with respect to convection, ∇ equals ∇rad and the energy transport is fully radiative. In almost the entire

convective zone, ∇ is only a bit larger than ∇ad, except in a very thin layer at the upper part of the convective

envelope. Figure 5.3 compares the temperature gradients in a sun-like star to the one of a star of 4 M⊙.

We note that the criteria for stability are local criteria. Hence they can be evaluated easily for a specific

layer when the local quantities P, T and ρ are known, without further knowledge of the other parts of the

star. On the other hand, it is clear that convective motions do not only depend on local forces (as assumed

when deriving the criteria). These convective motions can influence the entire stellar structure, because in

reality it is coupled to all neighbouring layers via the basic equations. For certain purposes, the “reaction”

of the whole star to convection needs to be considered. An example is the precise determination of the

boundaries of the convective zone, where fluid elements that where accelerated elsewhere “overshoot” until

their motion is stopped. It is still unclear how important this overshooting is, while it is of great importance

when determining evolution models of stars that get born with a convective core. We come back to this issue

further on, but we first need to discuss convective energy transport in detail.

77

Figure 5.3: Comparison between the temperature gradients (denoted in this plot as instead of ∇) in a

sun-like star with a convective envelope and in a star of 4 M⊙ with a convective core. (Figure courtesy of

Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)

5.3.2 Buoyancy frequency and semiconvection

In a zone that is stable against convection, a fluid element that gets displaced by moving up will be pulled

back down until it is again situated at its equilibrium position, thanks to the action of the buoyancy force of

Archimedes. This oscillatory motion of the fluid elements depends on the local gravitational acceleration,

density, pressure, and chemical composition of the gas and happens with the so-called Brunt-Vaisala fre-

quency, or buoyancy frequency in brief, whose square is defined as

N2 ≡ g

(

1

Γ1

d lnP

dr− d ln ρ

dr

)

. (5.41)

We reformulate its physical meaning via Fig. 5.1: when the density of the displaced fluid element is lower

than the one of its surroundings, it will experience a net buoyancy accelerating the element further to the

surface and we have convective instability. In that case, N2 < 0. On the other hand, when the density in the

fluid element is larger than the one of its surroundings, buoyancy forces it back to its original equilibrium

position. This results in an oscillatory motion with a local frequency given by N(r).

In this way, we find that regions stable against convection have N2 > 0, while convective regions have

N2 < 0. As an alternative to using temperature gradients, the condition for convective instability can thus

78

also be written as:d ln ρ

d lnP<

1

Γ1. (5.42)

For a star fusing hydrogen into helium, N increases with time in the deep stellar interior because the

helium abundance increases due to the nuclear burning at the expense of hydrogen. In such a case lighter

gas surrounding the helium core is situated on top of heavier material and this adds to the stability. Thus

N increases throughout the core-hydrogen burning phase of evolution. This can be seen from the hydrogen

mass fraction profiles and the corresponding N2 profiles of an evolving stellar model in Fig. 5.4. It is also

seen in this figure that N2 = 0 in the convective core of the models, which decreases in mass fraction as the

evolution evolves.

The dependence of N2 on the chemistry of the star is most easily seen for the case of an ideal gas. In

that case, one can approximate N2 as

N2 ≃ g

Hp[δ (∇ad −∇) + ϕ∇µ] . (5.43)

The µ-gradient thus affects the local behaviour of N(r) in the radiatively stratified layers of the star. In

a dynamically stable layer, a displaced fluid element is pulled back by buoyancy. However, a vibrational

instability may occur when the element’s equilibrium position occurs in a layer with a temperature gradient

∇ that satisfies the Schwarzschild criterion, while it ends up in a layer that satisfies the Ledoux criterion:

∇ad < ∇rad < ∇ad +ϕ

δ∇µ. (5.44)

In this case, the fluid element is constantly rising and being pulled back down due to these two conditions.

This instability may have consequences for stellar structure when it concerns the motion of fluid elements

in layers with a different chemical composition. As shown in Fig. 5.4 and further discussed in Chapter 7 and

Part III, this occurs in the near-core regions of massive stars, because their convective core recedes during

the core-hydrogen burning phase. The receding convective core leaves behind a so-called µ-gradient zone

in which the right term in Eq. (5.44) becomes considerable such that vibrational instability sets in. In such a

case, the fluid elements oscillate between regions of different chemical composition, introducing slow mix-

ing, called semiconvection. Depending on its (unknown) efficiency of chemical mixing, the semiconvection

may (or not) introduce stratification in the mixed layer, resulting in plateaus in the profiles of chemical ele-

ments. For efficient semiconvective mixing, regions where ∇µ = 0 may result and their stability properties

will thus be altered by the semiconvection. For example, the hydrogen profile in the near-core region of

a massive star may become stratified; in that case, the semiconvective mixing may bring hydrogen in that

zone. This influx of hydrogen subsequently changes the value of ∇rad since it depends dominantly on the

hydrogen mass fraction (recall that the opacity ∼ (1+X) and is hence dominated by the hydrogen fraction).

As a result, the balance between the precise values of ∇rad and ∇µ, which are the decisive factors to have

convection or not, depend on the (in)efficiency of the semiconvection. In practise, semiconvection occurs

along with several other sources of mixing and their time scales and efficiencies decide which of the sources

dominate. We treat the topic of chemical mixing in stellar interiors in the dedicated Chapter 7.

As a side remark connected with the Ledoux criterion, let us imagine that a vibrational instability

occurs in a critical layer in the envelope of the star. A layer fulfilling the Ledoux but not the Schwarzschild

79

Figure 5.4: Top: Main-sequence evolutionary track in the HR diagram computed with the MESA code (see

Chapter 8) for a non-rotating star of 3.25 M⊙. Three stellar models are selected along the track indicated

by the three coloured circles, corresponding to the zero-age main sequence, mid-main sequence, and near

the terminal-age main sequence, respectively. Bottom: The corresponding profiles of the hydrogen mass

fraction X and the squared buoyancy frequency N2 are shown as a function of fractional mass coordinate.

(Figure courtesy of Dr. May Gade Pedersen)

criterion may coincide with a transition between an adiabatic and a non-adiabatic gas layer. In a chemically

homogeneous layer, compression is accompanied by a local temperature increase, resulting in a reduced

opacity following Kramer’s law. As a reaction, the layer expands, cools down and the instability is damped.

However, when the critical layer is not chemically homogeneous, such as in a partial ionisation layer, the

80

absorbed radiation will dominantly be used to futher ionise the material in that layer rather than heating it

up. As a consequence, the density increase is dominant in the opacity behaviour and the latter increases

upon compression, blocking even more radiative flux. This leads to an expansion beyond equilibrium,

which can grow in time if circumstances are optimal in the sense that radiative damping in the non-adiabatic

layer becomes inferior to the excitation and growth of the vibrational instability. Such type of vibrational

instability is thus strongly connected with the opacity behaviour in that critical layer and is therefore called

an opacity-driven instability. It is responsible for several types of pulsations that may occur in stars. We

refer to the Master courses Asteroseismology and Theory of Stellar Oscillations for a detailed description of

the conditions for, and consequences of the growth and damping of vibrational instabilities.

5.4 Convective energy transport

When the opacity and the amount of energy to be transported become too large, radiation alone cannot effi-

ciently transport the energy in a stable way. In that case, convection takes over the task of energy transport.

Convective energy transport is the exchange of energy via fluid elements moving between hotter and cooler

layers in a dynamically unstable layer. This happens by means of the exchange of macroscopic fluid ele-

ments. The hot convective cells move up, while the cool cells sink (cf. Figure 5.1). The moving cells will

dissolve in their new environment and hence release their excess or lack of heat. Because the density near

the stellar core is very high, convection can be a very efficient way to transport energy.

The detailed theoretical treatment of convective motions in stars appears extremely complicated and

a full description based on first principles does not exist. This is not surprising, because even convective

motions in a kettle with boiling water give rise to complex hydrodynamical motions that are not yet un-

derstood. Solving the hydrodynamical equations for stars, taking into account convection, has until now

only been done in simplified conditions that could be tested in the laboratory. Convection in stars, however,

occurs in extreme circumstances in which turbulent motions can transport large amounts of energy in a very

compressible gas, which has a pressure, density, temperature and gravity that can vary over many orders of

magnitude from layer to layer.

5.4.1 Mixing length theory

Many attempts have been made to take into account convection as accurately as possible. We limit our-

selves here to the description of a long-existing and simple approximation, i.e., the one of the “mixing

length” theory. This theory allows for the local treatment of convection in a fairly simple way and is a good

approximation at hand for regions near the stellar core. The mixing length theory was developed for stars in

hydrostatic equilibrium and assumes that convection is time-independent.

The mixing length theory states that convection can be compared to heat transport by molecules. The

transporting particles are macroscopic “spheres” (cf. Figure 5.1) instead of molecules, and their mean free

path (“mixing length”) is the distance over which they move before dissolving in their new environment.

81

The total energy flux l/4πr2 at a certain location now consists of the sum of the radiative flux frad (in which

we incorporate a possible contribution of conduction) and the convective flux fcon.

In Eq. (5.29), we have defined ∇rad as the gradient necessary to transport the total energy flux by

radiation. A part of the flux is now, however, transported through convection, which means that the true

unknown gradient ∇ of the layer will be smaller:

frad + fcon =4acG

3

T 4m

κPr2∇rad (5.45)

and

frad =4acG

3

T 4m

κPr2∇. (5.46)

Herein, ∇ is a new quantity to be determined. For this we need an expression for fcon. It is assumed that

the convective element moves radially over a distance ℓm with a velocity vconv. Subsequently it arrives in a

cooler environment, where the element has a temperature excess DT . In this new environment, it dissolves

and releases its excess of internal energy. Because of the assumption of pressure balance (DP = 0), the

released heat is cPDT . The local convective energy flux corresponding to this heat exchange is therefore

fcon = ρvcPDT .

To actually compute fcon, a variety of additional assumptions is considered, e.g., about the amount of

work done by the sphere before dissolving in het new surroundings, about the fraction of this work that is

transformed in kinetic energy of the sphere and the surrounding fluid elements, about radiative energy loss

of the sphere (as it ends up in a cooler environment), etc. In this way, ∇ can be estimated. The assumptions

made, as well as the computational scheme, can be found in the next subsection.

5.4.2 A computational scheme

We assume for all the fluid elements that their motion started only as a very small disturbance. Then we can

choose the initial values of DT0 and v0 equal to zero. Due to differences in the temperature gradient and in

the buoyancy, DT and vconv will change as the element sinks or rises. This will happen until the element,

after travelling over a distance ℓm (the “mixing length”), is dissolved in its new environment and loses its

identity. The elements that enter a concentric sphere with radius r at a given time have a different vconv and

DT , as they started their motion at a different distance, between 0 and ℓm. We therefore assume that the

“average” element has travelled over a distance ℓm/2 when it enters the concentric sphere. We then have

DT

T=

1

T

∂(DT )

∂r

ℓm2

= (∇−∇e)ℓm2

1

HP. (5.47)

The density difference is, due to the assumption DP = 0 and Dµ = 0, simply Dρ/ρ = −δDT/T and the

buoyancy is kr = −g(Dρ/ρ). We assume that half of this value has acted on the element when it moved

forward over a distance ℓm/2. The work supplied then is:

1

2krℓm2

= gδ(∇ −∇e)ℓ2m8HP

. (5.48)

82

Next we assume that half of this work is converted into kinetic energy of the element (v2conv/2 per unit mass)

and that the other half is transferred to elements in the surroundings that were “pushed aside”. That way we

obtain the average velocity vconv of elements that pass through the sphere:

v2 = gδ(∇ −∇e)ℓ2m8HP

. (5.49)

When we substitute this result and (5.47) in the expression for the average convective flux we get

fcon = cPT√

gδℓ2m4√2H

−3/2P (∇−∇e)

3/2ρ. (5.50)

We still need the determine the expression for ∇−∇e.

We regard the temperature variance Te inside the element with a diameter d, a surface S and a volume

V when it moves with a velocity vconv. This temperature variance has two possible causes: adiabatic

compression or expansion on the one hand and exchange of heath with the environment through radiation

on the other hand. First we derive the total energy loss λ per unit of time of the sphere. We regard a fluid

element with an excess temperature DT > 0. Consequently the element radiates in its new environment.

Beside the radial energy flux ~f , that transports energy from the stellar centre to the stellar surface, a local

non-radial flux ~f will occur that emits the excess energy of the element to its environment. According to

(5.12) and (5.13)

f = |~f | = 4acT 3

3κρ

∂T

∂n

, (5.51)

where ∂/∂n has the meaning of differentiation perpendicular to the surface of the sphere. Assume that the

element is a sphere with a diameter d. We put that

∂T

∂n≈ 2DT

d. (5.52)

The radiative flux loss λ per unit of time and per unit of mass through the surface S of the sphere then is

λ = Sf =8acT 3

3κρDT

S

d. (5.53)

The unit λ is a kind of “luminosity” of the sphere that represents the change of its thermal energy. The

energy loss λ per unit of time results in a decrease of the temperature as heath is passed to the environment

through radiation. This decrease in temperature is, in case of pressure equilibrium, given by λ/ρV cP vconv.

The total temperature variance per unit of length caused by the two effects, being adiabatic compression

or expansion and exchange of heat with the environment through radiation, then is

(

dT

dr

)

e=

(

dT

dr

)

ad− λ

ρV cP v. (5.54)

When we multiply this by HP/T we obtain

∇e −∇ad =λHP

ρV cP vconvT, (5.55)

83

where we can replace λ by (5.53) with an average for DT given in (5.47). The resulting equation then has

a pre-factor ℓmS/V d, that we choose equal to 6/ℓm (the value for a sphere with a diameter ℓm). Finally we

obtain the following result1

Γ≡ ∇e −∇ad

∇−∇e=

8acT 3

ℓmvconvκρ2cP. (5.56)

Except for a missing value for ℓm we have obtained five equations, being (5.45), (5.46), (5.49), (5.50)

en (5.56), for five unknowns frad, fcon, vconv, ∇e and ∇, where the local magnitudes P , T , ρ, l, m, cP ,

∇ad, ∇rad and g are known.

It can be shown that these five equations can be transformed into one equation of degree three with as

an unknown a complicated combination of all unknowns. We will not consider the full solution space of the

problem here, but refer to the actual implementation in numerical codes. Rather, we tried to show how hard

it is to accurately take into account the convective transport and that the current theory is based on many

assumptions, some of which are more plausible than others. We will restrict our discussion to a few relevant

limiting cases here:

• Γ → ∞: it can be shown that this case implies that ∇e → ∇ad and ∇ → ∇ad. A negligible excess

of ∇ with respect to the adiabatic value is apparently sufficient to transport the total luminosity. This

limiting case occurs in the areas near the stellar core of massive stars where the density is very high

and in the layers of the photosphere of low-mass stars where the opacity is very large (cf. Figure 5.3).

In this case we do not have to solve the equation of the mixing length theory as ∇ ≈ ∇ad is a good

approximation (see Figure 5.2 for the Sun). In that way, we are not subject to the uncertainties and

limitations of this theory for this area.

• Γ → 0: this limiting case corresponds to the demand that ∇ → ∇rad. This means that the convective

transport is inefficient and can absolutely not transport a substantial fraction of the luminosity. We find

in this case F → Frad and again ∇ is known without the need of the mixing theory. This limiting case

occurs in the photosphere of massive stars and in stellar cores of stars of low mass (see Figure 5.2).

The situation is more complicated for regimes in between these two limiting cases. In that case, the equations

of the mixing length theory actually need to be solved and will yield ∇ad < ∇ < ∇rad. In that situation,

the convection is said to be super-adiabatic.

5.4.3 The parametric implementation

The weak point of the mixing length theory (and other variants) is that there is no physical basis to determine

a value for ℓm. Therefore, the mixing length is always considered as a free parameter, usually expressed in

terms of the local pressure scale height: ℓm = αmltHP . To choose a plausible value, it is assumed that the

main part of the convective energy transport is done by the largest spheres, and that these can only travel

over a short distance, not much longer than their own diameter, before losing their identity. For the Sun, it

is found from helioseismology that αmlt ≈ 1.75. This value delivers stellar models that provide the best

84

accordance with high-precision solar observations of very different kind (including the solar oscillations). It

is assumed that the energy transport in all other stars has similar characteristics as for the Sun, and typically

αmlt ∈ [1.5, 2.5] is used when computing the convective transport in stellar models, but it can be as low as

0.5 for very metal-poor stars.

In addition to the energy transport, convection has another important effect on the life of a star. Con-

vection is responsible for the mixing of stellar material on a time scale that is much shorter than other

relevant time scales mentioned before. Hence, when determining the stellar structure and evolution, one can

assume that convection causes instantaneous mixing of all chemical species in the entire convective zone.

Convection makes a clear mark on the chemical history of the star. We come back to this in Chapter 7.

Finally, we remark that we have so far neglected the influence of stellar (internal) rotation, and hence

the Coriolis and centrifugal forces that are induced by it, on the stellar structure. The most important effect

of non-rigid internal rotation is the efficient mixing of stellar material it brings about, as is the case with

convection. We also treat this in Chapter 7.

85

86

Chapter 6

The chemical composition of stellar matter

6.1 The relative mass fractions

The chemical composition of stellar matter is extremely important because it determines the basic charac-

teristics such as radiation and energy production due to nuclear reactions. These reactions, in turn, change

the chemical composition. Nuclear reactions fix the lifetime of a star.

The chemical composition of the star at a time t is described by the functions Xi = Xi(m, t) with

m ∈ [0,M ]. It is useful to take m as the independent variable when describing the chemical composition.

Indeed, if we were to use a description in terms of r, then the functions Xi(r, t), and all other functions

that depend on the chemical composition, would change with each small expansion or contraction with

conservation of mass.

Often, one uses the particle number per volume ni for particles with mass mi: Xi = mini/ρ. Usually,

the number of different species i, and hence the number of Xi, can be kept low, because (i) most types of

particles are rare, (ii) they have little influence on the stellar structure, or (iii) they have a constant abundance

in time. For most purposes, it is sufficient to define only the mass fractions of hydrogen, helium, and “all

other” elements (also called “heavy elements” or “metals”) together. This is noted as

X ≡ XH, Y ≡ XHe, Z ≡ 1−X − Y. (6.1)

For an “average” star in our Milky Way, X is in the interval [0.68,0.73]. The mass fraction of heavy

elements, on the other hand, varies strongly from star to star, and lies in between Z = 10−6 to about

Z = 0.04. This has important consequences for our understanding of the chemical evolution in the Universe.

At the time of the Big Bang, hydrogen and helium were created, and almost no other elements (apart from

some lithium; see later on). This explains the relatively small ranges in the mass fractions X,Y . All

heavier elements were created by nucleosynthesis in stars. During the late evolutionary stages of stars, a

large fraction of their mass is lost to the interstellar medium, either through a strong stellar wind on the

87

asymptotic giant branch, or due to a supernova explosion (see Part III). Hence, the interstellar medium is

enriched with heavy elements, which are then incorporated in new stars that are formed in this medium.

Consequently, the broad range of Z values must be interpreted as a broad range of stellar ages. The stars

with a low Z are the first-generation stars which were formed before significant chemical enrichment of the

interstellar medium took place.

The nuclear reactions will obviously alter the initial composition X,Y,Z , and make this simple picture

more complex. For certain purposes, e.g. the determination of isotope ratios (see later on), the description

in terms of just 3 types Xi is not sufficient. We come back to the relative distribution of particles within the

Z group, specifically the distribution of C, N, and O isotopes which are important for the hydrogen burning.

6.2 Variations of chemical composition due to nuclear reactions

Assume that the mass fractions Xi can only change due to occurrence of nuclear reactions, which alter

atomic nuclei of type i within a fluid element. The frequency of occurrence of a certain reaction is called the

reaction rate rlm. It is equal to the number of reactions per unit mass and per unit time converting particles

of type l into particles of type m. In general, particles of type i can be influenced by several reactions, of

which some will destroy (rik) while some will create particles (rji). The reactions govern the change of niover time. Because Xi = mini/ρ, we have:

∂Xi

∂t= mi

j

rji −∑

k

rik

, i = 1, . . . , I (6.2)

for all elements of type 1, . . . , I involved in the reactions. When more than one particle of type i is created

or destroyed per reaction, this can be taken into account by multiplying the corresponding term in the sum

with a factor that is equal to the number of particles i involved in the reaction.

The reaction p→ q that transforms a particle of type p into a particle q, is connected to a loss or gain of

energy epq. In the equation that expresses the conservation of energy, we have defined the energy production

ε per unit mass and per unit time; ε contains the contributions from several reactions and can be written in

terms of the reaction rates:

ε =∑

p,q

εpq =∑

p,q

rpqepq. (6.3)

We now define the energy that is generated when a unit mass of particles of type p is transformed into

particles of type q: qpq = epq/mp. For simple cases, it is useful to rewrite (6.2) in terms of ε because

this quantity already occurs in the equation of energy conservation. When all reactions deliver a positive

contribution to ε, we can write (6.2) as:

∂Xi

∂t=

j

mi

mj

εjiqji

−∑

k

εikqik

≡ Ei. (6.4)

88

When I different types of particles simultaneously participate in the nuclear network of reactions, equations

(6.2) or (6.4) form a system of I differential equations. Because one of the latter equations can be replaced

by the normalisation condition (2.24), I − 1 reaction equations are needed to complete the system of basic

equations that describe the stellar structure.

In simple cases, it is sufficient to add only one reaction equation. This occurs when hydrogen burning

is the only origin of nuclear energy production. Representing the energy production of all types of hydrogen

burning by εH , the only equation that needs to be considered is

∂X

∂t= −εH

qH, (6.5)

with ∂Y/∂t = −∂X/∂t, and qH the energy gain per unit mass when hydrogen is transformed into helium.

In the following, we abbreviate the equations describing the time evolution of the mass fractions as

∂Xi

∂t= Ei, (6.6)

i.e. the rate of change of Xi due to nuclear reactions is denoted symbolically as Ei. Previously, we defined

a general nuclear time scale τn = En/L. For each type of nuclear burning, a nuclear time scale τn,i can be

defined. This is the time scale upon which a particle of type i gets exhausted by nuclear burning.

6.3 Effective cross sections

The reaction between particles is mostly caused by the strong interaction (a.k.a. strong nuclear force), which

occurs between nucleons (protons and neutrons). The range of the strong interaction is determined by

the extent of the particle under consideration. The Coulomb potential of a particle determines whether

the nuclear attraction force or the Coulomb repulsion dominates. The transition between both occurs at

a distance r0, almost equal to the radius of the particle, which is typically of the order of 10−13cm (see

Figure 6.1). For a reaction to take place, the different particles have to be brought very close to each other

to overcome the Coulomb repulsion. In practice, this means that the particles must almost touch each other.

One can show that the Coulomb potential is mostly determined by the charge of the particles and that

its value is of the order of MeV. This immediately shows how hard it is to make a nuclear reaction take

place, since the typical kinetic energy of particles, given by 3kT/2, is only of the order of 103eV. The

typical kinetic energy is three orders of magnitude too small to overcome the Coulomb potential and to

initiate nuclear reactions. Hence, in terms of classical mechanics, we find that nuclear fusion cannot

occur. This explains why scientists in the first quarter of the 20th century were convinced that nuclear

fusion reactions could not be responsible for the luminosity of stars.

Why are nuclear fusion reactions possible, then? This is a result of quantum mechanical effects. From

quantum mechanics, we know that there is a non-zero probability that particles can overcome the Coulomb

potential and hence react through the process of tunnelling, where a particle tunnels through a barrier even

89

Figure 6.1: Schematic representation of the Coulomb potential of a particle. For r < r0, the nuclear

attraction force dominates; for r > r0, Coulomb repulsion dominates. (From Kippenhahn et al. 2012)

if its energy seems inadequate according to the laws of classical mechanics. Because of the extent of the

Coulomb potential (see Figure 6.1), however, the probability that quantum-mechanical tunelling takes place

is small, and consequently the occurrence of nuclear reactions in the stellar interiors is a slow process.

The very low energies imply it to be extremely difficult to determine the effective cross section of a

nuclear reaction, i.e., the probability that a reaction will occur, under conditions relevant for stellar interi-

ors. The effective cross section depends on the relative velocity of the particles. This velocity, in turn, is

determined by the temperature and the relative energy of the nuclei. Moreover, it depends on the presence

of other particles in the gas, that can partially shield the charges of the nuclei and hence can influence re-

action rates, depending on the thermodynamical state of the gas. In principle, the effective cross sections

can be determined experimentally. However, the experiments in the laboratory are performed under condi-

tions that are quite different from those in the stellar interior, which makes extrapolation of the results quite

challenging.

The nuclear physicist Gamow has pioneered this field of research, and derived expressions for the

effective cross sections for stellar interiors. We will not go into details of these derivations but rather only

mention that the cross section depends strongly on the charges of the particles involved in the reaction,

because these charges determine the shape of the Coulomb potential. There is also a strong temperature

dependence, because this quantity determines the kinetic energy of the particles. Gamow found that the

effective cross section ∼ exp(−πZiZje2/ε0hν) exp(−mv2/2kT ), in which ε0 is the permitivity. The first

exponent increases with increasing velocity v, while the second decreases. Hence we obtain a maximal

probability at a certain velocity for the nuclear reaction to take place. It is known as the Gamow peak and

90

occurs at a velocity v = (πZiZje2kT/ε0hm)1/3. To determine the reaction rate, we must integrate the

effective cross section over all possible velocities. It can be shown, however, that the latter is proportional to

the velocity corresponding to the Gamow peak (we do not go into details of the derivation here). We deduce

that reactions between particles with low charge occur faster, and that reactions can take place even at

relatively low temperatures. Moreover, we find that reactions of heavier elements need higher temperatures.

Determining the effective cross sections is an active domain within nuclear astrophysics. Gradually,

the conditions of experimental research are converging towards the true conditions that govern in stellar

interiors. We refer to the course Nuclear Astrophysics at ULB, a course offered to the Master in Astronomy

& Astrophysics students at KU Leuven, for more details.

6.4 Nuclear burning cycles

The life of a star is governed by thermonuclear fusion, which, as we learned, is induced by thermal motion

and quantum mechanical effects. Several light particles fuse into heavier elements. In the discussion on

the energy production in stars by nuclear reactions, we limit ourselves to a summary of the most important

reactions. Instead of the thermonuclear fusion of a certain element, the term burning of the element is used.

The different types of burning occur at quite different temperatures.

When a star evolves on a time scale that is comparable to the reaction rates, we have to take into

account the network of nuclear reactions to obtain an accurate estimate of the energy production. The

total ε is the sum over all possible reactions, and the “bookkeeping” of all changing abundances must be

followed strictly. Very often, however, a much simpler procedure suffices to determine ε. We discuss the

most important burning mechanisms occurring in stars in the following sections, but first we define a few

basic concepts.

6.4.1 Basic concepts

In Figure 6.2, we show different forms of the simplest elements in nature, namely hydrogen and helium.

The top row displays the two possible ionization stages of hydrogen (H) and the bottom row of helium (He).

Each nucleus consists of a number of protons, indicated by the atomic number Z , and a number of neutrons

N . The atomic mass number A is the sum of both: A = Z +N . Not all combinations (N,Z) are allowed

in a nucleus. The stable (N,Z) combinations populate a narrow strip in the (N,Z) diagram, the so-called

stability valley. It expresses that too neutron-rich or too proton-rich nuclei are unstable; too neutron-rich

nuclei are subject to β− decay, while too proton-rich nuclei experience β+ decay. The β+ and β− decay are

both manifestations of weak interactions (a.k.a. the weak nuclear force). In β+ decay, a proton changes in

a neutron by emission of a neutrino and a positron (i.e., the positively charged antiparticle of the electron).

The β− decay, on the other hand, changes a neutron into a proton by emission of an antineutrino and an

electron.

A given number of protons Z can be combined with only a limited number of different neutron numbers

91

Figure 6.2: The ionization stages of H (top) and He (bottom). (From sciexplorer.blogspot.com)

Figure 6.3: The stable isotopes of carbon. (From sciencestruck.com)

92

N . As an example, 6 protons can only form a stable nucleus with 6, 7, or 8 neutrons (see Figure 6.3). Nuclei

having the same number of protons but with a different number of neutrons, are called the isotopes of an

element. The isotope is indicated with AX, in which X is the element, and A the mass number.

Nucleons can, just like electrons, only occupy certain specific energy levels, and they display a shell

structure. A nucleus is very stable when a proton or neutron shell is fully occupied (in analogy to the noble

gasses where the outer electron shell is fully occupied). This phenomenon occurs for the so-called magic

numbers of N or Z: 2, 8, 20, 28, 50, 82, 126. These numbers will be important later on, when we discuss

the s-process in Part III of the course. Moreover, the nuclei with an even number of protons are more stable

than nuclei with an odd number of protons. The same holds for neutrons. Pairs of protons and neutrons with

opposite spin are more stable than unpaired protons or neutrons.

Define α as the projectile, e.g., a proton, and X the target. Assume that both react to form the end

products β and Y . We then write the reactions as follows:

α+X → Y + β, (6.7)

or shorter, X(α, β)Y . Most nuclear reactions that occur in stars are exothermal. This means that they release

energy. For the reaction (6.7), we have an energy balance

mαc2 +mXc

2 = mβc2 +mY c

2 +Q, (6.8)

in which Q is the energy produced per reaction which is added to the system. Q is of the order of MeV.

Before the fusion process, the involved particles j have a total mass∑

Mj , which is different from the

mass My of the product that will be created by the reaction. The mass deficit is

M =∑

j

Mj −My (6.9)

and corresponds to an energy given by E = (M) · c2. This energy feeds the energy balance in the star.

An example is hydrogen burning (see later), in which 4 1H nuclei with a total mass of 4 × 1.0081mu are

transformed into a single 4He nucleus with a mass 4.0039 mu, where one mu (i.e., atomic mass unit) equals

1/12 of the mass of a 12C isotope. For this value we refer to Appendix A. During the hydrogen burning, for

each 4He nucleus that has been created, a mass of 2.85× 10−2mu has “disappeared”, which corresponds to

0.7% of the initial mass. The corresponding energy is about 26.5 MeV (with 1 eV = 1.6022 × 10−12 erg).

The current luminosity (energy loss) of the Sun corresponds to a mass loss of L⊙/c2 = 4.25 × 1012 g s−1.

When we assume that in total 1M⊙ of hydrogen will be converted into helium, then 0.7% of that will be

converted into energy. With the currently observed luminosity of the Sun, it can “survive” for 3× 1018 s or

≈ 1011 year. In practice, only 10% of the total mass in the Sun can take part in nuclear fusion, and hence

the lifetime of the Sun is ≈ 1010 year. The current Sun has used up about half of that energy reservoir.

The mass deficit is connected to the fact that the involved nuclei have a different binding energy EB .

This binding energy is the energy needed to break up the nucleus into its nucleons (protons and neutrons).

Put differently, EB is the energy that is gained when a certain number of free protons and neutrons are

brought together starting from infinity, to create a particle. Consider a nucleus with mass Mk and a mass

93

Figure 6.4: The binding energy per nucleon versus the mass number A. The curve is smoothed to remove

the variations induced by the internal shell structure of the nuclei. (From researchgate.net)

number A, that consists of Z protons with mass mp, and A−Z neutrons with mass mn. The binding energy

EB is then given by:

EB = [(A− Z)mn + Zmp −Mk] c2. (6.10)

When comparing different elemental nuclei with each other, it is better to use the average binding energy

per nucleon: f = EB/A, which is called the binding fraction. Aside from hydrogen, helium, and lithium

nuclei, all elements appear to have a binding fraction of about 8 MeV. This shows that the nuclear attraction

force only reaches nuclei in the immediate surroundings. A representation of f as a function of A is shown

in Figure 6.4. We notice that f increases with increasing A, starting from hydrogen, until a maximum is

reached at 8.5 MeV at A = 56 (56Fe). After this point, f decreases again. So 56Fe is the most strongly

bound, or most stable, nucleus. Figure 6.4 shows that the nuclear fusion that transforms light elements into

more stable heavier elements, delivers energy. However, each nuclear reaction that transforms 56Fe into

a heavier element, induces a loss of energy from fusion (one can only gain energy from fission of heavy

nuclei). In this sense, the creation of 56Fe is a natural end point for nuclear fusion in stars.

In the following, the quantities ε and ρ will be expressed in units of erg g−1 s−1 and g cm−3, respec-

tively. The temperature T will be expressed in the dimensionless form Tn = T/10nK.

94

6.4.2 Big Bang nucleosynthesis

Before we discuss the most important processes of nucleosynthesis in stars, it is interesting to consider the

production of elements at the very beginning of the Universe. The bulk of the current amount of helium in

the Universe originated from nucleosynthesis during the first few minutes after the Big Bang. This is often

referred to as Big Bang Nucleosynthesis, although also a few other light elements besides hydrogen and

helium were created.

Consider the very early Universe, at the moment when it has cooled to about 1010 K. The only nuclei

that existed at these temperatures were protons and neutrons. In normal circumstances, a neutron β− decays

into a proton after 15 minutes. However, at these high temperatures and densities, protons and neutrons are

constantly transformed into each other:

νe + n e− + p en νe + p e+ + n. (6.11)

Because neutrons are heavier than protons, it requires more energy to create a neutron. The ratio between

the number of neutrons and the number of protons comes from the Boltzmann factor

Nn

Np= exp

[

−∆m c2/kT]

, (6.12)

in which ∆m indicates the mass difference between a neutron and a proton. This mass difference is equiv-

alent to 1.3 MeV/c2. The Boltzmann factor in equation (6.12) implies that the ratio of neutrons to protons

quickly decreased when the temperature decreased because of the expansion of the Universe. This decrease

implied that the reactions in (6.11) became less frequent, until they became impossible when the tempera-

ture dropped below T < 1010 K. At that moment, the ratio of neutrons and protons was about 1/5. Fifteen

minutes later, it was 1/7 due to β− decay, and the Universe cooled enough to allow particle-particle interac-

tions.

At a temperature of 109 K, primordial deuterium (2H) formed, as well as 3He. These nuclei were

subsequently converted into alpha particles. Because 4He is by far the most stable of these different nuclei,

almost all neutrons that existed in the Universe at the time, were captured in alpha particles. Moreover,

the absence of stable nuclei with mass numbers between 5 and 8 prohibited the creation of heavier nuclei,

except 7Li.

Hence, the Big Bang nucleosynthesis formed a primordial soup of deuterium, 3He, 4He, and 7Li,

containing all neutrons, and with that a large excess of free protons. We can estimate the abundance of

the primordial helium, because it is directly related to the neutron/proton ratio of 1/7. For each couple of

neutrons, there are 14 protons. These nucleons can form one 4He nucleus, with 12 protons left. In other

words, 16mu of nucleons (protons and neutrons) produce one helium nucleus with a mass of 4mu. The

fraction of the mass converted in helium therefore is 4/16 or 25%. Big Bang nucleosynthesis has resulted in

a Universe where 25% of the mass is captured in helium, and the remaining 75% in hydrogen. This was the

material from which the first stars were formed.

95

6.4.3 Hydrogen burning

The result of hydrogen burning is the fusion of four 1H nuclei into a single 4He nucleus. The difference

in binding energy sums up to 26.731 MeV, which corresponds to a so-called mass deficit, of some 0.7%.

The energy that is released in this way is roughly a factor 10 higher than for any other fusion process that

can occur in a star. Different reaction chains exist, which in general occur simultaneously in a star. For the

hydrogen burning, two chains are important: the proton-proton chain (pp chain) and the carbon-nitrogen-

oxygen cycle or CNO cycle. Below, we have a closer look at both.

The proton-proton chain

The pp chain is called after the first reaction of the chain, in which two protons are converted into a deuterium

nucleus 2H (often also noted as 2D), which in turn reacts with another proton to form 3He:

1H+1 H →2 H+ e+ + νe,2H+1 H →3 He + γ. (6.13)

Herein, e+ represents a positron, γ a photon, and νe a neutrino. The first one of these reactions (the ppreaction) is unusual in comparison to the other fusion processes, because the protons have to undergo a β+

decay at their closest approach, so that a proton is converted into a neutron. The β+ decay is a process that

is caused by the weak nuclear force, and therefore is an interaction having a low probability to occur (i.e., its

effective cross section is small). The average time for this reaction to take place is of order ∼ 109 years and

the close encounter of the protons needs to happen when they have sufficiently high kinetic energy for the β+

decay to take place. This translates into the demand of a temperature above ∼ 107 K in order for the protons

to overcome the electrostatic repulsion. It is impossible to create this reaction in a laboratory. The second

reaction is deuterium burning. This burning occurs very fast, in a matter of seconds, as it is a consequence

of the strong nuclear force. Deuterium burning can already set in at a temperature of about 106 K. Hence,

as explained further in Chapter 9, protostars still in their formation process burn their primordial deuterium

before the pp reaction can occur. Their initial nucleosynthesis thus delivers 3He isotopes and produces

radiation before the full hydrogen burning, transforming 4 H into 4He and requiring a temperature above

5 106 K, can be done in equilibrium.

After 3He is formed via de pp reaction, the pp chain can be completed to form a 4He nucleus, also

known as an α particle, in the time span of a few 100 years. This can be done via three branches pp1, pp2, pp3.

All of these start from 3He and have 4He as their end product. So based initially on 4 protons, we have as

full pp chains:

pp1 :

1H+1 H → 2H+ e+ + νe2H+1 H → 3He + γ

3He +3 He → 4He +1 H+1 H,

(6.14)

pp2 :

3He +4 He → 7Be + γ7Be + e− → 7Li + νe(+γ)7Li +1 H → 4He +4 He,

(6.15)

96

pp3 :

7Be +1 H → 8B+ γ8B → 8Be + e+ + νe

8Be → 4He +4 He.

(6.16)

Herein, e− is an electron. The numbers 1, 2, and 3 indicate the importance of the sub-chain with increasing

temperature. The branch pp1 requires T6 ≥ 5, pp2 requires T7 ≥ 1.5, and pp3 needs T7 ≥ 24. In the Sun,

83.6% of the luminosity is delivered by the sub-chain pp1, 16.4% via pp2, and 0.015% via pp3. The very

different reactions in the pp chain occur at very different rates. The pp reaction itself is by far the slowest

reaction (about a factor 1018 slower than the other reactions). To reach completion of pp1, the first two

reactions described in (6.13) must have occurred at least twice. The reaction 2H(p, γ)3He in the pp1 chain

is so fast that the abundance of deuterium remains very low. The last reaction of the pp1 chain is again

slower than the second, but still much faster than the pp reaction itself. When the temperature increases, the

abundance of 3He decreases, so the first reaction of pp2 becomes more important (starting from T7 ≈ 1−2).

The pp2 chain continues with electron capture by 7Be, which is almost independent of temperature, unlike

the alternative reaction, proton capture by 7Be, in pp3. 7Be(p, γ)8B will start to dominate 7Be(e−, ν)7Li at

T6 ≈ 24. The 8B nucleus, produced by proton capture, is unstable with respect to positron decay with a

half-life time of 0.8 s. The neutrino released in the latter process, as well as the neutrino from the electron

capture by 7Be, have been detected in solar neutrino experiments. The final reaction in the pp3 chain is the

decay of 8Be in two α particles. This reaction is important, not only because it finalises the pp3 chain, but

also because the inverse reaction determines He burning at higher temperatures (see later on).

Because of the different amounts of energy lost via neutrinos in each of the three sub-chains, the energy

gain per produced α particle is different, and adds up to Q = 26.2, 25.7, 19.2MeV for pp1, pp2, and pp3respectively. The “effective” Qeff can be determined as an average of the energy produced by the three ppchains. The released nuclear energy of the pp chains is:

εpp =rppQeff

ρ≈ ψ f11 g11

2.57× 104ρX2

(T9)2/3exp

(

−3.381/(T9)1/3)

(in erg g−1 s−1). (6.17)

In this expression, f11 is a shielding factor for the reaction considered, g11 is a factor based on a polynomial

fit of fourth order in T9 to experimental laboratory results and attains a value near 1, and ψ is a factor between

1 and 2 depending on the relative contributions of pp1, pp2, and pp3. Overall, the formula in Eq. (6.17) is

a parametric fit to measured and tabulated values of nuclear reactions. These fits are made by research

groups active in nuclear astrophysics based upon their experimental data and differ somewhat from group

to group. This topic is an active field of research at the Universite Libre de Bruxelles (ULB, see Agnulo et

al., 1999, Nuclear Physics A, Vol. 656, p.3 – 183, for a reference often used in stellar evolution codes). The

temperature dependence of the reaction rate of the pp chain decreases from ∼ T 6 at temperatures T6 = 5,

down to ∼ T 3.5 for T6 ≈ 20.

The carbon-nitrogen-oxygen cycle

The CNO cycle describes a second chain of reactions that leads to hydrogen burning. For this cycle to work,

the presence of certain isotopes of carbon, nitrogen and oxygen is required. The reactions that occur at

97

temperatures typical for stellar interiors, are:

12C+1 H → 13N+ γ (6.18)13N → 13C+ e+ + νe

13C+1 H → 14N+ γ14N+1 H → 15O+ γ

15O → 15N+ e+ + νe15N+1 H → 12C+4 He

− or −15N+1 H → 16O+ γ (6.19)16O+1 H → 17F + γ

17F → 17O+ e+ + νe17O+1 H → 14N+4 He

The general structure of the CNO cycle consists of a series of protons captured by isotopes of C, N, or O,

intermixed with β+ decay, processes which all have half-life times of 100 – 1000 s. The cycle always ends

with proton capture that induces the formation of an α particle.

The first of reactions indicated in (6.18) constitute the CN cycle because only isotopes of C and N

occur as catalysts. The full CNO cycle occurs when 16O is already abundantly present, or when the reaction15N(p, γ)16O has occurred enough to have generated the necessary amount of this oxygen isotope. The

occurrence of the full CNO cycle is a factor 1000 less likely than that of the CN cycle. The end product of

the full CNO cycle is not only an α particle, but also a 14N isotope that can again feed the CN cycle.

A detailed description of the CNO cycle burning is extremely complicated because many isotopes are

involved in a cyclic way. The energy production, as well as the detailed abundances of all isotopes, depend

on the initial concentrations of the catalysts, on reaction rates, on the temperature, and on the age of the star.

Here, we will not describe all reactions in the cycle in detail. Rather, we discuss the consequences of the

most important chain in the CNO cycle.

The key reaction of the CNO cycle is 14N(p,γ)15O. This reaction is relatively slow, and is based on the14N isotope which is present in both cycles. Like in the pp chain, it is the slowest reaction which is the most

important to determine how the cycle evolves. When the temperature is high enough to activate hydrogen

burning via the CNO cycles for a substantial time in a stable way, almost all available C, N, and O will be

converted into the 14N isotope, which then becomes most the abundant byproduct of the hydrogen burning.

The energy production is also determined by the slowest reaction 14N(p, γ)15O, and therefore deter-

mines εCNO. A good estimate for the latter is 24.97 MeV. The released nuclear energy due to the average

Qeff , determined for all reactions that occur in the CNO cycle, can be determined as follows:

εCNO ≈ g14,18.24 × 1025ρXZ

(T9)2/3exp

(

−15.231/(T9)1/3 − (T9/0.8)

2)

(in erg g−1 s−1), (6.20)

where g14,1 again stands for a polynomial fit, this time of third order in T9 following Agnulo et al. (1999).

98

Figure 6.5: The fraction of the total energy production by the CNO cycle throughout the stellar interior for

stars on the zero-age-main-sequence (ZAMS, see later on for a formal definition) with a mass between 1

and 3M⊙. (From Kippenhahn et al. 2012)

The temperature dependence of the reaction rate of the CNO cycle is much larger than that of the pp chain;

approximately ∼ T 18 for T6 ≈ 20.

In Figure 6.5, we show the contribution of the CNO cycle to the total energy produced by hydrogen

burning for stars with a mass between 1 and 3M⊙ as a function of position in the star (represented as l/L).

It is clear that the CNO cycle is the dominant energy source for stars more massive than 2M⊙.

6.4.4 Helium burning

The nuclear reactions burning helium consist of a gradual fusion of several α particles, resulting in isotopes12C,16O,. . . . These reactions occur at a temperature much higher than the temperature needed for hydrogen

burning. The typical condition is T8 > 1.

The first and foremost reaction is the one in which 12C is formed from three α particles: the triple

alpha reaction. This reaction occurs in two steps, because a simultaneous close encounter of three particles

is highly unlikely:

4He +4 He 8Be + γ − 93KeV (6.21)8Be +4 He → 12C+ γ + 7.4MeV.

In the first step 8Be is formed temporarily from two α particles in an endothermal reaction, i.e., it costs

energy and the ground state of this 8Be isotope has an energy that is about 90 KeV higher than that of the

two α particles. Due to this, the isotope decays on a short time scale of 10−16 s, falling apart in two αparticles again. This seems like a very short decay time, but the high density in the stellar interior brings

about a suitable probability of another α capture, used to form 12C. The energy production per unit mass of

99

the reactions given in (6.21) is a factor 20 lower than that of the CNO cycle and can be approximated by

ε3α ≈ 5.1× 1011 f3α ρ2 Y 3 T−3

8 exp (−44.027/T8) (in erg g−1 s−1), (6.22)

where f3α is the screening factor of the triple-α reaction. The triple-α reaction is very temperature depen-

dent, as can be seen from the high exponent for the factor involving the temperature (recall that T8 ≈ 1).

Once enough 12C is formed by the triple α reaction, additional α capture can occur and nuclei of 16O,20Ne, etc. can be produced, releasing energy:

12C+4 He → 16O+ γ + 7.16MeV (6.23)16O+4 He → 20Ne + γ + 4.73MeV

...

There is considerable uncertainty in the first of these two reactions and hence also in the share of initial 4Hethat takes part in each of (6.21) and (6.23).

In summary, during helium burning, the reactions given in (6.21) and (6.23) occur simultaneously,

and the total energy production εHe consists of essentially three contributions: εHe = ε3α + ε12,α + ε16,α.

Initially, shortly after the onset of the helium burning, the trippel-α reaction dominates as 4He is much more

abundant than 12C and 16O. With increasing temperature and more carbon available, the (rather poorly

known) reaction 12C(α, γ)16O comes relatively more into play and becomes even dominant due to the

third power dependence on the relative mass fraction of 4He in Eq. (6.22). The final abundances of 12Cand 16O at the end of the helium-burning phase are critical for the further stellar life, yet they are quite

uncertain. Improving knowledge of the 12C(α, γ)16O reaction is therefore an active field of research in

nuclear astrophysics.

6.4.5 Fusion of heavier elements

Carbon burning

After helium burning, the central core consists mostly of a mixture of 12C and 16O. If the temperature is

high enough at that time, say T8 ≈ 5− 10, the process of carbon burning starts. For this type of burning, as

for all types to follow, the situation is so complex that computations are based on rough approximations. A

first difficulty is that the first reaction in the carbon burning, 12C+12C, results in a 24Mg isotope, which can

then decay in many different ways:

12C+12 C → 24Mg + γ 13.9323Mg + n −2.6123Na + p 2.2420Ne + α 4.6216O+ 2α −0.11

(6.24)

100

in which we have indicated Q in MeV in the last column. Note that the second and the last reactions are

endothermal. The relative frequency of the different decay paths depends on the temperature and is very

different. The most probable ways are those that result in 23Na+p and 20Ne+α. The latter occur equally

frequently at not too high a temperature (T9 < 3).

The next difficulty is that the produced protons and α particles experience such a high temperature that

hydrogen and helium burning are not possible. As a result, very complicated reaction chains occur. As an

example, we mention 12C(p, γ)13N(e+ν)13C(α, n)16O, which creates, amongst other elements, a neutron.

All details of such chains need to be accounted for when determining the average energy production. As a

rough approximation, an average Q of ≈ 13MeV is taken for each 12C+12C reaction with all subsequent

reaction chains. The end products of the carbon burning are mostly 16O, 20Ne, 24Mg and 28Si.

Oxygen burning etc.

For the reaction 16O+16O to occur, a temperature of T9 > 1 is required. Due to the high temperatures,

the protons and α particles react with other nuclei in the gas. Also the neutrons react with other particles,

since they are not hampered by Coulomb repulsion. Like in the carbon burning chain, the reactions can be

continued through different channels :

16O+16 O → 32S + γ31P + p31S + n28Si + α24Mg + 2α.

(6.25)

A large number of chain reactions follow, which create, aside from Al, Mg and Ne, large amounts of free

neutrons, protons and α particles. These will in turn react with the 28Si isotopes and gradually produce

heavier elements.

When the oxygen has been burned, a new stage of contraction and heating of the stellar core starts.

One may expect that the next burning cycle, the burning of magnesium, would start. However, before the

temperature is sufficiently high for this burning to take place, another type of reaction occurs. Because of

the ever increasing temperature, the thermal energy of the photons has strongly increased as well. At a

temperature of ≈ 109 K, a large fraction of the photons has an energy in the order of MeV. Such highly

energetic photons can cause photo-dissociation of the nuclei in the gas. Photo-dissociation is a process in

which radiation is converted into matter (as opposed to the aforementioned burning cycles, which convert

matter into radiation). Such a process in which a high-energy photon is converted into an electron-positron

pair occurs when the photon energy hν exceeds the rest mass of the electron-positron pair: hν > 2mec2.

An example of photo-dissociation is:

32S + γ 28Si + 4He. (6.26)

The double (right-left) arrow indicates that, after the formation of the α particle, it can react with other

nuclei, such as 28Si, and the inverse reaction can take place. There are many analogous reactions that take

101

Figure 6.6: The periodic table as filled by nucleosynthesis in stars, where the number of protons is indicated

for each of the chemical elements. (Figure courtesy of Prof. Jennifer Johnson, Ohio State University, USA)

place through photo-dissociation, involving the nuclei of 32S, 36Ar, 40Ca, 52Fe and 56Ni. Furthermore, the

absorption of protons and neutrons, as well as the decay of unstable nuclei need to be added to the reactions.

Hence, extremely complicated reaction chains come into being, which eventually leads to the reaction of

heavier elements. This whole process is, misleadingly, called silicon burning.

When this process is continued, and there is enough time available, the formation of 56Fe will be

completed. Because the 56Fe isotope is strongly bound (see Figure 6.4), it is the only survivor. When there

is not enough time to form 56Fe, however, 56Ni will become the most abundant element resulting from the

silicon burning. This situation occurs in supernova explosions (see Part III).

6.5 Summary: the periodic table filled via the nuclear reactions in stars

While we have gone over the most important nuclear reactions in the previous subsections, additional reac-

tions connected with the slow and rapid neutron capture will be discussed in Part III of this course, as they

occur in particular phases of stellar evolution. Figure 6.6 provides an overview of all the chemical elements

produced in stars. These chemical yields form the basis of the chemical enrichment of exoplanetary systems,

galaxies and of the Universe as a whole.

102

Chapter 7

Complications: mixing of chemical elements

due to transport processes

7.1 Convective mixing and nuclear burning

The mixing processes of stellar matter due to turbulent convective motions is a process that is active on a

very short time scale compared to the slow variations in chemical composition induced by nuclear reactions

as these happen on a nuclear time scale. Upon first thought, one could therefore consider it justified to

assume that the chemical composition in a convective layer remains constant, hence ∂Xi/∂m ≈ 0. Within

the mass interval of the convection zone, one could therefore consider all Xi ≈ Xi to be constant. At

the edges of the convective zone, a discontinuity may occur, i.e., the “outer” value may be different from

the “inner” value Xi. However, apart from the mass fractions of the species of type i, the positions of

the convection zones change with time (cf. Figure 5.4). It is thus clear that the chemical composition in a

stellar layer can change, even if there are no nuclear reactions taking place in this layer. In particular, this

happens when the border of the convective zone penetrates into a zone with a different, non-homogeneous

chemical composition. In this way, the products of previous nuclear reactions can be traced because they

can get transported throughout the star. Such element transport may bring processed material up to the

stellar surface, if mixing occurs throughout the stellar layers between the centre of the star and its surface.

On the other hand, “fresh fuel” can be transported into the zone where nuclear burning occurs and this can

drastically change the stellar evolution. Indeed, fresh fuel leads to a prolongation of the nuclear burning

cycle for stars with a convective core. Care must therefore be taken to describe chemical mixing properly

from solving the transport equations, rather than making too crude approximations.

The rate of change of species of type i with mass fraction Xi is caused by various processes aside

from the nuclear burning denoted as Ei. Extra mixing due to element transport may behave diffusively (e.g.,

when caused by gradients of physical quantities) while others are advective (e.g., large coherent motions

of fluid elements such as circulation due to rotation). When the rate of change for Xi happens much faster

than the nuclear timescale, as is the case in a convection zone, it is customary to approach ∂Xi/∂t by a

103

diffusion equation, mainly for computational convenience. In the simplest case of chemical mixing due to

convective motions, along with nuclear fusion, the transport equation describing the time-variability of the

mass fractions can be written in Lagrangian format as

∂Xi

∂t= Ei +

∂m

[

(

4πρr2)2

Dconv(m)∂Xi

∂m

]

, (7.1)

where the diffusion coefficient associated with the convective mixing described by mixing length theory is

given by

Dconv =1

3αmltHp vconv (7.2)

and is thus expressed in the physical unit cm2 s−1 when adopting the cgs system.

7.2 Convective boundary mixing aka CBM

It is difficult to precisely locate the boundary layer between radiative and convective zones. Even though

the stability criterion (5.1) allows us to derive the zones where convection takes place inside the star, a com-

plication occurs in the transition layers between convective and radiative zones, hereafter termed convective

boundary layers. The fluid elements inside a convection zone experience a turbulent motion with velocity,

vconv. When they reach the convective boundary layer, their inertia will prevent them from stopping abruptly,

i.e., they will “overshoot” from the convection zone into the radiative layer over an unknown distance, which

we denote here with the parameter αov (in analogy to αmlt, it is expressed in the unit Hp). The way in which

the fluid elements overshoot the convective boundary depends on the location of the convection zone inside

the star and the physical circumstances at that position. It may be different for convective envelopes than for

convective cores.

Values for αov cannot be derived from first principles. Their derivation is an active research field within

3D hydrodynamical simulation studies. Such studies have indicated at least three physical processes that

may come into play: 1) convective penetration by plumes leading to superadiabatic mixing over a distance;

2) subadiabatic convective core overshooting due to thermal diffusion over a distance described by means

of an exponentially decaying mixing profile; 3) turbulent entrainment that occurs over a dissipation length

scale. All of these cases imply a different and uncalibrated level and functional form of convective boundary

mixing (abbreviated as CBM as of now) and have a different temperature gradient in the transition layer. We

use the global symbolic notation of the free parameter αov to express the unknown length scale over which

the fluid elements move from inside a convective region into the radiative adjacent zones, irrespective of the

profile that captures the efficiency of the mixing. Equation (7.1) gets extended to include CBM as follows:

∂Xi

∂t= Ei +

∂m

[

(

4πρr2)2

(Dconv +Dov)∂Xi

∂m

]

, (7.3)

where the unknown profile due to CBM connected with the motions of the fluid elements beyond the con-

vective boundary are captured by the local diffusion coefficient Dov. Both Dconv(m) and Dov(m) involve

at least one free parameter (denoted here as αmlt and αov).

104

5000600070008000Teff [K]

1.0

1.1

1.2

1.3

1.4

1.5

logL

/L⊙

1.7 M⊙

4000600080001000012000140001600018000Teff [K]

2.7

2.8

2.9

3.0

3.1

3.2

3.3

3.4

logL

/L⊙

5 M⊙

Figure 7.1: Evolutionary tracks for two masses as indicated, computed with the MESA stellar evolution

code (see Chapter 8). The tracks were computed for X = 0.71, Z = 0.014 and for three different values

of convective core overshooting in the approximation of a diffusive exponentially decaying profile with pa-

rameter fov=0.005 (full lines), 0.020 (dashed lines), 0.040 (dotted lines) and with constant envelope mixing

Denv = 100 cm2s−1. The thick dot indicates the first model along the track whose hydrogen mass fraction

in the core, Xc < 10−4. (Figure reproduced from data in Johnston et al., 2019, A&A, Vol. 632, id.A74 by

courtesy of Dr. Cole Johnston, KU Leuven)

For stars with a core that is convective due to nuclear reactions taking place in the central regions, the

lack of calibration of the physics in the convective boundary layers implies a serious limitation. Indeed,

the CBM influences the amount of matter that can be brought into the central regions where nuclear fusion

takes place. The higher the CBM, the more fresh fuel reaches the nuclear reactor and hence the longer the

nuclear fusion can go on. Figure 7.1 shows the impact of CBM on evolutionary tracks for stellar models

with a convective core. The CBM not only affects the evolutionary tracks, but has a major impact on the

star’s core mass and on its ageing. For this reason, calibration of the amount of matter in the convective

core of a star, Mcore, via observational estimation of the profile Dov(m) is a crucial piece of information to

predict the evolution of stars with M > 1.5M⊙ for which the extent of the convective core directly relates

to the amount of stellar matter that takes part in the nuclear fusion. In this sense, the parameter αov is a

critical unknown in the theory of stellar structure. Independent methods to determine αov observationally

are explained in the courses Asteroseismology and Binary Stars.

7.3 Mixing due to rotational instabilities and waves

7.3.1 Models with rotation

As already mentioned in Chapter 3, rotation is common in all stars throughout their lives. Rotation acts upon

stellar structure models in at least three ways: it deforms the star from spherical symmetry, it leads to higher

polar than equatorial flux due to gravity darkening, and it induces various instabilities and chemical mixing

in the stellar interior. The level of confidence in how to treat these effects is different for the three aspects.

105

Figure 7.2: Left: Core/near-core rotation frequencies derived from nonradial oscillations for 1210 stars

observed with the NASA Kepler space telescope. The stars span the entire stellar evolution from the ZAMS

to white dwarfs. Right: stars with an additional measurement of the envelope/surface rotation frequency.

We refer to the course Asteroseismology and to the review paper Aerts et al., 2019, ARAA, Vol. 57, p.35 –

78 for details.

By definition, the critical (or break-up) velocity is reached when the outwardly directed centrifugal

acceleration is equal to the inward effective gravitational acceleration at any one place on the stellar surface.

Usually the Roche approximation is adopted, which assumes that the mass concentration inside the star

is not distorted by the rotation. In this case, the polar (Rp) and equatorial (Re) radii of the star differ by

less than a factor 3/2, where Re,crit/Rp,crit = 3/2. This leads to the critical rotation frequency given by

Ωcrit =√

GM/R3e,crit =

8GM/27R3p,crit, with Re,crit and Rp,crit the critical equatorial/polar radius of

the star, respectively. This is the solution for the critical rotation frequency when the Eddington parameter,

Γ, which we introduce in Chapter 13 as Γ = κL/4πcGM⋆ < 0.639. We do not treat the other solution here

but refer to the monograph by Maeder (2009) for details. The vast majority of stars rotate at less than half of

the critical rotation frequency, implying that their polar radius is between 90% and 100% of their equatorial

radius. The deformation of most stars thus remains limited.

Gravity darkening was first discussed by von Zeipel in 1924. It stands for a reduction in the flux and

hence in the effective temperature of the star resulting from the reduced gravity in the equatorial regions

compared to the polar ones. Usually, the von Zeipel effect is expressed as

Teff = Teff,p

(

geffgeff,p

, (7.4)

where Teff,p and geff,p are the effective temperature and effective gravity at the pole of the star. For a

radiative envelope as considered by von Zeipel, β ≃ 0.25. In the presence of a convective envelope, βis usually assumed to be β <

∼ 0.1. This limited knowledge of the exponent β, and along with it a non-

106

symmetrical stellar wind (see Chapter 13), implies a non-trivial treatment in stellar evolution computations

in the presence of fast rotation.

The prediction Re,crit/Rp,crit = 3/2 of the Roche approximation, along with von Zeipel’s formula

(7.4), can be evaluated directly from interferometric measurements of stellar surfaces. Such observations

indeed show that fast rotators are oblate and that their surface properties and winds are not spherically

symmetric. However, such stars are rare, while the computation of 2D equilibrium models in the presence

of rotation comes with major uncertainty even in its simplest aspects of the local surface and its flux. For

this reason, computations of 2D models of stellar interiors are often restricted to static polytropic models.

A good compromise for stars rotating faster than about half of the critical rotation rate is to use only the

spherically symmetric component of the centrifugal force in the equation of hydrostatic equilibrium:

∂P

∂r= −Gmρ

r2+

2

3ρ rΩ2 , (7.5)

while sticking to 1D evolutionary models.

7.3.2 Rotational and pulsational mixing

Stellar models with rotation usually adopt the approximation of shellular rotation, where one assumes that

the chemical composition and the angular velocity remain constant on isobars. As such, the ratio of the

rotation frequency inside the star, Ω(r), with respect to Ωcrit (or the accompanying vrot(r)/vcrit) is used as

input for the numerical computations of the stellar models. Even 1D models of slowly rotating stars based

upon the simplified Eq. (7.5) face challenges, as such models cannot explain the latest space asteroseismic

data shown in Figure 7.2.

From a theoretical perspective, rotation is expected to induce a myriad of processes and instabilities in

the stellar interior, leading to transport of both angular momentum and of chemical species. This is exten-

sively discussed in the review paper by Aerts et al. (2019), to which we refer for details. In summary, the

processes can be classified into four main categories: meridional circulation, hydrodynamical instabilities,

magnetorotational instabilities, and internal gravity waves. The concept of rotational mixing in stellar evo-

lution theory as treated in the literature usually stands for the macroscopic element transport and chemical

mixing due to the action of circulation and all instabilities together. Further, in analogy to rotational mixing,

the term pulsational mixing is sometimes used for macroscopic element transport in radiative zones caused

by internal gravity waves.

The Eulerian transport equation controlling the evolution of the angular momentum in the presence of

rotation, r2Ω(r), reads

∂t

(

r2Ω)

=1

5ρr2∂

∂r

[

ρr4ΩU(r)]

+1

ρr2∂

∂r

(

ρr4Dshear∂Ω

∂r

)

. (7.6)

Here U(r) is the radial component of the velocity due to meridional circulation and the local diffusion

coefficient Dshear(r) represents the vertical mixing due to a variety of shear instabilities occuring between

layers subject to different velocities (see Maeder, 2009).

107

Figure 7.3: Schematic representation of mixing profiles, Dmix(r), due to various transport processes in

stars with a convective core (indicated in grey) and a radiative envelope, for a model with CBM (purple)

described by exponentially decaying diffusive core overshooting (upper) or convective penetration (lower)

and different types of envelope mixing (pink). (Figure courtesy of Dr. May Gade Pedersen, KU Leuven)

The transport of the chemical species due to rotation can be approximated as a diffusive process in

the presence of strong horizontal turbulence due to shear instabilities. For this reason, the diffusive part

in the chemical composition equations in Eq. (7.3) gets extra terms due to various effects of rotation, each

with their own local diffusion coefficient. Aside from rotation, additional causes of element mixing are

also considered, particularly in transition layers that are stable against the Schwarzschild criterion, but un-

stable against the Ledoux criterion, both for the case of ∇µ > 0 (called semiconvective mixing as already

discussed) and ∇µ < 0 (called thermohaline mixing, which occurs in evolved stars due to dredge ups –

see Part III). For a star with CBM, semiconvection is not of importance as the convective overshooting or

convective penetration largely dominate over semiconvection.

Overall, one has to deal with a multitude of (uncalibrated!) extra diffusion coefficients that affect

the chemical composition profiles of the star, aside from Dconv(m) and Dov(m) included in Eqs (7.3). For

rotation, these have been grouped asDshear(m) andDeff(m) adopting the notation by Maeder (2009), where

the latter is the diffusive effect due meridional circulation in the presence of strong horizontal turbulence,

108

while the former stands for the vertical shear due to the joint effect connected with all sorts of rotational

(and possibly magnetic) instabilities.

Pulsational mixing profiles due to internal gravity waves, adopting a diffusion approximation, have

been derived from hydrodynamical simulations, resulting in a diffusion coefficient depending on the density

as DIGW(r) ∼ DIGW · ρ−γ(r) with γ ∈ [0.5; 1], where the proportionality factor DIGW remains unknown.

Figure 7.3 offers a schematic representation of some macroscopic diffusive mixing profiles that have been

considered in recent stellar evolution computations. The level of efficiency of the chemical mixing at the

position where the purple and pink profiles coincide remains uncalibrated so far. The particular shape of the

third panel from the left in Figure 7.3 is due to the drop in velocity U(r) ≃ 0 in some of the envelope layers

near m/M⋆ ≃ 0.5 in a 5 M⊙ star just after its birth. In general, the mixing profiles as indicated in Figure 7.3

vary strongly during the evolution of the star, but it is poorly understood how, given limitations in the theory

of element and angular momentum transport.

It also remains unknown how the diffusion coefficients in the equation describing the element trans-

port on the one hand, and the angular momentum transport on the other hand, scale with respect to each

other. In practice one therefore introduces extra unknown scaling factors (as free parameters, with value

between 0 and 1) between each of these two versions of the diffusion coefficients. As a consequence of this

zoo of uncalibrated local diffusion coefficients, major uncertainty occurs in the profiles of the mass frac-

tions, Xi(m, t), in the radiative zones of stellar interiors throughout the evolution of stars. For this reason,

so-called standard evolution models do not include any angular momentum and element transport in the

radiative zones.

Rotational or pulsational mixing are expected to (partly) homogenize the chemical mixture and thus

lead to flatter Xi(r) profiles in the layers where they are active compared to the case where no such extra

mixing occurs in the core boundary layer and envelope. Moreover, envelope mixing may transport elements

produced by the CNO cycle to the surface of the star. As highlighted in Chapter 6, 14N is the most important

byproduct of the CNO cycle. Hence CBM may transport it to the deep bottom of the stellar envelope, where

it can then be picked up and transported to the surface by any form of efficient envelope mixing. This is

observed in massive stars and is one of the major reasons to consider rotational mixing in stellar evolution

theory (see Figure 7.4).

The extra rotational or pulsational mixing processes occurring in the near-core boundary layers and in

the stellar envelope are assumed to happen on short time scales compared to the evolutionary time scale.

Rotating models including these ingredients therefore often ignore the element transport due to microscopic

atomic diffusion giving rise to concentrations of particular chemical species (see next subsection). However,

there is no justified physical reason for this “computationally convenient” simplification when the time scales

and levels of these processes are similar.

109

Figure 7.4: Predicted surface nitrogen abundance as a function of the projected rotational velocity v sin imeasured from spectral line broadening for observed stars (indicated as dots). The results for stellar models

with convective penetration as CBM over a distance αov = 0.35Hp and rotational envelope mixing based

on meridional circulation and shear instabilities (following Dmix(r) according to the lower right panel of

Figure 7.3) is indicated in colour. The observed stars reveal that these models with rotational mixing are only

partly able to explain the observed N excess at the stellar surface. In particular, these models are challenged

by the measured N excess in slowly rotating stars. (From Brott et al., 2011, A&A, Vol. 530, A115, 20pp.)

7.4 Microscopic atomic diffusion

Aside from full and instantaneous mixing in convective regions and full or partial mixing in the convective

boundary and radiative layers, the profiles of the mass fractions Xi(r, t) may also change due to microscopic

transport processes. In this section, we consider element transport caused by microscopic atomic diffusion,

which may induce changes in Xi(r, t) caused by gradients operating in the radiative layers of the star. These

gradients may introduce lower or higher concentrations of particular chemical species at particular layers

within a radiatively stratified zone because heavy elements sink, while light elements surface.

A key aspect of assessing the importance of microscopic atomic diffusion is that the time scales on

which it acts are very different for the atmosphere than for the interior of the star. Diffusion time scales are

typically less than a century for the stellar atmosphere, while millions to billions of years for the interior

regions. Here, we limit ourselves to those aspects of atomic diffusion that act on long time scales comparable

110

to the contraction or nuclear time scales of the star.

Following numerous studies, four different aspects of microscopic atomic diffusion are considered in

stellar models. These occur due to pressure, temperature, and concentration gradients on the one hand, and

radiative forces on the other hand. While pressure and temperature gradients augment the concentration

of more massive species towards the centre of the star, concentration gradients have the opposite effect.

Generally, these three microscopic processes together lead to a larger concentration of heavier elements in

the deeper interior of the star. On the other hand, radiative forces levitate species with an efficiency that

depends on the details of the atomic structure of the involved isotopes. The calculation of the appropriate

radiative accelerations is therefore challenging in terms of computational needs.

Radiative levitation due to accelerations acting upon isotopes can be computed from atomic data, by

treating the appropriate multi-component gas. This requires evaluations of the frequency-dependent absorp-

tion coefficients derived from Coulomb potentials, taking into account partial ionization, and this for all the

layers inside the star. Once the overall local velocities wi for each of the species i involved in the atomic dif-

fusion are computed, they can be inserted in the equation governing the time evolution of the mass fraction

Xi:∂Xi

∂t= Ei −

∂m

(

4πr2ρXiwi

)

+∂

∂m

[

(

4πρr2)2

(Dconv +Dov +Denv)∂Xi

∂m

]

, (7.7)

where the 2nd term on the right-hand side is the result of the radiative levitation acting upon species Xi and

the 3rd term is due to microscopic and macroscopic transport of the chemical species due to convection,

CBM, envelope mixing and atomic diffusion together. The computation of wi necessary to solve the set of

I Eqs (7.7) at each step of the evolution, is a major CPU challenge. This computational inconvenience is

the reason why radiative levitation is most often omitted in stellar evolution computations. Moreover, for

stars with a radiative or a very thin convective envelope (i.e., born with M >∼ 1.3M⊙), the sinking of heavy

elements from the surface towards the interior happens on such a short time scale that it results in unrealistic

stellar atmosphere models depleted in all elements heavier than H. For this reason, such stellar models often

require a further ad-hoc turbulent diffusion coefficient to be at work near the surface along with a thin stellar

wind, to keep the metals in the appropriate position compatibel with surface abundance measurements. For

all these reasons, stellar models are often computed for the simplest case of atomic diffusion, namely helium

settling at the interface of the thick convective outer envelope and the radiative interior of low-mass cool

stars. This is entirely appropriate for stars like the Sun, with sufficiently cool envelope layers to prevent

radiative levitation to occur. Mixing at this interface due to helium settling is well calibrated for the Sun

from helioseismology.

Atomic diffusion impacts the concentration of species in the stellar interior on time scales that are

relevant for stellar evolution. Its effects are hard to unravel when only looking at a star’s luminosity and

effective temperature, which are the two quantities that define the evolutionary tracks in an HR diagram.

Models with and without atomic diffusion usually differ far less than typical observational errors of L or

log g plotted versus Teff . Given that the confrontation between data and theory in an HR diagram is often

the only assessment to evaluate stellar evolutionary theory by lack of additional data (particularly for faint

or very distant stars), and given the computational requirements, microscopic atomic diffusion tends to be

ignored in stellar and galactic astrophysics. It is, however, required to be considered for archaeological

studies of the Milky Way as they rely on chemical tagging and ageing of evolved stars because the effects

of atomic diffusion accumulate along the evolution of the star.

111

112

Chapter 8

Numerical computation of stellar structure

and evolution models

In this chapter we first give an overview of the full system of basic 1D stellar structure equations deduced

in the former chapters. We ignore the Coriolis and centrifugal forces due to rotation in the equation of

motion, but we do consider the effect of mixing due to transport processes by means of uncalibrated diffusion

coefficients in the chemical equations describing the mass fractions. Furthermore we discuss the boundary

conditions that are required for the computation of stellar models and give one of the relatively simple ways

of numerical analysis revealing how the models can be computed. Subsequently, we turn our attention to

the modern state-of-the-art software suite MESA, which forms the computational tool used for the lab work

of this SSE course.

113

8.1 The full system of basic equations

When we put together all relevant derived equations for a spherically symmetric star, we get the following

system of differential equations in Lagrangian form :

∂r

∂m=

1

4πr2ρ,

∂P

∂m= − Gm

4πr4− 1

4πr2∂2r

∂t2,

∂l

∂m= εn − εν − cP

∂T

∂t+δ

ρ

∂P

∂t= εn − εν + εg,

∂T

∂m= − GmT

4πr4P∇,

∂Xi

∂t= Ei −

∂m

(

4πρr2Xiwi

)

+∂

∂m

[

(

4πρr2)2(Dconv +Dov +Denv)

∂Xi

∂m

]

, i = 1, . . . , I.

(8.1)

The last equation is in fact a system of I equations in which one equation can be replaced by the normalising

condition∑

iXi = 1. These equations describe the variation of the mass fractions Xi of the relevant species

i = 1, . . . , I considered in the chemical mixture of the star. In general ∇ stands for d lnT/d ln P , but when

the energy transport is only established by radiation (and conduction), ∇ is replaced by ∇rad, which was

defined in (5.29). When convective energy transport is important, ∇ in the fourth equation has to be replaced

by a value derived from the adopted approximation for the theory of convection. The fourth equation

assumes that the star is in hydrostatic equilibrium. As a good practise in computational astrophysics,

users of stellar evolution codes should always check if this condition is fulfilled for the numerical models

achieved as outcome of the scientific software.

In the system (8.1) of coupled differential equations we can identify some partial sub-systems. The first

two equations describe the mechanical part, which is only linked to the thermonuclear part via the density,

which in turn depends on the temperature. When the density is no longer linked to the temperature, we can

solve the first two equations without taking the other three into account. We then get the mechanical structure

expressed as r(m) and P (m). An example are the polytropic solutions. The last system of equations in

(8.1) describes the overall chemical aspect of the problem.

The equations in the system (8.1) contain functions that describe the characteristics of the stellar ma-

terial, like ρ, εn, εν , κ, cP ,∇ad, δ and via Ei also the nuclear reaction rates rij for the chosen network of

isotopes. We assume that these are known as a function of P, T and of the chemical mixture at birth (t = t0)

described by the mass fractions Xi(m, t0) of the isotopes. In other words, we assume the equation of state

to be known, as well as the Rosseland mean opacity, the equations for the other thermodynamical character-

istics of the stellar material, the nuclear reaction rates, the energy production and energy loss by neutrinos:

ρ = ρ(P, T,Xi) κ = κ(P, T,Xi) (8.2)

cP = cP (P, T,Xi) δ = δ(P, T,Xi) ∇ad = ∇ad(P, T,Xi)

rjk = rjk(P, T,Xi) εn = εn(P, T,Xi) εν = εν(P, T,Xi)

114

Definitions for cP , δ and ∇ad were given in Chapter 2. To compute them, we need to adopt an equation of

state throughout the star.

As already mentioned, the Rosseland mean κ is a good approximation for the opacity, except for the

outer stellar layers. The atmosphere requires a special approach for the energy transport because the mean

free path of the photons does not comply with the requirement of the diffusion approximation that this path

is much shorter than the distance the photons still have to pass before being radiated into space. Therefore

we have to solve a much more complicated energy transport equation in the stellar atmosphere. We do not

go into this matter in this course, but we refer to the course Stellar Atmospheres in the KU Leuven Master

of Astronomy & Astrophysics. Here, we simply assume that we have proper atmosphere models available

to be used as boundary conditions to close the system of coupled differential Eqs (8.1).

Let us now make the balance of the number of equations versus the number of unknowns, taking (8.2)

into account. All input functions described in (8.2) can be replaced by functions of P, T and Xi. For Idifferent types of species, the set of equations Eqs (8.1) constitutes a system of I + 4 differential equations

for the I+4 unknowns r, P, T, l,X1, . . . ,XI . The independent variables are m and t. When we assume that

the total mass of the star does not change in time (no mass loss), and when we denote the birth of the star

with t0, we have to compute solutions of the stellar structure equations in the intervals 0 ≤ m ≤M, t ≥ t0.

We have to solve a system of coupled non-linear partial differential equations. We will only find the

physically relevant solutions if we impose a correct number of appropriate boundary conditions for m = 0and m = M and if we know the initial values of the unknown functions. To figure out for which functions

we have to know the initial values, we replace the time derivatives of P and T in the third equation of

Eqs (8.1) by the time derivative of the entropy s, −T∂s/∂t , based on Eq. (3.46). We conclude that we can

solve the entire system Eqs (8.1) if we have initial values for the functions r(m, t0), r(m, t0), s(m, t0) and

Xi(m, t0). After we have found the appropriate initial values and physically justified boundary conditions

have been formulated, it comes down to solving the system Eqs (8.1) for a given equation of state and input

physics. A solution r(m), P (m), T (m), l(m),Xi(m) for a given time t is called a stellar model.

8.2 Time scales and simplifications

There are three types of time derivatives in the system Eqs (8.1). Each of them is connected with a charac-

teristic time scale. The term with ∂2r/∂t2 was used to estimate the hydrostatic time scale τhydr, the time

derivatives in the third equation resulted in the definition of the Helmholtz-Kelvin time scale τHK and the

time derivatives in the last set of equations led to the nuclear time scale τn.

We have already shown that the inertia term in the second equation of Eqs (8.1) can be neglected if

the star evolves slowly compared to the hydrostatic time scale. When the evolution of the star is dominated

by the stable nuclear energy production, we can replace this second equation by the equation of hydrostatic

equilibrium since the Helmholtz-Kelvin time scale as well as the nuclear time scale are much longer than

the hydrostatic time scale. In this case, we only have to know initial values for the functions s(m, t0) and

Xi(m, t0) to solve the system of equations.

115

8.3 Boundary conditions

Setting boundary conditions for the system of equations (8.1) is an important part of the problem to be

solved. The influence of the chosen boundary conditions on the solutions is often difficult to interpret. The

reason is that the boundary conditions for the stellar structure cannot be limited to one end of the mass

interval [0,M ], but have to be divided into conditions for the stellar centre and for the stellar surface. The

boundary conditions in the stellar core are quite simple in comparison to those for the stellar surface. These

latter have to be related to observational quantities and rely on a much more complicated energy transfer

equation. Here, we will only discuss a star in full equilibrium during the phase of core-hydrogen burning.

8.3.1 Central boundary conditions

We will search for central values for the unknowns r, l, P, T . We can immediately determine two boundary

conditions for the stellar centre (m = 0). Since the density has to remain finite, r = 0 has to be valid and

since the energy sources also have to remain finite, also l = 0 has to be valid. There are, on the contrary, no

conditions that we can impose to figure out the values for the central pressure PC and the central temperature

TC . We thus have only two boundary conditions and we will have to work each time with a two-parameter

solution for a given TC and PC . Therefore it is useful to investigate the behaviour of the four unknowns

close to the stellar core m → 0 at a certain point in time t = t0. The first equation of the system Eqs (8.1)

can be written as

d(

r3)

=3

4πρdm. (8.3)

For a constant density ρ = ρc (so for low values of m) we can integrate this equation. This results in

r =

(

3

4πρC

)1/3

m1/3, (8.4)

in which the integration constant was chosen to be zero to comply with the demand of r(m = 0) = 0. We

can see this result as the first term of a series expansion for r around m = 0. A similar integration of the

energy equation with condition l(m = 0) = 0 gives

l = (εn − εν + εg)C m. (8.5)

When we now substitute (8.4) in the equation of the hydrostatic equilibrium we get low values of m:

dP

dm= − G

(

4πρC3

)4/3

m−1/3, (8.6)

which can be integrated to obtain

P − PC = −3G

(

4πρC3

)4/3

m2/3. (8.7)

Furthermore the pressure gradient has to disappear in the stellar core as follows from the equation of hydro-

static equilibrium dP/dr ∼ m/r2 ∼ r3/r2 → 0.

116

For the variation of the temperature close to the centre, we limit to the radiative case, in which

dT

dm= − 3

64π2ac

κl

r4T 3. (8.8)

For P → PC and T → TC the opacity will converge to a certain value κC . When we substitute I by (8.5)

and r by (8.4) we can integrate (8.8) for small m values. We then get

T 4 − T 4C = − 1

2ac

(

3

)2/3

κC (εn − εν + εg)C ρ4/3C m2/3 (8.9)

when the energy transport in the core is radiative.

8.3.2 Boundary conditions for the surface

It is complicated to deduce appropriate boundary conditions for the surface. As a very rough approach we

could take at first instance the naive conditions P → 0 and T → 0 for m→M . This indeed expresses that

P and T take very small values at the stellar surface in comparison with the values in the stellar interior, but

in the end the temperature and the pressure at the stellar surface are not zero.

The next step is going to a sphere which we can call the surface of the star and which defines the stellar

radius r = R. In the study of the stellar atmosphere, one uses the photosphere, i.e., the sphere where the

optical depth, defined as

τ ≡∫ ∞

Rκρdr = κphot

∫ ∞

Rρdr, (8.10)

equals 2/3. Here κphot represents a mean opacity of the photosphere. In hydrostatic equilibrium, the pressure

in this photosphere is determined by the matter above it. The gravity can be assumed as constant g =GM/R2 in this area because the photosphere is a thin shell containing only a very small amount of matter.

With the help of (3.15) and (8.10) we then get for τ = 2/3

Pr=R =

Rgρdr =

GM

R2

Rρdr =

GM

R2

2

3

1

κphot. (8.11)

The temperature in the photosphere is, to a good approximation, given by the effective temperature of the

star.

The photospheric boundary conditions deduced for Tr=R and Pr=R give two relations between the

surface values for the functions P, T, l, r that are certainly an improvement compared to the naive boundary

conditions P → 0 and T → 0. The weakest point when using them is that they were deduced for an area

where the basic assumption we made for the description of the radiative energy transport, namely that the

mean free path of a photon is much shorter than the distance to the stellar surface, is no longer valid. In fact,

a much more complicated energy transport equation should be used in the photosphere. We again refer to

the course Stellar Atmospheres.

In practice the transition of the solutions that are valid “inside” the atmosphere towards the ones that

are valid “outside” of it will happen by choosing a fitting point mf where both solutions will be coupled to

117

each other. Hence, mf has to be situated deep enough into the stellar interior because the assumptions made

to deduce the equations should still be valid. We then get solutions of these equations in the fitting point:

rinf , Pinf , T

inf , l

inf . On the other hand, mf has to be close enough to M so that we can use the simplification

of an outer layer in thermal equilibrium where l = L. The smaller M − mf , the lesser the energy that

can be gathered or released in the outer layer. In the study of stellar atmospheres, the solutions for the four

unknown functions routf , P outf , T out

f , loutf are computed. One can show that these solutions are functions of

the parameters R and L. The boundary conditions are the four solutions constructed for the interior in such

a way that they are equal to the ones computed for the atmosphere:

rinf = routf , P inf = P out

f , T inf = T out

f , linf = loutf . (8.12)

Those four boundary conditions can be met because we have enough free parameters: TC and PC for the

internal solutions and R and L for the external solutions.

For numerical applications (see next section) the following procedure is used. In the point mf we find

solutions for the outside of the atmosphere: routf (R,L), P outf (R,L), T out

f (R,L), loutf (R,L) by numerical

integration of the equations that are relevant in the atmosphere. The last function is very simple: loutf = L.

The first one can be inverted without any problems which leads to R = R(routf , L). This equation is

now used to express the R- dependence of the other two functions: P outf (R(routf , L), L) ≡ π(routf , L) and

T outf (R(routf , L), L) ≡ θ(routf , L), where π and θ are known functions of routf en loutf = L. We now

replace the variables for the outside by their equivalents at the inside, taking the fitting conditions (8.12) into

account:

P inf = π(rinf , L), T

inf = θ(rinf , L). (8.13)

These are two boundary conditions for the internal solutions. They were constructed in a way that, when a

good internal solution is found, this can always be coupled to the external solution in a continuous way.

8.4 A simple numerical scheme: the Henyey method

An analytical solution of the system Eqs (8.1) is not possible for realistic equations of state. We should

thus search for numerical solutions to solve the system of equations. Due to computational demands the

calculations of the solutions for the whole system have only been possible during the last half century.

Before modern computers were available, simpler stellar models, like polytropes, had to be used to predict

stellar evolution. One of the numerical methods that has seen widespread use to compute solutions of the

system Eqs (8.1) is the Henyey method, which we will discuss now. This method is particularly suitable to

compute stellar models during the core-hydrogen burning stage. Further in this chapter, and in the course lab

work, we introduce the students into a modern state-of-the-art stellar evolution code that is able to compute

stellar models at all the stages of the evolution and for all mass ranges.

The Henyey method is a very practical method to solve differential equations with boundary conditions

at both ends of the solution interval. A first approximate starting solution is proposed and evaluated. By

means of an iterative process the starting solution is gradually improved until an appropriate solution is

obtained that meets a predetermined precision. At each iteration step corrections are being applied to all

118

variables and in all grid points so that the effect of these variations to the final solution, including the

boundary conditions, is taken into account.

For spherical stars in hydrostatic equilibrium we have to solve the system Eqs (8.1), where we replace

the second equation by ∂P/∂m = −Gm/4πr4, with the corresponding boundary conditions as described in

the previous subsection. For standard stellar models without any transport processes (i.e., for ∂Xi/∂m = 0),

the set of equations allows us to solve two separate partial systems. We restrict the description to this

simplified case, where one can first solve the “spatial” system for the given Xi(m) and afterwards apply

the last system of equations of Eqs (8.1) for a small time step t. After this one can again solve the first

partial system for the new Xi(m), etc. We will now describe in detail how to solve the spatial system for

such standard models. For more sophisticated models with the chemical mixing as discussed in Chapter 7,

we refer to the use of the MESA code as illustrated during the MESA Lab work.

We will limit ourselves to solving models in full equilibrium: r = P = T = 0. We fill the initial values

for Xi(m), which we can choose as known parameters for each point. The input physics given in (8.2) can,

in the given system, be replaced by their dependencies of P and T . This way, we have to restrict ourselves

to solving models in full equilibrium: r = P = T = 0. We then only have to define initial values for Xi(m)for each point and solve for four unknown functions r, P, T, l in the interval [0,M ] for a given M . We write

these four equations asdyidm

= fi(y1, . . . , y4), i = 1, . . . , 4, (8.14)

where we have introduced the following abbreviations y1 = r, y2 = P, y3 = T, y4 = l.

The following step is the discretisation of the equations (8.14), by replacing them by differential equa-

tions for a finite mass-interval [mj,mj+1]. We will indicate the values of the variables at each end of the

mass interval [mj,mj+1] with upper indices: yj1, yj+11 , . . . , yj4, y

j+14 . The functions fi on the right-hand

side of Eqs (8.14) have to be evaluated in an average argument, which is indicated with yj+1/2i . A logical

choice for these arguments is the arithmetic or geometric mean of yji and yj+1i . Let us now define the four

functions:

Aji ≡

yji − yj+1i

mj −mj+1− fi

(

yj+1/21 , . . . , y

j+1/24

)

, i = 1, . . . , 4, (8.15)

then the following differential equations

Aji = 0, i = 1, . . . , 4 (8.16)

replace the differential equations Eqs (8.14) that we want to solve.

The two boundary conditions for the outside of the atmosphere are fixed in a fitting point mf . We

choose this point as the one with upper index j = 1. These two boundary conditions give a relation between

the four variables y11 , . . . , y14 in the point m1 = mf . With the definitions

B1 ≡ y12 − π(y11 , y14), B2 ≡ y13 − θ(y11, y

14) (8.17)

the boundary conditions (8.13) are given by

Bi = 0, i = 1, 2. (8.18)

119

We now consider the whole interval in m, from mK = 0 to the fitting point m1 = mf . We divide this

area into K−1 partial intervals by choosing K grid points, which do not have to be equidistant. In the inner

interval for m, between the central point mK = 0 and mK−1 we use the series expansions (8.4), (8.5), (8.7),

and (8.9) for the four variables. These four equations are of the form

Ci

(

yK−11 , . . . , yK−1

4 , yK2 , yK3

)

= 0, i = 1, . . . , 4, (8.19)

in which the requirement yK1 = yK4 = 0 (r = l = 0 in the centre) is already incorporated.

In the K grid points we have 4K − 2 unknown variables, since yK1 = yK4 = 0. These unknowns have

to meet (8.18) for the first point, (8.16) for all intervals except the last (j = 1, . . . ,K − 2) and (8.19) for the

last interval. In total we have 2 + 4(K − 2) + 4 equations, which we can write schematically as

Bi = 0, i = 1, 2

Aji = 0, i = 1, . . . , 4, j = 1, . . . ,K − 2

Ci = 0, i = 1, . . . , 4.

(8.20)

We search a solution for a given M and Xi(m), which occur as input parameters in the equations. We also

need a first rough estimate of the values of the unknowns:(

yji

)

1for i = 1, . . . , 4; j = 1, . . . ,K. Since the

(

yji

)

1are only approximations, they will not meet (8.20):

Bi(1) 6= 0, Aji (1) 6= 0, Ci(1) 6= 0. (8.21)

We now derive corrections δyji for all variables in all grid points so that the second approximation(

yji

)

2=

(

yji

)

1+ δyji of the arguments will make the functions Bi, A

ji and Ci disappear. The corrections δyji of the

arguments deliver corrections δBi, δAji , δCi of the functions. We thus demand that

Bi(1) + δBi = 0, Aji (1) + δAj

i = 0, Ci(1) + δCi = 0. (8.22)

For corrections that are small enough, we can expand δBi, δAji , δCi in a series of increasing powers of δyji

and only keep the linear terms of this series. For B1 this is for example

δB1 ≈ ∂B1

∂y11δy11 +

∂B1

∂y12δy12 +

∂B1

∂y13δy13 +

∂B1

∂y14δy14 . (8.23)

Thanks to the linearisation procedure the conditions are given as (8.22):

∂Bi

∂y11δy11 + . . .+

∂Bi

∂y14δy14 = −Bi,

∂Aji

∂yj1δyj1 + . . .+

∂Aji

∂yj4δyj4 +

∂Aji

∂yj+11

δyj+11 + . . . +

∂Aji

∂yj+14

δyj+14 = −Aj

i ,

∂Ci

∂yK−11

δyK−11 + . . . +

∂Ci

∂yK−14

δyK−14 +

∂Ci

∂yK2δyK2 +

∂Ci

∂yK3δyK3 = −Ci,

(8.24)

120

where the indices i and j can take the same values as in (8.20). We again have 4K−2 (linear inhomogeneous)

equations for as many unknown corrections δyji (since δyK1 = δyK4 = 0 following the boundary conditions).

When calculating (8.22) all functions Bi, Aji , Ci and all their derivatives have to be determined with as

arguments the first approximations(

yji

)

1. The scheme (8.24) that should be solved can be annotated much

shorter in matrix form:

H

δy11

.

.

.

δyK3

= −

B1

.

.

.

C4

. (8.25)

Here H is the Henyey matrix, whose elements are the derivatives in the left-hand side of (8.24).

When H has a determinant different from zero, we can solve the system of linear equations and compute

the corrections δyji . On their turn, these lead to a better, second approximation of the unknowns(

yji

)

2.

When we take these as arguments for the equations (8.20) to solve, and will still find

Bi(2) 6= 0, Aji (2) 6= 0, Ci(2) 6= 0, (8.26)

because we only worked in the linear approximation and, moreover, numerical inaccuracies are always

involved. Therefore we take a second iteration step where we determine new corrections following the same

method which leads to a third approximation:(

yji

)

3=(

yji

)

2+ δyji . We keep on going with this iteration

process until the approximate solution is close enough to the solution we are searching for, following a

predetermined stop criterion. In this way one determines the entire stellar structure of a star in equilibrium,

given the mass and the chemical composition in the different layers at birth.

In the Figures 8.1 – 8.5 we show the results of the functions m(r), P (r), ρ(r), T (r) and l(r) (loga-

rithmic scale), derived with the Henyey method for a star with an initial mass of 1M⊙ (left panels) and of

15M⊙ (right panels) just after birth. The initial chemical composition in the whole star was X = 0.74, Y =0.24, Z = 0.02 using the solar chemical mixture. Further we assumed an ideal gas with radiation taking

ionisation effects into account as EOS. For the energy production, the nuclear networks described by the ppchains and the CNO cycle were used. Convective energy transport has been taken into account by using the

mixing length theory as described in Chapter 5. All other sources of chemical mixing were ignored.

From Figure 8.1 we deduce that the mass is strongly concentrated near the stellar centre: approximately

80% of the mass of the sun-like star is situated within a sphere with r = 0.4R⊙, so within a fraction 0.064

of the total volume. For a star with 15M⊙, 80% of the mass is situated in the inner half radius, which

corresponds to a fraction of 0.125 of the total volume. We conclude that the mass is more concentrated

towards the stellar interior as the star is less massive. The outer layers of the star negligibly influence the

total mass of the star.

The luminosity is even more concentrated than the mass (see Figure 8.5): 90% of the luminosity is

created within r = 0.2R, so within a fraction 0.008 of the volume of the star. It is in that central core that the

121

Figure 8.1: The mass distribution m(r) as a function of the position inside the star for a star of 1M⊙ (on the

left) and of 15M⊙ (on the right).

Figure 8.2: The pressure P (r) as a function of the position in the star for a star of 1M⊙ (on the left) and of

15M⊙ (on the right).

122

Figure 8.3: The density ρ(r) as a function of the position in the star for a star of 1M⊙ (on the left) and of

15M⊙ (on the right).

Figure 8.4: The temperature T (r) as a function of the position in a star for a star of 1M⊙ (on the left) and

of 15M⊙ (on the right).

123

Figure 8.5: The luminosity l(r) as a function of the position in a star for a star of 1M⊙ (on the left) and of

15M⊙ (on the right).

nuclear fusion occurs. In all surrounding layers the energy is only transported outwardly; there l(r) = L =is constant. For the sun-like star, the density strongly peaks in the centre and at r = 0.5R⊙ it is already

decreased with a factor 100. The profile of the pressure follows the one of the density. For a massive star the

decrease in density and pressure happens more gradually than for the sun-like star. The temperature changes

gradually and is “only” decreased with a factor 3 at r = 0.5R⊙. The temperature decreases rapidly near the

surface of the star, because the radiation can escape easily from there. The models of the current Sun (age

approximately 5×109 years) are characterised by a chemical composition X ≈ 0.35, Y ≈ 0.636, Z = 0.014in the stellar core, while the initial chemical composition X = 0.718, Y = 0.268, Z = 0.014 is still valid

for the areas with r > 0.2R⊙. The mass, density, pressure, temperature, and luminosity have not changed

much since its birth.

8.5 The MESA stellar structure and evolution code

We now turn to more realistic and more complex stellar models. As part of this course, the students learn

how to use and interpret the outcome of a state-of-the-art modern stellar evolution code, which relies on far

more sophisticated numerical methods than the Henyey method described in the previous section and allows

for the inclusion of many more physical effects, such as transport processes and the chemical mixing they

induce as described in Chapter 7. It concerns the MESA code, whose first version has been released to the

public in 2011. This code is being upgraded continuously by its developers’ team led by Bill Paxton at the

University of California at Santa Barbara, following suggestions from an active community of hundreds of

users worldwide, among which members of the Institute of Astronomy of KU Leuven.

Students of this SSE course are invited to read the history of MESA, the motivation of the developers,

and the terms of reference at the MESA website:

124

http://mesa.sourceforge.net/

The MESA code is described and discussed in five extensive peer-reviewed instrument papers, which

we use as study material for the Lab work of this master course:

1. Paxton B., Bildsten L., Dotter A., Herwig F., Lesaffre P., Timmes F., 2011, “Modules for Experiments

in Stellar Astrophysics (MESA)”, The Astrophysical Journal Supplement Series, 192, article id. 3,

35 pp.

2. Paxton B., Cantiello M., Arras P., Bildsten L., Brown E. F., Dotter A., Mankovich C., Montgomery

M. H., Stello D., Timmes F. X., Townsend R., 2013, “Modules for Experiments in Stellar Astrophysics

(MESA): Planets, Oscillations, Rotation, and Massive Stars”, The Astrophysical Journal Supplement

Series, 208, article id. 4, 42 pp.

3. Paxton B., Marchant P., Schwab J., Bauer E. B., Bildsten L., Cantiello M., Dessart L., Farmer R., Hu

H., Langer N., Townsend R. H. D., Townsley D. M., Timmes F. X., 2015, “Modules for Experiments

in Stellar Astrophysics (MESA): Binaries, Pulsations, and Explosions”, The Astrophysical Journal

Supplement Series, 220, article id. 15, 44 pp.

4. Paxton B., Schwab J., Bauer E. B., Bildsten L., Blinnikov S., Duffell P., Farmer R., Goldberg J. A.,

Marchant P., Sorokina E., Thoul A., Townsend R. H. D., Timmes F. X., 2018, “Modules for Experi-

ments in Stellar Astrophysics (MESA): Convective Boundaries, Element Diffusion, and Massive Star

Explosions”, The Astrophysical Journal Supplement Series, 234, article id. 34, 50 pp.

5. Paxton B., Smolec R., Schwab J., Gautschy A., Bildsten L., Cantiello M., Dotter A., Farmer R., Gold-

berg J. A., Jermyn A. S., Kanbur S. M., Marchant P., Thoul A., Townsend R. H. D., Wolf W. M.,

Zhang M., Timmes F. X., 2019, “Modules for Experiments in Stellar Astrophysics (MESA): Pulsat-

ing Variable Stars, Rotation, Convective Boundaries, and Energy Conservation”, The Astrophysical

Journal Supplement Series, 243, article id. 10, 44pp.

Students are encouraged to download these papers and read them carefully as an optimal preparation of their

practical MESA lab tasks. The MESA software and the basic framework of the code will be explained to the

students as part of the project work included in this course. This has the aim that the students learn how to

compute numerical stellar models with a modern code that is currently in use by numerous active researchers

in the field of stellar astrophysics. In this way, any student that passes this course will be equipped with a

modern tool to construct important building blocks for many other topics in modern astrophysics: models

of stellar interiors in all stages of stellar evolution.

125

126

PART III: STELLAR EVOLUTION

127

Chapter 9

Star formation

9.1 The interstellar medium

The existence of interstellar matter was proven in the early 1900s based on observations of the binary star

δOrionis. Because of the motion of the two components in the binary system, the spectral lines of the stars

shift back and forth in wavelength (Doppler shift) according to their orbital motion. Measurements of the

spectral lines of δOrionis revealed a spectral absorption line of calcium that did not follow the shifts of

the other spectral lines. In 1904, Hartman rightfully concluded that this absorption line must be caused by

matter in between δOrionis and the Earth.

Another indication for the presence of matter between the stars is delivered by the dark regions in the

Milky Way. Initially scientists thought these regions, where much fewer stars are seen, were intrinsically

depleted in stars. In reality, these dark regions are concentrations of dust that block the stellar light at these

locations. In the direction of such dark clouds, only the stars in front of the cloud are visible, which explains

why we see less stars in that direction. Interstellar dust has a typical temperature between 10 and 100 K.

Next to interstellar dust, interstellar gas appears everywhere in space. It is observed in the form of very

narrow absorption lines in the spectra of stars. The gas has the same composition as young, newborn stars.

The average density of the gas is extremely low, approximately 1 atom per cm3. The temperature is around

1000 K. It therefore mostly consists of neutral atoms, primarily hydrogen. In the surroundings of hot stars,

the gas is ionised due to the UV radiation, and can heat up to 10,000 K, because the neutral H atoms can

absorb the UV photons. The photons with an energy E > 13.6 eV ionise H, and put an energy E − 13.6 eV

into the kinetic energy of the electron that is released. Via collisions with other electrons and protons, this

kinetic energy is distributed over the gas as internal energy.

The interstellar matter is not homogeneously distributed in space, but localised in the disk of the Milky

Way, more specifically in the spiral arms. Moreover, local concentrations occur: interstellar clouds. These

have a diameter of a few tens of parsec, a temperature between 10 and 200 K, and a density of roughly 10 to

129

1000 atoms per cubic cm. The interstellar clouds are cooler than the low-density gas, because the radiation

that heats the matter cannot penetrate deeply. In dense interstellar clouds, also molecules can exist, primarily

H2 and, in much lower quantities, complex molecules such as CH3OH and H2CO. Such clouds are called

molecular clouds.

From detailed studies of the absorption properties of the interstellar dust, we know it consists of mi-

nuscule particles of carbon and silicate with a diameter of the order of 1µm, encapsulated in a thin layer of

ice (H2O). The mass in an interstellar cloud is in between 100 and 105 M⊙, but the dust particles constitute

only a minor fraction (∼ 1%) of this total mass. The remaining 99% consists of gas in the form of neutral H

or H2 molecules, and He atoms.

Stars are formed from the matter in molecular clouds. This happens when such a cloud becomes

gravitationally unstable and collapses. The clouds are opaque to visual radiation, which implies that the

formation process of young stars is not well known. Recently, better infra-red and mm instruments have

been used, which increased our understanding of star formation. Here, we first derive a criterion for cloud

collapse. Next, we will discuss several stages between the collapse and the birth of a new star. We necessarily

have to keep the topic of Star Formation short in this SSE course and refer to the fully dedicated master

course in this topic for details.

9.2 The Jeans criterion

Consider an infinitely extended, homogeneous cloud at rest. The density, temperature, and gravitational

potential are then constant. This is not a stable equilibrium state, because the equation of Poisson, ~∇2Φ =4πGρ then implies that ρ = 0. Nonetheless, we consider this state with a non-zero density. Even for a more

realistic equilibrium configuration, the result derived below does not change too much.

A perturbation is applied to the medium in equilibrium. This perturbation can be caused by e.g. a

supernova explosion in the vicinity, or by the passage of a density wave caused by the spiral arms in the

Milky Way. The gas must satisfy the equation of motion

d~v

dt=∂~v

∂t+ (~v.~∇)~v = −1

ρ~∇P − ~∇Φ (9.1)

and the continuity equation∂ρ

∂t+ ~v.~∇ρ+ ρ~∇.~v = 0. (9.2)

Moreover, the equation of Poisson has to be met, and we assume that the equation of state for an isothermal

ideal gas is valid :

P = a2ρ, (9.3)

with a the isothermal speed of sound, see expressions (2.52) and (2.53). In equilibrium, we have ρ = ρ0 =constant, T = T0 = constant and ~v0 = ~0. Φ0 is determined from the condition ~∇2Φ0 = 4πGρ0.

We now perturb the equilibrium and determine the effect of the perturbation on the physical quantities.

Herein, we only consider a small perturbation, and hence neglect non-linear effects. The quantities are

130

written as

ρ = ρ0 + ρ1, P = P0 + P1, Φ = Φ0 +Φ1, ~v = ~v1, (9.4)

where the functions with lower index 1 now have a spatial and a temporal dependence. If we replace (9.4)

in the equations that have to be met, the follow system of differential equations is obtained:

∂~v1∂t

= −~∇(

Φ1 + a2ρ1ρ0

)

,

∂ρ1∂t

+ ρ0~∇.~v1 = 0,

~∇2Φ1 = 4πGρ1.

(9.5)

We have assumed that the cloud remains isothermal during the perturbation. This approximation is valid,

as long as the cloud is able to radiate the released gravitational energy in an efficient way. The system of

Eqs (9.5) consists of linear, homogeneous partial differential equations with constant coefficients. We can

hence find solutions proportional to exp[i(kx+ ωt)], so that

∂x= i k,

∂y=

∂z= 0,

∂t= iω , (9.6)

where k is the wavenumber and ω the frequency of the solution. With v1x = v1, v1y = v1z = 0, we find,

based on (9.5), the following equations:

ωv1 +ka2

ρ0ρ1 + kΦ1 = 0,

kρ0v1 + ωρ1 = 0,

4πGρ1 + k2Φ1 = 0.

(9.7)

This homogeneous linear system of three equations for the three unknowns v1, ρ1,Φ1 only has solutions

different from zero if the determinant∣

ωka2

ρ0k

kρ0 ω 0

0 4πG k2

is equal to zero. For k 6= 0, this implies the condition

ω2 = k2a2 − 4πGρ0. (9.8)

For sufficiently large wavenumbers k, the right-hand side of the equation is positive, and the perturbation

will vary periodically in time according to ω being a real-valued eigenvalue. Because the amplitude does

not increase, this solution represents a stable equilibrium situation. In the limit for infinitely large k, the

second term on the right-hand side of Eq. (9.8) can be neglected, so that ω2 = k2a2, which is the dispersion

relation for isothermal sound waves. We thus find that, for very short waves (high k), the influence of the

gravitational force can be neglected. Each form of compression will, in this case, be restored by an increased

pressure, and the perturbations travel through the medium at the speed of sound.

131

When k2 < 4πGρ0/a2, ω is a complex number of the form ±iξ where ξ a real number. Perturbations

∼ exp(±ξt) occur, that grow or decay exponentially in time, so that the equilibrium is disturbed. We define

now a characteristic wavenumber kJ and a characteristic wavelength λJ:

k2J ≡ 4πGρ0a2

, λJ ≡ 2π

kJ. (9.9)

The perturbations with wavenumber k < kJ (or wavelength λ > λJ) cause an instability. Instability hence

occurs when

λ > λJ with λJ =

(

π

Gρ0

)1/2

a. (9.10)

Condition (9.10) is called the Jeans criterion, after J. Jeans who derived it in 1902.

Physically, the following happens: after a small compression of a set of plane-parallel layers, grav-

ity overcomes the pressure force, and the layers are compressed to narrow zones. The rate at which this

compression occurs, can be estimated by only considering the influence of gravity in Eq. (9.8). We have

iω ≈ √Gρ0 and the corresponding time scale τ ≈ 1/

√Gρ0. The latter is equivalent to the previously

defined free-fall time scale in Eq. (3.24). The time scale for thermal adaptation, on the other hand, is much

shorter if efficient “cooling agents” are present in the cloud. Water and carbon monoxide molecules, in

which many different kinds of rotational and vibrational transitions are possible, are indeed able to remove

heat and hence quickly cool the cloud during contraction. The thermal time scale in an interstellar cloud

is of order hundreds of years. To a good approximation, the collapse occurs isothermally as long as these

molecules are present.

It can be shown that the Jeans criterion is still valid when considering realistic configurations, e.g., a

spherically symmetric gas cloud. Depending on the assumed geometry, the factors in Eq. (9.9) for λJ will

differ a bit. For a given equilibrium state, there is a critical mass, termed the Jeans mass. Gas clouds with a

mass exceeding the Jeans mass are gravitationally unstable and will collapse due to a perturbation. We can

estimate the Jeans mass as follows:

MJ =4π

3ρ0λ

3J

=4π

3ρ0

(

π

Gρ0

)3/2 (RTµ

)3/2

=4π

3

(RπGµ

)3/2

T 3/2ρ−1/20

≈ 1 to 5× 105M⊙

(

T

100K

)3/2 ( ρ010−24gcm−3

)−1/2

µ−3/2,

(9.11)

where we have used that a2 = RT/µ. Typical values for interstellar clouds consisting of neutral hydrogen

are: ρ0 = 10−24 g cm−3, T = 100K, and µ = 1. With these values, the Jeans mass MJ ≈ 1 to 5× 105 M⊙.

This means that only for masses significantly larger than stellar masses, the cloud can collapse according to

the Jeans criterion.

132

9.3 Fragmentation

How are stars formed from a gas cloud that gravitationally collapses? It is assumed that a collapsing cloud

with a mass exceeding the Jeans mass will fragment. During the contraction, fragments are created which

themselves become unstable and contract at a faster rate than the initial cloud. If this process indeed occurs,

it implies that smaller sub-masses condense from the cloud.

As noticed previously, the contraction occurs isothermally. Hence, the Jeans mass decreases according

to ρ−1/2 during the contraction, in other words, the Jeans mass becomes smaller than the original mass of

the gas cloud when it started contracting. When the Jeans mass has decreased below half the initial mass,

the cloud splits up into two segments, which both collapse individually. Such fragmentation will continue as

long as the collapse occurs isothermally. We derived the Jeans criterion under the assumption of a medium

in equilibrium, and hence the theory is not strictly valid for a cloud that is already contracting but as long as

the time scale argument remain valid, the results hold.

What are the end products of the fragmentation process? Finding a detailed answer to this question

based on the equations of hydrodynamics and thermodynamics is beyond the scope of this course. As said,

we limit ourselves to a reasoning based on time scales: when does the time scale for thermal adaptation

become comparable to the free-fall time scale? At that moment, the contraction will not occur isothermally

anymore, but adiabatically. For a mono-atomic ideal gas, we have ∇ad = 2/5, so that T ∼ P 2/5 and because

P ∼ ρT the temperature changes as T ∼ ρ2/3. The Jeans mass then is proportional to T 3/2ρ−1/2 ∼ ρ1/2.

Hence, we find that the Jeans mass will increase during an adiabatic collapse. As a result, the existing

fragments will stop fragmenting further.

The characteristic time scale for free-fall of a fragment is (Gρ)−1/2. The total energy that needs to be

radiated to keep a constant temperature, is of the order of the gravitational potential energy Eg ≈ GM2/R,

in which M and R are the mass and “radius“ of the fragment. An energy A of the order

A ≡ GM2

R(Gρ)1/2 =

(

3

)1/2 G3/2M5/2

R5/2(9.12)

per unit time needs to be radiated to keep the fragmentation isothermal. Let us now assume thermal equi-

librium, which is a good approximation at the end of the fragmentation process because the matter becomes

opaque. Then, however, the fragment cannot radiate more energy than a black body with the same tempera-

ture. The fragment radiates an energy given by B = 4πfσT 4R2, with σ Stefan-Boltzmann’s constant (see

Appendix A) and f a dimensionless parameter with value between 0 and 1 that takes into account the fact

that less energy is radiated than in case of a black body. The condition for isothermal collapse is A ≪ B,

and the transition to adiabatic contraction will occur at A ≈ B. The latter condition is met when

M5 =64π3

3

σ2f2T 8R9

G3. (9.13)

Fragmentation stops when the Jeans mass is equal to the mass given in Eq. (9.13). We replace M in

Eq. (9.13) with MJ, R by (3MJ/4πρ)1/3, and eliminate ρ using Eq. (9.11). Hence, we obtain the Jeans

133

mass at the end of the fragmentation process:

MJ,end =

(

46π15

38

)1/41

(σG3)1/2

(Rµ

)9/4

f−1/2T 1/4 = 0.17M⊙

T 1/4

f1/2, (9.14)

where we have set µ = 1. We now take a typical temperature of 1000 K as the temperature of the smallest

fragments. Subsequently we assume that deviations from the isothermal state occur when f = 0.1, i.e. when

the fragment loses 10% of its maximum possible energy loss. The Jeans mass at the end of the fragmentation

process hence is ∼ 3M⊙. This result does not change dramatically with varying the temperature and f -

value within reasonable limits. We conclude that fragmentation stops when the fragments have reached a

mass of the order of a few tenths to several tens of solar masses, not the order of a planetary mass, nor the

mass of a star cluster.

9.4 The formation of a protostar

The Jeans criterion we have derived is based on a first-order perturbation method, and gives the condition

under which a perturbation of the equilibrium state will grow exponentially. This theory, however, does not

provide insight in the end product of the collapse. We now describe the different stages between the collapse

and the birth of the star.

When the fragmentation process stops, the different fragments continue to contract. The gravitational

force is still dominating, and the pressure gradient can be neglected at first. We can approximate the collapse

as a free fall of a homogeneous sphere. The time scale of this free fall is very comparable to the time scale

one finds when a sudden disappearance of the pressure force occurs in the equation of motion, and is about

105 to 107 years. This time scale is not so accurate anymore near the centre of the fragment, because the

pressure force becomes important there, which stops the collapse.

We now follow the process of collapse for a homogeneous cloud with a mass of 1 M⊙ after the frag-

mentation process has ended. To a good approximation, the instability keeps outer layers of the sphere at a

quasi-constant radius while the inner matter undergoes a free fall. Hence the density increases very rapidly

in the inner parts, while the density in the outer parts of the fragment barely varies. Once a small central

concentration appears, it will inevitably continue to grow and an irreversible process has started. The free-

fall time scale for a sphere within a radius r is of the order of [Gρ(r)]−1/2 in which ρ indicates the average

density in the sphere with radius r. When ρ increases towards the center, the free-fall time scale decreases

in that direction. Hence, the inner spheres will collapse faster than the outer spheres, and the density differ-

ence will become even more pronounced. Eventually, the fragment will evolve from a density distribution

ρ =constant to ρ ∼ r−2.

The collapse of the central part occurs in free-fall as long as the matter can lose the released gravi-

tational energy. A part of this energy is radiated in the infra-red. Another part is captured in the form of

differential rotation. Matter with a small angular momentum r2Ω(r) (per unit mass) with Ω(r) the rotation

frequency at position r from the centre, will undergo a dynamical collapse on a free-fall time scale. Matter

134

CLASS I

CLASS II

CLASS III

CLASS 0

-7

-8

-9

11 12 13 14 15

-7

-8

-9

-7

-8

-9

-7

-8

-9

ν (Hz)Log

νL

og

νL

og

νL

og

νL

og

Star

Remnant

disk

Star

Protostar

Active

disk

Passive

disk

Core

zZ

5000 AU

5 AU

50 AU

500 AU

Figure 9.1: The different stages of the star formation process of a single fragment in a schematic picture

(right) and the accompanying theoretically predicted spectral energy distributions (left). For details: see

text. (Figure courtesy of Dr. Bram Acke, KU Leuven)

135

on the outer edge of the fragment, on the other hand, has a much larger angular momentum and will not just

fall towards the center, but will spiral in around the star-to-be (see Figure 9.1).

A further increase of the density will cause an adiabatic increase of the temperature. Hence the pres-

sure will rise until free fall is stopped. A central core in hydrostatic equilibrium forms, surrounded by a

(still) collapsing envelope. At this moment, the mass of the core is approximately 1/200M⊙, the radius is

1000R⊙. Typical values for the central density and temperature are ρc = 2 × 10−10 g cm−3, Tc = 170K.

The free-fall velocity at the edge of the core is approximately 75 km s−1. When the mass of the core contin-

ues to increase, while its radius decreases, this velocity will exceed the local sound speed. Hence, a shock

wave will be generated, that separates the hydrostatic “interior” from the supersonic “rain” on the core. In

the shock front, the in-falling matter comes to a stop and transfers its kinetic energy to internal energy of the

core. In this way, the accreting core is heated.

In the core, the gas consists primarily of hydrogen in molecular form. However, when the temperature

rises to ∼2000 K, the H2 molecules will dissociate. A mixture of atomic and molecular hydrogen is obtained.

This mixture has a very high opacity and the cooling mechanism becomes much less efficient. At the start

of the dissociation process, the larger part of the energy that is injected into the core via the shock wave, will

be used to dissociate all molecular hydrogen. The shock wave rapidly extinguishes, before it can reach the

outer layers of the fragment. During the stage of strong dissociation, the hydrostatic equilibrium in the core

is broken, and the latter contracts again. This happens when the mass in the core has roughly doubled, and

its radius halved. This second collapse lasts as long as the gas is partially dissociated.

When all hydrogen has been converted to its atomic form, a dynamically stable sub-core has formed

in the star-to-be. This sub-core has a mass of approximately 1.5 × 10−3M⊙ and a radius of 1.3R⊙. The

central density has increased to about 10−2 g cm−3, and the central temperature is about 2× 104 K. Again,

a shock front is formed, at the edge of the sub-core. This front is much more energetic that the first one,

and now does reach the surface of the fragment: the early protostar shows its first luminosity. A schematic

representation of the two shock fronts is given in Figure 9.2.

The evolution of the core of the fragment with mass 1M⊙, starting from the original Jeans instability,

is schematically shown in Figure 9.3. The evolution starts on the left with an isothermal collapse. When the

matter becomes opaque, the temperature increases adiabatically. The temperature increase is topped off by

the dissociation of H2. The central compression occurs adiabatically as long as the accretion time scale of

the core (or the sub-core if it exists already) remains short compared to the Helmholtz-Kelvin time scale.

The more molecular hydrogen is depleted from the envelope, the longer the accretion time scale becomes.

At a certain moment, the latter time scale will exceed the Helmholtz-Kelvin time scale and the accretion

will gradually cease: a protostar is born, and its mass will not increase anymore. We now make a sidestep

before answering the question what happens to the protostar before it becomes a newborn star.

136

Figure 9.2: The collapse of a gas cloud with a mass of 1M⊙. (a) After about 0.4 million years, the cloud

has a dense, opaque core. The collapse stops at the edge of that core, and develops a shock front between

the core, in hydrostatic equilibrium, and the envelope, which is still in free fall. (b) When the core becomes

dynamically unstable because of H2 dissociation, a second collapse of the core occurs. Consequently a

second shock front develops, but now at much smaller r. (c) The velocity modulus |v| (in cm s−1) and

density ρ (in g cm−3) as a function of r (in cm). The regions of the shock waves are characterised by large

variations in velocity. (From Kippenhahn et al. 2012)

9.5 Hayashi tracks in the HR diagram

Let us consider the limiting case of a fully convective star, i.e., a star where the convective zone stretches all

the way from the stellar core to the stellar photosphere, so that only the stellar atmosphere remains radiative.

The Hayashi track is the location in the HR diagram where fully convective stars with a certain mass and

chemical composition reside. There is a separate Hayashi track for each mass and chemical composition

as these affect the temperature and chemical gradients occurring in the Ledoux criterion for convective

instability. The Hayashi tracks are situated on the right side of the HR diagram, at effective temperatures

137

Figure 9.3: The central evolution of a cloud with mass 1M⊙, starting from the isothermal collapse to the

ignition of hydrogen. The central temperature Tc (in Kelvin) is shown as a function of the central density

ρc (in g cm−3). The dotted line is an extrapolation which indicates that the stage of thermal adaptation, that

follows after the adiabatic compression, results in the ignition of hydrogen in the core. (From Kippenhahn

et al. 2012)

between 3000 K and 5000 K. A good approximation for the Hayashi tracks in the HR diagram is:

log Teff ≃ 0.05 log L+ 0.2 logM + constante. (9.15)

The slope of the steep tracks is ∂ logL/∂ log Teff ≃ 20. This implies that the Hayashi tracks shifts to the

left in the HR diagram with increasing stellar mass.

The exact determination of the Hayashi tracks does not only depend on the stellar mass and chemical

composition of the star, but also on the details of the convection theory used. In Figure 9.4, Hayashi tracks

are shown for stars with masses ranging from 0.25 to 4M⊙. The Hayashi tracks are located far away from

the main sequence for high stellar masses, and approach the main sequence at masses below ∼ 0.5M⊙.

Stars with such low masses are fully convective main-sequence stars since their Hayashi track crosses the

main sequence.

The Hayashi track indicates the border between a permitted and a forbidden region in the HR diagram.

Positions to the right of the Hayashi track cannot occur for a star in hydrostatic and convective equilibrium.

The latter means that variations of quantities connected to the convective cells occur so slowly that the con-

138

Figure 9.4: The position of Hayashi tracks for stars with masses between 0.25 and 4M⊙, for a chemical

composition X = 0.70, Y = 0.28, Z = 0.02. The ZAMS is indicated as a reference. (Figure courtesy of

Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)

vection has had enough time to adapt to the new situation. Because hydrostatic and convective equilibrium

are quickly restored, stars can only occur to the right of the Hayashi track for a very short period.

During some stages of stellar evolution, stars closely approach, or even coincide with their Hayashi

track. The position of the Hayashi tracks hence influences stellar evolution.

9.6 Evolution of the protostar towards the zero-age main sequence

After the dynamical collapse stage described in Sect. 9.4, the protostar reaches quasi-hydrostatic equilib-

rium. As long as the protostar has a central temperature below that needed to ignite hydrogen burning, it

can only use contraction to generate the energy needed to counteract gravity. The star is not yet in thermal

equilibrium. It contracts slowly on a Helmholtz-Kelvin time scale while still accreting the last remaining

matter from the surroundings. During this evolution stage, τHK ≈ M−2.5 is a good approximation for the

contraction time scale. The virial theorem states that a part of the gravitational contraction energy is con-

verted into internal energy, and the other part is responsible for the protostar’s luminosity. The opacity of

139

Figure 9.5: Henyey tracks of evolving models representing contracting protostars between their Hayashi

track and the ZAMS. The contraction time scales are indicated for the various masses. (Source: Wikipedia)

the matter in the protostar remains very high as long as the matter is not ionised, and nuclear burning does

not occur yet, hence convection is the only way to transport energy. The protostar resides on the Hayashi

track. During the contraction, the protostar keeps an almost constant effective temperature, but its radius

decreases, hence so does its luminosity (proportional to R−2). Thus, it moves downward in the HR diagram.

This is illustrated in Figure 9.5 by the red arrows.

In higher mass protostars, the central temperature increases faster than the central density. In lower

mass stars, it is the opposite: the central density increased faster than the central temperature. Thus for

low-mass protostars, the radiation of half the contraction energy has a harder time to take place and the

star remains convective and so closer to the Hayashi track. Because the internal temperature in the core of

the protostar keeps increasing steadily, the gas eventually becomes ionised. The opacity in the core drops,

and the convective zone disappears from the core. This implies that the protostar leaves its Hayashi track,

because it is not fully convective anymore. It will continue its evolution with a radiative contraction of the

core, and hence moves on the so-called Henyey tracks to the left in the HR diagram. Because the contraction

in the protostar’s core occurs in an increasingly transparant environment because of the increasing central

temperature, the protostar’s luminosity will stop decreasing and start increasing. The star has now become

a pre-main-sequence star, or pre-MS star in brief, as indicated by the orange–yellow–white–blue tracks in

140

Fig. 9.5.

While the pre-MS star continues its way along the Henyey track to the left in the HR diagram, its central

temperature reaches the value needed to initiate deuterium burning (approximately 106 K). We recall that

deuterium is immediately burned into 3He and creates a photon. The less massive the protostar, the closer it

will be to its Hayashi track when the first nuclear reactions based on primordial deuterium occur. The full ppchains cannot be completed yet, however, because the pp reaction requires a higher temperature and the full

pp chains only occur in equilibrium for temperatures above 5× 106 K. Hence the 3He isotopes cannot reach

an equilibrium concentration needed to have the full hydrogen burning in action. As a result, the temperature

sensitivity of the nuclear reactions at this stage is higher (typically a factor 3) than when the pp chains would

occur in equilibrium. This implies that the pre-MS star develops a convective core. In pre-MS stars with

a mass below ∼ 1.1M⊙, this convective core will disappear when the pp chains, with all corresponding

chemical reactions, occur in equilibrium. More massive stars will rapidly switch from deuterium burning

to hydrogen burning through the CNO cycle (cf. Figure 6.5). This kind of burning, however, is much more

sensitive to temperature than the pp chains. Hence, these stars will maintain their convective core during the

entire stage of hydrogen core burning.

The accretion continues during almost the entire pre-MS stage, on a Helmholtz-Kelvin time scale.

Protostars with a mass above 9-ish M⊙ evolve so quickly from their Hayashi track to the main sequence

(time scale far below a million years, see Figure 9.5), that they are not visible during their pre-MS stage,

also because they remain embedded in a thick circumstellar envelope of in-falling matter. These massive

stars hence only light up in the HR diagram when they have already reached ZAMS. At that occasion, the in-

fall of matter stops due to the strong outward radiation. The pre-MS stage of high-mass stars is thus poorly

known. Pre-MS stars with masses between ∼1.6 and 9M⊙ end their accretion stage before they reach the

main sequence. Such objects are called Herbig Ae/Be stars. Pre-MS stars with masses below ∼1.6 M⊙ are

called T Tauri stars.

From an observational point of view, it is indeed so that the HR diagram of young star clusters (e.g.

the Pleiades, h and χ Persei) show massive stars on the main sequence, while stars of low mass are still

in their contraction phase, occupying the region to the right of the main sequence. Many of these stars

are indeed T Tauri stars. Observations of Herbig Ae/Be stars and T Tauri stars show that both groups of

stars experience active surface phenomena and differential rotation. The combination of this rotation and

the convection in the outer stellar layers may induce a chaotic magnetic field. The latter transports the

available angular momentum to the surface, where it (part of it) can get lost through a stellar wind. This

wind escapes via the stellar polar axis, because of the presence of the accretion disk in the plane of the

equator. Hence, a bipolar outflow forms and this stops the accretion process before the star has reached the

main sequence (Figure 9.1). During their pre-MS stage, the dust disk around T Tauri and Herbig Ae/Be stars

disappears. The details of how this happens are not yet clear but this phase coincides with the formation of a

planetary system. Whether planet formation in such disks is a common, or rather exceptional phenomenon

is under intense study nowadays. For more details, we refer to the Master course Star and Planetary System

Formation.

Once the hydrogen burning can occur in equilibrium, and fully dominates the energy production, the

star reaches a state of thermal equilibrium, and the contraction stops. This point in time is called the zero-

age-main-sequence abbreviated as ZAMS as already used a few times in figure captions above. The star is

141

Figure 9.6: Pre-MS HR diagram with evolutionary tracks for solar-type metallicity (black) and isochrones

for ages of 0.1, 1, 10, and 100 Myr (grey) for birth masses covering 0.1 to 6 M⊙. (Source: Wikipedia

Commons, based on Stahler & Palla, 2004, “The Formation of Stars”, Wiley-VCH Verlag GmbH & Co.

KGaA)

now “born”. When the energy from contraction drops below a percent of the total energy budget, the internal

structure of the star requires a reorganisation. The two energy sources, gravitational contraction energy and

nuclear energy, switch in importance. Both have a very different influence on the stellar structure. The

gravitational energy production εg ∼ T , while the hydrogen burning processes are much more concentrated

towards the stellar centre with temperature dependencies εpp ∼ T 5 and εCNO ∼ T 18. The nuclear energy

rapidly becomes the dominant energy source, and the evolution of the star is from this moment on fully

governed by hydrogen burning. In Figure 9.6, we show pre-MS evolutionary tracks until the ZAMS, for

different stellar masses, computed with the MESA code.

Finally, we note that a contracting sphere with a mass below a certain threshold mass will never reach a

central temperature high enough to start hydrogen burning. Protostars with a mass below 0.08 M⊙ will never

be able to achieve hydrogen burning in full equilibrium, and hence never reach the ZAMS. These “failed”

142

stars are fully convective during their contraction stage. Contraction is responsible for the production of

the luminosity as long as no nuclear reactions take place. The central density keeps increasing, which

in a protostar with a mass below ≃ 0.08M⊙ leads to electron degeneracy before the pp reaction ignites.

This electron degeneracy prevents a further increase of the temperature, and the latter will never become

sufficiently high to perform hydrogen burning in full equilibrium. Such objects are called brown dwarfs. A

star-to-be is doomed to become a brown dwarf when its mass is not high enough to ignite the full hydrogen

burning cycle in the core before degeneracy sets in. The limiting mass depends somewhat on the initial

chemical composition and on the degree of ionisation, as this affects the mean molecular weight and hence

the central gas pressure. Taking this into account in numerical models for Big Bang nucleosynthesis and for

more recent metal mass fractions, one obtains a minimum mass for a main-sequence star in the range 0.06 –

0.09 M⊙.

Brown dwarfs with mass above about 0.065M⊙ can burn both deuterium and 7Li during their pre-MS

contraction but those with lower mass cannot. Hence, these “high mass” brown dwarfs produce 3He and

photons but not 4He as their central temperature remains too low to overcome the Coulomb barier of the3He – 3He reaction. Their thermonuclear energy thus mostly comes from the burning of their primordial

deuterium. Once that is finished, the brown dwarf can only cool. About 107 years after the initiation of the

deuterium burning, the luminosity of brown dwarfs drops with age as L ∼ τ−1.2. A rough luminosity-mass-

age relation holds:

L

L⊙

≃(

M

M⊙

)2.6 ( τ

107 yr

)−1.2

. (9.16)

Lower-mass brown dwarfs do not fuse their primordial Li and hence spectroscopy focusing on the absence or

presence of Li spectral lines offers an observational method to distinguish a low-mass star from a low-mass

brown dwarf.

At the other (planetary) mass end, one distinguishes brown dwarfs and gaseous planets by considering

a limiting mass between the two of about 0.013M⊙, because this is the amount of gas necessary to reach

high enough central temperature and density to fuse deuterium (Tc ≃ 106 K). However, this value for the

mass limit is not meaningful for rocky planets, where it is rather placed at ≈ 0.025M⊙. Low-mass brown

dwarfs and high-mass planets thus overlap in their mass ranges.

Brown dwarfs were for some time held responsible for the so-called “missing mass” in the Universe.

Because of their low luminosity, it is difficult to observe them. Hence, it was thought that a significant

fraction of the mass in the Universe might be “hidden” below the detection limit. An accurate estimation of

the mass in the Universe is of great importance for the set-up of cosmological models (see Master course

Galaxies and Cosmology). In this framework, the search for brown dwarfs remains quite topical.

143

144

Chapter 10

The main sequence or core-hydrogen

burning phase

10.1 Zero-age main sequence models

We now consider a series of stellar models in mechanical and thermal equilibrium with the same chemical

composition, but with different masses. The stars have arrived on the ZAMS, as explained in the previous

chapter, and experience hydrogen burning in their core in full equilibrium. This nuclear burning is their

source of energy for a very long time. As of now, the stars evolve on a time scale τn, which is much longer

than the time span τHK covered by the star during its formation history.

The consumption of hydrogen in the core occurs at such a low rate that the star spends almost its entire

life on the main sequence (about 90%). Most stars we observe are therefore main-sequence stars. The age of

the star is usually expressed starting from the zero-age main sequence (i.e., t = 0 at the ZAMS). We repeat

again that the latter is defined as the moment when the central hydrogen burning occurs in full equilibrium

and becomes by far the most important energy source (i.e., the contribution of the contraction energy drops

below a percent).

Equilibrium models of main-sequence stars in the stage of central hydrogen burning can be determined

based on the scheme described in Chapter 8. In Figure 10.1, the position of the stellar models at the time

of the ZAMS is shown in an HR diagram for a range in masses between 0.1M⊙ and 100M⊙, for an initial

hydrogen composition given by the mass fraction X = 0.70, and for two values of the metallicity. The lumi-

nosity and effective temperature increase with increasing mass. Connecting these initial values for different

stellar masses delivers the entire ZAMS. As can be seen in Figure 10.1, there is a clear tight connection

between the values of the mass and luminosity of the stellar models. Hence, given the effective temperature

dependence of L, also the mass and radius must be tightly connected.

Stellar models at the ZAMS indeed comply with so-called homology relations. We elaborate on this

145

Figure 10.1: The zero-age main sequence (ZAMS) in the Hertzsprung-Russell for stellar models with X =0.70 and two values of the metallicity Z . The positions of the models for different masses between 0.1

and 100M⊙ are indicated with plus signs, revealing that metal-poorer stars are bluer than metal-rich ones.

(Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University

Nijmegen, NL)

in a bit more detail in the next section, from a pragmatic user approach rather than performing detailed

theoretical/analytical computations (the latter can be found in Kippenhahn et al. 2012, Chapters 20 to 22 but

are skipped here). Before going into these details, we compare the ZAMS mass-luminosity (ML) and mass-

radius (MR) relations obtained from the theory with data that does not rely on models of stellar interiors.

This can be done for a large coverage of the mass range from unevolved detached binary stars (we omit

details here but refer to Chapter 14). The outcome is shown in Figure 10.2 and reveals quite good agreement

between theory and observations, particularly if one keeps in mind that the models are for the ZAMS only,

while the observations represent stars covering the entire main sequence. We discuss this figure further

in the next subsection but note here that this good agreement between the measurements and full lines is

remarkable, as it occurs over an extraordinarily extensive range of mass and luminosity: a factor ∼ 200 in

mass and a factor 108 in luminosity!

146

Figure 10.2: The full line indicates the ZAMS mass-luminosity (left) and mass-radius (right) relation de-

rived from the computed stellar models with Z = 0.02 already shown in Figure 10.1. The dashed lines

show the approximations for the ML and MR relations discussed in the text. The coloured symbols are

observations based on dynamical masses and radii, which were determined in a model-independent way.

The accuracies of these measurements are smaller than the symbol sizes. Blue stars: visual binaries; red

plusses: spectroscopic binaries. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and

Evolution, Radboud University Nijmegen, NL)

10.2 The mass-luminosity and mass-radius relations

We can derive and study mass-luminosity and mass-radius relationships by using the outcome of model

computations based on the methods in Chapter 8 and by subsequently fitting the outcome with mathematical

functions. Given the tight relationship between the mass, effective temperature, and luminosity revealed in

Figures 10.1 and 10.2, the fact that L ∼ R2T 4eff and that the mean molecular weight is determined by the

initial chemistry (X,Y,Z), we consider approximations of the form:

L ∼Mη1µη2 ;R ∼M ξ1µξ2 . (10.1)

Looking at the outcome of the numerical integrations to obtain the ZAMS models in Figure 10.2, we can

see that the slope in the ML-relation, i.e., the values of η1, η2 (and hence of ξ1 and ξ2), will depend on the

chosen mass interval. When restricting the EOS to an ideal gas, it can be shown analytically that η1 = 3

147

and η2 = 4 (Kippenhahn et al. 2012). Given that M can change with 3 orders of magnitude, while µ hardly

changes, it is meaningful to fix η2 = 4 and only estimate η1. Doing this for the entire stellar mass range, the

best approximation to the ZAMS models is found for η1 ≈ 3.3 – this is the dashed line indicated in the left

panel of Figure 10.2. It can also be seen from that figure that the highest value of the slope η1 is attaigned

by restricting the fit to the mass range [1, 10]M⊙ , giving η1 = 3.9. The appreciable decrease of η1 for the

highest masses is due to the increase of importance of the radiation pressure in the EOS and of mass loss (cf.

the so-called Eddington luminosity, which will be discussed in Chapter 13). For ZAMS stars with a mass

below half the solar value η1 ≈ 5.

As for the mass-radius relation for ZAMS stars, we find roughly ξ1 ≈ 0.79 for models with mass

lower than the solar value, while the more massive stars have ξ1 ≈ 0.57. This steeper relationship for low-

mass stars is readily seen in the model-independent data for the dynamical masses and radii of the highest

accuracy from detached eclipsing binaries assembled by Serenelli et al. (2021, A&A Rev., Vol. 29, id.4,

141pp.) shown in Fig. 10.3. This figure also illustrates that the validity of the mass-luminosity, mass-radius,

and mass-temperature relationships are limited to main-sequence stars. For the more global mass regime,

the dashed line in the right panel of Figure 10.2 is drawn for ξ1 = 0.81. The full line in the right panel

of Figure 10.2 is the outcome of the theory and reveals a clear “kink” in the mass-radius relation around

M = 1M⊙, also in line with the more extended sample of benchmark stars in Figure 10.3. The reason is

that stars with an effective temperature lower than that of the Sun, have a much more extended convective

envelope (see also Figure 10.4 discussed further in the text), which induces an increase of the radius with

respect to a star of similar mass that would hypothetically have a radiative outer layer. This leads to a

steeper increase of the mass-radius relation as the mass decreases. This trend cannot be continued into the

high-mass regime, because such stars have radiative outer zones.

Overall, the agreement between the data and the dashed/full lines in Figure 10.2 is good, given that

these linestyle curves represent relationships based on just one exponent. In general the radius, and hence

luminosity of a star depends as well on its metallicity (cf. Figure 10.1) and so the mass-radius and mass-

luminosity relations must also be metal-dependent. Rather than taking this into account via the expression of

the molecular weight, which does not change much in numerical value for various stars, we can also consider

a fit including Z , as this is a quantity that can be measured from stellar spectroscopy. The metallicity

dependence is rather modest but well observable. A good mathematical fit valid for a fixed mass is

R ∼ Z1/6 and L ∼ Z−7/6. (10.2)

Hence, since T 4eff ∼ L/R2 we find Teff ∼ Z−9/24. As a result, metal-poor stars are bluer and more luminous

than their metal-rich counterparts of the same mass, as was already revealed in Figure 10.1, except for the

highest masses where a strong radiation-driven wind comes into play (Chapter 13).

10.3 Chemical evolution on the main sequence

During the main-sequence stage, the energy loss at the stellar surface is compensated by the energy pro-

duction due to hydrogen burning. The chemical evolution of the star is mostly concentrated in the direct

environment of the stellar core, because the energy production is strongly dependent on temperature. The

148

Figure 10.3: Mass-radius and mass-temperature relations of “benchmark stars” assembled in the review

paper of Serenelli et al. (2021, A&A Rev., Vol. 29, id.4, 141pp.). These are detached eclipsing binaries for

which the masses and radii were determined with an accuracy better than 2% for the high-mass stars and

gradually down to 1% for the low-mass stars, while their metallicity is also available for both components

from spectroscopy, either via spectral disentangling or from double-lined composite spectra. For the binaries

containing at least one low-mass star with M ≤ 0.7M⊙, a relative error of 3% in mass was allowed. The

insets zoom in on the stars with M ≤ 1.0M⊙. Cyan triangles are pre-MS stars while red squares represent

evolved stars. (Figure courtesy of the author, produced for inclusion in Serenelli et al., 2021, A&ARev, Vol.

29, id.4, 141pp.)

central part of the star, where hydrogen fusion occurs, contains between 10 and 30% of the total stellar

149

Figure 10.4: The values of the mass coordinate m/M from the centre to the surface of the star is drawn as a

function of the total stellar mass for the ZAMS models shown in Figure 10.1. Areas indicated with “clouds”

are zones in the star where the energy transport is done via convection. The two full lines indicate values of

m/M where r is equal to one fourth and half of the total radius R. The dashed lines indicate mass shells in

which 50% and 90% of the total luminosity L is produced. (From Kippenhahn et al. 2012)

mass, depending on the mass regime and on the occurrence or absence of CBM. When convection occurs,

turbulent motions cause an efficient mixing of the stellar matter, and thus a larger volume is influenced. In

Figure 10.4, the convection zones are shown as a function of stellar mass for standard models without extra

mixing (i.e., no CBM nor chemical mixing in the envelope). We see that there are no convective zones near

the stellar core of stars with masses below 1 solar mass, and that the extent of the central convective core

increases with increasing stellar mass. A schematic representation of the location of the convective zones

according to stellar mass, is shown in Figure 10.5, in which the left cartoon represents stars with M >∼ 2M⊙,

the middle one those with 1M⊙<∼ M <

∼ 2M⊙, and the right one stars with M <∼ 1M⊙.

For stars with masses between 0.1 and 1M⊙ with a radiative core, the change in hydrogen content due

to hydrogen burning is easily determined. The variation of X for a certain fluid element is proportional to

the local value of εH when there is no convective mixing going on. This means that the change in hydrogen

abundance after a time interval t is given by X ∼ εHt. Hence, the chemical evolution can be traced

150

Figure 10.5: Sketch of the convection zones in ZAMS stars. Left: Stars with masses above 2M⊙; middle:

stars with masses between 1 and 2M⊙; right: stars with masses below 1M⊙. (Figure courtesy of Prof. J.

Christensen-Dalsgaard)

easily throughout the entire hydrogen-burning stage. At the end of the main-sequence stage, X → 0 in the

stellar core. The effective temperature of these low-mass stars barely changes during the main sequence.

For more massive stars, the helium production is much more concentrated towards the center, because

there is a much larger temperature dependence for the CNO cycle than for the pp chains. The convection in

the central parts is so efficient and rapid, that the stellar core can be regarded homogeneous in composition

at all times. Within the core, we hence have X ∼ εHt, in which εH is an average value of the energy

production over the total stellar core.

The evolution of the size of the convective core during the main sequence for stars with M > 1M⊙

depends on the stellar mass. For stars with M >∼ 1.6M⊙, ∇rad changes mostly because of the variation in

opacity. In the core we have κ ∼ (1 +X), hence the opacity will decrease. For these stars, the convective

core will thus shrink when they evolve along the main sequence. This was already illustrated by the change

in the buoyancy frequency shown in Figure 5.4. As a result, products of the nucleosynthesis are left behind

in a region around the convective core. The stars get a slightly larger radiative outer zone, in which the

temperature drops faster than in a convective zone, and hence the star becomes cooler at the surface, which

induces a movement to the right in the HR diagram.

For M <∼ 1.6M⊙, on the other hand, ∇rad changes mostly due to the contribution of ℓ(r)/m(r) = ε.

Because εpp ∼ X2 and εCNO ∼ XZ , the relative importance of the CNO cycle in the energy production in-

creases with respect to the pp chains as the star evolves along the main sequence. The pp chains get a harder

and harder time to occur in equilibrium, which makes the reactions more temperature sensitive. Because

the CNO burning is concentrated in a smaller core, ℓ(r)/m(r) increases. This increase dominates over the

decrease in opacity, which implies that the radiative temperature gradient increases during the star’s main-

sequence evolution, hence the convective core grows. In reality, both opacity and energy production change

151

simultaneously, and their combined effect on ∇rad must be considered. In stars with M < 1.3M⊙, the ppchains remain the dominant energy source. The effective temperature of these stars is mostly determined

by the large outer convective zone, and barely changes as a result of the slightly larger convective core. In

stars with 1.3M⊙ < M < 1.6M⊙, the CNO cycle was already the dominant energy production mechanism

(see Figure 6.5), and the outer convective layer is very thin. These stars evolve to the right during the main

sequence evolution.

The time a star spends on the main sequence depends on its mass, because the luminosity is very

dependent on mass. If we define the energy reservoir, available from hydrogen burning by EH, the star can

remain on the main sequence for τH ≡ EH/L. As a rough approximation, we assume that a fixed fraction of

the mass in hydrogen MH is available for hydrogen burning. In this assumption EH ∼MH ∼M . Although

the luminosity L changes for stars during their main-sequence stage, we can use the mass-luminosity relation

for ZAMS models to estimate τH. We thus find the following dependence of the main-sequence lifetime on

stellar mass:

τH(M) ∼ M

L∼M1−η1 . (10.3)

For the average exponent η1 = 3.3 of the mass-luminosity relation, we find that τH(M) ∼ M−2.3: the

main-sequence lifetime decreases rapidly with increasing mass. A typical value is 2× 108 year for a 5M⊙

star, and 1010 year for the Sun.

The faster main-sequence evolution of more massive stars is clearly confirmed by the observational

studies of the HR diagram of star clusters. These are concentrations of stars on the sky, so close to each other

they must be physically linked. There are two types of star clusters. Galactic or open clusters contain stars

of Population I and are located in the Galactic disk, where they easily get disrupted depending on their mass

content and the surroundings in the disk. Indeed, they contain typically only a few hundred stars and are not

strongly gravitionally bound. Globular clusters, on the other hand, consist of millions of Population II stars.

They are found at great distances away from the Galactic disk and do not get disrupted easily. All stars in

a cluster are approximately equally distant to Earth, which implies that differences in apparent magnitudes

are equal to differences in absolute magnitudes for all cluster members – see Equation (1.10). Hence, a

diagram of apparent magnitude versus colour has the same shape as a diagram of the absolute magnitude

versus colour.

All stars in a cluster were born more or less simultaneously, and therefore have the same age τcluster.Consequently, all stars with a mass exceeding a certain limitMlimit will already have left the main sequence,

while stars with a smaller mass M < Mlimit are still in the stage of hydrogen burning in the core. Observa-

tions of stars in clusters confirm this scenario. In Figure 10.6, we show the contrast in HR diagram between a

young and an old cluster. In the bottom panel, the HR diagram of the young double cluster h and χ Persei is

shown, in which the lower-mass stars are still pre-MS stars evolving towards the ZAMS, while more massive

stars are already on the ZAMS and the most massive stars have become red supergiants. In the top panel,

the HR diagram of the old star cluster M 5 is shown, in which the more massive stars have clearly evolved

off the main sequence, while the low-mass stars are still on the main sequence. The horizontal branch (see

Chapter 12 for a definition) is clearly visible in this old cluster.

In Figure 10.7, the evolutionary tracks of several galactic clusters is indicated. The difference in main-

sequence age as a function of initial stellar mass has the following important application for star clusters.

152

Figure 10.6: The colour-magnitude diagram for a typical globular cluster (M 5), consisting of Population II

(old) stars (top), and the young galactic double cluster h&χ Persei, containing Population I stars (bottom).

(Source: Wikipedia)

153

Figure 10.7: A schematic representation of colour-magnitude diagrams of several galactic star clusters. The

age scale on the right-hand side is based on evolutionary model computations for Population I stars. The

turn-off point of each cluster reveals its age. (Source: Wikipedia)

The limit mass which indicates whether or not a star in the cluster is still on the main sequence, is given by

the condition τcluster = τH(Mlimit). This condition is the basis for the age determination of stellar clusters.

The turn-off point determines the age of the cluster, indicated on the right-hand side of the figure. The older

the cluster, the lower the turn-off from the main sequence towards the red giant branch of the cluster will

be situated (see Figure 10.7). The example of h&χ Persei (see Figure 10.6) shows that the low-mass stars in

extremely young clusters have not yet reached the main sequence. The study of these young stars reveals

details of the evolution of protostars while contracting towards the main sequence.

Also the influence of the chemical composition on the stellar evolution is noteworthy. This was already

revealed in Figures 1.9 and 10.1. In terms of cluster ageing, one has to keep in mind that globular clusters

consist of Population II stars, which are metal-poor and hence have lower opacities than their Population I

metal-rich analogues in galactic clusters. The age determination of globular clusters is used as a limit to the

age of the Universe, and is in that sense important for observational cosmology.

154

Figure 10.8: Multiple populations, due to various levels of initial helium (see model fits in the inset), in the

globular cluster NGC 2808. (From Piotto et al., ApJL, Vol. 661, L53, 2007)

star cluster studies constitute an entire research field by themselves in astrophysics, particularly since

Gaia DR2 revealed spectacular improvements of the CMD morphologies (cf. Figures 1.8 and 1.9). These

new astrometric data still have to be digested by the astronomical community, some decade after the discov-

ery of multiple main sequencies in globular clusters due to different chemical compositions (mainly helium)

caused by different populations in terms of generations of stars. Young open clusters, on the other hand,

have extended main-sequence turnoffs. These are, up to the present day, assigned to differences in stellar

rotation, although binarity certainly also plays a role. The two phenomena are illustrated in Figures 10.8 and

10.9. So far, CBM has been kept constant for all stars in isochrone fitting of young open clusters, leading

to typically 30% relative uncertainty in the aging from isochrones. This is a serious limitation, given that

asteroseismology shows a whole range in αov to occur in stars of the same mass, age, and rotation. Improv-

ing stellar ageing from isochrone computations by allowing stars that belong to the same binary or cluster

to have different levels of CBM and envelope mixing (following Chapter 7), offers a new explanation for

extended main-sequence turnoffs in clusters, cf. Figures 10.11 and 10.12 discussed in the next subsection.

155

Figure 10.9: The extended main-sequence turnoff of the open cluster NGC 2818, compared with evolution-

ary tracks of rotating stellar models, failing to explain the core helium burning cluster members. (From

Bastian et al., MNRAS, Vol. 480, p.3739, 2018)

10.4 The end of core-hydrogen burning

The evolutionary track of a core-hydrogen burning star of 7 M⊙ in the HR diagram is shown in the left

panel of Figure 10.10. This track is computed for standard models without CBM in the core boundary

layers and without extra chemical mixing in the radiative envelope. From point A on the main sequence,

the star moves up and to the right, on its way to B. The increase in luminosity is due to the increase in

mean molecular weight in the core, because of the conversion of hydrogen to helium – see Eq. (2.31) and

recall that P ∼ T/µ and L ∼ T 4. When almost all hydrogen has been used up (X = 5%), a minimal

effective temperature is reached (point B). This stage is called the terminal-age-main-sequence (TAMS).

The star will soon experience an energy crisis. Because the central temperature is far too low to start helium

burning, the stellar core starts to contract and the star evolves to the left in the HR diagram. The evolution

is accelerated, because the remainder of hydrogen in the core is consumed very quickly. At the end of the

hydrogen core burning (point C), the star consists of a core containing helium. This core is still not hot

enough to start helium burning. The helium core is surrounded by a hydrogen-rich envelope. Due to the

increase in temperature near the core when the star evolved from B to C, the temperature at the bottom of

the envelope is high enough to ignite hydrogen burning in this region and thus produce the nuclear energy

needed to counterbalance gravitational contraction. In this way, a stage of hydrogen shell burning is initiated.

156

Figure 10.10: Hertzsprung-Russell diagrams with evolutionary tracks for Population I stars during the stage

of core hydrogen burning. The ZAMS is indicated with a dashed line. (a) For a 7M⊙ star. The points A,

B, and C correspond to the time of stellar birth (ZAMS), minimal Teff , and exhaustion of hydrogen in the

core, respectively. The dotted line indicates the subsequent stellar evolution after the core hydrogen burning

stage. (b) For stars with a mass between 4 and 8M⊙. (c) For stars with a mass between 1 and 3M⊙. (From

Kippenhahn et al. 2012)

The evolutionary track shown in the left panel of Figure 10.10 is representative for all stars with a

large convective core. This is illustrated in the middle panel of the same figure. The increase in luminosity

between points A and B is larger for stars with higher masses, while the variation in effective temperature

remains more or less the same. Figure 10.11 illustrates the major effect of the presence of extra mixing in the

convective boundary layer (whatever its physical cause) in the duration of the main sequence (dotted tracks)

compared to the case where no extra mixing aside from convective mixing in the core occurs (full tracks)

for stars with birth mass between 3 and 7 M⊙. The reason of this large difference is that the extra mixing at

the bottom of the radiative envelope transports fresh hydrogen to the convective core, extending drastically

the core-hydrogen burning phase. This also implies that, by the time the star reaches the dashed-dotted line,

it has a more massive helium core and a higher luminosity compared to the standard models where no extra

mixing occurs outside the convective core.

For stars with a mass lower than that of the Sun, the evolution tracks are different. This is indicated in

157

Figure 10.11: Evolutionary tracks computed with MESA to illustrate the effect of convective boundary and

envelope mixing. The models are based on an exponentially decaying Dov(r) with fov = 0.04 (MESA

notation) and mixing due to internal gravity waves at the level of Denv(r) = 1000 cm2 s−1 (see Figure 7.3)

in the deep bottom of the radiative envelope (dotted grey tracks) compared to the case where these two types

of chemical mixing are at low value (fov = 0.01 and Denv(r) = 1 cm2 s−1; full grey lines). The dashed and

dashed-dotted lines indicate the exhaustion of the central hydrogen mass fraction. (Figure courtesy of Dr.

Cole Johnston)

the right panel of Figure 10.10. These stars do not have a convective core and are thus not subject to convec-

tive core mixing. They experience a more gradual transition from core to shell hydrogen burning, building

up their helium core starting in the stellar centre and continuously increasing from the centre outwards.

The onset of hydrogen burning in a shell has important consequences for the internal stellar structure.

The structure and change of the helium core depend on the mass and chemical composition of the star. A

core in thermal and hydrostatic equilibrium without internal source of energy does not contribute to the

luminosity and therefore has to be isothermal (since l ∼ dT/dr). However, the pressure delivered by the

core must be sufficient to compensate for the gravitational contraction due to the envelope on top of it. This

is only the case if the mass of the helium core remains below the so-called Schonberg-Chandrasekhar limit

(MSC). This limit is defined as the maximum mass that a non-fusing, isothermal He core can have while

158

Figure 10.12: The grey area in this HR diagram is an isochrone cloud, which is a zone covered by stars of

equal age, having different levels of CBM and envelope mixing between the limits indicated in Figure 10.11.

(From Johnston et al. 2019, A&A, Vol. 632, id.474, 11pp.)

still being able to support the overlying envelope. This limit is expressed as the ratio of the He core mass

to the total stellar mass and lies between roughly 7% and 15% of the total stellar mass. The helium core

can only remain isothermal until its mass reaches MSC . After that, the core cannot provide enough pressure

force to counteract the gravitational force experienced by the envelope, so it starts to contract. It can be

shown that, under the assumption of an isothermal ideal gas,

MSC ≈ 0.37×(

µenvµcore

)2

M ≈ 0.10 to 0.15M . (10.4)

Stars with an initial mass above 2 to 3 M⊙ (depending on their metallicity) already have a helium core

mass above the Schonberg-Chandrasekhar limit at the TAMS, because their convective core homogeneously

mixed all material entering the core. Their helium core therefore starts to contract on a Helmholtz-Kelvin

time scale immediately after the TAMS.

Stars that have not yet reached the Schonberg-Chandrasekhar limit at the TAMS, maintain an inert

isothermal helium core in hydrostatic equilibrium that does not contribute to the luminosity and where the

pressure is partially delivered by degenerate electrons and partially by the ions. For these stars, the stellar

structure consists of a helium core with mass Mcore = q0M , surrounded by a hydrogen-rich envelope with

mass (1 − q0)M , with q0M < MSC . This is displayed schematically in Figure 10.13. The luminosity

of these stars only originates from hydrogen shell burning at the bottom of the envelope. The functions

159

Figure 10.13: Schematic temperature profile in an equilibrium model with an isothermal inert helium core

with mass q0M < MSC . Hydrogen shell burning occurs in the shaded region, which is located at the bottom

of the stellar envelope. (From Kippenhahn et al. 2012)

that describe the stellar structure can therefore be evaluated separately for the inert helium core and for the

surrounding envelope, and be connected to each other at the boundary.

10.5 Later stages of evolution

The evolution of the star beyond the TAMS depends on the initial stellar mass:

• Stars with a birth mass just above the minimum of ∼ 0.08M⊙ are fully convective. This is implies

that the helium created through the hydrogen burning is constantly and fully mixed in the entire star.

Such stars never reach a sufficiently high temperature to burn helium. They will hence lead to helium

white dwarfs. Currently, such white dwarfs are hypothetical in the sense that none exist yet in our

Universe, since it is still too young to allow for such stars to have evolved past the core-hydrogen

burning stage.

• Stars born with a mass below some 0.5M⊙ will initiate hydrogen-shell burning past the TAMS, while

the helium core contracts. The minimum mass for a star to enable helium burning is, however, some

0.47 M⊙. Hence these stars will die as hydrogen-helium white dwarfs. Also in this case, our present

Universe is too young to allow for such white dwarfs to have formed.

• After central hydrogen burning the stars with 0.5 <∼ M <

∼ 2.3M⊙, where the precise boundary masses

depend on the metallicity, have a degenerate He core. They start the helium burning explosively with

a series of off-centre thermal runaways (collectively called the “helium flash” – see Chapter 12). They

will end as a carbon-oxygen (CO) white dwarf.

• After central helium burning the stars with intermediate mass 2.3M⊙<∼ M <

∼ 8M⊙ have a (partially)

degenerate CO core. The central temperature of the stars with 2.3M⊙<∼ M <

∼ 6 M⊙ never reaches

8 × 108 K and these stars cannot start carbon burning. They live on, relying on the hydrogen and

helium shell burning. They end as a carbon-rich white dwarf.

160

Stars with 6M⊙<∼ M <

∼ 8M⊙ can ignite a few nuclear reactions of carbon burning and end as an O,

Ne, Mg white dwarf, provided that their CO core does not reach a too high level of degeneracy when

they achieve a central temperature of 8 × 108 K. In the case of stars whose CO core reaches a high

degree of degeneracy, an off-centre thermal runaway may occur when the temperature required for

carbon burning is reached, leading to an explosion similar as in the case of the helium flash but now

called carbon flash. This flash may be fatal for the star, or not, depending on the efficiency of neutrino

cooling, as we will discuss for the helium flash in Chapter 12.

• For stars with mass M >∼ 8M⊙, the core never contains degenerate matter. Their central temperature

keeps on increasing with every core contraction cycle. They pass through all successive burning cycles

until their core consists of iron. They end their life as a supernova with a neutron star or black hole as

a remnant.

Figure 10.14: Global picture of the evolution for stars of (X,Z) = (0.7, 0.02) and masses of 1, 2, 3, 5, 7

and 10 M⊙ in the HR diagram (left panel) and in the central temperature versus central density plane (right

panel). The dotted lines are the ZAMS, while the dashed lines in the right panel show the limits for the

various EOS (as in Figure 4.6). The 1 M⊙ curves are characteristic of those of lower-mass stars, i.e., the

central core becomes degenerate after the main sequence and helium will get ignited in a thermal runaway

at the tip of the red giant branch. The 2 M⊙ model will also undergo a He flash but the 3 M⊙ model will

start burning helium quietly under non-degenerate conditions and hit degeneracy only in its CO core formed

from the core helium burning. The 5 M⊙ model is representative of stars undergoing quiet He ignition and

He burning causing a loop in the HR diagram, while never reaching a high enough temperature to ignite

carbon due to electron degeneracy. The 7 M⊙ model does so, but in a state of degeneracy. The 10 M⊙ will

undergo all burning cycles in the centre without entering the area of electron degeneracy. (Figure courtesy

of Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)

161

Figure 10.14 shows a comprehensive HR diagram (left panel) and the properties of the core physics (right

panel) summarizing some of these various options. In the following chapters, we discuss the stellar evolu-

tionary paths beyond the TAMS in detail.

162

Chapter 11

Evolution of a star with 8M⊙ <∼ M <∼ 15M⊙

11.1 The Hertzsprung gap for stars with M >∼ 2.3M⊙

We will now first take a look at the further evolution of all stars with M >∼ 2.3M⊙ consisting of a shrinking

helium core with a mass higher than the Schonberg-Chandrasekhar limit that is surrounded by a hydrogen

envelope with hydrogen burning in a shell. The evolution of the internal structure and the evolutionary track

of a Population I star of 5M⊙ is shown in the HR diagram in Figure 11.1. The different layers in the star

are characterized by their m-value in units of M⊙. Grey areas are convection zones. The red hatched zones

indicate areas with nuclear burning.

The transition from central to shell hydrogen burning takes place in point C. At that moment the 1H in

the core is exhausted hence the burning stops and the region no longer is convective. The hydrogen burning

ignites in a relatively broad shell surrounding the He core. This shell becomes thinner as the evolution

progresses, while the helium core will gain more mass and therefore shrinks faster. After point C, the

evolution with a shrinking helium core happens on a contraction time scale.

Due to the core contraction the core is no longer in thermal equilibrium, which means that the time

derivative in the equation of energy conservation can no longer be ignored. The increase in central tempera-

ture is accompanied by an increase of the local energy production (virial theorem!) and, consequently, of the

local luminosity. As a reaction to the shrinking core, the layers above the shell burning hydrogen expand.

The density in the core, while increasing, remains sufficiently low to avoid electron degeneracy, which is

the case for all stars with initial mass above ∼2.3M⊙. For such stars the contraction of the core thus leads

to a local increase in temperature.

When a temperature of 108 K is reached, the central helium burning starts (point D). The star has found

a new energy source in the core, hence its contraction stops. A state of thermal and hydrostatic equilibrium

sets in in the core. The contraction of the core between the points C and D takes about a time scale of

Helmholtz-Kelvin (≃ 3 × 106 years for a star with 5M⊙). During this time interval the outer layers have

163

Figure 11.1: Left: evolutionary track of a 5 M⊙ star with (X,Z) = (0.70, 0.02) without CBM. Right:

time evolution of the stellar interior in a so-called Kippenhahn diagram, where the evolutionary phases

correspond to those labeled in the left panel. Dark/light grey areas are convective/semiconvective. The red

hatched regions are areas where nuclear fusion is taking place, with dark red zones delivering 5 times more

energy than light red. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution,

Radboud University Nijmegen, NL)

expanded and the stellar radius has increased substantially, with about a factor 25! The star has evolved

into a red giant in point D of the HR diagram. The expansion to a red giant occurs so rapidly that the

probability to catch stars in the transition stage from C to D is low. This is called the Hertzsprung gap in the

HR diagram: it is the area between the main sequence and the red giant branch with a deficiency of observed

stars.

The evolution of a star described above and showed in the Figure 11.1 remains qualitatively the same

for all massive stars having a helium core above the Schonberg-Chandrasekhar limit at the end of the main

sequence and whose helium burning starts before electron degeneracy occurs (M >∼ 2.3M⊙). In this stage

of their lives such stars do not experience a strong stellar wind (M <∼ 15M⊙). These stars all move in very

short time towards the area close to their Hayashi track in the HR diagram.

Figure 11.1 represents a stellar model without CBM. However, observational evidence for the occur-

rence of an overshoot zone is piling up from asteroseismology and from eclipsing binary modelling (see

Chapter 14). A pair of evolutionary tracks for stars with 3 and 5 M⊙ is shown in Figure 11.2. One track

(full lines) stands for standard models without CBM and the other track (dashed lines) shows the important

impact of the presence of CBM: it gives the star a more massive He core, a higher luminosity, and a longer

main-sequence duration as it has more fuel available.

Irrespective of the occurrence of CBM or not, the models share a common property for their envelope:

164

Figure 11.2: HR diagram with two types of evolutionary tracks from the ZAMS to the red giant branch for

models with 3M⊙ and 5 M⊙, this time for (X,Z) = (0.718, 0.014). The full black line concerns standard

models with only convective mixing in the core and no additional CBM, while the dashed red line represents

models with an average level of CBM. (Figure courtesy of Dr. May Gade Pedersen)

its expansion as the star becomes a red giant causes a reduction of the radiated energy (virial theorem,

this time applied to an expansion). This implies that the models show a marked “dip” in luminosity at a

temperature of about log Teff ≃ 3.7 in Figures 11.1 and 11.2. This was also already visible in Figure 7.1.

There is no “analytical” mathematical expression for the evolutionary track towards the red giant branch

once the hydrogen-shell burning is initiated. The stellar evolution models computed from numerical integra-

tion of the system of differential equations that we discussed in Chapter 8 all show this result, irrespective

of the used code. From a physics standpoint we do understand that the star will swell, because the cool

outer stellar layers become convective to transport the energy produced in the shell fusing hydrogen. Aside

from the hydrogen shell burning, also the contraction of the helium core delivers energy, and half of this

energy will have to be transported as well (once more the virial theorem). The temperature gradient in case

of convective energy transport is lower than in the case of radiative transport, causing the temperature to

decrease more slowly going outwards in convective zones compared to the case of the radiative tempera-

ture gradient. To cool down sufficiently up to the stellar surface, the star thus has to expand and its radius

becomes substantially larger. This expansion causes a decrease of temperature in the outer envelope (via

the virial theorem: half of the expansion energy is taken from the luminosity and half of it is used to cool

the star). This way the internal energy in the envelope decreases, together with the luminosity of the star,

despite the strongly increased radius.

165

11.2 Helium burning

At the time when central helium burning is about to start, the star of 5 M⊙ without CBM hasMcore ≃ 0.6M⊙

and is situated in the surroundings of its Hayashi track. It has an extensive outer convective zone with a depth

reaching the position corresponding to m ≈ 0.9M⊙ for the example of the star with 5M⊙ (see Figure 11.1).

This is deeper than the maximum extent of the convective core during the main sequence (≈ 1.25M⊙ at the

ZAMS). The higher the birth mass, the deeper this envelope convective zone penetrates into the layers where

the chemical composition changed due to the CNO burning. This way, convective mixing in the envelope

disperses nuclear reaction products in the envelope and brings them to the surface of the star. This happens

between points D and E and is called the first dredge-up.

The dominant reaction in the central helium burning is 3α →12C. While the abundance of 12C in-

creases, the reaction 12C + α →16O will gradually take over the lead in chemical composition compared

to 12C. Indeed, in the stage that 4He gets exhausted, the exhaustion of 12C in favour of 16O will be larger

than the production of 12C due to the 3α reaction. Consequently the abundance of 12C will start to decrease

again after having reached a production maximum.

Depending on whether CBM has been active or not, the stage of central helium burning lasts between

25% (without) and 15% (with average CBM) of the main-sequence duration (22 Myr for the model in Fig-

ure 11.1). At first sight, that seems surprisingly long keeping in mind that the luminosity of the star (i.e.,

the energy consumption) is higher, that the central core where the helium burning is taking place is much

smaller than for hydrogen burning and that the energy gain is below 10% than the one delivered by hy-

drogen burning. The reason for this extensive time span is that the largest part of the energy production

is not delivered by the helium burning in this stage, but by the hydrogen shell burning. In the point E the

helium burning is responsible for less than 10% of the total energy production. However, this modest energy

production in the core is sufficient to counter the gravitational contraction and to keep the star as a whole

in thermal equilibrium. Towards the end of the helium core burning, both fusions deliver about an equal

contribution to the nuclear energy.

After the point E the star has found a new energy balance and the convection in the envelope slowly

retreats: the star moves downwards along its Hayashi track towards F. It subsequently moves to the left in the

HR diagram. The bluest point G corresponds to the time when ∼75% of the central helium burning stage has

passed. At that moment the central helium mass fraction has dropped to about Y ≈ 0.25. Afterwards, the

star again returns to its Hayashi track (towards point H) as its central helium gets more and more depleted.

11.3 Later evolution stages

The core helium burning stops when all provision of 4He is exhausted and is converted into 12C, 16O and20Ne. The precise correlations of the abundances of these produced elements depend on the temperature, the

mass, the initial chemical composition, and the occurrence (or not) of CBMs. The burning is now displaced

to a concentric shell that surrounds the CO core (as of point H). While the helium shell keeps on burning,

166

the CO core gets heavier and contracts. The situation is now similar to the one just before the central helium

burning started.

In this stage of its life, the star has two types of shell burning that produce the necessary energy:

hydrogen shell burning in a shell that is situated at the bottom of the envelope and helium burning in the

shell right above the CO core. The CO core contracts, helium is produced between the two burning shells,

and the outer envelope expands and becomes convective. In the HR diagram the star moves up from point H

to J.

The temperature in the hydrogen shell is all the time decreasing. Depending on the birth mass it may

become lower than the temperature needed to keep the process of hydrogen burning going. In that case

and at that time, there is only a contracting CO core surrounded by an area above the helium shell where

all layers expand. In that situation, the luminosity rapidly increases as a consequence of the fast increasing

mass of the CO core. Whether a second loop in the HR diagram arises in addition to the one shown in

Figure 11.1, or not, depends on the mass, the initial metallicty, the nuclear burning efficiency, the opacities,

CBM, etc.

In Figure 11.1 we notice that the outer convective zone reaches deeper and deeper into the stellar

interior as the evolution proceeds. At a certain moment this zone contains about 80% of the mass and its

bottom clearly interferes with the area where the hydrogen shell burning in the preceding millions of years

has taken place. In this area all 1H is transformed into 4He and almost all 12C in 16O and 14N. These nuclei

are transported to the surface by the convective cells during the late evolutionary stages. This is called the

second dredge-up.

11.4 Burning cycles

The evolution scenario described above is fairly complicated, when considering the position in the HR diagram

as it depends on the details of the physical properties of the star as a whole. The evolution process is however

less complicated at the level of the evolution of the stellar core. When we extrapolate the stages of central

hydrogen and helium burning, the central core undergoes subsequent cycles of nuclear fusion that can be

represented schematically as follows:

core burning

ր ցheating up core exhaustion fuel

տ ւcontraction core

The burning in a given time frame will gradually use all fuel that is available in the convective core.

The exhausted core will then contract, increasing the central temperature until it is high enough to initiate

the following burning cycle. As long as this scheme is continued, heavier nuclei keep being produced in the

167

Figure 11.3: Schematic illustration (not to scale!) of the internal “onion structure” of a highly evolved

massive star. A few typical values of the mass, the temperature and the density are given in cgs-units. (From

Kippenhahn et al. 2012)

stellar centre. These new heavier elements are homogeneously mixed by the convection in the core, which

shrinks at the onset of each new cycle: after central hydrogen burning we get an extensive helium core, in

which a smaller CO core is formed by helium burning etc.

Each time the central fuel is exhausted and the burning stops, the next burning cycle in the core cannot

immediately start, but a transition period of shell burning will take place. This shell burning occurs in the

hottest layer still containing fuel at that moment. Shell burnings can survive different subsequent central

burning cycles, which on their turn create a new shell. Several shell burnings can thus take place simul-

taneously. They are separated by mass shells with a different chemical composition, where the occurring

elements are gradually heavier as the shell is situated deeper into the star. This is called the onion model,

and is represented in Figure 11.3. Depending on the temperature differences occurring in the core at every

new cycle, a given shell burning can be activated again in the shell that was no longer active. The burning

cycles after the hydrogen and helium burning in the core all have such a short time interval that the chance

to observe a star in this stage of its life is small.

168

11.5 Explosive versus non-explosive evolution

The scheme above can be interrupted temporarily or definitively. On the one hand, a temporary interruption

can occur when the density in the central core is so high that degeneracy sets in. When the degeneracy

parameter ψ starts to increase, the electron pressure helps to counterbalance gravity so there is less need to

contract as ψ increases. As the mass of the core increases due to the shell burning, contraction continues.

The cycles of core burning will continue as long as electron degeneracy is avoided or can be lifted. On

the one hand, the central core of a star with initial mass lower than ∼ 6M⊙ will never become sufficiently

hot to start carbon burning. On the other hand, while discussing the nuclear burning mechanisms we have

mentioned that 56Fe is the most stable isotope. Therefore, the iterative core burning scheme stops definitively

when the inner core entirely consists of 56Fe and exothermal fusion is not possible anymore.

It is obvious that we now have to make a distinction in terms of birth mass for the further evolution of

the star. Whether the mass of a star comes close to the boundary masses (2.3, 6 en 8 M⊙) depends strongly

on the mass loss it undergoes during its evolution. Up to now we did not take into account the effects of

mass loss, but a large mass loss in the form of a strong dust-driven stellar wind does occur at the end of the

stellar evolution for stars with M <∼ 8M⊙ while stars born with a higher mass experience a radiation-driven

wind during the evolved stages for M >∼ 8M⊙ and already during the main sequence for M >

∼ 15M⊙. The

influence of mass loss on stellar evolution is a complicated problem. The mass loss of a star with initial

mass below 8 M⊙ is such that a final core mass below the Chandrasekhar limit of 1.44 M⊙ is left, while stars

born with a mass above 8 M⊙ end up with a core mass above the Chandrasekhar limit.

We now have to make a distinction between stars more massive than 8M⊙ at birth and stars born with

a lower mass to describe the further evolution. In this chapter, we discuss the further evolution of a star with

an initial mass higher than 8 M⊙ but lower than 15 M⊙, i.e., we consider stars that are left with a core mass

higher than 1.44 M⊙ at the end of the various burning cycles. The evolution of stars with M <∼ 8M⊙ and

M >∼ 15M⊙ will be treated in subsequent chapters.

11.6 Neutron stars

11.6.1 Supernova explosion

For stars with 8M⊙<∼ M <

∼ 15M⊙ the CO core is not degenerate after helium burning. During the

contraction following the central helium burning, the central temperature increases sufficiently to induce

subsequently carbon, oxygen, and silicon burning. These final cycles elapse very fast. For a star with 15 M⊙

carbon burning produces enough energy during about 5 000 year, oxygen burning during about 1.7 year and

silicon burning takes just a few days! The end of the silicon burning stage, which mainly produces 56Ni,

entails a serious problem for the star: it is no longer able to generate energy through nuclear reactions in the

core and to balance the gravitational force.

These stars thus complete the whole burning cycle until they have built up an Fe core. As aforemen-

169

tioned the stable state inevitably comes to an end: gravity is the winning force and the nucleus collapses

very quickly. With the collapse of the core, the material in the envelope is accelerated to a velocity that can

amount to half the speed of light. This is the consequence of the enormous gravitational force by which the

particles of the collapsing core attract those in the envelope. These accelerated envelope particles suddenly

come to a stop when they collide with the very dense core of the star: their kinetic energy is converted

into heat and a strong temperature increase occurs. The temperature of the core of the star rises up to

T > 1010 K. This time the increased energy does not result in the start of a new burning cycle. On the

contrary, the increase of the temperature implies that the photons get a higher energy and consequently the

photo-dissociation of the nuclei dominates. Because of this, the heavy nuclei that were formed during the

last burning cycle, are dissolved. First the elements of the iron group are transformed into α particles :

56Ni + γ → 14 4He,54Fe + γ → 13 4He + 2n,56Fe + γ → 13 4He + 4n, . . .

(11.1)

While energy was generated during the building up of these heavy isotopes, the process of dissociation costs

energy, i.e., these are endothermal reactions. The required energy is provided by the contraction of the core.

The resulting increase in temperature subsequently implies also a photo-dissociation of each α particle:

4He + γ → 2 1H + 2n, (11.2)

again requiring energy and thus accelerating the contraction even more. At this point, the whole sequence

of nucleosynthesis to build up the chemistry deep inside the star gets undone in less than a second. . .

The photo-dissociation results in a mixture of protons, electrons and neutrons. This results in a drastic

increase of the core density and consequently the electrons and protons are forced to recombine to form

neutrons. The density becomes so high that the neutrons subsequently collide with one another. The drastic

increase in pressure results in a shock wave that propagates through the outer layers of the star surrounding

the core full of neutrons. Part of the energy of the shock wave gets dumped in the remains of the core of

the star. Another part gets diverted as neutrinos. Because of the high densities, large quantities of neutrinos

get caught by the outer layers of the star.The result of this dumping of neutrino-energy is that the layers

surrounding the core get expelled: the star explodes as a supernova and temporarily becomes as bright as a

galaxy! The table represented in Figure 11.4 lists some quantitative values of the nuclear burning properties

of a 15 M⊙ star from birth until the supernova explosion.

Supernovae are classified in Type I and Type II according to an observational characterisations. A

star is a Type II supernova when hydrogen lines occur in the spectrum and a Type I supernova when such

lines are absent. Each supernova, however, has its own characteristic shape of the light curve, and many

subclasses have been introduced so far. Type II supernova are not observed in old stellar populations, like

the elliptic galaxies, but are observed in galaxies with spiral arms rich of gas and dust. Type I supernova are

observed everywhere and are split into categories Ia and Ib/c. In general Type II supernova are associated

with the collapse of the iron core of a single massive star; a more appropriate term instead of Type II is core

collapse supernova. At the time of implosion, these stars still have hydrogen-rich envelopes explaining the

detection of hydrogen lines in the spectrum. Because massive stars evolve much faster than low-mass stars,

elliptic galaxies already have their core-collapse supernova behind them. Type Ia supernovae originate when

a star crosses the mass limit of Chandrasekhar. This mainly occurs through accretion in a binary system,

170

Figure 11.4: Typical values for the central temperature, density, and duration of each of the core burning

cycles of a 15 M⊙ star prior to the formation of a neutron star.

where mass transition of a star to a white dwarf happens (see Chapter 14 and the MSc courses Binary Stars

and High Energy Astrophysics). Due to the constant formation of white dwarfs and close binaries in a

given population, Type Ia supernovae are observed in all types of galaxies, i.e., in young as well as in old

populations.

The observational classification in Type I and II supernovae does not always corresponds to the astro-

physical interpretation, i.e., the division between whether or not an iron core collapses and expells its outer

layers. Type Ib/c supernovae originate from the explosion of a massive star in spiral galaxies, more particu-

larly a Wolf-Rayet star that has lost all of its hydrogen due to a very strong radiation-driven stellar wind (see

Chapter 13). The spectrum of their exploding cores hardly shows any hydrogen lines, while it does concern

a collapsing iron core.

11.6.2 The neutrino flux and the r-process

In case of the very high densities achieved during the collapse of the stellar core, the electrons very efficiently

come close to the nuclei, where they can transform protons into neutrons. While neutrons are unstable

elements that decay after 7 minutes in non-degenerate matter, they no longer decay in degenerate matter:

the stellar core becomes a neutron star. The pressure becomes so high that the neutrons become degenerate.

171

Figure 11.5: Light curve of the supernova that exploded in 1987 in the Large Magellanic Cloud. This

supernova was easily visible by eye in the Southern Hemisphere. Note the long, almost linear decrease

of the brightness during the first months following the explosion and the hump in brightness. The latter

corresponds to the energy production supplied by the decay of 56Co.

This degenerate neutron gas will be able to prevent a further gravitational collapse.

The equation of state for a degenerate neutron gas is not yet well established. Consequently, the upper

limit for the mass of the neutron star cannot be derived. Current estimations of the upper limit are around

2 M⊙. This is only slightly larger than the mass limit for a degenerate electron gas. The observational

determination of the mass of a neutron star is mainly done on the basis of binary stars of which one of the

components is a neutron star (see Chapter 14). However, just as for white dwarfs (cf. Chapter 12) these stars

are subject to binary evolution and may represent a different mass distrbution. At any rate, using such a

binary approach, the masses found are compatible with the upper limit of 2 M⊙ (taking into account errors).

A neutron star has a radius of a few tens of km.

A detailed picture of the formation of a neutron star is not available. Models for the equation of

state contain several parameters whose values are not well constrained. The current models predict that the

interior temperature will drop to 108 K after formation in a timespan of about 100 years. This cooling occurs

as a consequence of the strong neutrino flux. This neutrino flux is produced by the capture of electrons.

Indeed, the 56Ni isotopes formed during the silicon burning are unstable with regard to electron capture.

Consequently this isotope decays to an 56Fe isotope as follows :

56Ni + e− → 56Co + νe,

56Co + e− → 56Fe + νe.(11.3)

The first reaction has a half-life of 6.1 days and the second one of 77 days. This radioactive decay is

172

Figure 11.6: Schematic representation of the r-process in a (N,Z) diagram. Indicated are the reaction chains

that represent the capture of neutrons, followed by β−−decay, as a result of which heavy stable isotopes

originate. (From Wanajo, S., et al., 2004, ApJ, 606, 1057 –1069)

responsible for the luminosity observed months after the explosion. All (limited) theoretical models of

neutron star formation predict that high neutrino fluxes already leave the star before the explosion becomes

optically visible. Indeed for supernova 1987A (for the light curve of SN 1987 A, see Figure 11.5) in the

Large Magellanic cloud 20 neutrinos were measured at the correct energy in two neutrino detectors in

the Northern Hemisphere (Japan and USA), about 6 hours before the discovery of the optical flash. This

number of neutrinos is compatible with the predicted neutrino production according to the nuclear reactions

described above. It is remarkable that the neutrinos first passed through the Earth before being detected,

as the supernova exploded in the Southern Hemisphere. These neutrinos produced during the supernova

were the first ones to be measured directly originating during such a process. As such they delivered a very

important and successful test for the up to then uncertain calculations of the nuclear reactions, as described

above, during the ultimate final stage of a massive star.

The 56Fe isotope is, as aforementioned, the most stable isotope in nature. Nevertheless, processes

responsible for the production of elements heavier than this isotope occur, such as the s- and the r-process,

which are abbreviations for “slow” and “rapid neutron capture”. Hereby neutrons are captured by nuclei.

These processes consequently only can occur when there is an efficient production of neutrons. A free

173

neutron is not stable and decays with a half life of only 7 minutes. However, since a neutron has no electrical

charge, it can easily reach any nucleus (no Coulomb repulsion) in dense matter. The probability for a nucleus

to capture a neutron depends on the density of the neutrons, the mutual velocity of the nucleus and the

neutron and the mass number. A nucleus with a magic number of neutrons, i.e., an isotope with a closed

neutron shell, will be less tempted to capture an extra neutron. The s-process mainly happens in AGB stars

(“Asymptotic Giant Branch”, see next chapter), while the r-process happens during supernova explosions.

The thermo-nuclear reactions that take place before and during the supernova explosion indeed produce

elements heavier than iron because the production of neutrons during the explosion is efficient enough to

produce neutron-rich nuclei past the iron peak in a stable way. We can represent the neutron capture as

follows :

(Z,A) + n→ (Z,A + 1) + γ,

(Z,A + 1) + n→ (Z,A + 2) + γ,

. . .

(11.4)

When the consecutive nuclei are unstable, they decay very rapidly through a β− decay:

(Z,A) → (Z + 1, A) + e− + νe. (11.5)

Herein νe represents an antineutrino. Such a decay does not happen if meanwhile a new neutrino gets

captured. As such very heavy nuclei can originate before they get the time to decay. In the r-process (“r”

of “rapid”: the capture of neutrons is quick with regard to the β−−decay), the neutron density needs to

be of the order of 1022cm−3. Consequently, the path of the r-process elements in the (N,Z) diagram (see

Figure 11.6) is located deeply into the neutron-rich area, far from the valley of stability. The required large

neutron production can only be realised during supernova explosions.

The matter in the core of the star just before and just after the supernova explosion indeed consist of a

substantial number of neutrons. As a result, the r-process can take place during the cooling down stage after

the explosion. The net effect of this production of heavy elements is the main source of the heavy elements

we find today in nature.

11.6.3 Pulsars

Neutron stars need to spin very fast around their axis. This is a consequence of the conservation of angular

momentum. When collapsing, the dimensions of the star are largely reduced: the radius shrinks from a

few million km to about 20 km. Consequently, the angular velocity will increase with a factor 1010. The

accompanying angular frequency is a few tens per second. Due to the strong increase of the angular velocity,

the strength of the magnetic field of the star increases with about the same factor. Stars with a weak magnetic

field of a few Gauss prior to collapse suddenly get a magnetic field of about 1010 − 1012 Gauss.

Already in 1934, two years after the discovery of the neutron, the astronomers W. Baade and F. Zwicky

predicted the existence of neutron stars as the collapsed core after a supernova explosion based on theoretical

considerations. It took up to 1967, however, before the first neutron star was discovered. PhD student

Jocelyn Bell from Cambridge (UK) discovered a source of radio-emission in the sky. This source exhibited

174

Figure 11.7: Sketch of the model of a pulsar. The radiowaves of a pulsar are emitted in two bundles, depart-

ing from the two magnetic poles of the neutron star. The star rotates around an axis, which is inclined with

respect to the magnetic axis. The bundles of radio-emission spin, similar to the beams of light of a light-

house. When the bundle swings over the Earth, we observe a pulse of radio-emission. (From quora.com)

very regular intervals of about one second, transmitting strong pulses of radio waves to Earth. Such an object

is called a pulsar. The only stars known in 1967 able to spin around their axis in one second were white

dwarfs (see next chapter) and that was consequently the initial explanation given to the pulsar. In November

1968, however, a pulsar was discovered in the Crab Nebula. It emitted 30 pulses per second (the Crab pulsar).

It was known that the Crab Nebula was the rapidly expanding remnant of a supernova explosion, because

a dazzling bright star had appeared on that spot on the 4th of July 1054. This supernova was even visible

during day time for a couple of weeks. In Japanese, Chinese and Korean chronicles, the appearance of this

“super-bright new” (super-nova) star is elaborately recorded and the evolution of the brightness is tabulated.

175

Figure 11.8: Pulse profiles of 45 pulsars. Each pulse profile depicts how the intensity of the radio-emission

received from the pulsar varies during one pulse period. Some pulsars have a pulse consisting of only one

peak, others can have two or even three peaks. The pulse periods vary between 0.1 and a few seconds.

(Figure courtesy of Prof. Ed van den Heuvel, University of Amsterdam, NL)

176

The very short pulse period of the Crab Nebula makes it impossible for this object to be a white dwarf,

because the angular rotation rate at the surface of the star would exceed the critical rotation rate caused by

the centrifugal force. As such, it became clear that it must be a fast rotating neutron star. The radio waves

are generated above the strong magnetic poles of the star, shaped like light bundles (see Figure 11.7). The

neutron star revolves around an axis inclined with regard to the magnetic axis. Consequently, the bundles

strike the Earth with regular intervals. As for the rotating light bundle of a lighthouse, we observe the radio-

emission as regular pulses. The discovery of the Crab pulsar was immensely important: it was realised that

pulsars are neutron stars and moreover that neutron stars are the final product of core-collapse supernovae.

The supervisor of Jocelyn Bell received the Nobel prize in physics for this discovery1.

Meanwhile many more pulsars have been found. In Figure 11.8 we show the original pulse profiles of

about fifty pulsars. We notice very different shapes, from narrow and symmetrical to broad and asymmetri-

cal. The shape of the pulse profile depends on, among other things, the geometry of the magnetic field and

the inclination of the rotation axis with regard to the observer.

1Still today, it is considered a major and unacceptable scandal that the (female) PhD student was not involved in this prestigious

award, she was after all the one who made the discovery. It did not stop Jocelyn to develop a beautiful career in astronomy in the

UK and to keep on fighting for women’s rights in this profession. Today she still is a very active Emerita

177

178

Chapter 12

Evolution of a star with M <∼ 8M⊙

12.1 Post-main-sequence evolution

Contrary to stars with M >∼ 2.3M⊙ stars with a lower mass evolve in a qualitatively different way after the

exhaustion of the hydrogen in the central parts. There are several reasons for this. First, these low-mass stars

have no or very small convective cores. For stars with a mass smaller than the Sun there is no convective

core and consequently these objects produce a helium core with a very low mass in which no mixing takes

place. Because of this there will be a very gradual transition from central to shell hydrogen burning, much

less drastic than when convection mixes the helium core and CBM occurs adjacent to it. The mass taking

part in the hydrogen-shell burning is typically <∼ 0.1M⊙.

Electron degeneracy is important during or immediately after the main sequence (cf. Figure10.14).

The pressure in the central parts of the star is not only produced by the ions that meet the ideal gas law, but

partially by the degenerate electron gas. Consequently these stars can handle a larger fractional helium mass

before contraction of the core sets in, i.e., their Schonberg-Chandrasekhar limit is at the highest level in the

expression given in Eq. (10.4), i.e., MSC is typically near 15% of the birth mass, depending on the chemical

composition and on the degree of degeneracy in the core. Therefore the core of these stars can remain in

equilibrium thanks to the (partially) degenerate electrons in the inert isothermal helium core. There is no

need for a rapid contraction of the core to start up central helium burning and there is no analogy of the

Hertzsprung gap for low-mass stars, also because such stars start much closer to their Hayashi track for their

further evolution.

During the first stage after the hydrogen burning in the core, the hydrogen shell burning starts and the

mass of the core grows at a very slow pace. The temperature of the core remains constant and far below the

temperature needed for helium burning. Therefore, for these stars, the shell burning in-between the central

hydrogen and central helium burning stages is a phase on a nuclear time scale, as long as their core has a

mass below their Schonberg-Chandrasekhar limit. Because of this slow phase, we expect to observe many

low-mass stars in this stage of hydrogen shell burning, which is called the subgiant phase.

179

Figure 12.1: Left: evolutionary track of a 1M⊙ star with (X,Z) = (0.70, 0.02). Right: corresponding

Kippenhahn diagram. The colours and hatchings have the same meaning as in Figure 11.1. (Figure courtesy

of Prof. Onno Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)

The hydrogen shell burning causes an increase of the mass of the helium core. Consequently the

brightness increases steadily, while the hydrogen-rich envelope located above the hydrogen shell source

expands. At first the brightness only changes slightly while the star moves to the right in the HR diagram.

This movement cannot be maintained very long since the star is located close to its Hayashi track. However,

the star needs to expand its envelope further, to counterbalance the ever deeper reaching convective envelope.

This necessarily leads to a relatively strong increase of the brightness due to the increasing radius. In this

stage, the brightness will increase with a factor 100 while Mcore keeps on growing. The star climbs up the

red giant branch (RGB) , see Figure 12.1.

Figure 12.1 shows the evolutionary track, as well as the inner structure of a star with initial mass 1 M⊙,

born with an initial chemistry represented by the mass fractions (X,Z) = (0.70, 0.02). During the central

hydrogen burning the star first moves up and later on evolves to the right in the HR diagram. It is about 9 Gyr

at the bluest point B. Between B and C the central hydrogen burning stops and shell burning continues.

Because of that, the helium core becomes more massive and slowly contracts, while the envelope expands.

This takes about 2 Gyr. The star’s track is located very close to its Hayashi track, implying, as mentioned

before, that the star inevitably needs to climb up the red giant branch towards the point D.

Aside from a slightly different main-sequence evolution, all stars with M <∼ 2.3M⊙ end up in more

or less the same point D on the RGB. Stars with 1.1M⊙<∼ M <

∼ 1.5M⊙ do have a convective core during

core-hydrogen burning but do not reach their Schonberg-Chandrasekhar limit at the TAMS. On the other

hand, stars with 1.5M⊙<∼ M <

∼ 2.3M⊙ do reach this limit at the TAMS so they start core contraction and

envelope expansion on a thermal time scale τKH, passing through a small Hertzsprung gap prior to reaching

point D.

The radius of the star, and therefore as well its brightness, increase. That the star moves closely to the

180

Hayashi track can also be seen from the inner structure, revealing that the outer convective zone contains

about 75% of the total mass. The convection zone thus reaches layers already containing heavy nuclei

produced by nuclear reactions. The star goes through its first dredge-up. The turbulent convective motions

imply that the processed material gets transported homogeneously throughout the envelope and in particular

to the surface of the star. As a consequence, the surface composition is substantially changed after the first

dredge-up: Li and C abundances as well as the carbon isotopic ratio 12C/13C decline, while 3He and the 14N

abundances increase, providing a marked decrease in the C/N abundance ratio at the surface.

The monotonically increasing character of the brightness along the RGB above point D gets interrupted

in point E. Indeed, at that stage, the hydrogen burning shell reaches the region that got replenished by fresh

hydrogen at the expense of heavier nuclei by the bottom of the convective envelope at the epoch of the first

dredge-up. This lowered µ in that position compared to what it used to be prior to the first dredge-up. As a

consequence, the luminosity of the star all of a sudden decreases. Depending on the mass and metallicity,

this happens around log(L/L⊙) ∈ [1.5, 2]. The star thus makes a small down-and upward zig-zag along the

RGB. This means it passes there three times, going up, down, and back up again. Due to this we observe an

overpopulation of stars in that part of the RGB, called the “luminosity bump”.

When the outwardly moving hydrogen burning shell reaches the discontinuity in µ created by the deep

convective envelope, it finds a reservoir of 3He, feeding the reaction 3He(3He,2p)4He and creating protons.

This lowers µ locally and hence delivers a negative ∇µ. Indeed, we have

3He +3He → 2 1H + 4He, hence µ : 6/6 → 6/7

On the other hand, the following reactions also take place:

3He +4 He → 7Be + γ7Be + e− → 7Li + νe(+γ)7Li +1 H → 4He +4 He

, hence µ : 7/6 → 8/6,

implying an increase in µ and thus a positive ∇µ. In the case that the reaction 3He(3He,2p)4He is dominant,

the process of thermohaline mixing becomes active. This represents a slow mixing process acting on the

local thermal (expansion) timescale. The condition for the occurrence of thermohaline mixing is

ϕ

δ∇µ ≤ ∇−∇ad ≤ 0, (12.1)

i.e., it operates in regions that are stable against convection according to the Ledoux criterion and where an

inversion in the mean molecular weight is present. Thermohaline mixing may transport chemical species

between the H burning shell and the convective envelope. The efficiency of the thermohaline mixing is

regulated by the local abundance of 3He and 4He: the second set of reactions depends linearly on the

local abundance of 3He and 4He, while the first reaction depends quadratically on the abundance of 3He.

181

For increasing initial mass, the second set of reactions becomes more important than the first reaction and

creates a composition barrier because µ increases, while the first reaction leads to a decrease in µ. For this

reason, we find a threshold in mass at about 1.5 M⊙ above which thermohaline mixing is not efficient.

The evolution calculations of stars born with a different mass lead to similar results as those shown in

Figure 12.1. The main sequence tracks are different depending on the birth mass M <∼ 2.3M⊙ but close to

their (slightly different) Hayashi tracks, all these evolutionary tracks come together. For all of these evolved

stellar models, the central parts of the star have become dense enough to treat these parts independently

from the stellar envelope (and thus from the total mass). Stars with a different mass but a similar core mass

Mcore will display the same luminosity and will occupy the same position in the HR diagram.

When MSC is reached along the RGB, the helium core will quickly start to contract, and the upper

stellar layers will expand. The star reaches the tip of the red giant branch in point F. Numerical calculations

demonstrate that the temperature in the core rises strongly (virial theorem!), but not in the very inner part of

the degenerate core. This phenomenon of an off-centre temperature increase has to do with the occurrence

of a temperature inversion in the inner core, due to a neutrino cooling process. Indeed, the contracting core

also implies a contraction of the hydrogen-burning shell that surrounds it and the core mass thus grows

efficiently in this fast contraction phase. In the inner stellar core, the plasmon neutrino cooling process

sets in at that stage, which implies that energetic photons can decay into two neutrinos. The accompanying

neutrino production gives rise to an energy flux that can easily escape the star. This neutrino cooling thus

gives rise to an energy loss from the inner stellar core. The efficiency of the production of plasmon neutrinos

increases with density, and so the temperature profile of the star is inverted locally, in the sense that the core

becomes cooler in the centre than in its surrounding layers. Thus, the temperature for helium ignition is first

reached in a shell surrounding the very inner core.

Typical values for the density and temperature in the off-centre core regions are 106 g cm−3 and 108 K

(see Figure 10.14). That way the temperature necessary for the triple alpha reaction is reached, starting the

helium burning off-centre. This takes place when Mcore ≈ 0.47M⊙, quite independently of the value of Mon the ZAMS. But the stellar matter in the core is in an advanced state of degeneracy and the helium burning

is unstable in such an environment. Indeed, the energy production by the nuclear fusion is not accompanied

by an increasing outward ion pressure force, since the electrons are mainly responsible for the outerward

pressure, independently of the temperature. Thus, the nuclear energy is not used to expand but rather to

increase the temperature. Due to the large temperature dependence of the He burning, a thermal runaway

occurs, which ends the quiet evolution of the star on the RGB in point F.

12.2 The helium flash

The thermal runaway originating from the ignition of helium burning in the degenerate core region has a time

scale of the order of the thermal time scale of the helium burning in the off-centre region. The temperature

increases, while the matter does not expand nor contract (the pressure is not related to the temperature).

There is no work done and therefore there is an enormous overproduction of nuclear energy. For a few

seconds, the local luminosity l reaches a maximum of about 1011 L⊙ (the luminosity of a whole galaxy!):

182

Figure 12.2: Scheme indicating the change in temperature during the helium flash. After the ignition tem-

perature of helium is reached in the degenerate core, the temperature increases without giving rise to an

increase in pressure or density until the degeneracy is lifted (nearby the dashed line). Thereafter, a stage of

stable central helium burning in a non-degenerate regime occurs. (From Kippenhahn et al. 2012)

the star is said to experience a helium flash.

Figure 12.2 shows the behaviour of the temperature as a function of the density during the flash. The

increase in temperature at a constant density first results in a removal of the degeneracy and, afterwards,

in an expansion of the region. By removing the degeneracy, the helium burning becomes stable given that

the expansion causes the temperature to reach an equilibrium value. Stable He burning is first reached in a

shell surrounding the inner core, and gradually occurs deeper and deeper in the star until the stage of stable

central helium burning occurs. Thus, after the main off-centre flash, several secondary sub-flashes occur

deeper and deeper inside the star, which each heat the interior layer below it. In the end, the nuclear energy

produced in the core is carried away gradually by an expansion and cooling after the He (sub-)flashes until

the degenerary is completely lifted and the temperature reaches an equilibrium value appropriate for central

stable helium burning.

The path followed by the star in the HR diagram as a consequence of the helium flashes is as follows.

Just before the flash the luminosity reached a maximal value which was only delivered by the hydrogen shell

burning and the core contraction. During the helium flash the area of hydrogen shell burning becomes so

thin that the shell disappears on a time scale of ≈ 10−3 year. The energy produced immediately after the

helium flash by the helium burning (first in a shell that gradually moves deeper in, later on centrally) is much

lower than that produced by the hydrogen shell burning before the main flash. Consequently the luminosity

will decrease substantially and the star occurs in Point G in Figure 12.1 when its core helium burning occurs

in equilibrium. Once the core He burning happens in equilibrium, the hydrogen shell burning restarts in a

thin shell as well, such that He production re-occurs in the envelope.

183

12.3 Evolution after the helium flash

After the violent stage of the helium flash, a more quite stage of central helium burning in a non-degenerate

environment follows. The star has a luminosity of about 100 L⊙ and is again located close to its Hayashi

track. The star has now arrived on the horizontal branch (see Figure 12.3). The position of arrival of the

star on the horizontal branch depends on its core mass, its envelope mass, and its chemical composition

at that time. Differences in the positions observed thus reflect a difference in mass loss that must have

occurred before the helium flash and/or a difference in metallicity. By analogy with the zero-age main

sequence (ZAMS), this stage is called the ZAHB: “zero-age-horizontal-branch”, although this terminology

is sometimes limited to metal-poor stars as in Figure 12.3. For Population I stars, as the model in Figure 12.1,

one usually refers to this phase as the core helium burning phase (Point G to H).

For metal-poor stars, the entire ZAHB is covered by stars of about the same core mass but with a

clearly different envelope mass, if we consider the same metallicity. Stars that have suffered the largest

mass loss are located on the left side of the ZAHB, while stars with a lower mass loss occupy the ZAHB on

the right. In practice this discrimination cannot be made easily, because the different observed positions of

the stars on the horizontal branch reflect the evolution of the star during its stay on the horizontal branch,

as well as the different chemical composition. The star makes a loop to the left and back to the right due to

the growing mass of the core caused by the hydrogen shell burning while the helium burns in the core. A

different position on the horizontal branch thus reflects a combination of a different chemical composition

upon arrival on the ZAHB, a different core composition and mass, and a difference in envelope mass. This

situation is depicted separately in a schematic way in Figure 12.3.

Upon arrival on the ZAHB, the star has a homogeneous non-degenerate helium core with a mass

Mcore ≈ 0.47M⊙. The core is surrounded by a hydrogen rich envelope with a mass M − Mcore. The

total luminosity is delivered by the slow central helium burning and the hydrogen shell burning, which

started again in a thin shell after the He (sub-)flash(es). The mass of the helium core increases due to the

shell burning, while the helium burning forms a central convective CO-core within the helium core. Thus,

in fact, shell burning soon occurs in two shells, one surrounding the tiny CO core and one on top of the He

core and below the H-rich envelope. The masses of these shells will grow during the following stage. Thus,

the luminosity increases slowly during the evolution on the horizontal branch, resulting in a non-negligible

extended region in the HR diagram, with the ZAHB as a lower limit.

Cluster stars are born together and have the same initial chemical composition; they will coagulate

on the horizontal branch. When it concerns metal-rich (young clusters), their red giants will be located on

the red (cool) side of the horizontal branch with a relatively low luminosity, because of their larger opacity

(compared to metal-poor stars). This is the reason why the region between Points G and H in Figure 12.1

is called the red clump. This phenomenon is also visible for stars with the same metallicity that do not

belong to a cluster, in particular also for the red giants in the environment of the Sun (see Figure 1.7). The

horizontal branch of the globular cluster M5 shown in Figure 10.6 looks entirely different than the red clump

for stars in our environment. The horizontal branch of M5 is located at high temperatures and luminosity

and is stretched out over a large temperature range. On one hand this is due to the lower metallicity of the

stars in M5 in comparison to the stars close to us and on the other hand this reflects the age of the cluster,

as a result of which the horizontal-branch evolution already took longer. In general we find that the more

184

Figure 12.3: The position of metal-poor stellar models on the horizontal branch with a similar helium core

but a different total mass and a different metallicity (indicated as XCNO in the plot). All models have

X = 0.65 in the envelope. The solid line indicates a series of models with a constant XCNO = 0.01 but

with different masses, ranging from 0.6 to 1.25M⊙. The dotted line indicates a series of models with a

constant mass 1.25M⊙ but with a varying chemical composition XCNO ranging from 10−5 to 0.01. The

dashed line on the left is the main sequence and the one on the right is the Hayashi track for 1.25M⊙. (From

Kippenhahn et al. 2012)

metal-poor the cluster, the higher the temperature and the luminosity of the horizontal-branch stars.

The theoretically determined evolutionary tracks for the horizontal branch are very hard to compare

to observations at a great level of detail due to relatively large observational errors. In any case the tracks

always start on the ZAHB and the star arrives (possibly after a few loops) nearby its Hayashi track at the

time when the central helium gets exhausted. The phase of core helium burning lasts about 100 to 150 Myr,

depending on the occurrence of CBM surrounding the burning convective helium core or not. The post-

horizontal-branch evolutionary tracks are all located above the horizontal-branch tracks (see Figures 12.1

and 12.4). At the end of the core helium burning, the star is said to have arrived on the asymptotic giant

185

Figure 12.4: The ZAHB in the HR diagram and evolution tracks representing the HB evolution. The bold

solid line is the ZAHB for models with a helium core of a mass 0.475M⊙ and a hydrogen rich envelope

with X = 0.699, Y = 0.3 of different mass M −Mcore. The total mass is indicated for several models

(bold dots). The subsequent evolution is depicted for the three models by solid lines (slow evolution) and

by dashed lines (fast evolution). The slow evolutionary stages are those of central helium burning combined

with hydrogen shell burning (107 year) on the one hand, and of hydrogen and helium shell burning on the

other hand. In between, a fast stage occurs during the transition from central helium burning to helium shell

burning. (From Kippenhahn et al. 2012)

branch (AGB, see Figure 12.5).

During the evolution on the horizontal branch the stars cross, as indicated in Figure 12.5, the instability

strip, where the Cepheids (indicated by “W”) and the RR Lyrae (RR) stars are found. These stars experience

large-amplitude radial oscillations driven by a heat mechanism similar to a Carnot cycle in thermodynamics.

Therefore they shrink and expand periodically while the spherical symmetry is preserved. For a detailed

description of the observational characteristics of pulsating stars we refer to the course Asteroseismology.

12.4 AGB stars

Up to now we only treated the evolution of a star with a mass lower than 2.3M⊙ in this chapter. For a

further description of the evolution we pick up the stars whose evolution we did not describe until the end

of their life in the previous chapter, i.e., we unite all stars born with a mass below 8M⊙. After the central

helium burning, these stars have all arrived on the asymptotic giant branch, where they undergo two shell

186

Figure 12.5: The evolution of three stellar models in the HR diagram. After the helium flash the stars end

up on the horizontal branch. Subsequently they evolve to the upper right and arrive at the asymptotic giant

branch. The dashed line indicates the classical instability strip, where the RR Lyrae (RR) stars and Cepheids

(W) are found. (From Kippenhahn et al. 2012)

burnings. A zoom of their Kippenhahn diagram is offered in Figure 12.6, following Figure 11.1. As of the

start of the early AGB phase, all these stars follow a common evolution towards their death. During the early

AGB phase (point H) the He burning shifts from the centre to a shell. The H-burning shell extinguishes and

at point K, which indicates the time when the second dredge-up occurs. The H-burning shell is re-ignited

some time later at point J. This is the start of the double shell-burning phase, which soon leads to thermal

pulses of the He burning shell. At this stage, the star also undergoes strong mass loss, removing the the

stellar envelope in a time frame of about a million years, leaving the degenerate CO core as a cooling white

dwarf. We now discuss these events in a bit more detail.

187

Figure 12.6: Zoom in on the Kippenhahn diagram of the 5 M⊙ already shown in Figure 11.1 for the end of

the core helium burning phase and the AGB. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar

Structure and Evolution, Radboud University Nijmegen, NL)

12.4.1 The circumstellar envelope and mass loss during the AGB

During this stage of its life, the star is subject to a considerable mass loss caused by a dust-driven stellar

wind. Even though it is poorly known when the mass loss starts and how strong it varies during the AGB, it is

clear that it is affected by large-amplitude radial pulsations of the star driven by the H partial ionization zone

via a thermodynamical Carnot cycle (also known as the opacity mechanism). The circumstances are good

for a stellar wind to occur, given that the star meanwhile expanded so much that is has a very extensive radius

but only very limited mass in its envelope. This envelope makes the star look like a red supergiant. Stars on

the asymptotic giant branch have radii between 100 and 500R⊙ and an effective temperature between 2 200

and 3 500 K. A cartoon is presented in Figure 12.7.

AGB stars thus exist of a small hot CO core that is strongly gravitationally bound and a very large cool

“loose” envelope of which the outer layers are only very weakly gravitationally bound. Because of this,

the AGB stars can easily undergo a substantial mass loss. Due to the pulsations, an extensive circumstellar

188

Figure 12.7: Cartoon of an AGB star during its thermally pulsing phase. The CO core is degenerate and

very compact, and is surrounded by two burning shells (indicated in red). The convective envelope is very

extensive in size and only loosely bound to the interior. It experiences a strong stellar wind, creating a

circumstellar (CS) envelope. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and

Evolution, Radboud University Nijmegen, NL)

envelope of gas and dust forms. This envelope can be considered as the third part of the star. The temper-

ature of the star decreases typically from 3 500 K in the envelope to only some 10 K on the outside of the

circumstellar envelope. At such a low temperature, (complex) molecules (among which the OH and the CO

molecule) and dust grains form. The latter determine not only the spectral characteristics of the AGB star

in the infra-red, but also the further evolution of the star. Indeed, the dust in the envelope is very efficient in

absorbing the stellar light due to its large opacity. This environment succeeds in transforming the absorbed

189

light into outward motion of the molecules. This dust-driven slow wind has an outflow velocity, also named

terminal wind speed, of typically v∞ ∼ 15 km s−1.

The theory of a dust-driven stellar wind is poorly developed, given the mathematical, physical and

chemical complexities at play in such a cool low-density environment, which implies non-equilibrium chem-

istry. The basic idea is that a fraction of the stellar radiation is converted into momentum of the dust particles

and that these then move away from the star. Consequently, one makes the reasonable assumptions that the

mass loss is proportional to the luminosity of the star and inversely proportional to the escape velocity. To

predict the mass loss, an empirical correlation is used, calibrated by observations of mass loss in the infra-

red. This formalism was suggested for the first time by Reimers, hence the term Reimers wind. The mass

loss caused in this way, amounts to a good approximation to

MReimers = 10−13 L

L⊙

R

R⊙

M⊙

MM⊙ yr−1. (12.2)

Here the convention is used that the mass loss is a positive quantity expressed in solar masses per year.

Typical values range from 10−7 M⊙yr−1 to 10−4 M⊙yr

−1. The dust-driven wind thus relies on a mechanism

that can convert the stellar radiation into wind momentum in an efficient way, thanks to the absorption

capacities of the dust particles in the cool circumstellar envelope.

Meanwhile different alternative formulations to the Reimers treatment for a dust-driven stellar wind

have been proposed. Moreover the Reimers approach was deduced for Population I stars and adaptations are

needed for metal-poor stars. At present, the knowledge of the physical mechanism behind a Reimers wind

is insufficiently known to include it in stellar evolution calculations in a consistent way, i.e., by abandoning

conservation of mass and replacing it by a proper dust-driven wind theory. On the other hand, we cannot

neglect the mass loss on the AGB to treat the stellar evolution properly. One then proceeds as follows: for

two subsequent time steps along the AGB track, the Reimers formula (or a variant) is computed. In that way

the total mass loss between the two time steps is computed (assuming a constant mass loss), and this amount

of mass is subtracted from of the envelope mass the star had during the previous step. Thus one still assumes

that hydrostatic equilibrium and conservation of mass applies for each time step that the structure models

are computed, but each time the envelope mass is decreased. As the AGB stage is very short compared to the

total life span (typically only about one million year versus billions of years), it is assumed that this mode

of operation is quite a good approximation to predict the further life cycle of the star.

The last decades, accurate observations of bright AGB stars in the infra-red, with a high spatial and

spectral resolution, have become available. These observations revealed that the empirically determined

formalisms of a dust-driven wind fail to explain the details of the outflow. Incorporating a better approach

to describe the mass loss, especially its time dependence, requires the full coupling between the dust-driven

wind mechanism and the dynamical stellar pulsations. This requires taking into account the details of the

radiative energy transport through the low-density cool circumstellar envelope, where molecules and dust

grains of various sorts are present. Moreover the assumption of conservation of mass needs to be abandoned,

and this basic equation needs to be replaced by a good wind description, abandoning the stationary boundary

conditions to solve the differential equations of stellar structure. The uncertainties in the physics of such a

complex non-equilibrium chemical system are yet severe. Hence, one opts for the simple description in terms

of the Reimers approach. The improvement of this aspect of stellar evolution for low- and intermediate-mass

stars is an active research domain in stellar astrophysics, in which members if the Institute of Astronomy of

190

the KU Leuven are active.

12.4.2 Thermal pulses, Hot Bottom Burning and the 3rd dredge-up

The two shells that burn and deliver energy during the AGB are separated by a helium layer. The outer

shell, at the bottom of the hydrogen envelope, burns hydrogen and increases the mass of the helium layer.

When the inner shell, located on top of the CO core, is hot enough it burns helium resulting in a heavier CO

core and a decrease in mass of the helium layer. In principle, both shell burnings could constantly go on

simultaneously in a stable way, taken that both shells grow outwards at the same pace. This does not happen

in reality because of the large difference in the conditions (temperature!) in which hydrogen and helium

burning occur. Consequently both shells produce energy in a cyclic manner and the mass of the helium

intershell changes in a quasi-periodic way. During most part of these cycles the hydrogen shell burns, while

the inner helium shell is not hot enough for burning. Consequently the mass of the intershell consisting of

helium increases. When there is no energy source below this helium layer, it will start to shrink. Because of

this it heats up, until its bottom is hot enough to ignite the helium burning. However the helium burning in

this thin shell is not stable as the material at that position already has a high degree of electron degeneracy. A

thermal runaway, termed helium shell flash occurs, resulting in a typical energy production of about 108 L⊙

and lasting a century. This energy is absorbed by the layers lying on top, so these expand and cool down.

As these layers contain the hydrogen shell burning layer, the latter comes to a halt. During a short period

of time the helium shell burning makes the CO core heavier and gives rise to an intershell convective zone

(ICZ) in the helium layer that is needed to efficiently remove the energy produced. During this helium

shell burning the shell moves outwards and constantly comes closer to the area where the hydrogen burning

stopped. Due to the outward heating the hydrogen shell burning layer ignites again. Because the hydrogen

burning is much less sensitive to the temperature, and provides more energy, the hydrogen can burn again in

a stable way. In the mean time the helium burning stops, because this shell has cooled too much due to its

outward displacement. Hence a new cycle begins.

In that manner, periodic cycles of thermal pulses arise. These occur about every 103 − 105 years

depending on the mass of the star. The star has arrived on the TP-AGB: “thermally pulsing AGB”. As the

stellar evolution progresses, the thermal pulses gain in strength, while the time span between them decreases.

The number of thermal pulses that occur depends on the initial mass, the metallicity and the mass loss that

the star experiences in this stage. The pulses continue as long as there is enough hydrogen in the outer

envelope to restart the hydrogen burning.

The luminosity and the surface temperature can change substantially with every thermal pulse. The

difference is more outspoken the less mass there is above the two burning shells. Because of the large

changes in luminosity and temperature the star makes fierce movements in the HR diagram. In Figure 12.9

we show the evolutionary tracks of a star with 0.6M⊙ that experiences 11 pulses. Nowadays clues are found

using interferometric observations in the infra-red that the mass loss of some of the TP-AGB stars indeed

changes cyclically, with periods that are compatible with the occurrence of thermal pulses (but also with

shorter timescales of a few hundreds of years that are not yet fully understood).

The helium shell burning that starts during the thermal pulse transforms 4He into 12C and 16O. Through

191

Figure 12.8: The region around the two burning shells inside an AGB star experiencing two thermal pulses is

shown. Convective regions are shown in grey, where ICZ denotes intershell convection zones caused by the

He-shell flash. The H-exhausted intershell region is indicated by the thin red solid line and the He-exhausted

core mass by the dashed red line. Thick red lines indicate when nuclear burning is active in these shells. The

hatched blue region indicates a shell (“pocket”) rich in 13C formed at the interface of the H-rich envelope

and the C-rich intershell region, following an episode of 3rd dredge-up. (Figure courtesy of Prof. Onno

Pols, Lecture Notes on Stellar Structure and Evolution, Radboud University Nijmegen, NL)

the energy production of this burning a convection zone arises in the intershell in-between the two burning

shells. This convection layer brings the 12C and 16O isotopes to the hydrogen shell burning zone. With each

following cycle of hydrogen shell burning starting when the bottom of the hydrogen layer is hot enough due

to the moving burning helium shell, the 12C and 16O isotopes are transformed into 14N through CNO-type

hydrogen burning. In this phase of stellar evolution, this hydrogen burning is called hot-bottom-burning

(HBB). Thus, HBB prevents in this way that AGB stars become carbon-rich stars at their surface.

The luminosity of the AGB star is entirely determined by its core mass and is independent of the mass

of the envelope. To a good approximation, we have

L

L⊙

= 6× 104(

Mcore

M⊙

− 1

2

)

. (12.3)

In order to remove the luminosity efficiently, the star has a large convective envelope, as is always the case

192

above a layer with shell burning. The outer convective zone reaches deep enough to transport the products

of the hydrogen and helium shell burning to the stellar surface in the case of AGB stars with a relatively high

birth mass, i.e., M >∼ 4M⊙. In this way, these stars experience cycles of a so-called third dredge-up. Stars

with M <∼ 4M⊙ never climb high enough on the AGB to experience these 3rd dredge-up episodes due to

their relatively low core mass, which implies that they have no need for such an extensive outer convective

envelope. They will thus not experience efficient 3rd dredge-ups and the products of nucleosynthesis of

these low-mass AGB stars formed during the two shell burnings will not be revealed at their surface.

12.4.3 The s-process in AGB stars

In between two thermal pulses, 14N is produced through the hydrogen shell burning, and the convective layer

between the two shells transports these 14N isotopes to the helium burning shell during the next pulse. In that

region, the following chain of reactions takes place:14N(αγ)18F(β+ν)18O(α, γ)22Ne. For a pulse in a rather

massive AGB star the temperature reaches a value high enough to also burn the 22Ne isotopes in the reaction22Ne(α, n)25Mg. This reaction accomplishes the production of a neutron. Another, much more efficient

neutron source has been mentioned in the discussion of the carbon burning, but to make this active, sufficient13C isotopes need to be brought to the helium burning shell. It concerns 12C(p, γ)13N(e+ν)13C(α, n)16O.

The latter reaction is much faster than 22Ne(α, n)25Mg but it does require a proton concentration of about

10−4. This can be met when hydrogen-rich material is transported to 12C-rich areas during the pulses, as a

result of which a 13C pocket can be formed (see Figure 12.8).

The neutron sources mentioned above can be strong enough to form elements beyond the iron peak

through the s-process. Neutron capture has been described in the discussion of the r-process. In the s-

process, neutron capture continues until too many neutrons have been captured and the isotope gets too far

outside the stability valley in the (N,Z) domain. The nucleus then experiences a β− decay:

(Z,A) → (Z + 1, A) + e− + νe . (12.4)

The path of the s-process is located along the neutron-rich border of the stability valley in the (N,Z)diagram, but less deep than for the r-process (see Figure 11.6). The name “s-process” refers to the fact

that the neutron capture is slow compared to the β− decay, in contrast to the case of the r-process. The

s-process takes place for neutron number densities of the order of 108 − 1012cm−3 and strongly depends on

the metallicity of the star.

Due to the neutron sources discussed above, the s-process takes place in AGB stars that experience

thermal pulses. Typical s-process elements are on one hand those of the light s-process elements group (i.e.,

the strontium peak). These are the elements with N = 50: Strontium (Sr, Z = 38), Ytrium (Y, Z = 39),

Zirkonium (Zr, Z = 40) and Technetium (Tc, Z = 43). On the other hand there are the numerous heavy

s-process elements of the barium peak with N = 82, of which (Ba, Z = 56) is the main example. The

s-process product Tc is important as its isotope decays already after 105 year. This implies that the detection

of Tc spectral lines in the spectrum of a star is the most direct diagnostic to prove the AGB character of a

star, since Tc can only be formed from the s-process in the AGB phase. The precise details of the transport

of s-process elements to the surface of AGB stars by dredge-up are not well known. An efficient episode of

193

Figure 12.9: The evolutionary track after central helium burning of a star with core mass 0.6M⊙ and

chemical composition X = 0.749, Y = 0.25. The track increases along the AGB until thermal pulses

(indicated by bold dots) occur. The change in position in the HR diagram during a pulse is shown for pulses

9 and 10. Before the last pulse, the track has reached the domain of the white dwarfs. The main sequence,

the horizontal branch and the line of constant radius for white dwarfs are also depicted. (From Kippenhahn

et al. 2012)

cyclic 3rd dredge-ups only occurs for the more massive AGB stars because of its strong dependence on the

precise extent and location of the outer convective layer.

194

12.5 Post-AGB stars

As aforementioned, the number of thermal pulses experienced by an AGB star during the TP-AGB stage

depends on the birth and core mass, the mass loss and the metallicity of the star. The stars keep climbing the

AGB as their CO core becomes more massive – see Eq. (12.3). Meanwhile the stars lose a large part of their

envelope through the coupling between the pulsations and the dust-driven stellar wind. The thermal pulses

keep returning as long as the hydrogen layer has a mass of >∼ 0.1M⊙. When it becomes less massive, the

hydrogen burning can no longer continue and consequently the helium layer and the stellar core no longer

grow in mass. A last thermal pulse occurs when the helium layer contracts one last time until it is hot enough

to start the last cycle of helium shell burning.

When the mass of the hydrogen envelope is reduced to <∼ 0.03M⊙, the pulsations also come to an

end, and as a result the mass loss decreases quickly and stops. The effective temperature of the star starts

to increase when the mass loss stops. This is caused by the disappearance of the outer envelope and the

appearance of the hotter inner layers surrounding the shrinking stellar core. The star leaves the AGB and

starts its post-AGB stage. This takes only about 10 000 years. During this stage the luminosity of the star

remains nearly constant, as it is only determined by the core mass – see Eq. (12.3). The effective temperature,

on the contrary, keeps rising due to the contraction of the core on the one hand and the visibility of hotter

inner regions on the other hand.

The last thermal pulse can still occur during the post-AGB stage, and even somewhat later during the

hot part of the cooling track of white dwarfs. This is due to the short post-AGB phase, while the contraction

of the core still continuous. As a result the helium burning can start one last time. Such a last thermal

pulse takes place in about 25% of the post-AGB stars. In that case the star returns fast to the AGB as it

experiences again shell burning. This is called the born-again scenario, where the star very quickly crosses

the HR diagram. Subsequently it returns to the white-dwarf stage in a time interval of typically 200 years.

Depending on the core mass, the star experiences a radiation-driven stellar wind (see next chapter) and

becomes a hydrogen-deficient helium burning star consisting of a CO core surrounded by surface layers

enriched with helium, carbon and oxygen.

Even though the post-AGB stage is short, it is extremely useful for observational tests about the 3rd

dredge-up and the associated mixing in the intershell and outer convection zone. With the pulsations, the

dust envelope also disappears and consequently the products of nucleosynthesis on the surface of the star

can be studied. The chemical analysis of post-AGB stars on the basis of high-resolution spectroscopy is an

active domain in stellar astrophysics in which members of the Institute of Astronomy play a leading role.

This subject receives ample treatment in the Master course Stellar Atmospheres in Leuven.

As the effective temperature raises to a value of about 30 000 K, the circumstellar matter becomes

ionized. The star has become a planetary nebula. Not every post-AGB star becomes a planetary nebula as in

some cases the circumstellar dust shell is already too far away from the star before the effective temperature

surpasses 30 000 K and/or it contains too little mass. The thermal pulses experienced by the AGB star are an

envelope phenomenon which does not affect the CO core. The latter has the characteristics of a white dwarf.

From the fact that there are far more white dwarfs than planetary nebulae, we deduce that the planetary-

nebula stage needs to be much shorter, even when taking into account that not every post-AGB star lights

195

up as planetary nebula. The planetary-nebula stage takes about 105 year.

At the end of the post-AGB phase, the mass of the CO core is between 0.6 and 1.1M⊙, the higher

masses resulting from stars with a birth mass between 6 and 8M⊙. Because there are many more stars born

with a low mass than with a high mass, we expect the masses of the CO cores to peak around 0.6M⊙. This

is indeed in accordance with the mass distribution observed for white dwarfs.

12.6 White dwarfs

As mentioned in the previous chapter, stars with an intermediate mass of 2.3M⊙<∼ M <

∼ 8M⊙ finally

develop a degenerate CO core after the stage of helium burning. The precise mass of this core depends on

the (up to now not yet fully understood) mechanism of mass loss on the AGB. When the core mass of the

post-AGB star, with an initial mass of M <∼ 8M⊙, is below the Chandrasekhar limit, a fully degenerate star

is left at the end of the evolution: a white dwarf is born after the post-AGB stage. The study of star clusters

confirms that stars with an initial mass >∼ 6M⊙ can indeed end up as white dwarfs. A number of white

dwarfs has been found in star clusters with the turning point of the main sequence below stars with birth

mass 6M⊙. In those clusters, stars with M <∼ 6M⊙ are still located on the main sequence, consequently the

few white dwarfs have to be the end product of stars with an initial mass M >∼ 6M⊙. These stars clearly

have lost a large amount of mass as AGB star.

White dwarfs have sizes comparable to that of the Earth and their mass is about 3×105 times larger than

that of the Earth. The white dwarfs are a homogeneous class of stellar remnants. They form a well-defined

sequence in the B−V , MV diagram. The coolest objects detected have a luminosity of about 3× 10−5 L⊙.

The strong correlation between the luminosity (or MV ) and the effective temperature (or B−V ) shows that

the radii of white dwarfs need to be very similar, i.e., R ≈ 0.01R⊙. From observed values of their surface

gravity, it is deduced that the masses of single white dwarfs are very similar, with a strong peak around

M ≈ 0.6M⊙. For white dwarfs located in a binary system a much larger range in mass has been observed.

The white dwarfs mainly exist of C, O, and thin outer layers of He and possibly H. The ratios depend

on the efficiency of the helium burning. In general, the more massive white dwarfs contain more carbon.

From spectroscopic observations it is deduced that the composition of the stellar atmosphere can be quite

different. The most frequent are white dwarfs with an atmosphere mainly consisting of hydrogen. These are

DA white dwarfs. 80% of the known white dwarfs are of type DA. There is also a group of white dwarfs

with atmospheres mainly consisting of helium. These are the DB white dwarfs. Their percentage is about

20%. A very small number of white dwarfs has an atmosphere with a special chemical composition and

does not belong to the two main classes. They are divided into different classes depending on the observed

spectral lines of certain chemical elements. The effective temperature of white dwarfs cover a large interval

from 200 000 K to 4 000 K. The majority of these stars thus has a temperature higher than the Sun, therefore

the term “white” dwarf has been introduced.

In a star with a mass smaller than 1.44 M⊙, the degenerate electron gas is capable of counteracting the

enormous gravitational force. The less massive the white dwarf, the more non-degenerate matter still exists

196

Figure 12.10: Schematic representation of a mass-radius relation of a “classical” white dwarf structure

according to the theory of Chandrasekhar for a fully relativistic degenerate electron gas. The non-relativistic

case is shown for comparison. (Figure courtesy of Prof. Onno Pols, Lecture Notes on Stellar Structure and

Evolution, Radboud University Nijmegen, NL)

in the outer layers. As characteristic for configurations consisting of degenerate matter, the mechanical and

thermal properties are decoupled. On one hand, the mechanical structure is well described by electron pres-

sure belonging to a gas consisting of degenerate electrons. For this an expression was derived in Chapter 4.

The non-degenerate ions adhering to the ideal gas low on the other hand, are responsible for the mass of the

white dwarf.

It can be shown that white dwarfs follow a mass-radius relation, i.e. the radius of the white dwarfs only

depends on the mass and not on the temperature. Moreover, from the mass-radius relation it is deduced

that the radius is smaller as the mass is larger, i.e., the mass is inversely proportional to the volume. This

“classical white dwarf structure” is shown in Figure 12.10. On both sides of the mass interval, corrections

are needed as the classical theory deduced by Chandrasekhar does not apply any more. As such, the more

accurate determined Chandrasekhar limit is only 1.44M⊙.

The thermal properties are responsible for the radiation and the further evolution of the white dwarf.

In the deep interior of the white dwarf, the matter is degenerate and the energy transport happens very

efficiently through conduction, for which the nuclei transport the heat and not the photons. In the outer

layers, the energy transport happens differently. The outer layers contain less degenerate matter and the

energy transport happens through radiation or convection. The outer layers exist of normal gas that acts

197

as a very effective insulating layer, causing the white dwarf to cool very gradually and slowly. Thus we

have a non-degenerate outer layer in which the temperature is much lower and that insulates the degenerate

isothermal hot core. Because of this, the white dwarf is optically faint but luminous in X-rays since those

wavelengths capture the hot core.

As no nuclear reactions take place any more, the radiation that the white dwarf emits must come from a

different energy reservoir. For white dwarfs the energy needed to account for the luminosity is obtained from

the cooling of the ions: L ∼ T . An extremely small gravitational contraction takes place due to the cooling

as only the ion pressure decreases and not the electron pressure, the latter being the most important of both

by far. Half of the gravitational energy that is released by the contraction supplies the luminosity, the other

half is used to increase the Fermi-energy of the electrons. Finally, the result of this cooling mechanism is

that the white dwarf evolves to form a “black dwarf”: the contraction stops completely and all energy has the

shape of Fermi-energy at that point. The typical cooling time for a white dwarf of 1 M⊙ and L/L⊙ = 10−3

is 109 years. The oldest observed white dwarfs have an age comparable to the age of our Milky Way.

198

Chapter 13

Evolution of a star with M >∼ 15M⊙

Stars born with a mass above ∼ 15M⊙ evolve differently from those that are less massive. This is because

they are subject to a stellar wind from their birth onwards. Their continuous outflow of hot stellar gas

influences their life cycle between their birth and explosion as a supernova. An example of such a star is

shown in Figure 13.1: it is clear that we cannot properly describe the evolution of this star if we do not take

its mass loss into account. The basic assumption of conservation of mass is not justified for such stars.

13.1 The spectra of hot massive stars with mass loss

The discovery of the expansion of the atmosphere and the mass loss of massive stars by a stellar wind

has mainly been established since the “International Ultraviolet Explorer”, launched in 1979, intensively

observed such stars. Before the ultra-violet (UV) spectrum was available, it was assumed that conservation

of mass was not a bad approximation during the main-sequence stage. However, from the spectral lines in

the UV part of the spectrum of such stars, it became evident that they have a fast expanding atmosphere and

constantly experience mass loss already during the core hydrogen burning phase.

The continuum radiation in the stellar wind of hot stars is dominated by scattering processes, as the

densities of the wind are usually low. More specifically, it concerns the scattering of photons by free elec-

trons which is wavelength-independent. Such photon scattering by resonance lines of ions moving in the

wind occurs efficiently. The spectral lines formed in the stellar wind can be easily distinguished from those

formed in the photosphere due to their large broadening and large displacement with regard to the rest wave-

length. In general, line-profile shapes depend on the efficiency of the creation, scattering and destruction of

photons. Spectral lines can therefore occur as absorption lines, emission lines or as a combination of both.

When an ion in a stellar wind collides with an electron, it can use this electron to recombine. The

most probable collision recombination is the one to the ground state of the ion. The ion can, however, also

recombine to an excited state and subsequently descend in the energy-level-diagram through a sequence of

199

Figure 13.1: Image made with the Hubble Space Telescope of the Luminous Blue Variable ηCarina. (From

hubblesite.org)

radiative de-excitations. In this case each de-excitation is accompanied by the emission of a photon. This

process therefore produces a considerable amount of photons in the stellar wind. Lines belonging to specific

electron transitions, that have a high probability to be fed by collisional recombination followed by radiative

de-excitation, can display in this way a clear surplus of radiation: these spectral lines occur in emission. The

process of photon creation is responsible for the Hα emission and the infra-red emission lines in the wind

of hot stars. Photon creation requires much larger densities than photon scattering. It therefore only takes

place in the most dense parts of the stellar wind, not far from the stellar photosphere. Photon destruction

hardly occurs in the stellar wind. Most ions occur in their ground state. After radiative excitation, the decay

time for radiative de-excitation is very short. Moreover, the particle density in the stellar wind is very low.

Consequently, the ion will not be able to experience a collisional de-excitation in time. The absorbed photon

will therefore experience photon scattering and will not be annihilated by photon destruction.

When the spectral line consists of both an absorption and an emission component, it is called a P Cygni

profile, as was first observed for the supergiant P Cygni. Most of the P Cygni profiles are formed by resonant

scattering. Examples of P Cygni profiles are shown in Figure 13.2 for the stars ζ Pup and τ Sco. Compare

these spectral lines with those of stars without substantial mass loss, as shown in Figures 1.1 and 1.4. It

is evident that the profiles of the lines in Figure 13.2 look different. The formation and interpretation of a

P Cygni profile can be understood qualitatively as follows. We consider a simple model of a spherical wind

where the velocity increases with the distance away from the star (see left panel of Figure 13.3). An observer

recognises four areas that contribute to the formation of a spectral line :

1. the STAR emitting continuum radiation with a possible photospheric absorption component at a wave-

length λ0 of the spectral line,

2. the cylinder F in front of the stellar disc. The gas in F moves towards the observer with a velocity

between v ≃ 0 and v∞,

200

Figure 13.2: Observed P Cygni profiles of the N V doublet (upper panels) and the O VI doublet (lower

panels) in the UV spectrum of the massive stars ζ Pup (O4 supergiant) and τ Sco (B0 dwarf). The rest

wavelength is indicated by the arrows. The doublet lines merge in one strong P Cygni profile in case of ζ Pup.

They are observed separately in for τ Sco. The spectrum of τ Sco also displays many narrow photospheric

absorption lines. The profiles of both stars reach large negative velocities, pointing at matter outflow in the

direction of the observer. (From Lamers & Cassinelli 1999)

201

Figure 13.3: Left panel: geometry of a spherically symmetric stellar wind with increasing outward velocity.

The observer discriminates four areas STAR, F, O, H (see text for explanation). Right panel: the contribution

of the star (the continuum flux), the absorption by F and the emission by H. The P Cygni profile covers the

interval [−v∞, v∞] in velocity and is the sum of these three contributions. (From Lamers & Cassinelli 1999)

3. the cylinder O is located behind the star and is occulted by it. The gas in O moves away from the

observer, but the radiation from this area does not reach the observer.

4. the areas H around the star, which the observer would observe as a “halo” around the star if the wind

could be spatially resolved. The gas in H has negative as well as positive velocity components with

regard to the observer.

In the right panel of Figure 13.3 the contributions from the four different areas to the formation of the spectral

line are shown. The STAR provides continuum radiation with a photospheric absorption line. The area F in

front of the star scatters the photons that leave the star and consequently some of them disappear from the

line-of-sight. These photons would reach the observer when there is no stellar wind. The removal of stellar

photons by the accelerated wind particles results in a blue-shifted absorption coefficient with a Doppler shift

between −v∞ and 0 km s−1. For optically thin matter the absorption coefficient does not reach a flux equal

to 0 as there is also scattering in the line of sight in the direction of the observer from area F. For optically

thick lines the flux can be fully blocked from the observer. The halo H scatters radiation coming from the

stellar photosphere in all directions. Part of that radiation moves in the direction of the observer. This part

induces an emission component with a Doppler shift between −v∞ and v∞, with the biggest contribution

centred at velocity 0 km s−1. The net result of all these contributions is obtained by summing them up.

The P Cygni profiles of resonance lines occur in the UV part of the spectrum for hot stars. Therefore the

conclusion of outflow by means of a stellar wind had to wait until this part of the electromagnetic spectrum

could be observed and this requires a space mission. The detection of P Cygni profiles allows at once to

deduce the maximum velocity occuring in the wind, denoted as v∞.

202

13.2 Basic characteristics of radiation-driven stellar winds

The stellar winds of massive stars are caused by a different mechanism than the dust-driven winds during the

AGB. In the case of hot stars it concerns a radiation-driven or also termed line-driven wind. The mechanism

of such a stellar wind is well understood and can be mathematically derived. The latter will not be discussed

in this course due to the limited time. We will summarise a few of the concepts and results.

The two most important parameters that describe the stellar wind of massive stars and that can be

derived from observations are the mass loss M and the terminal velocity of the stellar wind: v∞. The

terminal wind velocities of hot stars have values up to 3000 km s−1 (this is 1% of the speed of light!) and

are thus of a completely different order than the slow dust-driven winds. For hot stars with M >∼ 15M⊙

the mass loss is important from their birth onwards as it influences the stellar evolution (think about the

mass-luminosity relation and the main-sequence life time!).

Each photon created in the core of the star by nuclear reactions has an energy hν and a momentum

hν/c. The total momentum loss the star encounters by emitting photons at its surface is given by L/c =4πR2F/c with F the stellar flux. The corresponding mass loss is L/c2. Conversely, the loss of momentum

due to a stellar wind is given by Mv∞. Observationally, it is found that Mv∞ ≃ L/c. This means that an

efficient mechanism must be at work in the stellar wind. Apparently, this mechanism is capable of absorbing

almost all photons that leave the star and convert them into kinetic energy of the wind ions.

The gas that escapes from the star into the interstellar medium, has a certain kinetic energy. The amount

of kinetic energy that the stellar wind transmits to the interstellar medium per unit of time is 12Mv2∞. To

know the effect of the stellar wind on the environment, we need to derive values for M and v∞. We also

need these values to evaluate the effect of the mass lost due to the wind on the evolution of the star. For a

star that experiences a stationary spherically symmetric wind, the mass loss at a certain point in the wind is

related to the density and the velocity at that point:

M = 4πr2ρ(r)v(r), (13.1)

where r is the distance from the point in the wind to the centre of the star, and ρ and v are the density

and velocity of the wind at that certain point, respectively. Equation (13.1) expresses that no material is

destroyed or created in the wind and consequently it is always the same amount of gas that passes through

a sphere at a distance r from the star. This equation for a stationary wind and its accompanying mass loss

replace the equation for the conservation of mass in the system Eqs (8.1) of the stellar structure equations.

The gas that escapes from the outer stellar layers is accelerated. It has a low radial velocity, typically

of order 10 km s−1, at the stellar photosphere and is accelerated up to a high velocity at a large distance

away from the star. At a very large distance from the stellar centre, the velocity experienced by a particle in

the stellar wind approaches asymptotically the terminal wind velocity: v∞ = v(r → ∞). The distribution

of the velocity in the stellar wind as a function of the radial distance r to the stellar centre is called the

wind velocity-law v(r). The observations of stellar winds give rise to a velocity described by a so-called

βwind-law :

v(r) ≃ v0 + (v∞ − v0)

(

1− R

r

)βwind

. (13.2)

203

This velocity-law describes a general increase in v with the radial distance, with v0 the velocity at the

photosphere r = R and v∞ the terminal wind velocity at large distance, with v0 ≪ v∞. The wind velocity

is thus assumed to be time-independent. The parameter βwind describes how steep the velocity-law is. Hot

stars have a velocity-law that is quite well described by βwind ≃ 0.8. Particles in these winds thus experience

a large acceleration and reach a velocity equal to 80% of the terminal velocity at a distance ≃ 4R, i.e., at

≃ 3R beyond the stellar surface.

13.3 Mass loss and terminal wind speed

We study the hydrodynamics of the stellar wind with the goal to derive an expression for the mass loss and

the terminal velocity of the wind. The assumption of hydrostatic equilibrium is no longer justified. The

equation of motion for a spherically symmetric configuration has the general form of Eq. (3.22) that we

rewrite here as:1

4πr2∂2r

∂t2= −∂P

∂m− Gm

4πr4.

Herein P is the total pressure: P = Pgas + Prad. The outward directed acceleration due to the radiation

pressure thus reduces the effect of the inwardly directed gravitational acceleration. We can write the balance

of the forces on the right-hand side of the equation above differently :

1

4πr2∂2r

∂t2= −∂Pgas

∂m+

(ggrav + grad)

4πr2, (13.3)

where ggrav represents the gravitational acceleration and grad the radiative acceleration. We have to deter-

mine grad and solve the equation.

The stellar wind of massive hot stars is driven by the scattering of photons by ions. The ions are

responsible for the spectral lines and therefore the wind is called a line-driven wind. The computation of

grad hence requires a study of the radiative transport in the outer stellar atmosphere. Here, we only consider

the result (the derivation requires the calculation of a complicated integral of the radiation pressure, defined

as the radiation flux per unit of surface). In analogy to the force exerted by the gradient of the gas pressure,

we find the force exerted by the gradient of the radiation pressure at position r in the stellar wind:

grad(r) =1

c

0κν(r)Fν(r)dν. (13.4)

Hot stars radiate most of their energy at UV wavelengths (see Figure 13.4). In this wavelength interval,

the atmospheres of such stars have numerous absorption lines. The opacity of the absorption lines is much

higher in the UV than the one of the continuum radiation. The opacity of one strong absorption line, for

example the C IV resonance line at 1550A, can easily be a million times higher than the opacity of electron

scattering.

The large radiation pressure experienced by the ions due to the absorption lines would not be an efficient

driving mechanism for the mass loss without the Doppler effect. In a static atmosphere with a strong line

204

Figure 13.4: The fraction of the stellar radiation from hot stars absorbed in the UV by the stellar wind is

indicated by the shaded parts for stars of different effective temperatures. (From Abbott, 1982, Astrophysical

Journal, Volume 259, p.282)

absorption, the photons radiated outwards will be absorbed or scattered by an ion in the atmosphere if they

have the appropriate wavelength. The ions in the outer layers see few “suitable” photons, i.e., photons with

the appropriate wavelength to make a line transition, passing by. Thus the radiative acceleration grad in the

205

outer layers of the atmosphere due to line absorption will be weak. When the outer atmosphere is dynamical,

however, there is a velocity gradient causing the ions in the outer layers to see the photons red-shifted with a

level covering the velocity range [−v∞,+v∞]. Thanks to their Doppler shift, the ions can “use” many more

photons to form spectral lines. Consequently, the ions can absorb the radiation coming from the photosphere

very efficiently. This is the basis for the efficient driving mechanism causing the stellar wind in hot stars.

13.3.1 Thomson scattering in the stellar wind

First we only consider the radiative acceleration grad caused by the continuum opacity from a point source

of radiation as if all photons come from the core of the star. For the continuum radiation it concerns photons

that are scattered by free electrons in the stellar wind. It is called Thomson scattering and it is independent

of the frequency of the photon. The frequency-independent effective cross section of one electron is

κT =8π

3

e2

mec2, (13.5)

where me represents the mass of the electron and e its charge. The opacity caused by the electron scattering

is given by

κe = κTneρ, (13.6)

where ne is the number of electrons per cm3. The latter depends on the mass fractions X,Y,Z and the

degree of ionization in the wind. For early-type massive stars the opacity at the base of the wind can be

approximated by relying on the mean molecular weight per electron defined by µemu = ρ/ne and the

expression for µe in the case of fully ionised material given by Eq. (2.38). This leads to the approximation

given by κe ≈ 0.20 (1 + X) cm2/g. Thus, in the approximation that the degree of ionization is constant

throughout the whole wind, the radiative acceleration due to Thomson scattering is κeF/c, i.e., it only

depends on the position in the stellar wind. At a distance r > R in the wind, following Eq. (13.4) and the

approximation of Thomson scattering by free electrons, the radiative acceleration is therefore given by

ge(r) =κeL

4πr2c=GM

r2Γe, (13.7)

with

Γe ≡κeL

4πcGM. (13.8)

For main-sequence stars with M <∼ 15M⊙, Γe ≃ 0 and we can simply neglect the stellar wind caused

by the continuum radiation. For more massive main-sequence stars and hot giants and supergiants, Γe is

significantly different from zero.

We find that the radiative acceleration caused by the continuum radiation has the same r-dependence

as the gravitational acceleration. The corresponding force, however, is opposite to the gravitational force,

and will diminish the effect of gravity. Therefore, we can merge both terms in an effective gravitational

acceleration:

geff(r) ≡ −GMr2

[1− Γe] . (13.9)

The radiation pressure caused by the continuum radiation can overcome gravity when Γe > 1. In practice,

however, Γe < 1. The continuum opacity alone thus cannot be the source of the stellar wind.

206

13.3.2 LBVs, WR stars and the Eddington limit

The quantity Γe is a function of L/M . We now consider the total opacity κ resulting from all radiative

processes. We then find an upper limit for the luminosity of a star by expressing that gravity is capable of

holding the gaseous sphere together:

L <4πcGM

κ. (13.10)

When this condition is not met, the star cannot exist. This leads us to the critical luminosity that cannot be

exceeded, termed the Eddington luminosity:

LEdd ≡ 4πcGM

κ. (13.11)

The corresponding condition for L/M is called the Eddington limit.

Stars can get close to their Eddington limit when they have a very large energy flux and/or when the

opacity becomes very large. In that case, they cannot be very stable and the slightest disturbance helping the

radiation to overcome gravity results in a large loss of matter. This is the case for luminous OB supergiants

and so called Luminous Blue Variables (LBVs) and Wolf-Rayet (WR) stars. LBVs are very massive stars

that experience an unstable state in their evolution. The outwardly directed acceleration caused by the

strong radiation pressure is so large in LBVs that, with the slightest pertubation, they can overcome the

inward gravitational acceleration. Consequently an unstable state occurs, resulting in severe mass loss. The

outbursts of an LBV can last for decades and can be very irregular, with long periods of equilibrium in

between. The star ηCarina, of which the geometry was shown in Figure 13.1, is an LBV.

A star is called a Wolf-Rayet (WR) star when a hot helium core remains after the evolution of a massive

star that lost its entire outer envelope due to an extremely strong radiation pressure. WR stars are thus the

successors of the LBVs. In the spectrum of such a WR star we find mainly emission lines caused by the

fast expanding envelope. Due to the presence of this envelope, it is hard to define the stellar photosphere.

The effective temperature of a Wolf-Rayet star is about 30 000 to 50 000 K. These stars had an original mass

above 40M⊙ and lost so much mass during their evolution via their stellar wind that they have only about

∼ 4M⊙ left at this stage of their evolution. The WR stars are divided in two groups: the carbon-rich

WC stars and the nitrogen-rich WN stars. These classes are subdivided in WC5 – WC9 and WN3 – WN8

depending on the presence of particular spectral lines in the observed spectrum. The WN and WC varieties

represent different evolutionary stages. WN stars evolve to WC stars as more stellar material is lost through

the stellar wind. The LBVs as well as the WR stars are located close to their Eddington limit.

The application of the mass-luminosity relationship results in an upper limit for the mass of a star in

order for the gravity to keep the gaseous sphere together. We thus find that the main sequence must have

an upper limit in mass. When we only take the continuum opacity by scattering into account, we find an

upper limit for the mass of about 150M⊙. In practice we know that stars with a much lower mass already

experience a strong mass loss. This is due to the large line opacity of massive stars.

207

13.3.3 A realistic description of a line-driven stellar wind: the CAK-model

The radiative acceleration caused by the spectral lines no longer is proportional to 1/r2. Consequently it

can no longer be combined with the gravitational acceleration in a simple way. We need to take into account

all the spectral lines to find an accurate description of grad, i.e., the “combined effect” of all lines has to be

regarded to determine an accurate expression for grad. This requires the determination of ionisation degrees

and excitation states of a large number of energy levels for a large number of ions. As an illustration we

show in Figure 13.5 parts of the spectrum of the B1III giant ξ1 CMa from the far ultra-violet up to and

including the visible. The depth and width of each of these spectral lines is described by its line absorption

coefficient. This coefficient depends on the temperature of the gas, the density, the pressure, the abundance

of the element, the ionisation state of the gas, the collision probabilities, the line transition probabilities,

etc. The number of spectral lines that needs to be taken into account to compute the radiative acceleration

caused by line radiation reduces drastically (with a factor 105!) when limiting to the strong resonance lines.

This is a good approximation due to the low density in the stellar wind. The astronomers Castor, Abbott &

Klein (CAK) came up with this approximation in their 1975 theory, which is still in use today and is called

the CAK-theory.

The spread of the (resonance) lines over the wavelengths is not homogeneous as shown in Figures 13.4

and 13.5. There are regions in the spectrum where they hardly occur, while they are abundant in others,

even overlapping sometimes. Once the radiative force of all individual lines is computed and tabulated,

the corresponding overall radiative acceleration, grad, can be parametrised. The aim then is to solve the

equation of motion and to derive expressions for the mass loss and the wind velocity.

The cumulative effect of an ensemble of non-overlapping absorption lines was parametrised by Castor,

Abbott & Klein through a power law of the line opacity: N(κℓν) ∼(

κℓν

)αCAK−2with 0 < αCAK < 1 and

κℓν the opacity of the spectral line ℓ at a frequency ν (or wavelength λ = hc/ν). In this empirical law, the

power αCAK determines how optically thin (value below 0.4) or optically thick (value close to 1) the line

ℓ at frequency ν is. They found this result empirically from a list of a few hundred observed carbon lines.

They gave each line a weight according to the value νℓFν/F (see Figure 13.4), so that a higher weight is

associated with a stronger line. The proportionality constant is chosen such that κ0N(κ0) = 1 with κ0 the

opacity of the strongest line in the entire stellar spectrum. After these first empirical results and with the

increase of available computational power, many more spectral lines were used to determine grad.

Figure 13.4 graphically represented which fractions of the stellar flux in the UV occur in spectral lines

to drive the stellar wind. It concerns substantial fractions of the produced flux, explaining why the line-

driven stellar wind can be such an efficient mass loss mechanism: a large part of the stellar flux is converted

into wind momentum.

The parametrisation of CAK results in:

gCAKrad (r) =

KL

r2

(

1

ρ

dv

dr

)αCAK

, (13.12)

with K a constant and αCAK determined by the relative proportion of the number of optically thin (read:

weak) and optically thick (read: strong) absorption lines in the spectrum. With this result, one can deter-

208

Figure 13.5: Parts of the spectrum of the star ξ1 CMa (B1III). The derivation of grad(r) requires the accurate

determination of the absorption coefficients of all spectral lines in a large wavelength range, including the

UV part of the spectrum.

209

mine the effect of gCAKrad for the motion of particles in the wind by substituting the expression for gCAK

rad in

the equation of motion and solving it numerically under the assumption of a stationary wind. Using the ex-

pression for an ideal gas for the gas pressure, the solutions in the approximation of a point source of photons

read (derivation omitted here, see MSc course Stellar Atmospheres):

MCAK =

(

κe4πc

)1/αCAK 4π

κevthαCAK (0.32)1/αCAK

(

1− αCAK

GM(1− Γe)

)(1−αCAK)/αCAK

L1/αCAK (13.13)

and

vCAK

(r) = v∞

1− R

r=

αCAK

1− αCAK

2(1− Γe)GM

R

1− R

r. (13.14)

In this approximation it is thus found that βwind = 1/2.

In practice the star is not a point source, and this needs to be corrected for, especially for wind particles

that are located close to the stellar surface. The corrections are made by redoing the calculation but this time

in the approximation that the stellar surface has the correct spherical shape. Various additional numerical

upgrades have been made to the original CAK theory to compute wind mass-loss predictions. In the left

panel of Figure 13.6 we show recent results for the mass-loss rates as a function of luminosity for O-type

stars at the metallicities of our Milky Way and the Magellanic Clouds computed with the FASTWIND code,

originally developed by Prof. Joachim Puls at Munich Observatory and currently under active development

at KU Leuven by Prof. Jon Sundqvist. The dependence of the mass loss on metallicity is obvious from

this graph. Most stellar evolution codes, including MESA, take the mass loss rate predictions by Vink et

al. (2001, A&A, Volume 369, p.574) as the standard values for the regime of OB-type stars. However, it

has lately become clear that these rates predict systematically too high values for M according to various

types of modern observations. A comparison with the latest Leuven FASTWIND mass-loss rates and those

based on the older Vink et al. recipes is shown in the right panel of Figure 13.6 and reveals a factor ∼ 3overestimation when using the Vink prescriptions – students of this SSE course may want to take this into

account in their MESA Labs!

The mass loss caused by line-driven stellar winds has a large effect on the evolution of stars with an

initial mass larger than about 25 M⊙, i.e., O-type dwarfs on the main sequence. These stars experience

substantial mass loss during their entire life. Stars with 15M⊙<∼ M <

∼ 25M⊙ also experience a stellar

wind, but it only becomes strong when leaving the main sequence. Consequently all of these stars play a

crucial role in the chemical enrichment of the interstellar medium, from their birth until their explosion as

core-collapse supernova.

For the stars born with a mass above 25 M⊙, the mass lost in the core-hydrogen burning phase due

to the stellar wind implies a change in the Mass-Luminosity relation and, given the Mass-Radius relation

valid for the main sequence, the mass loss also affects the radius of the star. Both the mass and radius in

their turn determine the escape velocity of the star. Figure 13.7 shows the Mass-Luminosity relation and

the ratio of the terminal wind speed with respect to the escape velocity as a function of luminosity taking

into account the effects of the line-driven wind for dwarfs, giants, and supergiants. Calibrations of these

numerical computations by means of detached eclipsing double-lined binaries are hardly available, as the

catalogues of such objects typically cover masses until about 25 M⊙.

210

Figure 13.6: Left: Mass-loss rates as a function of luminosity for O-type stars resulting from modern CAK-

based radiation-driven wind computations done with the FASTWIND code at KU Leuven. Right: Most

stellar evolution codes, including MESA, take the mass-loss rate predictions by Vink et al. (2001, A&A,

Volume 369, p.574). The right panel compares these with our Leuven results. Squares represent dwarfs,

bullets are for giants and diamonds for supergiants. (From Bjorklund et al., 2020, A&A, Vol. 648, id.A36,

16pp.)

Figure 13.7: Left: Mass-Luminosity relation for O-type stars undergoing a radiation-driven wind. Right:

the ratio of the terminal wind speed versus the escape velocity from the star as a function of luminosity.

(From Bjorklund et al., 2020, A&A, Vol. 648, id.A36, 16pp.)

13.4 Consequences of mass loss for stellar evolution

In Figure 13.8 we show the results of stellar model calculations during hydrogen and helium burning where

the mass loss due to a line-driven stellar wind was taken into account. Several approximations are being

considered to determine evolutionary tracks. In principle one should make a fully self-consistent integration

211

of the system of differential equations Eqs (8.1) such that the conservation of mass is replaced by the equa-

tion for a stationary stellar wind as in Eq. (13.1) and the equation of motion by Eq. (13.3) with grad given

by Eq. (13.12). However, as the free parameter αCAK is a non-integer, this implies a serious mathematical

complication of the system of differential equations. Moreover, the boundary conditions at the surface need

to be adapted, as the hydrostatic equilibrium is no longer valid in a dynamical atmosphere where the ions

experience an accelerated motion. Also recall that the energy transport equation in Eqs (8.1) relied on the

assumption of hydrostatic equilibrium . . . . For these reasons, a different road is taken for the calculation of

evolutionary tracks, as was done for the AGB (only here we have a much better theory for the determina-

tion of M and v∞). The mass is simply reduced during each time step with an amount MCAK multiplied

by the duration of the time interval, and the system of Eqs (8.1) is solved. Clearly this approach does not

result in a complete consistency between the used mass loss and the evolutionary stage for which the model

is computed. A fully consistent integration of the system of differential equations, including a dynamical

atmosphere with an outflow, is not at hand yet.

Stars with a birth mass M > 60M⊙ experience such a strong stellar wind during their main sequence

and their hydrogen shell burning stage that their entire hydrogen envelope is blown away. After these stages,

they are left with a naked helium core. Consequently they stay in the blue part of the HR diagram. The stellar

wind of stars with 25M⊙<∼ M <

∼ 60M⊙ is not strong enough to blow away the entire hydrogen envelope

during the main sequence. These stars very quickly become red supergiants after the TAMS. At that stage,

the stellar wind does blow away the remaining hydrogen envelope. When the mass of the hydrogen rich

envelope has decreased through mass loss, the convective energy transport in this envelope can no longer

happen in equilibrium. Consequently, the outer envelope contracts until it is in radiative equilibrium, in

other words the radius decreases. In that way, the stars can never remain red supergiants due to their mass

loss and they have to return to the blue side of the HR diagram. In practice, red supergiants are missing

for L > 5 × 105 L⊙ (or in terms of absolute bolometric magnitude: Mbol < −9.5). This observed upper

limit for the distribution of stars in the HR diagram is called the Humphreys-Davidson limit, named after

its discoverers. The limit is a slant line with a negative slope for effective temperatures between 50 000

and 10 000 K, and a horizontal line for cooler temperatures. In Figure 13.9 we show the upper part of the

HR diagram with the observed Humphreys-Davidson limit.

Finally, stars with 15M⊙<∼ M <

∼ 25M⊙ do experience mass loss but it remains so limited that they

never lose their outer hydrogen envelope. Because of that they succeed to traverse to the red part of the

HR diagram and proceed with their evolution from there on, as the stars discussed in the former two chapters.

They end their lives as a neutron star.

As can be established from Figure 13.8, the main sequence broadens considerably vis-a-vis the one

for stars with M <∼ 15M⊙, as mass loss prolonges the stay on the main sequence. Since the mass loss

decreases the mass of the star, the luminosity will also decrease (mass-luminosity relation). Massive stars

that experience mass loss have therefore a lower luminosity than stars born with the same mass but without

a stellar wind. The mass loss thus indeed prolonges the life span on the main sequence. In Figure 13.10

both effects, lower luminosity and lower mass, are schematically depicted for a star with an initial mass

of 30M⊙. The three evolutionary tracks represent a different value of the mass loss. This mass loss is

calibrated in units M = N L/c2, where N indicates the number of resonance lines in the UV responsible

for the line driving. In practice, N ≃ 100. The luminosity of the star decreases as the mass increases.

The mass at the end of the main-sequence stage and the length of this stage are also indicated. The effect

212

Figure 13.8: Evolutionary tracks of massive stars with an initial composition X = 0.73 and Z = 0.02taking into account the mass loss as described in the text. The tracks indicated by a solid line assume the

Schwarzschild criterion for convection, use the mixing-length theory, and ignore CBM. For comparison,

parts of the tracks are indicated by a dashed line: these are based on a variant of the mixing-length theory

but otherwise have the same input physics. The shaded areas indicate the main sequence and the start of the

helium burning. The first bold dot is the time when the first triple α reaction starts, the second dot (when

present) indicates the time when the carbon burning starts. (From Maeder 2009)

213

Figure 13.9: The upper part of the HR diagram. Thin lines represent evolutionary tracks. The original

Humphreys-Davidson limit is indicated by a bold dot-circle line, while the dotted line represents its refine-

ment. The grey area contains red supergiants, which are not shown individually for reasons of clarity. (From

Fitzpatrick & Garmany, 1990, Astrophysical Journal, Volume 363, p.119)

of a longer lasting main sequence is reinforced when models with CBM are computed. Indeed, due to the

overshooting more hydrogen is brought into the core, making the main sequence last longer. Moreover,

the helium burning starts at a higher effective temperature for these stars (Figure 13.8), as they have a less

thick hydrogen envelope. Finally, the stars with M >∼ 25M⊙ also evolve very quickly to the blue side

of the HR diagram. All of this implies that there is no analogy for the Hertzsprung gap for M >∼ 15M⊙.

Observations confirm this, as can be seen in Figure 13.9.

Observations show that the part in between the TAMS and the start of the helium burning stage is

populated by stars with a different chemical composition at their surface. This reflects that stars moving to

the right, as well as stars moving to the left occur in the same area of the HR diagram. The massive stars

burn hydrogen through the CNO cycle and mainly produce nitrogen at the expense of carbon when they

have reached the TAMS. The mass loss additionally makes the outer hydrogen layers disappear during the

short red giant stage. Consequently the products of the nuclear reactions reach the stellar surface while the

stars move back to the blue part of the HR diagram. This explains the existence of blue

• ON stars. These are massive stars with a very high nitrogen abundance and a very low carbon abun-

dance;

• WR stars with a high He abundance and a very low H abundance at their surface. The WN stars have

a high nitrogen abundance through the hydrogen burning via the CN cycle while WC stars have a high

carbon abundance due to the triple α reaction.

214

Figure 13.10: Evolutionary tracks for a star with an initial mass of 30M⊙ for different values of the mass

loss during the main-sequence stage. The mass loss is expressed as M = N L/c2. (From Lamers &

Levesque 2017)

13.5 Example: the evolution of a star with an initial mass of 60M⊙

We show the evolution of a star with a birth mass of 60 M⊙ in Figure 13.11. The upper limit in the bottom

panel indicates the decreasing mass due to the mass loss. During the stage of central hydrogen burning

(from A to C), H is converted into He via the CN cycle. This stage only lasts 3.7× 106 years and causes an

increase in He and 14N, as well as a decrease of H and 12C in the extensive convective core. Meanwhile, the

stellar mass decreases substantially through the mass loss. The latter increases from 1.4× 10−6 M⊙/year on

the ZAMS to 7.0 × 10−6 M⊙/year at the TAMS.

In stage B, when the main-sequence stage has almost come to an end, the products of the hydrogen

burning through the CN-cycle in the core are transported to the surface of the star by an extensive convective

zone. The star reaches the stage where the vertical lines occur at point B in the bottom panel of Figure 13.11.

At that time the N and He abundances increase at the stellar surface while H and C abundances decrease: the

star becomes a N-rich ON star. When the hydrogen supply is completely finished in the stellar core, it starts

to contract, making the star move to the left in the HR diagram. This motion continues until the temperature

in the area around the core is high enough to induce hydrogen shell burning. This happens in point C. At

that time, the star already lost about 15M⊙ and the burning shell surrounds a core containing about 30M⊙.

By starting the hydrogen burning in a shell, the outer layers expand and the star moves to the right

215

Figure 13.11: Top panel: the evolutionary track of a star with an initial birth mass of 60M⊙ in the

HR diagram. Bottom panel: The internal stellar structure as a function of time. The stages of different

spectral classes are indicated. In the areas with “clouds” the energy transport happens through convection.

Areas with diagonal lines indicate where the nuclear burning takes place. In the areas with the thin vertical

lines the original chemical composition changes substantially. The time-axis is divided in three parts. The

characters in each of the panels denote the specific stages in the life of the star, as discussed in the text.

(From Maeder 2009)

216

in the HR diagram. Shortly after that, the star becomes unstable and turns into an LBV (stages E → F). It

suffers from strong mass loss, of the order of 5 × 10−4 M⊙/year. In total it loses about 5M⊙ during this

LBV stage, that takes about 10 000 years. Due to the strong mass loss, the He-enriched layers appear at the

stellar surface, where He/H mass ratio equals about 0.4. While the star is busy with its post-main-sequence

evolution and is moving to the right in the HR diagram, it loses that much mass during the LBV stage that

its expansion is stopped. In other words, it reaches the Humphreys-Davidson limit, and consequently the

envelope contracts again and the star moves back to the left in the HR diagram (after point E). This star is

now transformed into a luminous, relatively small hot star with a He-rich and N-rich photosphere.

The star becomes a WR star of type WN during the stages F → G. It continues to lose its hydrogen

envelope and receives its energy from the combination of He-burning in the core and hydrogen shell burning,

as long as there is enough hydrogen left in the outer envelope. The mass loss during the WR stage remains

substantial, about 3 × 10−5 M⊙/year. After about 4 × 106 year (stage G) the outer layers are lost to such

an extent that the carbon-rich layers appear at the stellar surface. The C-rich material was formed by the

triple-α reaction and is brought to the surface through convective motions. The star now becomes a WR star

of type WC with an C-rich and He-rich envelope.

Meanwhile the star keeps contracting (G → H). The luminosity remains nearly constant until it reaches

a radius of about 0.8R⊙ and an effective temperature of about 200 000 K. However, an observer does not

“see” this temperature due to the circumstellar material. The large mass loss results in an optically thick

wind. The radiation that escapes in the direction of the observer stems from that wind, where the temperature

is about 30 000 K. The radius of the area where the wind is optically thick is about 10 R⊙. This radiation is

observed from the near UV until the IR. Observed WR stars therefore have a much lower effective temper-

ature and a much larger radius than those deduced from stellar evolution calculations. The optically thick

wind makes it very hard to estimate the observational parameters of the WR stars.

The stage of central He-burning (D → H) lasts about 6 × 105 year and is followed by a stage of C-

burning. The latter takes only 2 000 year. The next stages, burning always heavier elements, end in less than

a year. The star will explode as a core-collapse supernova and a black hole remains. During its evolution

before the supernova explosion, the star has emitted 38M⊙ of its material and transmitted it to the interstellar

medium: 29M⊙ of hydrogen that was partially enriched with 14N, 8M⊙ of helium and 1M⊙ of carbon and

oxygen. The interaction of the emitted material with the surrounding interstellar medium often causes a WR

(ring)nebula.

13.6 Black holes

Once a star with M >∼ 25M⊙ has passed through its different burning cycles and passed the LBV and

WR stages, there is no return: the star will soon end its live as a supernova. The (still quite uncertain)

theoretical EOS options for neutron stars impose an upper limit of about 2 M⊙ for the mass of such a

remnant. For compact objects that are more massive, there is at this moment no known mechanism capable

of counteracting gravity. Such objects collapse into, what is called, a black hole. Where neutron stars

were already extreme in their density, rotation, and magnetic fields, black holes are the ultimate form of

217

compactness that a massive star can evolve into. By definition, the collapsed star will no longer be directly

observable by electromagnetic radiation. Only a strong gravitational field remains.

The theoretical description of a black hole, and even the whole concept of such an object, is fully

based on the theory of general relativity. This is outside the scope of this course and is studied in separated

courses throughout the BSc and MSc Physics educational tracks at KU Leuven. Nevertheless, even simple

arguments allow to infer the bizarre situation when the radius of a star with a certain mass becomes so small

that the escape velocity approaches the speed of light. This limiting radius is known as the Schwarzschild

radius and is given by

RSch ≈ 2GM

c2≃ 3

M

M⊙

km. (13.15)

Classical mechanics would never result in a gravitational field for r ≤ RSch from where photons can no

longer escape, as photons have no mass. However, the concept of the Schwarzschild radius gives an intuitive

explanation for the “blackness” of a black hole.

Despite the recent “imaging” of the supermassive black hole at the centre of the galaxy M87, stellar

black holes remain hard to detect. One approach is by detecting X-rays emitted by the matter falling into

the black hole. Another, easier, method is by detecting the motion of the visual component in a binary star,

where the other component is a black hole (see Chapter 14). In that way the existence of black holes can be

proven. Nowadays many black holes in binary stars are known. The best-known and first-found example

is Cygnus X-1. This object is an X-ray source for which the binary character reveals a massive O-type

supergiant companion. From the estimation of the mass of this component and the inclination of the orbital

plane, the mass of the invisible compact star is estimated to be 6 M⊙, which excludes it to be a neutron star.

Meanwhile, many other such obvious examples of binary systems with an invisible yet massive compact

companion have been found, the so called “X-ray binaries”. In those objects, the invisible companions all

have a mass above the (uncertain) upper limit for a neutron star. Black holes in binary systems are discussed

in the master courses Binary Stars and High Energy Astrophysics and we briefly touch upon them in the

final chapter of this SSE course.

Summary

We summarize a few of the results we have described in this chapter up to now. Mass loss due to a radiation-

driven stellar wind has a considerable effect on the life cycle of stars with a birth mass above 15M⊙, and

consequently on that of the galaxies containing such stars. This mass loss :

1. changes the chemical composition of the stellar surface,

2. drastically changes the life span of the star during certain evolutionary stages,

3. explains the occurrence of circumstellar matter around black holes,

4. considerably changes the post-main-sequence evolutionary tracks in the HR diagram,

218

5. explains the lack of very luminous red supergiants,

6. explains the existence of LBVs and WR stars, which are the precursors of black holes,

7. and, last but not least, results in a considerable enrichment of heavy elements in the interstellar

medium, and thereby determines the chemical evolution of galaxies.

13.7 Chemical evolution of galaxies

13.7.1 Chemical enrichment by stellar evolution

In Figure 13.12 a schematic representation is shown of the chemical enrichment of the interstellar medium

through the mass loss of massive stars. The figure shows the masses in terms of percentage of helium and

heavier elements that are expelled by the stellar wind and during the supernova explosion of massive stars. A

division is made for helium, carbon, oxygen and elements of the silicon-iron group. For the Population I star,

we notice a clear increase in the importance of the stellar wind in terms of expelled helium for increasing

mass. Around 35M⊙, the contribution of the wind and of the explosion are about equal. With increasing

mass, the chemical enrichment of helium by the stellar wind becomes much more important than the one

during the explosion. The latter is of no significance for stars with a mass above 60M⊙. The enrichment of

elements heavier than hydrogen and helium by the stellar wind only becomes important for masses above

50M⊙, during their WC stage. Even for these stars, the enrichment during the supernova explosion is

dominant compared to the enrichment due to the stellar wind. For the metal-poor star, the wind is a lot less

strong as the line-driving is heavily dependent on metals. As revealed by Figure 13.12, the fractional mass

of helium and carbon by the dust-driven wind emitted by AGB stars is significant and less dependent on

metallicity than the line-driven wind of massive stars. The fractional masses residing in the three types of

remnants, i.e., white dwarf, neutron star or black hole, are also indicated.

13.7.2 Initial mass function

Continuous starformation results in a steady decrease of the population of massive stars. Such stars live so

short on a galactic time scale that their relative number is determined by the fractional amount of gas in a

galaxy. So even when massive stars would be born with a similar probability as low-mass stars, they still

would become less common as the galaxy evolves. This effect is enhanced as we know that the probability

to form massive stars is much lower than the probability to form low-mass stars.

Suppose that starformation is independent of the location in the galaxy and of its age. In that case,

the number of stars that form at a given time and in a given volume only depends on the mass. Denote the

number of stars born with a mass in the interval [M,M + dM ] as

dN = Φ(M) dM. (13.16)

219

Figure 13.12: Mass fractions ejected by a stellar wind and during the supernova explosion as a function of

birth mass, for two metallicities. The fractional mass left behind in the form of a white dwarf, a neutron star

or a black hole is also indicated. (From Maeder, A., 1992, A&A, Vol. 264, p.105–120)

In this expression, Φ(M) is called the birth function. It was estimated empirically based on observations of

main-sequence stars in the environment of the Sun by Salpeter in 1955. Salpeter made a histogram of the

stars according to the luminosity using the mass-luminosity relation and he assumed that the life span of the

220

main sequence was proportional to M/L. Meanwhile observations have improved substantially, as well as

the estimate of the life span of the main sequence by stellar evolution models. An adapted version of the

Salpeter distribution is:

Φ(M) ∼ M (−2.35±0.3). (13.17)

The initial mass function (IMF), ξ(M), is defined as follows. We write the amount of mass that is

found at a given time and in a given place in stars with a mass in the interval [M,M + dM ] as

M dN = ξ(M) dM. (13.18)

When we then combine Eqs (13.16), (13.17), and (13.18), we obtain for the IMF:

ξ(M) ∼(

M

M⊙

)(−1.35±0.3)

. (13.19)

For the environment of the Sun, deviations from this approximation for low mass stars and particularly for

brown dwarfs are found. This does not come as a surprise as it is very hard to characterise the stars with

the lowest masses and brown dwarfs, and consequently there is a bias of the IMF exponent towards high

masses, i.e., it is hard to describe the IMF in terms of one single exponent.

In general, the star formation rate changed quite drastically over the history of galaxies and these

changes produce variations over time in the local birth functions and in the overall IMF. Moreover, the

main-sequence lifetimes of massive stars overlaps with the duration of the star formation process of lower-

mass stars. This must imply that the original IMF is different from the current one, i.e., that the assumption

of the star formation rate being constant or only slowly varying with time is not justified. The distribution

function in Eq. (13.19) hence is and remains a semi-empirical result which cannot be but of limited precision.

We presently lack a sound theory that has one simple IMF description as its solution, because of the

complexity of the evolutionary aspects that need to be taken into account. Appropriate star formation rates

should indeed include the history of the most massive stars, star bursts, the evolution of OB associations and

star clusters, along with the evolution of the field stars. All those complex evolutionary phenomena lead to

variations in the IMF from parsec scales to entire galaxies. Improvement of the IMF based on the theory of

star formation and stellar evolution is a whole branch of research by itself.

13.7.3 Global enrichment of the Universe

The fractional mass returned to the interstellar medium by a generation of stars and the number of compact

remnants resulting from that generation (white dwarfs, neutron stars, black holes) is obtained after con-

volving the chemical enrichment shown as a function of the fractional mass in Figure 13.12 with the birth

function in Eq. (13.17). This hence delivers the chemical enrichment per generation. This is shown for a

generation of 1000 massive stars that have all exploded as supernova in Figure 13.13. It is this enrichment

that is subsequently used in chemical evolution models of galaxies. The helium enrichment is mainly due

to stars with a mass below 20M⊙. Heavier elements are provided by stars with M > 20M⊙, with carbon

and nitrogen as an exception. As there are many more low-mass stars than massive stars, the major part of

221

Figure 13.13: Convolution of the stellar yields (as indicated in Figure 13.12) for a sample of 1000 massive

stars of solar metallicity that have exploded as supernova with the Salpeter birth function in Eq. (13.17).

This reveals the overall chemical enrichment delivered by this one generation of massive stars. Left panel:

for non-rotating stellar models; right panel: for models with rotational mixing. The dotted areas show the

wind contributions. (From Hirschi et al., 2005, A&A, Vol. 433, p.1013–1022)

the carbon production in the Universe is provided by AGB stars. This scenario implies that stars of a later

generation are always born with a higher metallicity Z .

The winners of this whole galactic evolution are the compact remnants, as well as the brown dwarfs

and the low-mass main-sequence stars that did not have enough time yet since the Big Bang to evolve away

from the main sequence. At the end of the galactic evolution, when all available gas will be locked in these

low-mass objects, the starformation stops entirely. On a galactic scale we can summarize the evolution

processes as follows:

1. the amount of available gas decreases constantly. The number of gas and dust clouds therefore de-

creases steadily;

2. the luminosity of the galaxy, provided by the luminosities of the individual stars, decreases, as the

relative number of massive stars strongly decreases in favour of the number of compact remnants and

low mass stars;

3. the galaxy becomes more and more metal rich.

222

The strongest chemical enrichment took place in the early life of the galaxy, i.e. the degree of enrichment

decreases strongly in time. Consider the Sun as an example. Its age is about 1/3 of the age of our Milky

Way and its metallicity Z is about 0.014. The metallicity of the youngest stars in our galaxy is 0.04, while

that of the oldest stars is about 0.0003. This means that Z has increased by a factor ∼50 during the first 2/3

of the life time of the galaxy, and afterwards only by a factor ∼3.

It thus becomes clear that the division of stars in only three Populations, as mentioned in the first

chapter, is too simple. The division has grown historically, mainly based on the position of the stars in and

around the galactic plane. In practice, it is impossible to group the stars in a discrete division of Populations,

as we have a continuous variation in Z . There are old Population I stars that formed between Populations

I and II and there are also extreme Population II stars with a metallicity lower than the average population

II star as they originated during the earliest stages after the galaxy formation. The latter stars are nowadays

called Population III stars, although they may have already obtained some metal fraction.

In any case we conclude that the mass fraction of heavy elements is very small even in the youngest

stars. The absolute mass of heavy elements that the star is born with nowadays is, however, substantial

when we compare it to the masses of exoplanets around host stars. For the Sun, e.g., we find that its mass

of heavy elements exceeds by far those of all planets in our solar system. The gaseous planets still contain

some primordial gas, but all other bodies in the solar system exist of heavy elements that where present

in the proto-solar dust cloud from which the Sun has originated. The source of these heavy elements is

nucleosynthesis. We thus find that most atoms in our body have once belonged to stars and that most of

them experienced dramatic explosions.

223

224

PART IV: BINARY EVOLUTION

225

Chapter 14

Binary stars and their evolution

More than half of the observed stars belong to multiple systems. In some binary systems, stars move at

such a considerable distance that their evolution is not influenced by the companion. Such wide binaries

follow the evolutionary theory of single stars as outlined in the course so far, hence these do not require an

extra chapter. Here, we deal with systems of stars that are gravitationally bound to each other and move

around a common centre of gravity in such a way that they influence each other’s evolution. The fraction

of binarity for main-sequence stars changes according to the masses of the stars involved, as shown in

Figures 14.1, 14.2, and 14.3. The latter two figures were reproduced from the pioneering study by Sana

et al. (2012), which revealed that binary interaction dominates the evolution of O-type stars, such that one

cannot ignore binarity when it comes to understanding and modelling the evolution of the most massive

stars in the Universe. The previous chapter has to be recalled keeping this in mind. In fact, the LBV shown

in Figure 13.1 is a binary and SN1987A was one as well.

This chapter offers only a very concise overview of close binary systems and their evolution, keeping

in mind that the biennial MSc courses Binary Stars and High-Energy Astrophysics are fully dedicated to this

topic and obviously provide many more details. Compact binaries are the end products of binary evolution.

These are subject to general relativity (not treated here) including gravitational wave progenitors. As for the

origin of binaries, we do not go into details. Binary formation is mainly attributed to two processes: tidal

capture due to close encounters (e.g., in globular clusters) and fragmentation as discussed for single stars in

Chapter 9.

In a close binary, each star is distorted by the tidal force induced by its companion. More precisely,

a close binary is defined as a binary of which the distance between the components is comparable with or

only slightly larger than their own dimensions. The evolution of a star in such a close binary is different

from the evolution of an isolated star because of the proximity of the companion. Because of its presence,

the primary component cannot grow unlimitedly during evolution. As discussed in the previous chapters,

all evolved stars, once beyond the TAMS, go through giant or supergiant phases increasing their radius

drastically. The evolution of a star in a close binary will therefore change once the evolved primary comes

close to its companion. We recall from the discussion of stellar evolution theory of a single star that the most

227

Figure 14.1: Top panel: the y-axis shows the fraction of main-sequence primaries with at least one stellar

main-sequence companion having a mass at least 10% of the primary’s mass shown on the x-axis. Bottom

panel: the average frequency of stellar main-sequence companions having a mass at least 10% of the pri-

mary’s mass per primary. The red cross is for pre-MS binaries. These statistics exclude sub-stellar brown

dwarf companions, as well as compact remnant companions such as white dwarfs, neutron stars, or black

holes. (From Moe, M., 2019, Memorie della Societa Astronomica Italiana, Vol.90, p.347)

massive of both components will first evolve away from the main sequence.

14.1 Observational classification of close binary stars

It is known since long that gas streams from one component to the other occur in close binaries. This is a

result of the tidal forces and in some cases even of the physical interaction of two stars in contact. Such gas

streams were observed for the first time in the spectrum of the star β Lyrae during one of its eclipses.

Mass transfer is a necessary ingredient to understand binary stars, as shown by the so called Algol

paradox. This paradox was derived from observations of the binary Algol in the 1940s. Algol is composed

of a red giant and a main-sequence star. Because the red giant is the most evolved star, it should be the most

massive one. However, measurements of the orbital radial velocity showed that the red giant is less massive

228

Figure 14.2: Cumulative number distributions of orbital periods (left panel) and of mass ratios (right panel)

for O-type objects. The horizontal solid line and the dark green area indicate the most probable intrinsic

number of binaries and its 1σ uncertainty, corresponding to an intrinsic binary fraction of 69±9%. The

horizontal dashed line indicates the most probable simulated number of detected binaries. (From Sana et al.,

2012, Science, Vol. 337, p. 444)

than the unevolved main-sequence star. This is the case for many similar binary systems. The fact that the

most evolved star is the less massive one stands in sharp contrast with the evolution theory in the previous

chapters. The paradox finds its solution by realising that the more evolved star has already encountered

considerable mass loss. The ejected mass is captured by its companion making the latter the most massive

one of the system. Algol and the similar semi-detached systems clearly demonstrate the occurrence of mass

transfer between binary components.

The evolution of close binaries is mainly determined by the increase of the stellar radii, because the

size of the stars determines the start of mass transfer. The episodes of the fastest increase of the radius

correspond with transitions from central to shell burning, i.e., when the core contracts and the outer layers

expand. It is thus most plausible that mass transfer in a close binary occurs during these phases.

Observers usually define the primary component, or simply primary as the brightest star of the pair

and give it mass M1. The other star is then called the secondary with mass M2. As we just discussed in

the case of Algol, this primary need not necessarily be the most massive one, as the mass ratio (denoted as

q ≡M2/M1) may have switched already during phases of mass transfer. The star undergoing the mass loss

is called the donor, while the star accreting this mass is called the gainer. Much of the discussion to follow,

will be based on Newton’s theory of gravity and on Kepler’s laws. We will not derive these laws here as this

229

Figure 14.3: Schematic representation of the relative importance of different binary interaction processes.

All percentages are expressed in terms of the fraction of born O-type stars, including the single and binary

ones, either as the initially more massive component (the primary), or the less massive one (the secondary).

(Figure courtesy of Prof. Selma de Mink, reproduced from Sana et a., 2012, Science, Vol. 337, p. 444)

was already done in detail in BSc courses.

As is often the case in astronomy, objects are divided in different categories or classes. The case of

binary stars is no exception. In fact, there are several types of classifications for binaries: based on observed

properties, based on physical principles, based on type of mass transfer. A first classification is based on

observational data:

• visual binaries: two stars in orbit around each other, far enough apart that we can see them as two stars

(by eye or by telescope). These binaries are situated relatively close to the Sun and have a relatively

wide orbit. Visual binaries are highly important for detailed mass determination.

• composite spectrum binaries: two stars in orbit around each other but generally too close to be seen

separately, their binary nature is revealed by spectral lines characteristic of two different stars (having

different or similar spectral types).

• eclipsing binaries: two stars in orbit around each other that periodically occult (or eclipse) one another

230

in the line-of-sight of an observer. This implies that the observer cannot be much inclined with

respect to the orbital plane. As visual binaries, eclipsing ones are important for fundamental parameter

determination.

• ellipsoidal variables: characterised by a double-wave light curve due to a non-spherical shape of the

components. There are no eclipses which must mean that the orbital plane is heavily inclined with

respect to the observer.

• astrometric binaries: the binary nature is revealed because the visible star shows wobbles in its proper

motion.

• spectroscopic binaries: the binary nature is revealed by the back and forth motion of spectral lines

in the spectrum. Spectroscopic binaries are discovered from periodic changes in radial velocity, a

quantity deduced from the spectrum. In case of a circular orbit, we can deduce from the third Kepler

law (V ∼ P−1/3) that the largest variation in the radial velocity curve is observed in close systems

with a short orbital period. That is why, for spectroscopic binaries, unlike for visual binaries, a

selection effect occurs towards close systems. This class is subdivided:

1. SB1 or single-lined spectroscopic binaries: only the spectrum of one of the components is de-

tected. These are very common. Indeed, there is a relatively low probability that the spectrum of

the brightest component does not dominate the spectrum of the weaker component, because the

luminosity is a quantity which can vary with orders of magnitude. The spectrum of the brightest

component fully dominates in this case.

2. SB2 or double-lined spectroscopic binaries: the spectra of both components are detected simul-

taneously. This means that both stars cannot be very different in effective temperature. They are

thus not so common compared to SB1s.

14.2 The Roche model

We study a binary system consisting of two stars with respective masses M1 and M2. When both stars are

spherically symmetric, their gravitational potential is ∼ 1/r. In this case, both components move around

the centre of mass of the system according to an elliptical orbit. The orbital period is connected with the

extent of the system via the third Kepler law:

P = 2π

[

a3

G(M1 +M2)

]

, (14.1)

where a is the semimajor axis of the orbit. When e is the orbit’s eccentricity, the separation between both

components varies from a(1−e) at periastron to a(1+e) at apastron. In case of a circular orbit, a is simply

the separation between the two stars.

We want to assess how long one of the stars can expand before mass transfer occurs. For simplicity,

and also because this will turn out to be a valid approximation prior to the expansion of the more massive

binary component (see below), we assume that the binary system has a circular orbit and the components

are rotating with a constant angular frequency Ωorb = 2π/Porb.

231

Figure 14.4: The coordinate system with origin in the stellar centre of the primary component.

We consider a coordinate system with origin in the centre of mass of the binary. In an inertial system

the velocity ~v of a particle with position vector ~r is given by

~v = ~r + ~Ωorb × ~r. (14.2)

Here ~r ≡ d~r/dt stands for the motion of the particle with respect to a co-rotating coordinate system, while~Ωorb ×~r represents the velocity of the point ~r with respect to the inertial system. Similarly, the acceleration

~a of a particle expressed in an inertial system is provided by

~a = ~v + ~Ωorb × ~v, (14.3)

and using Eq. (14.2) this can be transformed into

~a = ~r + 2~Ωorb × ~r + ~Ωorb × (~Ωorb × ~r). (14.4)

The second term on the right hand side of this equation is the Coriolis acceleration and the third term is the

centrifugal acceleration. Let us consider a Cartesian coordinate system (X,Y,Z) that co-rotates in such a

way that the Z-axis coincides with the rotation axis: ~Ωorb = (0, 0,Ωorb). For ~r = (X,Y,Z) we can deduce

the equality

~Ωorb × (~Ωorb × ~r) = ~∇[

−1

2Ω2orb

(

X2 +Y

2)

]

≡ ~∇ΦΩorb. (14.5)

The centrifugal force can be defined as being derived from a force field ΦΩorbacting perpendicular to the

direction of the rotation axis. The acceleration ~a of a mass element is controlled by the forces acting upon

it: (we again work per unit mass):

~a = −1

ρ~∇P − ~∇ΦG, (14.6)

232

with ΦG the gravitational potential due to the two stars. This way, the equation of motion for a unit mass

expressed in rotating coordinates becomes:

~r + 2~Ωorb × ~r = −1

ρ~∇P − ~∇ΦG − ~∇ΦΩorb

. (14.7)

Besides this, also the continuity equation

∂ρ

∂t= −~∇.(ρ~r) (14.8)

and the equation of Poisson~∇2ΦG = 4πGρ (14.9)

must be fulfilled.

We will discuss below that the tidal forces cause stars to end up in a state of co-rotation. In this case

~r = ~0 and the equation of motion becomes

1

ρ~∇P + ~∇Φ = ~0 with Φ = ΦG +ΦΩorb

. (14.10)

This equation implies that surfaces of equal P and surfaces of equal Φ coincide, so it follows that P and

ρ are functions of the total potential Φ. In particular, the pressure and the density at the stellar surface

evolve towards value zero, so the stellar surface is determined by the shape of the surface Φ = constant. We

conclude that if we want to deduce the stellar shape in a binary system, we have to determine the shape of

the equipotential surfaces Φ = constant.

Determining the general shape of the equipotential surfaces is complex. Through the Poisson equation

ΦG depends on the density distribution in each of the stars. The solution for an equipotential surface can

therefore only be obtained in a numerical way. In practice, an approximate solution is introduced following

the French physicist E. Roche (1820–1883). In the Roche approximation it is assumed that the gravitational

field of each of the stars can be approximated as if the star is not disturbed by rotation or by its companion.

In other words, we suppose that the mass of each of the stars is concentrated in the stellar centre. In this

case

ΦG = − GM1

|~r − ~r1|− GM2

|~r − ~r2|(14.11)

is the solution to the Poisson equation, where the stellar centre of the first component is situated in ~r1 and

that of the second component in ~r2. This approach is good up to the level of a few percent because most

stars have a strong mass concentration towards their centre.

To deduce an explicit expression for the Roche equipotentials, it is convenient to shift to a coordinate

system with origin in the centre of the primary component (see Figure 14.4). We consider Cartesian coordi-

nates (x, y, z) with the z−axis coinciding with the rotation axis. The x−axis connects the two stellar centres

and points towards the secondary component. The y−axis is chosen so as to achieve a right-handed rotating

coordinate system. The coordinates of the stellar centre of the secondary component then are (a, 0, 0). The

binary’s centre of mass is located in the point with coordinates (µa, 0, 0), with µ ≡ M2/(M1 +M2). The

transformation formulae between the systems (X,Y,Z) and (x, y, z) are

x = X+ µa, y = Y, z = Z. (14.12)

233

Roche lobe

common

envelope

semi-detached

mass transfer

(RLOF)

detached

Figure 14.5: Top panel: Sections in the orbital plane of Roche equipotentials, for a binary system with mass

ratio q = M2/M1 = 0.6. The position of the Lagrangian points L1, ldots, L5 (black dots) are indicated, as

well as that of the centre of mass (plus symbol) of the system. Bottom panel: A schematic representation

of the equipotentials as a function of distance along the axis joining the two stars, with top panel a detached

binary, middle panel a semi-detached binary, and bottom panel a contact binary. (Figure courtesy of Prof.

A. Jorissen, Universite Libre de Bruxelles, B).

In these new coordinates, the Roche potential is given by

Φ = ΦG +ΦΩorb= − GM1

(x2 + y2 + z2)1/2− GM2

((x− a)2 + y2 + z2)1/2− 1

2Ω2orb

[

(x− µa)2 + y2]

(14.13)

and the Roche equipotentials are given by Φ =constant.

234

Figure 14.6: Schematic representation of stable RLOF occurring near the tip of the RGB and leading to a

configuration of a subdwarf binary according to the properties as indicated. (Figure courtesy of Prof. Philipp

Podsiadlowski, Oxford University, UK)

235

In Figure 14.5 we show the equipotentials for z = 0, coinciding with the orbital plane. The equipo-

tential surfaces close to each star are almost spherically symmetric hence the influence of the other star is

minimal. The presence of the component distorts the surfaces in two different ways. Because of the tidal

forces due to the gravitational field of the component the equipotential surface gets an elongated spheroidal

shape . The co-rotation of the star with the orbital frequency causes that the spheroid to be flattened by the

centrifugal force and that the axis of symmetry of the flattened spheroid coincides with the rotation axis.

The net effect of both of these quasi-ellipsoidal distortions is that the star has its maximum dimension along

the connecting line of the two stellar centres, while it is minimal in the direction of the rotation axis.

The further from the star, the more distorted the equipotential surfaces are, until they eventually reach

the critical potential. This contains the inner Lagrangian point L1, which is a saddle point of the potential.

In three dimensions, the equipotential surfaces are approximately axisymmetric around the connecting line

between the stellar centres. So the equipotential surface through the point L1 contains two special volumes,

which are called the Roches lobes. Figure 14.5 reveals the three most important maxima, corresponding with

the three Lagrangian points L1, L2, L3. These can be found by solving the following equations

∂Φ

∂x= 0, y = 0, z = 0 . (14.14)

When both stars are unevolved, each component is situated within its Roche lobe and the stars are

almost spherically symmetric. This is called a detached system. When the stars evolve, they expand. Each

star can only expand until it has filled its Roche lobe. In case of further expansion, mass transfer will occur

to the other component via L1. From this moment on, the system is semidetached and experiences Roche

lobe overflow (RLOF). When both stars are expanded beyond their Roche lobe, a contact system is formed.

A diagram for these three cases is shown in Figure 14.5. The envelope of a contact system can co-rotate

with the orbital motion until the expansion reaches beyond L2. When the binary expands even further, it

loses mass via L2 and the mass is no longer forced to co-rotate. In this case ~r 6= ~0 and the Roche potentials

are no longer relevant.

The tidal forces get stronger as a star fills its Roche lobe. The Roche Lobe radius RL is defined in such

a way that the volume filled by the Roche lobe equals 4π/3R3L. The ratio RL/b is fully determined by the

ratio of the component’s masses. Once the star’s radius equals the radius of the Roche lobe, the tidal effects

are so strong that the stars will move in a circular orbit and will rotate synchronously. This state will be

achieved first in the outer layers.

14.3 Determination of the orbital and fundamental parameters of binaries

14.3.1 Orbital elements

Binary stars are subject to the equations describing a two-body orbital motion. In general they move on

elliptical orbits, characterised by a semi-major axis a, an eccentricity e, and an orbital period Porb. It is

customary to describe the position and orientation of the orbit with respect to an inertial coordinate frame

236

whose z-axis points towards the observer. The orbit is then described with the three so-called Euler angles.

The first angle is the inclination i. It is defined as the angle between the line-of-sight and the axis of the

angular momentum vector of the binary, such that zero inclination corresponds to a pole-on view and i = 90

to a view in the orbital plane. A second angle, Ωorb, describes the longitude of the ascending node. The

orientation of the orbit within its own plane is described by the longitude of periastron ω.

The position of the stars as a function of time is given by Kepler’s equation. Thus, we add a sixth

parameter describing the motion of the bodies, by fixing the time of periastron passage T . The six quantities

(a, e, i, ω,Ωorb, T ) thus determine the binary orbit in three-dimensional space. They are termed the orbital

elements. These definitions as well as the equations of the two-body problem are fully described in the

biennial MSc course Binary Stars. The students not attending this elective course are referred to Chapter 2

of the monograph “An Introduction to Close Binary Stars” by Hilditch (2001) for details. Moreover, Chapter

3 of that book describes in detail how to determine the orbital elements from various types of data.

14.3.2 Masses and radii of main-sequence stars

Important quantities that can be determined from data, besides the six parameters describing the orbit, are

• the radial velocity of the centre of mass of the binary, termed the gamma-velocity or also the systemic

velocity;

• the semi-amplitude of the radial-velocity curves of the primary K1 and of the secondary K2;

• the mass function, defined as

f(M) ≡ M2 sin3 i

(M1 +M2)2 .

As shown in Hilditch (2001), the mass function can be determined for each spectroscopic binary from

its T , K1 and e:

f(M) = Tyr

(

K1

29.79

)3

(1− e2)3/2,

where K1 should be expressed in km s−1 to obtain f(M) in solar masses. The mass function is one

equation containing three unknowns. It represents a lower limit for the mass of the secondary (M1 = 0and i = 90).

As explained in Hilditch (2001), SB2s allow the derivation of the mass ratio as we can deduce the ratio of

the semi-major axes of both components from their semi-amplitudes. SB2s eclipsing binaries deliver the

masses and radii of the individual components, because a value of the inclination angle can be derived as

well. Visual or astrometric binaries also provide good constraints but only for SB2 eclipsers do we get all

four of M1,M2, R1, R2 with very high accuracy, from Kepler’s laws. There are about 100 binary systems

nowadays for which such dynamical masses and radii can be derived with 1% accuracy, without having to

rely on stellar models. Most of these are included in the review paper of unevolved binaries covering a large

mass range by Torres et al. (2010), to which we refer for the values.

237

Recall that the MLR-relations of ZAMS stellar models were evaluated by means of binaries in Fig-

ure 10.2. We noted a very good agreement between observations and theory, for a large mass range covering

a huge range in luminosities. This agreement between theory and observations at the level of 1% in the

masses and radii is especially impressive because we compare theoretical ZAMS models with observations

of stars that cover the entire main sequence. Noteworthy is that internal mixing, both in the form of CBM

and envelope mixing, occurs during the evolution on the main sequence (see Chapter 7). This creates more

massive helium cores. The study by Torres et al. (2010) did not take this into account. An improved ho-

mogoneous analysis for 11 among the massive binaries was performed by Tkachenko et al. (2020) taking

CBM into account and revealing the need of higher core masses than those obtained from ignoring CBM. In

this way, binaries play a critical role to evaluate single-star models in the core-hydrogen burning phase, to

evaluate the internal mixing and angular momentum transport processes discussed in Chapter 7.

14.3.3 Masses of white dwarfs

The derivation of accurate masses of white dwarfs is evidently important to test the Chandrasekhar mass

limit of 1.44 M⊙. Opportunities to derive model-independent masses of single white dwarfs, or white dwarfs

in wide binaries are scarce because the luminosity of the white dwarf is usually much lower compared to

the one of its companion star. Three well-known visual binaries with a white dwarf component are Sirius

(αCMa, A1V primary, orbital period of 50 years); αCMi (F5IV primary, orbital period of 41 years) and

o2 Eri BC (M4.5V primary, orbital period of 252 years). The masses of these white dwarfs are, respectively,

0.94, 0.65 en 0.43 M⊙. Asteroseismology also delivers good mass estimates for tens of white dwarfs. The

conclusion is that all mass estimates so far are compatible with the Chandrasekhar limit. The histogram of

the masses of white dwarfs shows a clear maximum at M = 0.58M⊙. This is understood in terms of the

initial mass function: stars born with 1.5 M⊙ end their lives as white dwarfs with a mass below 0.65 M⊙.

The determination of white dwarf masses is also important to try and understand the mass loss during

the AGB. Model computations predict that Sirius B descended from a star with initial mass between 3 and

4 M⊙. All white dwarfs in open cluster with a turn-off point near 6 M⊙ result in a remnant mass of 1.2 M⊙.

This shows that stars with birth masses until 6 M⊙ have no problem at all to lose enough mass and end their

life as a white dwarf with a mass below the Chandrasekhar limit.

White dwarfs occur frequently in cataclysmic variables. These are close binaries consisting of a cool

low-mass main-sequence star losing mass to its white-dwarf companion. The masses of these white dwarfs

are determined with less accuracy than for a visual binary because the white dwarf itself is not detected.

Rather, emission lines due to the infall of matter of the donor on the accretion disk around the gainer are

observed. These emission lines follow the orbital motion and thus allow to derive how the white dwarf itself

moves in its orbit. This is possible whenever the emission lines are clearly detected and not too broad (in

order to derive a precise value for the velocity changes). Usually, the estimate of the orbital inclination is

less accurate in this case compared to a visual binary and this limits the accuracy of the masses. The white

dwarfs in cataclysmic variables have higher masses than the single white dwarfs.

238

Figure 14.7: Schematic presentation of a binary whose least evolved component transfers mass to a compact

companion, creating an accretion disk around the latter. The position in the disk where matter is accreted

reveals a hot spot, giving rise to emission lines in the spectrum of the binary. This allows to determine the

orbital motion, even if the gainer itself is invisible. (From Pringle & Wade 1985)

14.3.4 Type Ia supernovae

The standard model for the formation of Type-I supernovae was already mentioned in Chapter 11. A close

binary with a white dwarf companion whose mass is already close to the Chandrasekhar limit keeps accreting

mass from its donor (cf. cartoon in Figure 14.7). At a certain point, the Chandrasekhar limit is surpassed

and a nuclear reaction transforming carbon into oxygen, takes place in degenerate matter. This leads to

a thermo-nuclear runaway and leads to an explosion. Such Type-I supernovae are classified in different

subcategories according to the spectral lines observed in their spectrum. Type-Ia supernovae are observed

most frequently. They show spectral lines of O, Mg, Si, Ca, and Fe and their Fe and Co lines remain visible

several months after the explosion. They are found in all types of galaxies and are thus ideally suited as

distant indicators. Type-Ia supernovae are important in observational cosmology, to derive the size and age

of the expanding Universe.

Uncertainties of 3D hydrodynamical simulations to predict the explosion of Type-I supernovae are

numerous. However, it is clear that two parameters are crucial in the evolution: the mass of the white dwarf

and its accretion rate. Different combinations of these parameters lead to quite different types of explosions

in terms of the remnants. Some explosions are so violent that they leave no remnant; others result again

in a white dwarf which is less massive than the one that went over the Chandrasekhar limit. An important

chemical aspect occurring for Type-Ia supernovae is the presence of 56Ni. This isotope decays into 56Co

which, in its turn, decays into 56Fe with a half-life time of 77 days. This picture is fully confirmed from the

observation of Co lines several months after the explosion of Type-Ia supernovae.

239

14.3.5 Masses of neutron stars

A degenerate neutron gas is obviously able to counterbalance gravity in a neutron star, up to a mass of about

2.5 M⊙. Unlike the case for a degenerate electron gas, the EOS of a degenerate neutron gas is still debated,

with various formulations under investigation. These EOS candidates lead to different limiting masses (in

analogy with the Chandrasekhar limit), and thus accurate observational mass determination of neutron stars

is of utmost importance for high-energy physics.

The mass determination of neutron stars is based on data of X-ray binaries. Low-mass X-ray binaries

(LMXBs) are analogous to cataclysmic variables, with the gainer a neutron star or a black hole instead of a

white dwarf. For the same amount of mass transfer, an LMBX produces much more energy output and at

shorter wavelengths because the potential well of a neutron star is ∼1000 times deeper than that of a white

dwarf. This number is even higher for a black hole. The matter falling into the gainer’s gravitational well

radiates energy in the form of X-rays, hence the naming. The surface temperature of such sources is about

106 – 107 K and brings the radiated energy in the X-ray area of the electromagnetic spectrum. A high-mass

X-ray binary (HMXB) results from a massive OB-type main-sequence star dumping material on a neutron

star in a close binary. The magnetic field lines of the neutron star capture the material of the main-sequence

donor, which has a strong stellar wind throughout its life (cf. Chapter 13) and transports this matter to the

magnetic poles of the gainer. When the captured matter falls onto the neutron star, a high-energy stream

emerges and sends out X-rays. The orbital periods of X-ray binaries vary from minutes to tens of days.

In terms of mass determination, X-ray binaries can be treated in the same way as double-lined spec-

troscopic binaries. We do not observe the neutron star directly, but its motion around its companion can

be derived with high precision in case of a pulsar thanks to the light-time effect. The pulses of the neutron

star are shorter whenever it moves towards the observer, while they are longer when the neutron star moves

away from the observer. This effect allows one to determine the mass, provided that the inclination angle is

known. The inclination is estimated from the eclipse of the X-rays by the companion and its uncertainty is

the largest limitation for the neutron star mass estimate. Derived masses range from 0.5 to 2.5 M⊙, hence

most neutron stars have masses not much above the Chandrasekhar limit. A unique opportunity to derive the

mass of neutron stars with an extremely high accuracy occurs for binary pulsars. One can reach accuracy

levels of order 0.001 M⊙ in that way.

14.4 Mass transfer and evolution of close binaries

In general, the components of unevolved close binaries rotate much more slowly than single stars in the same

stage of their evolution. Observations also show that many stars in close binary systems are synchronised

with the orbital motion. Both results are attributed to the effects of tidal forces and show that these are

important in close binaries.

240

Figure 14.8: Schematic representation of the difference between the properties of typical cases giving rise

to a HMXB and a LMXB. (Figure courtesy of Prof. Ed van den Heuvel)

14.4.1 Tidal effects: circularisation and synchronisation

If we want to study the dynamical effects caused by tidal forces in a close binary, it is essential to first

determine the external gravitational field caused by the two components. Whenever the two components of

a binary are located closely to each other, the tidal forces will distort one or both of the components. In

this case, the gravitational potential is no longer a simple function of 1/r but extra terms occur. These extra

241

terms appear because the mass distribution is no longer spherically symmetric in the affected components.

The extra terms in the gravitational potential cause a variation of the order of a few percent. When the stars

are moving around each other, the extra terms produce a force that causes a variation in energy and angular

momentum of the orbit. Among other things, this variation may cause apsidal motion, so the orbit is no

longer a closed ellipse. In this way, the close binary will evolve into a circular system.

An equilibrium (or static) tide occurs when the stars are constantly in a state of hydrostatic equilibrium

in a circular orbit and with their rotation synchronised with the orbital motion. In that case, the distortion of

the components is static in a coordinate system rotating with the binary. In the general case of an eccentric

non-synchonised binary, the tidal forces experienced by the components change throughout the orbital mo-

tion. Such time-dependent tides are called dynamic tides. The tidal interaction causes the system to evolve

towards a circularised orbit and a state of co-rotation with the orbital period. As long as the stars are sub-

jected to dynamical tides, they experience tidal forces with a variable amplitude. In this way, the stars are

forced to oscillate. Hence, tides can be described as a superposition of forced oscillations of a spherically

symmetric star. Given that the tidal forces imply periodic perturbations, resonances may occur between the

dynamic tides and low-frequency gravity-mode oscillations (see course Asteroseismology for a definition).

These resonances intensify the effects of the tidal forces considerably and guide mixing via tidal friction

and angular momentum transport in the system.

Theoretical calculations (omitted here) show that the tidal forces first give rise to the synchronisation

of the components, while the process of circularisation takes more time. The two following mechanisms

explain synchronisation in close binaries:

1. In a star consisting of a convective core and a radiative envelope, the forced oscillations are damped

by viscous effects in the outer stellar layers. This dissipates the pulsation energy. Following this

dissipation, the companion exerts a torque on the other star causing the synchronisation of the rotation

with the companion’s orbital motion. For relatively close binaries, the torque is strong enough to

synchronise the system within a time span shorter than the nuclear time scale of the star. This radiative

damping of dynamic tides is the most efficient mechanism for the synchronisation of the rotation of

the close binary components without a convective envelope.

2. In the components of close binaries with a convective envelope, turbulent convection can slow down

the equilibrium tide in relation to the tide-generating potential. A torque is induced and this leads to

synchronisation. In this case, there is a higher degree of uncertainty for the time scale upon which

synchronisation is induced than for stars with a radiative envelope.

These theoretical results are based on the simplified assumptions that the primary is spherically symmetric,

rotates uniformly in its interior and with the axis of rotation perpendicular to the orbital plane. Ignoring the

effects of the Coriolis force in the treatment of the forced oscillations leads to the following expressions for

the time scales of synchronisation and circularisation of the orbit:

τcirc ≈ 106 · (1/q) [(1 + q)/2]5/3 P16/3orb ; τsyn ≈ 104 · [(1 + q)/2q]2 P 4

orb (14.15)

(result taken from the seminal paper by J. P. Zahn, 1977, “Tidal Friction in Close Binaries”, A&A, Vol. 500,

pp. 121–132). The orbital periods in these formulae must be expressed in days to get the results in years.

242

Mechanism 2 above is effective during the pre-MS phase. Hence some pre-MS stars may arrive on the

ZAMS with already circularised orbits (and are hence expected to be synchronised), particularly for binary

protostars with orbital periods below some 10 days.

We conclude that, either during the pre-MS phase or on the main sequence, the dynamic tides result

in close binaries to evolve into a state of minimal energy: two co-rotating stars in a circular orbit. The

time scale for circularisation is roughly twice the one for synchronisation. Both time scales are shorter

than the nuclear time scale. It is therefore appropriate to assume that close binaries are synchronised and

circularised by the time that the primary reaches the TAMS. Should this not be the case, then this state will

soon thereafter be achieved once RLOF starts.

14.4.2 Mass transfer

The evolution of a star in a close binary is different from the evolution of a single star because mass transfer

occurs and therefore the evolution cannot just be described in terms of the initial birth mass. The evolution

of a star in a close binary is mainly determined by the size of its Roche lobe. In describing the evolution

of the components we rely on simple theoretical principles and adopt the convention that the donor is star 1

and the gainer is star 2. Because more massive stars evolve faster, we assume that the birth mass of star 1 is

higher than the one of star 2.

When star 1 fills its Roche lobe as a result of its evolution, the evolution is guided by the mass transfer

via L1 as soon as the size of the star becomes larger than the size of its Roche lobe. The gas stream of the

gas particles from the atmosphere of star 1 through L1 to the Roche lobe of star 2 is behaving as if the gas

leaks from the stellar atmosphere of star 1 into a vacuum, because star 2 is not yet filling its Roche lobe.

Therefore, the flow velocity of the gas equals the speed of sound in the atmosphere of star 1, which can be

approximated as vs ≈ 15T−1/24 km s−1 (again with the notation T4 ≡ (T/104)K). Since the temperature of

the star can range from a 3 000 K for a M dwarf to some 30 000 K for a late O dwarf, vs can vary between

10 – 30 km s−1.

To understand what happens with the gas stream when it is accelerated towards star 2 after having

passed L1, a second relevant velocity is considered, namely the dynamical velocity in the binary which

equals the orbital velocity of the stars. This can be deduced from Kepler’s laws. A good approximation is

for a system with not too different masses is

vorb ≈ 100

(

M1 +M2

M⊙

)1/3 (P

d

)−1/3

km s−1 . (14.16)

Hence, the sound speed of the gas particles in the stream is far below the orbital velocity in the system so

they get accelerated to a supersonic velocity. This fact implies that the transferred material stays within

a well-defined stream after it has passed L1 because it moves at a velocity much higher than its original

natural velocity vs. Numerical simulations show the gas stream to stay within the Roche lobe of star 2. Two

scenarios can occur:

1. The incoming gas stream immediately collides with the stellar surface of star 2. In this case, the

243

energy gained by the gas stream as a result of the attraction by star 2, is dissipated in a shock at the

stellar surface of star 2. Almost all of the transferred material is immediately accreted by star 2. There

is conservation of mass and of orbital angular momentum in the whole system.

2. The incoming gas stream is too far away from the stellar surface of star 2 and it is not captured at once.

In this case, the gas stream moves around star 2 and, at a given moment, collides with itself in a point

near to star 2 within the Roche lobe. Since the stream moves supersonically, the collision dissipates

kinetic energy, causing heating of the gas stream. This gas stream will radiate thermal energy and cool

down. Since the collision occurs near to star 2, we can neglect the influence of star 1 to the further

trajectory of the gas stream. The stream evolves to a state of minimal energy, which corresponds with

a circular ring around star 2. The radius of this orbit of this ring, RH , is as such that its orbital velocity

equals the tangential velocity of the stream in the point of collision with itself.

Because the material in the gas ring radiates energy, some of the particles move closer towards star

2. The conservation of orbital angular momentum implies that particles will be found further away

from the star as well. So, the gas ring will evolve to a disk around star 2. This way, an accretion disk

is formed around star 2. Matter from this disk can then easily be accreted by star 2. In this scenario,

there is also conservation of mass and of orbital angular momentum (cf. Figure 14.6).

The continuous flow of matter coming from the cooler donor and passing via L1 creates a so-called

hot spot at the exterior of the accretion disk (cf. Figure 14.7). This hot spot causes energetic radiation,

which can be detected as an excess of radiation at ultraviolet or at even shorter wavelengths according

to the level of heating that occurs in the hot spot.

14.4.3 Effect of mass transfer on the orbital parameters

Conservative mass transfer

The orbital period is the quantity of a binary that is the easiest to determine observationally. The orbital

period changes as a result of the mass transfer from donor to gainer. We confine ourselves to the description

of circular orbits, because we can roughly assume that the tidal force acts in such a way that close binaries

evolve towards a circular configuration as argued above.

In a system of coordinates of which the origin coincides with the mass centre of the close binary, each

of the stars moves in a circular orbit around the mass centre of the system. The orbits of stars 1 and 2 then

have radii:

a1 = aM2

M1 +M2en a2 = a

M1

M1 +M2, (14.17)

in which a is the orbital separation. The total angular momentum in the system is approximately

Jorb =M1a21Ωorb +M2a

22Ωorb, (14.18)

in which Ωorb = 2π/Porb. Two additional terms occur when the stars do not rotate synchronously with

the orbit, i.e., Ωorb 6= Ωrot. However, those rotational contributions are so small that they are negligible

in comparison with the terms on the right hand side of the equation (14.18) describing the orbital angular

244

momentum because the stellar radii are smaller than the orbital radii. With the help of Eqs (14.17) for the

radii, the equation for the momentum can be transformed into

Jorb =M1M2

M1 +M2a2Ωorb. (14.19)

When calculating the evolution of close binaries, it is often assumed that the process of mass transfer

is conservative. This means that there is no mass loss from the binary system and there is no loss of angular

momentum. In this case,d

dtM =

d

dt(M1 +M2) = 0 en

dJorbdt

= 0. (14.20)

By taking the time derivative of Eq. (14.19), we get

˙JorbJorb

= 2a

a+

˙Ωorb

Ωorb+M1

M1+M2

M2− M

M, (14.21)

and, taking the conservation of mass and angular momentum taken in account, this can be reduced to

2a

a+

˙Ωorb

Ωorb+M1

M1+M2

M2= 0. (14.22)

By applying Kepler’s 3rd law,

a3 =G (M1 +M2)

4π2P 2, (14.23)

this result can be transformed into the following conditions for the variation of the orbital period, respectively

orbital separation:˙Porb

Porb=

3M1 (M1 −M2)

M1M2, (14.24)

a

a=

2M1 (M1 −M2)

M1M2. (14.25)

Conservative mass loss causes the orbital period to change at a rate that is determined by the donor’s mass

loss. The orbital period as well as the orbital separation decrease because of the mass loss of star 1, given

that M1 > M2 and M1 < 0. The ratio RL/a decreases because the importance of the decreasing mass ratio

M1/M2 is higher than that of the decreasing separation. This means that the donor’s Roche lobe shrinks,

accelerating even more the mass loss and causing it to increase rapidly.

The initiated gas stream transport angular momentum from the donor to the gainer. The mass loss,

which is constantly accelerated, keeps on going until the mass ratio of the stars reverses, causing the orbital

period and separation to increase again. Also the donor’s Roche lobe will increase. On the other hand, the

size of the Roche lobe keeps on decreasing because of the mass loss of the donor. The net result from both

effects is a minimal increase of the Roche lobe radius. This puts an end to the process of accelerated mass

loss, and the loss of mass again occurs at a slower rate (on a thermal of even nuclear time scale rather than a

dynamical time scale). Most close binaries transferring mass are discovered in this phase of their evolution.

At a certain point in the evolution, the radius of the Roche lobe again becomes larger than the stellar

radius. At that moment, the separation between the components is much larger than the initial separation

245

before the start of the mass transfer, because the mass difference between the components is now higher.

Therefore, the thermodynamical equilibrium gets restored in the donor. The stellar surface of the donor

disconnects with the Roche lobe of star 2 and the system gets detached. The donor now consists of a helium

core surrounded by a very thin envelope that dissipates after a short while. Depending on the duration of

the mass transfer and the parameters of the close binary, the donor is transformed into a helium star, a white

dwarf or a neutron star.

Non-conservative mass transfer

In the case of non-conservative mass transfer, mass is lost from the system via L2. In that case, the evolution

of the orbit is more complex. Assume that a fraction β of the transferred mass leaves the system. In that

case, we have

M = β M1 and M2 = − (1 − β) M1. (14.26)

From˙Jorb

Jorb=

1

2

a

a+M1

M1+M2

M2− 1

2

M

M, (14.27)

we get˙Jorb

Jorb=

1

2

a

a+

M1

M1M2M

[

M2M − (1 − β)M1M − 1

2βM2M1

]

(14.28)

and3

2

a

a=P

P= 3

˙JorbJorb

+3 M1

M1M2M

[

(M1 − M2)− βM1 (1 − M2

2M)

]

. (14.29)

The second term on the right-hand side is less negative than in the case of conservative mass transfer and

turns positive somewhat before the mass ratio inverts. The first term, however, is always negative and implies

an increased reduction of the orbital separation compared to the conservative case.

Effect of mass loss due to a stellar wind

For binary systems involving massive components, one also has to take into account the mass loss by a

radiation-driven wind. The same is true for lower mass binaries which undergo a dust-driven wind. Such

winds also affect the orbital elements of the binary. Assume that one of the components undergoes mass

loss M1. In that case, there is a loss of orbital angular momentum:

˙Jorb = M1 a21 Ωorb. (14.30)

We have˙Jorb

Jorb=

M1 a21Ωorb

(a2 ΩorbM1M2)/M= M1

M2

M1M(14.31)

and also˙Jorb

Jorb=

2

3(M1

M− 2

˙Ωorb

Ωorb) +

˙Ωorb

Ωorb+M1

M1− M1

M= − 1

3

M1

M+

1

3

P

P+M1

M1(14.32)

246

such that3

2

a

a=P

P= − 2 M1

M. (14.33)

In this case, both the orbital period and the separation increase such that the mass transfer is slowed down.

14.4.4 The common envelope phase

The common envelope (CE) formalism in a binary was introduced to explain the existence of short-period

binaries with white dwarf components. The situation of a CE binary, compared with a detached and semi-

detached system, is sketched in the bottom right panel of Figure 14.5. During the CE phase, a spiral-in of

the secondary towards the primary occurs, because the companion experiences drag forces which makes it

move into the envelope of the giant primary. This sets in soon after the CE is formed, i.e., when the mass

ratio is high. In that stage, the CE is not in co-rotation and the envelope is heated.

The outcome of the CE phase is determined by the energy balance within the system, assuming angular

momentum conservation. In this model, the orbital energy of the binary is used to expel the envelope of the

giant with some unknown efficiency. The orbital energy Eorb released in the spiral-in process, may be used

to eject the envelope with an efficiency α,

α(Eorb,f − Eorb,i) = Eenv, (14.34)

where Eenv is the binding energy of the ejected envelope, and the subscripts i and f denote initial and final

values before and after the CE phase. In principle, one expects 0 < α ≤ 1. However, in order to explain

observed binaries one often finds that α exceeds unity. This indicates that other energy sources contribute to

the ejection of the envelope, e.g. the luminosity of the giant. Still, a very high value for α is not anticipated,

since it would be physically difficult to explain where such a large amount of energy would come from. The

poorly understood physics of the CE phase does not allow us to set a hard limit on α.

It is reasonable to assume that the secondary does not accrete matter since the mass transfer time-scale

is short. Expression (14.34) can then be written as

α(−GMremnantM2

2af+GMgiantM2

2ai

)

= −GMgiantMenv

λRgiant, (14.35)

where we have expressed Eenv in terms of the structural parameter λ. The combined parameter αλ can be

calculated from Eq. (14.35). To isolate α, one usually takes λ = 0.5, but an appropriate calculation should

take into account that λ depends on the stellar structure.

The total binding energy of the binary consists of the gravitational binding energy and the internal

energy U ,

Ebind =

∫ Mgiant

Mremnant

(

− GM

r+ U

)

dm. (14.36)

It is uncertain how much of the internal energy could contribute to the ejection of (part of) the envelope.

This uncertainty is expressed in a parameter αth:

Eenv =

∫ Mgiant

Mremnant

(

− GM

r

)

dm+ αth

∫ Mgiant

Mremnant

Udm. (14.37)

247

Figure 14.9: Two possible outcomes of a common envelope phase of binary evolution. (Figure courtesy of

Prof. Philipp Podsiadlowski, Oxford University, UK)

Expression (14.37) can be regarded as the effective binding energy of the envelope and is used to derive λ.

The first phase of mass transfer of observed double white dwarfs cannot be described by the standard

α formalism, nor by stable RLOF (which is graphically depicted in Figure 14.6). For binaries with mass

248

Figure 14.10: Scenario of binary evolution leading to a double white-dwarf binary. (Figure courtesy of Prof.

Alain Jorissen, Universite Libre de Bruxelles, B)

ratio close to unity, the common envelope is formed by a runaway mass transfer rather than a decay of the

orbit. In this case, the angular momentum of the orbit is so large that the common envelope is brought into

co-rotation. Consequently, there are no drag forces that can convert orbital energy into heat and kinetic

energy. This scenario is described in terms of the angular momentum balance, the so-called γ-formalism.

249

Figure 14.11: Scenario of binary evolution leading to the formation of a compact binary consisting of a

pulsar and a white dwarf. (Figure courtesy of Prof. Ed van den Heuvel, University of Amsterdam, NL)

The assumption is that the orbital angular momentum carried away by the envelope is γ times the initial

orbital angular momentum,

γJi

Mgiant +M2=Ji − JfMenv

. (14.38)

250

Figure 14.12: Scenario of a massive binary evolution. (From https://hfstevance.com/ccsne)

Although the γ-formalism was originally developed for double white dwarfs, it was put forward to explain

systems in which a main-sequence star transfers mass to a neutron star or black hole but this physical picture

is also applicable to systems where a red giant overflows onto a main-sequence star. The treatment gives

251

Figure 14.13: Formation scenario of a single or black hole black hole binary giving rise to gravitational

wave emission. (From Mapeli, M., 2020, Frontiers in Astronomy and Space Sciences, Volume 7, id.38)

an upper limit for γ, since the angular momentum carried away by the envelope cannot be higher than the

252

angular momentum of the secondary,

γmax =Mgiant +M2

Menv− (Mgiant +M2)

2

Menv(Mcore +M2)exp

(

− Menv

M2

)

. (14.39)

In conclusion, the orbital evolution during a common envelope phase is uncertain. In computations of

binary evolution, this phase is described by the free parameters α or γ (or both), by lack of a theory based

on first principles. There are basically two major outcomes of this phase, as illustrated in Figure 14.9 for a

low-mass binary: ejection of the envelope leaving a compact binary or a merger product.

14.5 Some binary scenarios

Given the above theoretical considerations, simplifications, and uncertainties in the various theories, numer-

ous binary scenarios across stellar evolution have been constructed and studied over the years. We show four

of those in cartoon-like figures, with low-mass to high-mass components as end product from Figures 14.10

to Figure 14.13. We stress that these are just some of the numerous binary evolution channels published in

the literature. It is obvious that major uncertainties come into play in the various channels and in the stellar

structure models accompanying those channels. The scenarios mainly aim to understand the intermediate

and end-products of close binary evolution. The theories cannot but deliver very rough models of stellar

interiors in a qualitative sense. Their level of quantitative detail comes nowhere near to one resulting from

the quite simple elegant theory of single-star evolution described in the other chapters of this course.

Many more details of binary stars and their evolution are covered in the biennial twin MSc courses

Binary Stars and High Energy Astrophysics.

253

254

THE END

255

Appendix A

Planck’s radiation laws

Most part of the electromagnetic radiation that exists in the Universe has a thermal origin. Each source

with a temperature T displays a characteristic spectrum in which the intensity Bν(T ) and the energy density

uν(T ) can be described, in a good approximation, as a function of Planck’s radiation law:

uν(T )dν =4π

cBν(T )dν =

8πh

c3ν3

exp (hν/kT )− 1dν. (A.1)

Here, k is Boltzmann’s constant (see Appendix,B). The derivation of this law is discussed in the course

Natuurkunde III and will not be treated here. An object that radiates following the law (A.1) is called a

black body.

Planck’s law can also be written as a function of the wavelength, which is more practical to compare

with observations of stars:

Bλ(T )dλ =2hc2

λ51

exp (hc/λkT ) − 1dλ. (A.2)

In Figure A.1 we show Bλ(T ) for different temperatures. It is clear that the intensity strives towards zero

in the limit of very small and very large wavelengths. The effective temperatures of the stars is roughly

in-between 3 000 K and 30 000 K. Consequently they radiate in the so called optical window of the electro-

magnetic radiation. Notice how strongly the intensity changes at blue wavelengths for temperatures relevant

for stars.

The curves in Figure A.1 reach a maximum of which the position depends on the temperature:

λmaxT = 2898µm K. (A.3)

This is called Wien’s displacement law. The temperature on the outside of the sun is almost 5 800 K. Con-

sequently the intensity of the solar radiation has a maximum around 500 nm. Planets and warm dust have

temperatures of about 290 K, and therefore radiate with a maximum in the infra-red, at about 10µm. Cool

molecular clouds with a temperature of 10 K radiate in the far infra-red up to the mm area.

257

Figure A.1: Black body radiation for different temperatures. top panel: T varies from 5000 K (lowest curve)

to 9000 K (upper curve) with steps of 1000 K; bottom panel: T varies from 9000 K (lowest curve) to 25000 K

(upper curve) with steps of 4000 K

258

The total energy density, integrated over all frequencies, is given by

u(T ) =

∫ ∞

0uν(T )dν = aT 4, (A.4)

with a the radiation constant (see Appendix B). From this we conclude that the energy of thermal radiation

strongly depends on the temperature of the object that emits the radiation. The energy flux per unit of surface

of a black body is given by Stefan-Boltzmann’s law:

B(T ) = σT 4 (A.5)

where σ is the constant of Stefan-Boltzmann (see Appendix B).

Planck’s law is usually a good first approximation for the description of the radiation intensity of stars.

However, the radiation we receive from stars does not come from one single layer in the stellar atmosphere,

and therefore cannot be characterised by one unique temperature. Moreover, absorption and sometimes

emission lines occur in the intensity spectra of stars. Thanks to quantum physics, they can be interpreted in

terms of transitions in atomic nuclei (for the γ- and X-rays), in electron shells (for UV, visual and infra-red

wavelengths) and in molecules (infra-red and mm wavelengths). A detailed analysis of the stellar spectrum

allows us to interpret the physical condition and the chemical composition of the outer layers of the star.

259

260

Appendix B

Values of some physical and astronomical

constants

In astronomy all quantities are usually still expressed in cgs units. However, students are (as they should

be!) more familiar with the SI system. I leave the choice with you what units you use in this course and have

used them mixed myself. Below we list a few physical and astronomical constants and other commonly

used quantities in the cgs as well as the SI system. We also give a few conversion formulas to other units.

Physical constants :

Constant Symbol cgs units SI units

Speed of light c = 2.99792458 × 1010 cm s−1 2.99792458 × 108 m s−1

Gravitation G = 6.67259 × 10−8 cm3 g−1 s−2 6.67259 × 10−11 m3 kg−1 s−2

Atomic Mass Unit mu = 1.6605390 ×10−24 g 1.6605390 ×10−27 kg

Electron Mass me = 9.1093836 ×10−28 g 9.1093836 ×10−31 kg

Proton Mass mp = 1.6726219 ×10−24 g 1.6726219 ×10−27 kg

Neutron mass mn = 1.6749275 × 10−24 g 1.6749275 × 10−27 kg

Mass helium nucleus mHe = 6.6446572 × 10−24 g 6.6446572 × 10−27 kg

Electron charge e = 1.60217733 ×10−20c esu 1.60217733 ×10−19 Coulomb

Planck h = 2πh = 6.6260755 ×10−27 erg s 6.6260755 ×10−34 J s

Boltzmann k = 1.380658 ×10−16 erg K−1 1.380658 ×10−23 J K−1

Gas R = 8.314510 ×107 erg K−1 mol−1 8.314510 J K−1 mol−1

Radiation a = 7.5646 ×10−15 erg cm−3 K−4 7.5646 ×10−16 J m−3 K−4

Stefan-Boltzmann σ = 5.67051 ×10−5 erg cm−2 s−1 K−4 5.67051 ×10−8 J m−2 s−1 K−4

261

Astronomic constants :

Constant Symbol cgs units SI units

Radius Sun R⊙ = 6.9598 ×1010 cm 6.9598 ×108 m

Mass Sun M⊙ = 1.9891 ×1033 g 1.9891 ×1030 kg

Luminosity Sun L⊙ = 3.8515 ×1033 erg s−1 3.8515 ×1026 J s−1

Astronomical unit AU = 1.49598 × 1013 cm 1.49598 × 1011 m

Parsec pc = 3.08568 × 1018 cm 3.08568 × 1016 m

Light year ly = 9.463 × 1017 cm 9.463 × 1015 m

Conversion :

From Angstrom to cm : 1 A = 10−8 cm

From Newton to dyne : 1 N = 105 dyne

From Joule to erg : 1 J = 107 erg

From electronvolt to erg : 1 eV = 1.60217733 × 10−12 erg

From atmosphere to dyne cm−2 : 1 atm = 1.01325 × 106 dyne cm−2

262

Appendix C

Some key references for this discipline, used

in these lecture notes

The notes for this course are based on the following standard works, from which many of the illustrations

were taken, as indicated in the figure captions.

Aerts, C., Christensen-Dalsgaard, J., Kurtz, D. W., 2010, “Asteroseismology”, Springer-Verlag

Aerts, C., Mathis, S., Rogers, T.M., 2019, “Angular Momentum Transport in Stellar Interiors”, Annual Re-

view of Astronomy & Astrophysics, Volume 57, pp.35–78

Cox, J.P., Guili, R.T., 1968, “Principles of Stellar Structure”, Volume I & II, Gordon & Breech, New York

Hansen, C.J., Kawaler, S.D., Trimble, V., 2004, “Stellar Interiors: Physical Principles, Structure, and Evo-

lution”, Second Edition, Springer-Verlag

Hilditch R.W., 2001, “An Introduction to Close Binary Stars”, Cambridge University Press

Kippenhahn, R., Weigert, A., Weiss, A., 2012, “Stellar Structure and Evolution”, 2nd edition, Springer-

Verlag

Lamers, H.J.G.L.M, Cassinelli, J.P., 1999, “Introduction to Stellar Winds”, Cambridge University Press

Lamers, H.J.G.L.M, Levesque, E.M., 2017, “The Evolution of Massive Stars of 25–120 M⊙: Dominated by

Mass Loss”, IOP Publishing

Maeder, A., 2009, “Physics, Formation and Evolution of Rotating Stars”, Springer-Verlag

263

Pringle J.E., Wade R.A., 1985, “Interacting Binary Stars”, Cambridge University Press

Tassoul, J.-L., Tassoul, M., 2004, “A concise history of solar and stellar physics”, Princeton University Press

Tkachenko, A., Pavlovski, K., Johnston, C., et al., 2020, “The mass discrepancy in intermediate- and high-

mass eclipsing binaries: The need for higher convective core masses”, Astronomy & Astrophysics, Volume

637, id. A60, 20pp.

Torres, G., Andersen, J., Gimenez, A., 2010, “Accurate masses and radii of normal stars: modern results

and applications”, Astronomy & Astrophysics Review, Volume 18, pp. 67–126

Weiss, A., Hillebrandt, W., Thomas, H.-C., Ritter, H., 2004, “Cox & Giuli’s principles of stellar structure:

Extended Second Edition”, Cambridge Scientific Publishers

264